I. Introduction
Parties and their elites control candidate selection and play an important role in shaping the ideological composition of legislatures and policy outcomes. In many cases, such as in Europe and most parliamentary systems, parties hold nearly complete authority to decide the composition of their candidate lists. However, our understanding of the determinants of candidate selection is still quite limited (see, e.g., the conclusion in Dal Bò and Finan 2018 and the references therein). This paper focuses on electoral rule disproportionality as a determinant of candidate selection, empirically examining its relationship with parties’ incentives to nominate a more or less cohesive list of candidates.
We refer to electoral rule disproportionality as the expected distortions generated by the electoral rule in favor of the election winner (Taagepera 1986). In proportional rules, a party’s seat share in the elected body (e.g., council or parliament) is in expectation (almost) identical to its vote share. Conversely, disproportional electoral rules generate distortions in the mapping of vote shares to seat shares that in expectation favor the election winner. These distortions can be attributed to various specific institutional characteristics, among which is the size of the elected body (Lijphart 1995; Herron, Pekkanen, and Shugart 2018). Indeed, in (very) large bodies, a party’s seat share in the elected body is, in expectation, (almost) identical to its vote share. That is, the electoral rule is proportional. Conversely, in small bodies, distortions are expected to give advantage to the election winner, making the electoral rule more disproportional. How does the magnitude of such expected advantage for the election winner affect parties’ choices when nominating their lists?
We inform our empirical approach by relying on a well-grounded conceptual framework in which parties present a list of candidates “tethered by a rubber band to the ideology espoused by the parties whose label they run on” (Grofman 2008). On the one hand, parties have incentives to obtain high vote shares so that they can influence the implemented policy in their preferred direction. Parties can attain this by adopting a more pluralistic approach, akin to a catch-all party. This involves including a diverse set of candidates in their lists, thus attracting a broader spectrum of voters with diverse ideological leanings (Kirchheimer 1966). On the other hand, electing candidates that adhere to a variety of ideologies can be harmful for the party since they may promote policies far from the party’s ideology. Under this trade-off, what is the optimal degree of a party’s ideological cohesion? Given that disproportional electoral rules are expected to favor the election winner, they strengthen parties’ incentives to become electorally appealing and hence propose a rather heterogeneous list of candidates. In contrast, relatively proportional electoral rules offer weaker incentives for an increased vote share, leading to the formation of ideologically cohesive lists.
Our empirical analysis focuses on Finnish municipal elections. This setup offers the opportunity to establish a meaningful identification strategy and utilize a novel dataset that records individual candidates’ policy positions. Causal evidence is obtained by focusing on quasi-experimental empirical evidence due to municipalities’ council size being determined solely as a step function of their population. That is, we use changes in the council size as a proxy for changes in the rule’s disproportionality, allowing for a regression discontinuity design (RDD). Data on parties’ ideological cohesion is obtained by leveraging a unique dataset recording the individual candidates’ policy positions in 2008 and 2012. These data come from the voting aid application of the Finnish public broadcasting company, YLE, and are further linked to electoral data and other candidate-level information. The questions in the voting aid applications allow us to construct two cohesion indexes, one using all available questions and one focusing on candidates’ redistribution preferences.
We start our empirical analysis by showing that council size serves as a reliable proxy for changes in the rule’s disproportionality (sec. III.D). That is, our first set of results validates our proxy by showing that, within our dataset, realized distortions in favor of parties securing high vote shares are more pronounced in small councils than in large councils. Consequently, this implies that the distortions imposed by the rule in our setting decrease with council size. The magnitude of the effect of crossing the threshold is a decrease of approximately 21%–26% relative to the mean value of the outcome, or 40%–67% of the outcome standard deviation, depending on the measure of distortions. While this result aligns with theoretical predictions and is not the primary focus of our study, this paper, to the best of our knowledge, is the first to offer causal evidence regarding the impact of council size on realized distortions. Indeed, within the existing literature, this relationship has been regarded as a folk theorem, with supporting arguments primarily being theoretical or based solely on correlations (Benoit 2000; Herron, Pekkanen, and Shugart 2018).
Our main results in section III.E show that, in the elections for smaller councils, competing lists tend to be less ideologically cohesive than in the elections for larger councils. The magnitude of the effect of crossing the threshold translates into a decrease in our two cohesion indexes by 7% relative to the mean value of these outcomes, or 20%–30% of the outcome standard deviation.
As our RDD identifies only the total effect on cohesion, we use our rich data to perform covariate balance tests and do not observe evidence of other jumps. For instance, our empirical setting permits us to focus on the effect of council size on intraparty cohesion without worrying about potential confounding effects through a change in the number of parties or candidates (Rae 1967; Taagepera and Shugart 1989; Cox 1990). Furthermore, no other policy changes take place at the thresholds that determine the council size. We also show that our results are not consistent with an alternative mechanism in which candidates optimally reposition in response to changes in the council size. Admittedly, we cannot be perfectly sure there is no other channel through which the council size could influence cohesion. Nevertheless, the electoral rule disproportionality is a plausible channel since our empirical results are in line with our conceptual framework and our balance tests do not reveal other evident determinants of cohesion. We also show that there is no sorting across the thresholds, which is natural as the municipal population is not self-reported. Last but not least, the results are robust to further batteries of robustness and validity checks. Here of particular interest is our novel use of placebo cutoff tests to assess the appropriate level of clustering in the optimal bandwidth selection.
The paper is structured as follows: In the “contribution” section just below, we present related literature and highlight our contribution. In section II we propose a conceptual framework to guide our empirical analysis. In section III we present our estimation strategy and the evidence. In section IV, we conclude. All supporting material, further discussion of our data and empirical results (e.g., robustness), and a formal analysis of our conceptual framework appear in the appendix.
Contribution.—Our estimates on the effect of electoral institutions on parties’ ideological cohesion constitute a novel finding. In a broader context, evidence regarding the causal effects of electoral institutions on any outcome remains relatively scarce (Shugart 2013). Typically, researchers have had to rely on cross-country or panel variation leaving room for potential confounding. In contrast, we leverage plausibly as-good-as-random variation within a country by utilizing the change in council size as a proxy for electoral rule disproportionality and join scholars exploiting subnational variation to identify causal effects of various dimensions of electoral institutions (see, e.g., Eggers 2015; Sanz 2017).
As Shugart (2013) and Buisseret and Prato (2017) argue, we still have a limited overall understanding of the mechanisms linking electoral institutions and the behavior of political actors. We contribute to this literature by proposing a potential mechanism through which electoral institutions could influence parties’ candidate selection on ideology. Our theoretical framework builds on recent literature, modeling proportional representation (PR) as a system of power sharing, as compared to majoritarian winner-take-all elections (Herrera, Morelli, and Palfrey 2014; Herrera, Morelli, and Nunnari 2016; Matakos, Troumpounis, and Xefteris 2016; Herrera, Llorente-Saguer, and McMurray 2019), and extends knowledge on parties’ cohesion across electoral institutions (Carey 2007; Buisseret and Prato 2022).
Closest to our setup, Matakos, Troumpounis, and Xefteris (2016) propose an electoral competition model in which two parties optimally choose their platforms on a unidimensional policy space (i.e., each party selects a unique point on the unit interval) under different levels of electoral rule disproportionality. Their main result suggests that polarization (i.e., the distance between parties’ platforms) decreases as electoral disproportionality increases, since parties have incentives to increase their vote share by converging toward the median. In our setup instead, parties propose an interval of policies (i.e., a list) that takes into account parties’ preferences over the overall composition of the elected body. In this setting, we introduce a personal vote dimension and characterize the optimal length of parties’ lists as the electoral rule disproportionality varies.1 While Matakos, Troumpounis, and Xefteris (2016) provide cross-country evidence on the correlation between the electoral rule disproportionality and between-party polarization, we focus on a within-country setting and obtain causal evidence on the effects of council size as a proxy of the electoral rule disproportionality on intraparty cohesion.
Overall, our work contributes toward a better understanding of politicians’ characteristics in representative democracies.2 While the elected officials’ characteristics and policy positions are known to matter for policy outcomes in several environments (see, e.g., Chattopadhyay and Duflo 2004; Lee, Moretti, and Butler 2004; Jones and Olken 2005; Washington 2008; Besley, Montalvo, and Reynal-Querol 2011)—including the context of our empirical exercise and via an intraparty channel (Hyytinen et al. 2018a)—different institutions seem to elect politicians with dissimilar traits (Best and Cotta 2000; Meriläinen 2022). Variation in electoral institutions has been offered as one potential explanation for differences in political selection (Carey and Shugart 1995; Galasso and Nannicini 2011; Gagliarducci and Nannicini 2013; Beath et al. 2016; Galasso and Nannicini 2017), with the present paper being, to the best of our knowledge, the first to provide causal evidence on electoral institutions affecting political selection on ideology. This result, hence, complements a long literature highlighting electoral institutions as a key determinant of the economy and the political system in representative democracies.3
II. A Conceptual Framework
Electoral rules define the allocation of a number of seats available in an elected body (e.g., a council or parliament) to parties competing in an election. That is, electoral rules define a mapping from vote shares to seat shares. Generating a fully proportional mapping in which seat shares and vote shares coincide is practically impossible due to distortions emerging when dividing a discrete number of seats. However, specific features of the electoral rule generate diverse distortions. A large literature focuses on such specific characteristics of electoral rules (Herron, Pekkanen, and Shugart 2018).4 Among those, one is the size of the elected body, the focus of our empirical application. Imagine that only one candidate is elected. One vote more than the opponent is enough and the distortion in the mapping from votes to seats is maximized. Instead, when many candidates are elected the seat share allocation is expected to generate less distortion.
In our theoretical framework, we abstract from the exact source creating distortions in the vote to seat share mapping. We follow conventional terminology by referring to the electoral rule disproportionality as the expected distortions generated by the rule in favor of the election winner and discuss a channel through which electoral rule disproportionality could affect parties’ intraparty cohesion. Our argument is based on standard assumptions of the electoral competition literature, and while presented informally here, it is derived in full mathematical detail in appendix A1. In the appendix, we also detail modifications that would not affect the intuition behind the result presented (app. A1.1). However, let us clarify that our main result essentially depends on the following key premises: On the voters’ side, our model requires that a personal vote dimension be present.5 In our benchmark model, voters vote for the party that includes in its list the candidate closest to their ideal point. In the appendix, we show that our results are robust when permitting voters to also care about the average ideology of a given list or introduce some voters that choose the list that offers them higher expected utility. On the parties’ side instead, our model requires that, ceteris paribus, parties prefer to nominate candidates who share the parties’ ideology, and parties have full power in nominating their party lists.6
The electoral rule and disproportionality.—To fix ideas and provide meaningful comparative statics, our conceptual framework below presents the simplest case of a linear electoral rule (Taagepera 1986). This rule serves as a continuous approximation of the D’Hondt rule present in our empirical setting (Flis, Słomczyński, and Stolicki 2020), and links with our empirical measures of realized distortions. In the appendix, we present further details on the D’Hondt rule and obtain theoretical results for the latter as well as a larger family of electoral rules than the linear rule.
To formalize the electoral rule and its degree of disproportionality, consider a two-party election in which party j obtains vote share and party −j obtains vote share . The electoral rule in our model is mapping each party’s vote share vj into a seat share with . The linear rule is formally defined as follows:
with (see fig. 1).

Fig. 1. Seat share allocation given parties’ vote shares according to the linear rule, where the electoral rule disproportionality D (in gray) is increasing in n.
We define the electoral rule disproportionality as the expected distortions generated by the rule in favor of the election winner, formally defined as follows:
For the linear rule in which , we have that . This makes evident that the electoral rule disproportionality is increasing in parameter n. This is illustrated in figure 1. When , parties’ seat shares accurately represent parties’ vote shares and the disproportionality is equal to zero. The extreme case of maximal disproportionality occurs in a winner-take-all election in which .
The election, parties, and voters.—Consider an election in a unidimensional policy space in which two parties (, R) strategically choose the ideological heterogeneity of a continuum of candidates (list) competing in the election under the party’s label. That is, candidates’ ideologies are just numbers on the real line, and a list of candidates is given by a closed interval containing the party’s ideal policy. For example, in figure 2, the two parties propose lists [0.2, 0.4] and [0.8, 0.9], respectively.7

Fig. 2. An example in which parties propose and . Parties’ vote shares are and . The electoral rule disproportionality determines parties’ seat shares and hence the distribution of the represented ideologies in the elected body.
Voters’ preferred policies are uniformly distributed on the same policy space. Voters are voting for the party that included the candidate closest to their preferred policy in its list. In the example of figure 2, all voters to the left of the indifferent voter (located at 0.6) vote for party L, and all voters to the right of the indifferent voter vote for party R, implying vote shares and .
The elected body.—Parties’ lists and seat shares sL and sR determine the distribution of ideologies in the elected body. The represented ideologies coincide with those of the nominated candidates, and each ideology has an equal weight within each party.8 Take for example party L in figure 2A. Given that this party is obtaining seat share , and each of the ideologies in [0.2, 0.4] is represented in the elected body with an equal weight (i.e., ). In contrast, in figure 2B, since , party L is obtaining , and each of the proposed ideologies [0.2, 0.4] is represented in the elected body with a higher weight than before (i.e., ).
Parties’ payoffs.—We assume that parties are policy motivated and their payoffs depend on the overall distribution of ideologies in the elected body. Specifically, we consider that parties evaluate each ideology t represented in the body using a quadratic disutility function, with their overall aim being to minimize the average disutility from the represented ideologies. Formally, the payoff of party , given the elected body, is
where f(t) is the density of the represented ideologies as previously discussed and illustrated in figure 2, and xj denotes the party’s ideal policy.
Predictions.—Parties propose a list of candidates running for a seat in the elected body (e.g., council or parliament) aiming at maximizing their payoff. So what determines parties’ incentives to propose a more or less cohesive list of candidates?
A party has incentives to expand its list toward moderate grounds since the indifferent voter moves in its favor and the party attracts more votes. Any expansion toward moderate grounds in search of a higher vote share should be counterbalanced by the inclusion of more extreme candidates too, so that parties make sure that the elected candidates are not too moderate compared to the party’s ideal policy. Hence, parties benefit when increasing the ideological heterogeneity of their candidates by increasing their vote share and hence their seat share. However, proposing a heterogeneous list of candidates comes at a cost: the ideologies represented in the elected body may be too different from those parties advocate and, thus, negatively affect parties’ payoffs, despite parties’ optimally balancing their lists.
The above trade-off and the incentives to propose a more or less cohesive list of candidates depend on the electoral rule disproportionality. Recall that, as the rule becomes more disproportional, the expected advantage for the election winner is increasing. That is, disproportional rules reward high vote shares more generously in terms of seat allocation than proportional rules. Therefore, parties’ incentives to become less ideologically cohesive are increasing in the electoral rule disproportionality since the expected benefits of doing so overweight the associated cost. The following comparative static result provides our main theoretical prediction formally presented in the appendix.
Prediction 1.
Parties’ lists of candidates become less ideologically cohesive as the electoral rule disproportionality D increases.
Figure 3 illustrates prediction 1 plotting the ideology of the two most extreme candidates in each list, highlighting how parties’ lists become less ideologically cohesive as the electoral rule disproportionality increases. Prediction 1 provides the main testable prediction of our theoretical model. In our empirical setting, we are using council size as a proxy of the electoral rule disproportionality and the above prediction implies that parties’ lists of candidates are predicted to be more ideologically cohesive as the council size increases.9

Fig. 3. An example of equilibrium lists considering the linear rule for different levels of disproportionality D and parties’ ideal policies . The two most extreme candidates in party j are denoted by and .
III. Empirical Evidence
We first describe the institutional details (sec. III.A), data of the empirical setup (sec. III.B), and then detail our identification strategy (sec. III.C). The same identification strategy is then used twice. First, we validate the use of council size as a proxy of the electoral rule disproportionality (sec. III.D). Second, we present our main results on the effect of the council size on parties’ ideological cohesion (sec. III.E). Finally, we discuss alternative mechanisms that could potentially explain the effect of council size on cohesion instead of the electoral rule disproportionality (secs. III.F and III.G) and provide further discussion on our methodology and robustness results (sec. III.H).
A. Institutional Details
In our period of analysis, municipalities have a central role in the highly decentralized Finnish system. They spend more than 5,000 euros per capita annually and employ around 20% of the total workforce. Most of this expenditure is used to take care of statutory responsibilities, including social care, health care, and primary education. To cover these expenditures, Finnish municipalities are allowed to set and collect income taxes, property taxes, and out-of-pocket payments from users of municipal services. In addition, municipalities receive a share of corporate taxes and fiscal grants from the central government. Therefore, municipalities wield a lot of power over public expenditures and revenues.
Municipal councils are the main seat of power in the municipal decision-making system. No official ruling coalition government is formed after the elections, and councils decide by simple majority vote on an issue-by-issue basis. Relative to the parliamentary politics in Finland, Finnish local politics do not have very strong party discipline in place. Mayors are merely civil servants chosen by the council. The council also nominates a municipal board that has a preparatory role. Councilors are “leisure” politicians who receive small meeting fees while holding regular jobs. There is no evidence of large monetary gains from holding local office (Kotakorpi, Poutvaara, and Terviö 2017). Despite the small personal monetary stakes for the politicians, these are high-visibility elections that concern positions of power over important policies. Indeed, extensive existing causal evidence shows that individual councilors’ characteristics matter for key policies in Finnish local governments (Hyytinen et al. 2018a; Meriläinen 2022; Harjunen, Saarimaa, and Tukiainen 2023).
Our conceptual framework was constructed with close parallels to Finnish municipal elections, where parties propose a list of candidates competing in an open list, and voters vote for a candidate who belongs to one of the lists. The number of candidates elected in each municipality (i.e., council size k) varies between 13 and 85 and is a deterministic step function of the municipal population.10 In each municipality of council size k, parties propose an open list of up to candidates.11 Candidates appear on the list in alphabetical order. Each voter votes for one candidate and cannot vote for a party without specifying a candidate. Candidates’ personal votes are aggregated at the list level to determine lists’ vote shares.12 Lists’ vote shares are mapped into lists’ seats according to the D’Hondt allocation method. Seats are in turn allocated to the most voted candidates within a list. The D’Hondt method is the most common way of allocating seats in proportional representation systems extensively used in western Europe (Kotanidis 2019). Two key elements of this rule are that (a) in expectation this rule favors the winner of the election (Gallagher 1991; Schuster et al. 2003; Herron, Pekkanen, and Shugart 2018; Kotanidis 2019), and (b) the advantage to the winner is decreasing in the number of seats k (Benoit 2000; Schuster et al. 2003; Herron, Pekkanen, and Shugart 2018; Kotanidis 2019; Fiva and Hix 2021).
B. Data Sources
We combine data from several sources covering the Finnish municipal elections in 2008 and 2012. First, our key data on individual candidates’ policy positions originate from the voting aid application (VAA) of the Finnish public broadcasting company, YLE. The YLE voting aid application is first open only to candidates who may reply to closed-ended questions focusing on current policy issues (see app. A4 for a detailed description). During the response period, each candidate has access only to their own replies, which can be modified during this time but not afterward. Once the candidates’ response period is over, the voting aid applications become publicly available. A voter can fill in the same questionnaire online and compare their replies to those of the candidates. The application also provides a list of candidates whose replies are closest to those of the voter. The open list makes Finland a fertile ground for the use of the voting aid application, as voters must find an individual candidate to vote for, so mere party-level information is not enough to guide the choice. Using the application is free of charge for both candidates and voters.13 We have access to these data only for 2008 and 2012 elections.14
Given the importance of the VAA in generating votes, candidates may have incentives for strategic responses. However, as the matching algorithms are trade secrets, they are not trivial to game. The strategic behavior by the candidates is further complicated by the fact that voter responses are not available, even afterward. In addition, the responses of the candidates are fixed once the response period has ended so that candidates cannot react to other candidates’ responses. Accordingly, Ilmarinen et al. (2022) show that the candidates’ responses are sincere rather than strategic by showing that candidates respond in the same way to a confidential survey as to the VAA.
Filling in the voting aid application questionnaire is not obligatory for the candidates. The median response rate by municipality in 2008 was 47.8% of the candidates, and on average, the candidates who did fill in a voting aid application questionnaire received 56.2% of the votes of the municipality.15 The equivalent figures for 2012 were 47.2% of the candidates and 54.3% of the votes. Generally, the candidates who respond to the voting aid application are politically more successful, experienced, younger, more educated, and more likely to be women (table A9). However, as we show in table 3, both the number of respondents and the number of all candidates (and thus, also the response rate) are balanced across the cutoffs used in the RDD, and hence, attrition is unlikely to pose any threat to our identification strategy. Nonetheless, we can still have the issue that different types of candidates respond across the cutoff. However, as detailed in table A2, respondents’ observed characteristics are also balanced across the cutoff.
Second, we use electoral data available from the Ministry of Justice with candidate-level information on candidates’ age, gender, party affiliation, their election outcomes (number of votes and whether elected), and the possible incumbency status. These electoral data are linked to the data from Statistics Finland on candidates’ education, occupation, and socioeconomic status. Moreover, we match the candidate-level data with Statistics Finland’s data on municipal characteristics. We have also collected information on pre-electoral coalitions of parties.
Using the electoral data, we construct our main disproportionality measures that we detail in section III.D. Similarly, using the YLE data, we construct the main outcome variables on within-party cohesion that we detail in section III.E. All variables are summarized in table A10.
C. Identification Strategy and Estimation
The deterministic council size rule allows for a sharp regression discontinuity design (RDD). The idea of our empirical strategy is to compare outcomes in municipalities just below and above the council size cutoff points.16 The identifying assumption in such an RDD is that individuals cannot precisely manipulate the forcing variable (see, e.g., Lee and Lemieux 2010). This is true in our case because municipalities do not self-report their population. The sufficient identification assumption is that the potential outcomes develop smoothly over the threshold.
We are interested mainly in two outcomes. First, we show that the council size has the expected effect on the distortions in the mapping of vote shares to seat shares. This validates that changes of the council size determined by the municipalities’ population serves as a proxy of the electoral rule disproportionality. Second, as the main empirical contribution, we analyze whether there is an increase in intraparty cohesion at the threshold (that is, a discontinuous jump downward in our within-party heterogeneity indexes). Finally, we discuss other possible mechanisms that could explain the cohesion result.
To achieve this, we estimate regression models of these outcomes on a set of zero-one indicators for being above a cutoff point. We also include a flexible, but smooth, function of the population as control variables. The population variables should pick up the impact of all the determinants of within-party cohesion correlated with the population, apart from the council size. Hence, we will obtain a reliable estimate of the causal effect of the council size on party cohesion, clean of confounding factors, which might otherwise bias our estimates.
As is standard in the literature, we use nonparametric local linear regressions as our main specification. We apply the bias correction and robust inference procedure by Calonico, Cattaneo, and Titiunik (2014), which we implement using the Calonico et al. (2016) “rdrobust” package in STATA. Based both on the Monte Carlo evidence by Calonico, Cattaneo, and Titiunik (2014) and Calonico, Cattaneo, and Farrell (2018) and on an experimental benchmark by Hyytinen et al. (2018b), this approach performs best among the standard implementation options (i.e., vs. conventional local linear without the bias-correction and/or robust inference, and parametric polynomial specifications). We use the latest mean squared error (MSE)–optimal bandwidth procedure proposed in Calonico et al. (2016) and apply a triangular kernel.
We report the conventional local linear MSE-optimal coefficients, because of the method’s optimal properties when it comes to point estimation. However, for statistical inference, and because of the superior coverage properties of the latter method, we report confidence intervals based on the bias-corrected coefficients and the associated robust inference by Calonico, Cattaneo, and Titiunik (2014). This is somewhat nonstandard reporting, as it implies that the reported 95% confidence interval is not centered precisely around the reported coefficient (but rather around the bias-corrected coefficient) but is, nonetheless, a well-motivated way to report. We report both classical and clustered inferences. The classical (nonclustered) inference has been standard in RDD for long as the typical optimal bandwidth selection methods have not been optimized for clustering. Because of the recent advances by Calonico et al. (2016), we can now also optimize the bandwidth selection while clustering. Note that, as opposed to the normal (non-RDD) case, clustering also changes the coefficients because the optimal bandwidths change.
One complication to our analysis is how to deal with multiple thresholds. One standard option is to calculate the forcing variable as a population distance to the nearest threshold and simply define a single group for being above a threshold. Given the limited amount of observations, we use this pooling option here. Cattaneo et al. (2016) show that, even if the pooling results in a loss of information, it produces meaningful (particularly weighted) treatment effect estimates. We can express this pooling approach as estimating regression functions of the form
where Yit is the outcome of interest, zit is the forcing variable measuring the distance from the normalized population cutoffs for each observation i in election t, is an indicator function for being above a cutoff, and δ is the coefficient of interest. If f(zit) is approximately correctly specified within a bandwidth and there is no precise manipulation of the forcing variable (i.e., the density is smooth at the threshold), the covariates should evolve smoothly at the boundary and, thus, δ is the causal estimate of interest.17
In all the analysis, we limit our sample to the municipalities with a population below 22,500, the midpoint between the fourth and fifth thresholds. This focuses the analysis around the four smallest thresholds where the data are densest. Moreover, omitting larger cutoffs is theoretically motivated, as the relative changes in the council size are too small to induce a substantial treatment; that is, the changes to proportionality are very small. This also means that the treatment effect on cohesion should be very small at the large cutoffs (see fig. A4). By including such observations we arguably add more noise than information.18 In addition, we omit the municipalities that underwent a municipal merger prior to the elections, as these have an impact on the council size and many other features of political competition. Overall, our sample contains 76% of all Finnish municipalities and 31% of the Finnish population.
Even if our pooling approach is standard in the literature, it is not entirely unproblematic. The main issue is that one could possibly end up comparing, for example, a municipality with a population of 1999 (just below) to a municipality with a population of 8001 (just above). This is clearly not a valid comparison for causal inference. Therefore, a further identifying assumption for pooling is that the share of identifying observations on both sides of each threshold is the same (which would happen in large samples due to local randomization). Thus, the McCrary (2008) density tests need to be reported separately for each threshold as opposed to the entire pooled sample. Additionally, we do not observe any jumps at any of the individual cutoffs or at the pooled one (see fig. A9).
The standard identifying assumptions of our model imply that other possible determinants of intraparty cohesion should develop smoothly with respect to the population and, therefore, be captured by the f function. This assumption is violated if there are other relevant factors that also depend on the same population rule. Eggers et al. (2018) have raised this concern especially in relation to the case of analyzing population thresholds since, in many countries, also municipal responsibilities, grants, and regulation as well as politicians’ salaries depend on the very same thresholds. In such cases, there are several simultaneous exogenous treatments and RDD is able to identify only their joint effect. None of these concerns is present in the Finnish system. However, the council size in itself can have different electoral effects because candidates, parties, and voters may respond to it in various ways. To argue that the empirical mechanism is in line with our conceptual framework, we rely mainly on the covariate balance tests of the pretreatment (before elections) variables (sec. III.F).
D. Council Size and Disproportionality
Note that the degree of disproportionality of an electoral rule—indicating the rule’s tendency to favor parties obtaining high vote shares—is conceptually distinct from the precise advantage it confers on those parties in a specific election.19 However, given a large enough sample of electoral results, a more disproportional rule should be associated with a greater average advantage in favor of parties obtaining high vote shares. Indeed, in this section, we illustrate that the realized distortions favoring such parties in the mapping between vote shares and seat shares are, on average, more pronounced in smaller councils (see, also, Lijphart 1995; Herron, Pekkanen, and Shugart 2018). That is, we validate that council size is a good proxy of the electoral rule disproportionality.20
1. Empirical Measures of Distortions
We use two established empirical measures of realized distortions in a certain election: (i) the slope index, directly linking to the linear rule presented in our conceptual framework, and (ii) a version of the Gallagher index. Both indexes capture the distortions in the mapping of parties’ vote shares to parties’ seat shares and take larger values the higher the distortion.21 In total, we have 505 observations at the municipality-year level for which we compute the two indexes as our main dependent variables.
The slope index first proposed by Cox and Shugart (1991) is constructed as follows: For each municipality-year observation, we regress the difference on vj. Then, we define the slope of the line obtained from this regression as the slope index. The advantage of the slope index is that it captures the size of the distortions alongside whether they favor large or small parties. The slope index is also linked to the way we have modeled the electoral rule disproportionality in our conceptual framework using the linear rule.
Figure 4 illustrates how the slope of the regression line captures not only the size of the realized distortions but also the direction. On the left, we depict one municipality-year observation for which the distortions are small and the slope of the regression is relatively flat (0.011). In a pure PR rule, the slope would be zero. On the right, we depict another observation for which the slope is positive and relatively large (0.27), pointing to distortions favoring parties obtaining high vote shares. That is, the slope of the line used as our slope index indicates whether distortions favor parties obtaining high vote shares (positive slope), parties obtaining low vote shares (negative slope), or the absence of distortions (flat line).

Fig. 4. The slope index as the slope of the regression of on vj. The slope index takes the value of 0.0113 on the left (the Ilmajoki municipality in year 2012 with five competing lists and a council size of 35) and of 0.2712 on the right (the Utsjoki municipality in year 2008 with six competing lists and a council size of 21).
We employ the modified Gallagher index as a second index to guarantee robustness of our results to alternative measures of realized distortions given that the early debate on the “best” way to capture the latter is still open (e.g., Lijphart 1995). The modified Gallagher index in municipality i in year t is defined as
where , … , p denotes the p different parties running in municipality i in year t. The difference in the summation term represents the distortions when a party j that obtains vote share vj is allocated a seat share sj. The value of the index is increasing in the level of distortions. In a pure PR system with no distortions, the index takes value zero since this difference is zero for all parties. Notice that, while the modified Gallagher index (as most indexes proposed in the literature) does well in representing the level of distortions in the vote-to-seat-share mapping, it remains silent on the direction of these distortions.22
2. Distortion Results
We begin this RDD analysis with a graphical visualization of the jumps at the pooled cutoff. In figure 5, we report the results for both indexes of realized distortions. The results are very similar for both indexes and both of them jump down at the cutoff.23

Fig. 5. Pooled RDD. We use the “rdplot” package in STATA. We report bins that mimic variance by using an evenly spaced method with spacing estimators. We use a fourth-order polynomial for the fit.
In table 1, we report the nonparametric RDD results on the effect of the council size on realized distortions. As for the main results, we report the conventional local linear MSE-optimal coefficients. For statistical inference, and because of its superior coverage properties, we report confidence intervals based on the bias-corrected coefficient and the associated robust inference by Calonico, Cattaneo, and Titiunik (2014). We also report both the nonclustered results and those clustered at the municipality level. In line with the properties of the D’Hondt rule (see app. A1.1), the negative coefficients imply that distortions are smaller as the council size increases. All the coefficients have an expected negative sign. They are statistically significant at the 5% level for the modified Gallagher index and (barely) insignificant for the slope index. The magnitude of the effect of crossing the threshold translates into a decrease in the slope index (modified Gallagher index) by roughly 21% (26%) relative to the mean value of the outcome, or 40% (67%) of the outcome standard deviation.
| (1) | (2) | |
|---|---|---|
| A. Slope Index | ||
| Conventional local linear RD coefficient | −.023 | −.023 |
| 95% confidence interval with bias correction and robust inference | [−.063, .007] | [−.062, .007] |
| Observations within main bandwidth | 267 | 268 |
| MSE-optimal bandwidths (main/bias) | 807/1,385 | 811/1,409 |
| Clustered bandwidths and standard errors | No | Yes |
| Outcome mean (standard deviation) | .108 (.057) | |
| B. Modified Gallagher Index | ||
| Conventional local linear RD coefficient | −.012 | −.012 |
| 95% confidence interval with bias correction and robust inference | [−.022, −.003] | [−.023, −.002] |
| Observations within main bandwidth | 239 | 240 |
| MSE-optimal bandwidths (main/bias) | 714/1,097 | 724/1,090 |
| Clustered bandwidths and standard errors | No | Yes |
| Outcome mean (standard deviation) | .047 (.018) | |
While it is a well-known fact in political science that council size (or district magnitude) serves as a proxy for electoral rule disproportionality (see, e.g., Carey and Shugart 1995; Carey and Hix 2011), and this association is well established at a theoretical and descriptive level, there has been, to our knowledge, a lack of causal evidence on this relationship. Thus, our paper makes an additional independent contribution through these results, even if they are not its primary focus.
E. Main Results: Council Size and Intraparty Heterogeneity
We now analyze how the council size influences intraparty cohesion. Guided by our theoretical framework, we are expecting that lists become less heterogeneous as the council size becomes larger (prediction 1).
1. Empirical Measures of Cohesion
Our main outcome variables are two indexes of candidate heterogeneity, based on candidates’ responses to questions present in the voting aid application. The all-questions index is constructed using all available responses. The redistribution index focuses only on a subset of the questions focusing on taxation and redistribution (see app. A4 for the subset). The benefit of the comprehensive index is that it avoids selecting on questions. However, its drawback is that perhaps some of the questions are less relevant for the voters and the interpretation of this index is more complex. Therefore, we also focus on the redistribution index that is policy relevant and matches a classic left-right dimension in political economy.
We measure ideological heterogeneity using Euclidean distances (scaled by the number of questions) for both indexes.24 That is, we first compute for each candidate the distance between their response and their local party mean response for each question, and take a square of that. Then we sum these squared distances over all the questions included in the index and take a root of the sums of those squares. Finally, we divide this sum by the number of questions each year to reduce residual variance due to the fact that the number of questions differs by year.25 Our scaling makes not only the mean but also the entire distribution more comparable across the years, which is useful in the RDD analysis for computing optimal bandwidth and avoiding the need to control for a year fixed effect. If the distance is zero for a candidate, their ideology coincides with the local party mean. The larger the distance is, the more this candidate deviates from the local mean. We present descriptive statistics of the indexes in table A10 and their histograms in figure 6.

Fig. 6. Histograms of the all-questions and redistribution indexes at the candidate level.
For the analysis, we include only the parties with more than five candidates responding to the YLE voting aid application at the municipality-party-election level.26 This leaves us with 14,999 candidate–election year, 1,184 party–election year, and 475 municipality–election year observations.27
2. Cohesion Results
We begin the main RDD analysis by graphical visualization of the jumps at the cutoff. In figure 7, we report the results for the two heterogeneity indexes using a pooled RDD and observations aggregated to municipality-year level. Both heterogeneity indexes jump down at the threshold.28 That is, parties become more ideologically cohesive as the council size increases.

Fig. 7. Pooled RDD. We use the “rdplot” package in STATA. We report bins that mimic variance by using an evenly spaced method with spacing estimators. We use a fourth-order polynomial for the fit.
We present the nonparametric regression results in table 2. Overall, the evidence is in line with our conceptual framework: The estimate is always negative indicating that party cohesion increases (that is, our dependent variable decreases) as the council size increases.
| All Questions | Redistribution | |||
|---|---|---|---|---|
| (1) | (2) | (3) | (4) | |
| A. Candidate-Year Level Analysis | ||||
| Conventional local linear RD coefficient | −.012 | −.009 | −.023 | −.021 |
| 95% confidence interval with bias correction and robust inference | [−.018, −.008] | [−.019, −.003] | [−.043, −.012] | [−.043, −.009] |
| Observations within main bandwidth | 2014 | 4256 | 3589 | 4357 |
| MSE-optimal bandwidths (main/bias) | 408/820 | 693/1,143 | 588/1,019 | 679/1,118 |
| Clustered bandwidths and standard errors | No | Yes | No | Yes |
| Outcome mean (standard deviation) | .136 (.029) | .292 (.101) | ||
| B. Municipality-Party-Year Level Analysis | ||||
| Conventional local linear RD coefficient | −.009 | −.008 | −.020 | −.020 |
| 95% confidence interval with bias correction and robust inference | [−.017, −.004] | [−.018, −.001] | [−.044, −.004] | [−.042, −.006] |
| Observations within main bandwidth | 348 | 417 | 382 | 363 |
| MSE-optimal bandwidths (main/bias) | 617/1,072 | 711/1,152 | 663/1,100 | 651/1,100 |
| Clustered bandwidths and standard errors | No | Yes | No | Yes |
| Outcome mean (standard deviation) | .137 (.015) | .293 (.042) | ||
| C. Municipality-Year Level Analysis | ||||
| Conventional local linear RD coefficient | −.006 | −.006 | −.028 | −.028 |
| 95% confidence interval with bias correction and robust inference | [−.016, .003] | [−.015, .002] | [−.060, −.007] | [−.058, −.009] |
| Observations within main bandwidth | 243 | 242 | 161 | 160 |
| MSE-optimal bandwidths (main/bias) | 813/1,191 | 813/1,206 | 558/956 | 550/958 |
| Clustered bandwidths and standard errors | No | Yes | No | Yes |
| Outcome mean (standard deviation) | .136 (.013) | .292 (.032) | ||
Panel A of table 2 reports the analysis at the individual candidate level. Especially in this specification, clustering (optimal bandwidth selection and inference) at the municipality level is the most reliable approach as our treatment has no variation within the municipality-year level. To confirm that this does not give us excess power, we repeat the analysis at the municipality-party-year (panel B) and municipality-year (panel C) level in table 2. These aggregated outcomes are calculated as means over the individual candidate distances aggregated to the respective level. The results are robust to these modifications.29
The effect of crossing the threshold in table 2 translates into a decrease in the heterogeneity indexes by roughly 7% (respectively, 7%) relative to the mean value of the outcome. This effect also translates into a decrease of 20% (respectively, 30%) of the outcome standard deviation for the redistribution index (respectively, the all-questions index).
F. Balance Tests
We conduct balance tests on a wide range of observables to check for other possible effects of moving across the population thresholds. In table 3, we report the most important ones for our purposes and relegate further standard validity tests of RDD on municipality and candidate characteristics to the appendix (table A2).
| Number of Parties | Effective Number of Parties | Candidates per Seat | Respondents | Candidates | |
|---|---|---|---|---|---|
| Conventional local linear RD coefficient | .256 | .008 | −.51 | .83 | 2.07 |
| 95% confidence interval with bias correction and robust inference | [−.41, 1.15] | [−.53, .58] | [−.91, −.21] | [−1.36, 2.84] | [−1.87, 5.98] |
| Observations within main bandwidth | 230 | 243 | 211 | 616 | 566 |
| MSE-optimal bandwidths (main/bias) | 778/1,188 | 825/1,187 | 705/1,102 | 1,067/1,601 | 955/1,367 |
| Clustered bandwidths and standard errors | Yes | Yes | Yes | Yes | Yes |
| Outcome mean (standard deviation) | 5.9 (1.4) | 3.5 (.9) | 2.8 (.7) | 12.7 (6.5) | 23.5 (11.3) |
| Unit of observation | Municipality-year | Municipality-year | Municipality-year | Party-year | Party-year |
First, it is important to stress that, across the cutoffs, the number of candidates and the number of parties (lists), either a simple count or an effective number of parties, are balanced (i.e., the effect is not statistically significant).30 The balance in the number of parties is particularly relevant since, at a first sight, one could think that the changes in council size could also affect the number of parties. Note, however, that here we are focusing on the same PR setup across all the thresholds and, hence, the incentives known since the early work by Duverger (1954) may not be fully in place. Moreover, in the current context, most local parties are also well known at the national level. Thus in practice, parties do not merge or split at the local level in Finland (unless the national party splits first). The main choice that affects the number of parties is whether the national party wants to enter or exit a given municipality. It is likely that the variations in the council size are not the parties’ main concern when considering this major discrete choice. Entry and exit are likely rather driven, for example, by the existence of a large enough support base in the municipality. The balanced number of parties is also relevant because, as explained in appendix A1.1, our conceptual framework is valuable in analyzing multiparty settings for an exogenous number of parties.
Additionally, the number of candidates is not significantly influenced at the cutoff. This highlights that parties and voters act in similar environments across the cutoffs. This is good news for our identification; yet, one could expect jumps across the cutoffs since parties can nominate a number of candidates up to 1.5 times the council size.31 The absence of jumps most likely implies either that new candidates are not sufficiently incentivized to run for office by the increase in the likelihood of getting elected due to the larger number of seats and, hence, are not very strategic (see also sec. III.G), or that parties play an important role in curating their lists more efficiently in the face of increased candidate supply, thus facilitating more cohesion without any need to vary the number of candidates. Note that while the supply of candidates is certainly influenced by the municipal population, our research design controls for this by estimating the effect at the cutoff, with municipalities being very similar across the thresholds both in population size and other respects (table A2).
Finally, we point out that the response rate of the candidates on the YLE application is balanced by reporting also the balance test for the number of respondents. Here, in correspondence with our sample from the main analysis, we also omit all candidates from parties with less than five respondents. Given that using the application is voluntary, one could be concerned about a possible selection bias. The balanced number of respondents (and given the balanced number of all candidates, also balanced response rate), however, indicates that a possible selection bias resulting from the response rate is unlikely to be present in the RDD estimates. As the possible selection bias seems to be the same across the cutoffs, it is thus differenced out from the RDD estimates. However, it could still be that there are differences in how the candidates select into the survey across the cutoff. To address this, we show the balance of candidate characteristics in the appendix (table A2).
Of course, we face the standard caveat of balance tests that the results may be statistically imprecise. In our case, we cannot rule out small effects of the council size on these outcomes. To further evaluate our cohesion results, we show in the appendix (table A3) that the disproportionality and the cohesion indexes move significantly at the same cutoffs, whereas the alternative variables less so.
G. Strategic Candidates
Our posited mechanism to explain our empirical results requires that parties be strategic and respond to the changes in electoral incentives (council size), while candidates be sincere. Yet, one cannot exclude an alternative mechanism, one that entails strategic candidates optimally repositioning in response to the change in the institutional setting. As the number of candidates remains about constant across the threshold, the available seats per candidate varies (see table 3). Therefore, one could argue that as the number of votes necessary to win a seat goes down, and thus intraparty competition becomes less intense, candidates face weaker incentives to diversify and distance themselves from each other. In other words, it is theoretically possible that the observed effects arise from candidates’ rather than parties’ strategic responses to changes in council size. While from a policy perspective it may not be relevant which agents are driving the link between council size and cohesion, a candidate-driven mechanism is different from the one our model proposes, and this should be acknowledged.
Here, we present empirical evidence showing that the candidates, at least in the Finnish context, do not seem to act strategically in how they position. That is, while the alternative candidate-centered mechanism is theoretically possible, and may well be important in other institutional setups, it is less likely to be the one driving our empirical results. Table A4 provides evidence in this direction by showing that candidates do not change their policy positions reported in the VAA application when their electoral incentives change. To achieve this, we analyze within candidate changes between 2008 and 2012. We focus on those VAA questions that remain the same across the years when the candidates’ institutional environment changes. First, we analyze changes in council size, which may happen because of changes in municipal population, municipal mergers, or candidate mobility. Whichever the reason, all should affect candidates’ incentives to respond strategically. Second, we analyze party switching. Out of 8,550 candidates in total that re-run and respond to both survey years, 5.3% switch their party. While candidates overall change their responses somewhat over time (“constant” in the table), candidates do not respond to changes in their contextual environment.32 Only one out of 12 estimates of interest in table A4 is statistically significant and the magnitudes of the point estimates in relation to the constants are small. Hence, table A4 does not support the presence of strategic repositioning in our context. This result is actually in line with past studies of Finnish local politics: Savolainen (2020) has documented with a causal design that candidates do not change their VAA policy responses if they get elected. Moreover, nonstrategic VAA responses seem natural in a setting where candidates have low office-seeking motives (see sec. III.A) while their election does affect policy. As Hyytinen et al. (2018a) and Meriläinen (2022) show, there are substantial policy effects related to (quasi-randomly) electing individual candidates with certain characteristics.
At any rate, the two approaches are not mutually exclusive or in contradiction. Indeed, as long as parties are strategic actors, they can select candidates in anticipation of the candidates’ strategic positioning at any given institutional setting. Thus, it is likely that in various settings both mechanisms could operate in tandem.
H. Robustness, Validity, and Discussion
In appendix A2, we report and discuss the standard validity and robustness checks in detail. The McCrary (2008) test for manipulation shows no evidence on municipalities manipulating their population count at any individual cutoff, or in the pooled data (fig. A9). This makes perfect sense because, since population counts are not self-reported by the municipalities, there are no incentives to manipulate this information. No other policies or municipality responsibilities change at these cutoffs. We also report that the main results are robust across a fair range of bandwidths around the optimal ones (fig. A10).
We report the placebo cutoff analysis in figure A11 (app. A2). This analysis is especially useful for understanding whether the applied RDD specification is appropriate (Hyytinen et al. 2018b). Moreover, it shows that we should trust the clustered results much more than the nonclustered ones because there is within-municipality correlation in the policy positions of the candidates. If the bandwidth calculation does not account for this clustering problem, the optimal bandwidths are too narrow in the sense that the results are derived using only a couple of clusters. This leads to the standard problem that, in small samples, any result is possible by chance even if the design is as good as random. The placebo cutoff test for the clustered specifications works as it is supposed to and the coefficients are zero at the placebo cutoffs.
Finally, in tables A7 and A8 we report 2SLS analysis where we estimate the effect of the realized distortions on intraparty cohesion while using the RDD cutoffs as instrumental variables. All coefficients across all specifications have the expected sign in the second stage (and the first stage and the OLS); that is, as the distortions increase party heterogeneity increases. However, the second stage results are not statistically significant given the low power of 2SLS in general and our weak first stage for 2SLS purposes. Importantly, since party lists are formed before the actual electoral distortion is realized, actual distortions are de facto not a determinant of party list cohesion. Thus, a 2SLS model (i.e., the council size affects the distortions at a given election, and these distortions affect party list cohesion) is arguably not the most suitable estimation approach in our setting.
Notes
We are grateful to James Adams, Manuel Bagues, Peter Buisseret, Alessandra Casella, John Duggan, Olle Folke, Alexander Fouirnaies, Anthony Fowler, Bernard Grofman, Tasos Kalandrakis, David Kang, Eva Mörk, Tuomas Pekkarinen, Carlo Prato, Johanna Rickne, Jim Snyder, Stephane Wolton, and the editor and anonymous reviewers for their constructive comments and suggestions. For valuable feedback we thank participants in the Conference on Research on Economic Theory and Econometrics and conferences of the American Political Science Association, European Economic Association, European Political Science Association, Midwest Political Science Association, Workshop on Political Economy and Political Science, and Società Italiana di Economia Pubblica, as well as seminar audiences at Bocconi, Chicago Harris, Columbia, Harvard, Helsinki Center of Economic Research, King’s College London, Lancaster, Leicester, London School of Economics, Padova, VATT, and Zurich. This research is funded by the European Union (Tukiainen, European Research Council, INTRAPOL, grant 101045239). Views and opinions expressed are however those of the authors only and do not necessarily reflect those of the European Union or the European Research Council. Neither the European Union nor the granting authority can be held responsible for them. This paper was edited by Anna Dreber Almenberg.
1 Our results would not be qualitatively distinct from those of Matakos, Troumpounis, and Xefteris (2016) if we entirely removed the personal vote element. In the appendix we provide extensions of our main setting guaranteeing robustness of our results when voters also care about the lists’ mean ideology or compare the expected utility obtained by the two lists.
2 The focus of our paper is on intraparty heterogeneity at the candidate level. Nevertheless, our main result transcends the stage of candidate selection (and list composition) and rises to the ideological composition of the elected council (see table A6).
3 The literature on electoral institutions is vast. For some representative examples among many others, we refer the reader to studies on the effect of electoral institutions on polarization (Cox 1990), turnout (Blais and Carty 1990; Herrera, Morelli, and Palfrey 2014), campaign spending (Iaryczower and Mattozzi 2013), and corruption, redistribution, public spending, and the provision of public goods (Lizzeri and Persico 2001; Milesi-Ferretti, Perotti, and Rostagno 2002; Persson, Tabellini, and Trebbi 2003; Persson, Roland, and Tabellini 2007). For more references see therein as well as Taagepera and Shugart (1989), Lijphart (1995, 1999), Persson and Tabellini (2002, 2005), and Grofman (2008).
4 Even in the class of proportional representation rules, the exact allocation method decisively affects the seat allocation. For instance, while the D’Hondt method present in our empirical setting “tends to increase the advantage for the electoral lists which gain most votes to the detriment of those with fewer votes” (Kotanidis 2019, 1), other methods generate less disproportional results (e.g., Hare-Niemeyer and Sainte-Laguë methods; see Schuster et al. 2003). Additionally, further explicit distortions in favor of the election winner, aimed at strong governments, can be introduced through the presence of a minimum vote threshold to obtain representation, a bonus to the winner of the election, or others (Herron, Pekkanen, and Shugart 2018).
5 A long-standing literature on personal vote has found that many voters condition their electoral choices on individual candidates’ characteristics, attributes, and promises (proposals) in a variety of countries and institutional contexts (for a summary see seminal contributions such as King 1991; Carey and Shugart 1995; Cain, Ferejohn, and Fiorina 2013). This, in turn, generates strong incentives for candidates to seek and cultivate their “personal” vote. Such incentives are further intensified in open-list multimember district PR systems, such as in Finland (see, e.g., Carey and Shugart 1995). Moreover, in Finnish municipal councils there is no ruling coalition and the role of parties as legislative teams is relatively weak, with several voters liberated to adopt a candidate-oriented perspective in deciding how to vote. For example, in the 2017 Finnish Municipal Election Survey a majority of the respondents stated that candidates were more important than parties in determining their vote (Borg 2018). Personal vote is also important in Finnish parliamentary elections with 42%–51% of the voters voting based on the candidate instead of party in surveys covering five parliamentary elections up to 2011 (see, e.g., Borg 2012, table 18.2).
6 The party-centered process of candidate selection is highly relevant for the Finnish context (even at the local level). While it is possible to run as a candidate independently of parties if the aspiring candidate can collect 10 signatures from local eligible voters in support of the candidacy, this is very rare in practice: In the 2012 municipal elections only 2.2% of candidates were independent.
7 In our benchmark model parties propose non-overlapping intervals. One could relax this assumption without affecting the direction of our results. Consider party L proposing interval [, ] and party R proposing [, ] with . Now all the voters in [, ] are indifferent between the two parties and, hence, randomize their vote. As increases, then the vote share of party L still increases but at a slower pace than in our model. Thus, the incentives to extend the list and to gain votes as the disproportionality increases are still present.
8 Here we assume that elected ideologies are uniformly distributed in the parliament despite the vote distribution being nonuniform. Considering the continuous list as an approximation of a discrete list of equidistant candidates, this assumption requires that once the party elects more than two candidates, the two extreme and hence most voted candidates be definitely elected with the remaining seats distributed among the interior candidates with equal probability. For completeness of our argument, in app. A1 we show that our argument holds for other distributions.
9 In app. A1.1 we provide further theoretical details on the link between the electoral rule disproportionality and council size under the D’Hondt method present in our empirical setting.
10 The council sizes by municipal population are as follows: population of less than or equal to 2,000 (council size of 13, 15, or 17), 2,001–4,000 (21), 4,001–8,000 (27), 8,001–15,000 (35), 15,001–30,000 (43), 30,001–60,000 (51), 60,001–120,000 (59), 120,001–250,000 (67), 250,001–400,000 (75), and over 400,000 (85).
11 Open lists are frequently encountered in national and subnational elections around the world using a PR system. For instance, at least 40 countries, including many western European democracies, use an open-list PR system when electing the single or lower chamber of their national parliament (Wall 2021).
12 Parties can form pre-electoral coalitions and propose a joint list of candidates. The allocation of seats takes place at the coalition list level.
13 Finland was one of the first countries to introduce voting aid applications. They have gained popularity, with surveys indicating that approximately 40% of the Finnish electorate used an application prior to the 2007 parliamentary election, with 15% of the users claiming that they had no favorite candidate and followed the application’s recommendation (see Wagner and Ruusuvirta 2012 and references therein).
14 Also the 2017 election data are available but the council size rule was no longer binding in those elections.
15 Figure A12 reports the histograms of the share of respondents by municipality and by party.
16 Regression discontinuity at population thresholds is a common approach to isolate causal effects. See, e.g., Ferraz and Finan (2009), Egger and Koethenbuerger (2010), Fujiwara (2011), Gagliarducci, Nannicini, and Naticchionia (2011), Pettersson-Lidbom (2012), Brollo et al. (2013), Gagliarducci and Nannicini (2013), Eggers (2015), and Bordignon, Nannicini, and Tabellini (2016). For a recent literature review and possible issues with the use of RDD at population thresholds see Eggers et al. (2018). We carefully address the concerns they raise. Similarly to us, Sanz (2017) and Lyytikäinen and Tukiainen (2019) use population thresholds to study political consequences of electoral systems.
17 In the reported results, the bandwidth is optimized after pooling the data. However, the results are robust both to optimizing at each cutoff before pooling and to controlling for the cutoff fixed effects (not reported).
18 However, the main results remain statistically significant in the nonclustered specifications even if we include the merged units and the larger municipalities, but as expected, the point estimates are closer to zero (table A5). The results are also robust to limiting the sample further to the one to three most densely populated cutoffs.
19 As an illustration, consider elections using the linear rule with as explained in our conceptual framework (see fig. 1B) and two scenarios: one in which the winner gets 55% of the vote and another in which the winner gets 65%. With the same rule, the 55% vote results in a 65% seat share, while the 65% vote results in a 95% seat share. That is, while the disproportionality of the rule in both scenarios is the same, the realized distortions vary.
20 According to our conceptual framework, the disproportionality of a D’Hondt rule that elects a k-member council is decreasing in k (see the relevant subsection in app. A1.1).
21 If parties have formed a pre-election coalition (roughly 15% of the lists in our sample) and, thus, run as a single joint list in the election, we define the list as a single party when calculating the distortions. This is to reflect the actual vote share to seat share mapping. When analyzing party cohesion, the RDD analysis on party cohesion could as well be conducted at the party level. While for consistency we report in the paper our cohesion analysis only at the coalition level, the results are similar at the party level.
22 The modified Gallagher index builds on the simpler Gallagher index (Gallagher 1991), possibly the most standard measure of distortions, defined as . Koppel and Diskin (2009) formalized the concerns by Taagepera and Grofman (2003) on the Gallagher index showing that the modified version of the Gallagher index satisfies relevant properties that the Gallagher index does not (e.g., Dalton’s principles of transfers, scale invariance, orthogonality).
23 We show that the results are similar when using a linear fit both for the whole sample and when limiting the sample within the MSE-optimal bandwidth in figs. A5 and A6.
24 There are obviously many other ways one could calculate similar indexes. We have the luxury of using this simple and transparent metric as our interest is only in the static relative position of a candidate in relation to their party. Moreover, we cannot use, e.g., Mahalanobis or similar standardized distance measures at the within-party level because, by construct, they would force all the lists to be about equally cohesive.
25 We also implement a correction by multiplying the distance by , where n is the number of respondents. This correction is similar to one used in computing sample variance.
26 The probability that parties have more than five respondents does not change at the cutoff. The RDD effect (MSE-optimal point estimate) on an indicator for “party has at least five candidates” is 0.02. This is small in magnitude and not statistically significant (robust and bias-corrected 95% CI is [−0.096, 0.138]).
27 Note that, because we impose this minimum of five responses, we have 30 fewer observations at the municipal level than in the previous disproportionality analysis (sec. III.D).
28 We show that the results are similar when using a linear fit both for the whole sample and when limiting the sample within the MSE-optimal bandwidth in figs. A5 and A6.
29 In table A1, we show that these results are robust to using a more flexible specification in which we include cutoff-specific fixed effects, allow different linear trends in the running variable around each cutoff, and optimize the bandwidths for each cutoff separately before pooling the data.
30 As is common in the literature, we compute the effective number of parties (lists) by the inverted Herfindahl index of vote shares of party lists.
31 However, this does not occur and lists usually include fewer candidates than the maximum permitted. Only 3.4% of lists are full in our sample. Typically only larger parties in larger municipalities do fill the lists completely.
32 We study absolute changes so that possible changes in opposite directions do not cancel each other out in the estimation.
33 For indicative evidence refer to the cases of Italy regarding welfare state reforms (Ceron, Curini, and Negri 2019) and Germany during the 1980s and the low internal cohesion of the Christian democrats (Zohlnhöfer 2003).
