Author Archives: Michael Alvarez

Seo-young Silvia Kim: The Benefits of In-Person Election Observation

Guest Blog by Seo-young Silvia Kim

Silvia Kim is a PhD candidate at Caltech, currently finishing her dissertation research on American Politics and Political Methodology. Silvia has been a key collaborator on the Monitoring the Election project. She’ll be starting her new position as an Assistant Professor in the Department of Government at American University in August 2020.

On Super Tuesday, I drove more than 170 miles alone in my tattered old car, zigzagging through both Los Angeles and Orange County for election observations, visiting nine vote centers from noon to 10pm. Usually our team policy is to go out in pairs, but this year I was determined to roam the new vote centers far and wide all day, so I volunteered to go alone.

I am a quantitative data analyst—that is to say, I revel in gathering and analyzing numbers. Qualitative research, which focuses on unquantifiable, non-numerical data, is usually not my turf and out of my interest. Yet ever since I jumped into the world of elections and election administration, I have been observing elections every primary and general Election Day. And I would never underestimate the importance of in-person observations in research.

The benefit of in-person observations are numerous. One gets to observe the election take place out in the open, the street-level bureaucrats and voters at their natural “habitat.” Once I arrive, I observe the exterior of the location, ask permission from the center lead to observe, stand still in a corner so that I do not get in the way of voter, and then observe for 10 to 30 minutes. When there is not much traffic at the center and there are no notable troubles, ten minutes could be enough. When there are long lines and apparent trouble at the location, sometimes even 30 minutes is not enough.

If time and place allow, I may also be able to chat with various vote center staff. This did not happen as much as in 2016 or 2018, as there was high turnout and the staff were busy. But during the early voting period, or in locations with less voters, or when a staff is biting into cold pizza slices, I get to ask questions about what is going on at the voting location. Are all check-in devices working properly? Did they receive all necessary equipment on time? Were communications with the Registrar smooth and readily available? Were there any particular spikes in provisional voting or voter information edits, and if so, why? In most cases, they are happy to provide answers, as I consolidate these into recommendations for the Registrar, as in the Los Angeles Vote Center Observation Report.

While these anecdotes may not necessarily be generalizable, they provide important intuition as to what to look for when numerical data actually arrives. For instance, I personally observed thousands of students milling in a line to vote, a vote center in Los Angeles County. Did the same happen in Orange County, where I did not get to see any college locations? When I analyzed the wait time data, as reported in our Orange County Vote Center Observation and Wait Time Report, I did indeed see long wait time at UCI, CSUF, and Chapman University’s data. Based on the intuition built from my observations, I can look for common patterns in the data more quickly. In other words, the qualitative analysis that I undertake provides direction and guidance.


With COVID-19, the administrators all around the United States are scrambling to prepare for the voting experience in the midst of a pandemic. It may not be possible to “observe” the election as I usually have. It will still be beneficial if alternatives can be implemented—for example, researchers talking directly to staff that have worked on the in-person voting locations via the phone. If not, we still hope that the intuition that we have gathered for Los Angeles and Orange County improve the voting experience of Southern California’s electorate in future elections.

The Blue Shift in California Elections

Guest Blog by Michelle Hyun

Michelle is an undergraduate student at Caltech. In the summer of 2019, she was a Summer Undergraduate Research Fellow (SURF) at Caltech. Her research was conducted in collaboration with Yimeng Li and R. Michael Alvarz. We have recently released a working paper, it’s now available online, “Why Do Election Results Change After Election Day? The Blue Shift in California Elections.”

With the presidential primaries ongoing, and the November 2020 general election looming, it is critical to explain the trends of voters and the integrity of the election. In many past elections, the occurrence of the electoral “Blue Shift,” in which vote margins are observed to shift towards favoring Democratic candidates, has provided a surge of votes in the later parts of the vote count that has caused changes in leads more often than expected. This shift can cause people to call into question the integrity of the election system, which is dangerous for the legitimacy of democratic elections and the participation of voters in their government.

The blue shift has been observed in several past elections: one such example was the 39th District’s 2018 election for the U.S. House of Representatives, Young Kim believed she had won the race, but after a few weeks, it was revealed that her opponent, Gil Cisneros, had actually won. Elections in Orange County, California in 2012, 2014, 2016, and 2018 for House of Representative seats, gubernatorial seats, and presidential seats were analyzed to seek the cause of the drift. Shifts occurred across almost all of these elections, and while they may not have been enough to change the results of all the elections, the drift was certainly enough to raise questions.

Technological advances have made ballot counting and transmission of election results faster than ever. In many states including California, soon after polls close, election officials release results from early-voting ballots and mail ballots that have been processed before Election Day, followed by regular ballots cast on Election Day as precincts report them. Major cable networks, radio stations, and other media organizations receive these results from The Associated Press correspondents stationed at local government offices and data feeds provided by local governments as soon as they become available and make projections on most races. As a result, for voters following Election Day coverage on TV, radio, the Internet, or through morning newspapers, it may appear that elections are mostly over except for a few close contests by the end of Election Night. This perception masks the reality that a significant fraction of ballots is counted after Election Day, especially in states like California where voting by mail and provisional ballots are common.

This paper shows that the demographics of the voters and the number of ballots that are counted later in the election process are directly related to the magnitude of the blue shift. Using data from the Orange County Registrar of Voters (OCROV), we were able to find a positive association between Democratic voters and blue shifts; additionally, using data from the Cooperative Congressional Election Survey (CCES) and the Survey of the Performance of American Elections (SPAE), we were able to find that young, non-white voters are more likely to cast ballots that are counted later in the vote count process. Our findings are significant in that they explain a phenomenon that may call into question the integrity of our voting system. As the presidential primaries continue and as the general election approaches, we seek to establish voter confidence to encourage voter participation and ensure a smooth transition of power.

Guest Blog by Andrew Sinclair: California’s Top-Two Primary

California’s Top-Two Primary, by Andrew Sinclair

After 2020’s “Super Tuesday,” many of the news headlines in California (and the rest of the nation) will focus on the outcome of the Democratic Party’s presidential nomination contests. Yet, there are other elections taking place in California at the same time, continuing California’s experiment with the “top-two primary.”

There are actually three types of elections taking place in California. The presidential contest is a traditional partisan primary. Unaffiliated (“no party preference”) voters can participate in the Democratic Party primary if they request a Democratic ballot. Still, the election is partisan in nature: it helps determine which candidate will be the party’s nominee for the November general election. The other two kinds of elections are both variants of nonpartisan elections.

California uses the nonpartisan top-two election procedure for “voter-nominated offices.” This year, these are the State Assembly, State Senate, and U.S. House elections (there are no statewide office or U.S. Senate elections this year). For these elections, every voter can choose between all of the candidates. The two candidates with the most votes advance to the general election in November, even if they come from the same party. Candidates for these offices list their party preference on the ballot. For a short explainer on how this works, Christian Grose (USC) has a great 4-minute description in an interview with NPR: here.

California also holds elections for “nonpartisan offices.” This can be a bit confusing, since the top-two for the voter-nominated offices also is a kind of nonpartisan election – but for these elections (county supervisor, etc.), the candidates also do not list their own party preference on the ballot and, if one candidate gets more than 50% in the primary, that person is simply elected. Otherwise, these are pretty similar to the voter-nominated top-two elections.

California adopted the nonpartisan top-two procedure for voter-nominated offices by passing Proposition 14 in 2010. Michael Alvarez and I wrote a book – Nonpartisan Primary Election Reform: Mitigating Mischief (Cambridge University Press, 2015) – about the first use of the top-two in California in 2012. I have been following along since then as we have learned more in each cycle about how this rule operates.

The top-two procedure is different than what we see in most states, both in the primary and general election (Ian O’Grady and I make this argument in the Routledge Handbook of Primary Elections – which is a nice resource; the other chapters are great). For political scientists, the institutional variation the top-two represents provides an interesting window into primary elections, voter behavior more generally, and the operation of political parties. This post highlights a few things to look for in the 2020 cycle here in California.

Why 2020 is unique.

This will be the fifth cycle with the rules in California, although each election year has taken place in a unique context in terms of top-of-the-ticket races and the state of national politics. The 2020 election will be the first of this era to take place in March, with the Democratic Party’s nomination far from over. With all of the previous primaries in June, California had a more limited role in the 2012 and 2016 presidential primaries. It is also the first of the top-two elections to have no statewide elections (either statewide offices as in 2014 and 2018 or a U.S. Senate election as in 2012, 2016, and 2018).

It may be the case that Republicans, without a meaningful presidential contest or statewide election, will turn out at much lower rates than Democrats. Registered Republican voters make up only 24% of the state’s electorate (as of the Feb. 18 report from the Sec. of State). Since all candidates – Republicans and Democrats alike – are in the same primary for the voter-nominated offices, low turnout for one party can potentially make it hard for that party’s candidates to make it to the November ballot.

What to look for.

How many same-party general elections will there be? In past election cycles, it has still been the case that most primaries sent one Republican and one Democrat to the general election. If the Republican vote does collapse, it is possible that we may see more Democrat-on-Democrat general elections.

Where are the same-party elections? Following up on an observation in the Alvarez-Sinclair book, I (and coauthors Ian O’Grady, Brock McIntosh, and Carrie Nordlund) examined over a couple of cycles where these same-party elections tended to take place. The punchline: the more politically lopsided the district, the more likely it is to see two candidates of the same party advance to the general election. These can make the election a lot more competitive than they likely would be otherwise, too. Does that finding continue to hold?

How well do party-endorsed candidates do? One of the prevailing theories of primary elections is that parties generally do a pretty good job of organizing to back their preferred candidates (for a neat recent book on this, see Hans Hassell’s The Party’s Primary, among others).

What happens to races with crowded fields? In not every contest do the parties manage to keep the number of candidates of their own party down. While somewhat odd results do not happen very often, they can: nearly everyone cites the 2012 Miller-Dutton CD 31 race when they want an example in part because it illustrates a potential issue and in part because there are not many other examples. The idea is that the vote can split across candidates in such a way that the majority party could, if the cards fall exactly right, end up shut out of the general election.

How do moderate candidates do? This was, of course, one of the original claims about the top-two primary (for a nice summary from Eric McGhee at PPIC: here). As Nolan McCarty wrote in his recent book on polarization, results of studies on the impact of the primary reform on polarization are “decidedly mixed.” This is part of the fun of political science – as multiple scholars use different approaches to work towards an understanding of a problem. See work from Christian Grose (here) and Eric McGhee and Boris Shor (here).

Do any incumbents look to face stiff challenges in November? Beyond the question of ideological moderation, it may also be the case that voters can use the top-two to eject candidates from office who have performed poorly (potentially democratically meaningful, even if the replacement is ideologically similar).

What happens with third party/independent candidates? These candidacies have not generally been very successful under the top-two at reaching the general election (although, under the old partisan system, they were also not very successful at winning offices).

Are voters happy with their choices? In the surveys for our book, Michael Alvarez and I found that voters were fairly uncertain about what they expected from the top-two. As they experience it: do they learn about it, and get used to it? Or discover that they don’t like it?

Particular Races to Watch.

Congressional District 8. This used to be Paul Cook’s (R) district; in announcing his retirement, he set off one of the more interesting contests in this cycle. Republican Assemblyman Jay Obernolte won the Republican Party endorsement but there are five Republicans on the ballot in total, including a former candidate for Governor, Tim Donnelly. Yet there are also three Democrats and one NPP candidate on the ballot.

Congressional District 25. This is the Katie Hill seat – and it drew six Democrats, six Republicans, and one NPP candidate. The state Republican Party did not endorse a candidate and there are several potential contenders. The Democratic Party has not issued an endorsement in this race either. To further compound the story, there is a simultaneous special election to fill the remainder of the term.

There are others that will be pretty interesting, but those two are certainly worth checking in on as the returns come in. Of course, it may be quite some time before we really know what the results are for some tightly contested races.

What’s hard to see this week.

The top-two election procedure impacts both the primary and the general election. Some of the research on the top-two points towards finding the most interesting results in the general election. In the surveys Michael Alvarez and I conducted in 2012, we found that mostly voters tried to choose ideologically proximate candidates in the primary. (See also a nice study: Ahler, Citrin, and Lenz: here). But what happens once two candidates of the same party advance to the general election?

There is interesting work now (see Badas and Stauffer 2019) on gender and nonpartisan elections as well as race and ethnicity in same-party general elections under the top-two (see Sadhwani and Mendez 2018). If party is not a cue, voters have to decide somehow; Betsy Sinclair and Michael Wray (2105) found that same-party elections corresponded with increased Google searching just before the election.

One of the more commonly mentioned consequences of same-party elections in November, also, is the possibility that “orphaned” voters will not participate (see Nagler 2015, Masket, via Vox, 2016, Fisk 2019 – Fisk with the on-point title of “No Republican, No Vote”). Yet, many still do participate, and we will have to see over the next several months how candidates in these same-party races try to appeal to these voters. It is possible to get something like what reformers intended (something like the 2012 AD5 race) if a centrist candidate holds on to a sizable enough – if short of a majority – of the voters of their own party, and pulls in just enough of the orphaned voters as well. That’s a delicate balancing act, though, because going too far towards recruiting support from the other party can cost a candidate support within their own.

In the larger picture, I would also say that it can take a long time to assess how political institutions function. Politics is a complicated business, and many of the institutions interact. Some of the scholars referenced here have noted that California passed the top-two procedure and the citizen redistricting process around the same time, making it hard to sort out the impact of each. I also wrote, in a paper on the adoption of the simple majority requirement to pass the state budget (Proposition 25, also in 2010), that it directly impacted one of the main motivations for passing the top-two (to get more moderate legislators, to be able to pass a budget). The election rules also operate within a political context defined by the party and ideological divisions present at the time (I am particularly fond of Hans Noel’s book on the difference between these: link). With different ideological and social cleavages, the rules may also have different consequences.

About the author: Andrew Sinclair is an Assistant Professor of Government at Claremont McKenna College. He has studied and written about California’s top-two primary process, as well as the primary election procedures in other states.

California’s Super, Super Tuesday: What to Expect

On March 3, California will be one of fourteen states holding primary elections (American Samoa will have caucuses that day). California’s 454 delegates to the Democratic National Convention will be at stake on March 3, meaning that California is a very large prize for candidates still seeking the Democratic presidential nomination.

But there’s a very good chance that we will not know the winner of California’s Democratic presidential nomination primary the evening of March 3. In fact, we may not know how California’s delegates will be allocated until much later in March. This will be especially true if there’s no clear front-runner in the Democratic presidential nomination contest by March 3.

So why are we anticipating that we may not know the winner of the Democratic presidential primary in California after polls close on March 3?

California is in the midst of sweeping changes in election administration procedures and voting technologies. While some of these changes started in 2018 in some counties, they are now hitting the larger counties in the state, in particular Orange and Los Angeles Counties. Election officials throughout the state have been working in recent years to make the process of registration, getting a ballot, and returning that marked ballot, much easier and more convenient. And it’s these changes that are likely to introduce significant delays in the tabulation of ballots after the polls close on March 3, and which could well delay the determination of a winner in California’s Democratic primary for days or weeks, if the contest is close statewide.

California election officials have sent out an unprecedented number of ballots by mail. For example, in Orange County the Registrar of Voters has mailed just over 1.6 million absentee ballots to registered voters. Many of those ballots (269,690 as of February 27 in Orange County) have been returned — but the vast majority of them are still in the hands of voters. We estimate that many voters will be dropping their voted absentee ballots in the mail in coming days, or they will drop them off in voting centers between now and Election Day. And if a vote-by-mail ballot is received and validated on or before Election Day, but is received by the election official no later than 3 days after March 3, it will be included in the tabulation. This means that there are likely to be a large number of these by-mail ballots that will be received on Election Day, and in the 3 days following Election Day, that will all need to be processed, validated, and included in the tabulation (mostly after March 3).

Californians who for some reason haven’t registered yet to vote, but who want to register now and participate in the March primary, can do so using what is called “Conditional Voter Registration” (CVR), in which they can register and vote at many locations in their county (usually the county election headquarters, a vote center, or a polling place). It’s unknown how many potentially eligible Californians may take advantage of the the CVR opportunity, but it’s possible that we might see large numbers of conditional registration voters between now and March 3, and of course many of these voters will not have their materials processed, and if they are eligible to vote, to have their ballots included in the tabulation, until after the primary on March 3. If there is a swell of interest in the primary election among currently unregistered but eligible voters, this could significantly slow down the reporting for final results after March 3.

Finally, there is also a good chance that there will be strong turnout on March 3, potentially resulting in crowded voting locations statewide, and producing a very large number of ballots, CVRs applications and provisional ballots, and by-mail ballots dropped off on Election Day. If turnout is strong in in the March primary, the large amount of election material that will need to be reconciled and examined after Election Day could also slow the tabulation process, and could introduce significant delays in the reporting of results.

Now that’s just on the administrative side. It also turns out that the rules governing the allocation of California’s 494 Democratic National Convention (DNC) delegates are exceptionally complex, so complex that they will require another blog post. The important issues are that most of the state’s DNC delegates are allocated proportionally to the statewide primary winners, and to the primary winners of the primary in each of the state’s Congressional Districts — but only those candidate receiving more than 15% of the votes cast in either case get delegates. So in order to know the delegate count from California’s Super Tuesday primary, we’ll need accurate counts of the votes cast in each Congressional District, and that could take days or even weeks.

There’s a good chance that we may not know the final delegate count until for a few weeks after the primary. So patience — the process will take time, and let’s give our election officials the opportunity to do their jobs and to produce an accurate tabulation of the results of California’s Super Tuesday March primary.

Twitter Monitoring for the 2020 Super Tuesday Primaries

We’ve launched our Twitter election monitors for the 2020 Super Tuesday primaries, the visualizations are now being posted at Monitoring the Election, or you can see them on GitHub. These data are being collected using a similar process to the one we tested and deployed in 2018. And if you are interested in seeing the code from 2018, here’s a link for that GitHub repo.

The major improvement since 2018 is that we’ve rebuild the code base, and this now runs in the cloud. That should make our collection stream more reliable, and will allow us to scale the collection process to cover additional keywords and hashtags when necessary. These improvements are the subject of a technical paper that is now under development, and we hope to release soon.

We also continue to work on the geo-classifaction of the data we are collecting, and have a few improvements in our process that we’ll roll out soon. These improvements should allow us to monitor social media discussion about the Super Tuesday primaries in Southern California.

The team working on this project includes Nicholas Adams-Cohen (now a post-doc at Stanford University) and Jian Cao (a post-doc here at Caltech).

The Iowa Democratic Caucus: Why Elections Need To Be Fully and Independently Audited

It’s Friday morning, and by this time I think that everyone who follows American elections thought that we’d have some clear sense of the outcome of the Iowa Democratic Caucus.

Instead, we have headlines like this, from the New York Times, “Iowa Caucus Results Riddled With Errors and Inconsistencies.” While it’s not necessarily surprising that there are errors and inconsistencies in the current tabulation reports from the Iowa Democratic caucuses, the issue is that we may never get a clear, trustworthy, and accurate tabulation of the caucus results.

It’s helpful that the caucuses produced tabulation results on paper — and these paper tabulation records can be examined, and these records can form the basis for recounting and even auditing the caucus results. But it doesn’t seem that there ever was any intention for anyone to try to audit or validate the results of the caucus. And I keep scratching my head, wondering why, given how close and competitive the Democratic presidential selection contest has been, it doesn’t appear that anyone considered building a process to audit and validate the caucus results in near-real time.

For example, in our Monitoring The Election project, we pilot tested independent and near real-time quantitative auditing of a number of aspects of the election process in Orange County (CA) in 2018. We are now just starting to do that same type of auditing in both Orange County and Los Angeles County for the March Super Tuesday primary (we’ll start releasing some of our auditing reports very soon). A similar process could have been used in the Iowa caucuses.

What would it involve? Quite simply, the Iowa Democratic Party could work out a data provision plan with an independent auditing group (say the Caltech/MIT Voting Technology Project and/or university and college teams in Iowa). They could securely provide encrypted images of the tabulation reports from the caucus sites, and the independent auditing team would then produce auditing reports for each round of tabulation. These reports, like those that we currently produce as part of our project, would of course be provided to the appropriate officials and then posted to a public website. As rounds of tabulation proceed, this process could continue, until the final tabulation is complete, at which time the independent auditing group could provide their evaluation of the final reported tabulation.

This could have been done earlier this week, and had such a system been in place, it might have helped provide an independent perspective on the problems with the initial tabulations on Monday night, and quite likely could have alleviated a lot of the rumors and misinformation about why the tabulation was proceeding so slowly and why the results were riddled with errors and inconsistencies. By announcing, in advance of the caucuses, a plan for independent auditing of the tabulation results by a trustworthy third-party, the Iowa Democratic Party could have relied on the auditing process to help them figure out the issues in the tabulation, and perhaps helped to buttress confidence in the accuracy of the reported results.

At this point in time, while the data is being released, it’s unfortunate that there wasn’t an independent auditing process established before the crisis hit.

In my opinion, one of the most important lessons from this experience this week is that election processes need to be fully and independently audited. Whether those audits are conducted by academic researchers, or by other third-parties, they need to be a regular component of the administration of any public election process (caucuses, primaries, special elections, and general elections). I think that election officials throughout the United States can learn a lesson about the importance of independent election performance auditing from the chaos of the Iowa Democratic caucuses.

The Iowa Caucus: A Frustrating Start to Election 2020

Like most observers of elections, I got my bowl of popcorn and turned on the TV last night, expecting to learn more about who “won” the Democratic caucuses in Iowa. I enjoyed the popcorn, but got a bit bored watching the pundits speculating endlessly about why they didn’t have immediate results from the caucuses last night.

While like everyone else, I’d like to learn more about who “won” the Democratic caucuses in Iowa, I’d also like to make sure that when the officials there announce the results, they provide the most accurate results they can — and they provide a detailed explanation for why there has been such a delay in reporting the results.

As my colleagues on the Caltech/MIT Voting Technology Project and I have said for nearly two decades now, democracy can be a messy business. Elections (including primaries and caucuses) are complex to administer, they inevitably involve new and old technology, and with hundreds of thousands of people participating they take time to get right. I suggest that we all take a deep breath, let the Iowa Democratic Party figure it all out, and be patient. It’s much better for American democracy, and for the confidence of voters and stakeholders, if we get accurate results and an explanation for the delay, rather than hurried and incorrect results.

And for the rest of this election cycle, I suggest continued patience. As we move further into the primary season, and then into the fall general election, issues like what we are now witnessing in Iowa will continue to arise. It’s likely that on Super Tuesday we might not know who “won” California immediately after the polls close that evening, for example. But we should let election officials have the time and space to get the results right, and to be transparent and open with the public about why delays or issues arise in the administration and tabulation of elections.

The 2018 Voting Experience

My fellow VTP Co-director, Charles Stewart III and some of his research team, released an important study last week: “The 2018 Voting Experience: Polling Place Lines.” Charles and his team continue to find that long lines can be an issue in many states and election jurisdictions. They estimated that in 2018, 5.7% of those who tried to vote on Election Day waited more than 30 minutes to vote, and that this was significantly longer than what they had found in the previous federal midterm election in 2014. Importantly, they also show that wait times are not distributed uniformly across the electorate, with nonwhite and voters in densely populated areas waiting longer to vote than whites and voters in less densely populated areas. Finally, as they note that wait times are strongly correlated with a voter’s overall experience at the polls, long wait times are an issue that needs continued attention in the United States. This is especially true as we are heading into what may be a very closely contested array of state and federal primary and general elections in 2020, where many states and jurisdictions may see much higher turnout than in 2016 and 2018.

Seo-young Silvia Kim’s research on the costs of moving on turnout

Seo-young Silvia Kim, one of our PhD students at Caltech working on our Monitoring the Election project, has recently posted a really interesting working paper online, “Getting Settled in Your New Home: The Costs of Moving on Voter Turnout.” Silvia’s recently presented this paper at a couple of conferences and in research seminars at a number of universities.

What is the dynamic impact of moving on turnout? Moving depresses turnout by imposing various costs on voters. However, movers eventually settle down, and such detrimental effects can disappear over time. I measure these dynamics using United States Postal Services (USPS) data and detailed voter panel data from Orange County, California. Using a generalized additive model, I show that previously registered voters who move close to the election are significantly less likely to vote (at most -16.2 percentage points), and it takes at least six months on average for turnout to recover. This dip and recovery is not observed for within-precinct moves, suggesting that costs of moving matter only when the voter’s environment has sufficiently changed. Given this, can we accelerate the recovery of movers’ turnout? I evaluate an election administration policy that resolves their re-registration burden. This policy proactively tracks movers, updates their registration records for them, and notifies them by mailings. Using a natural experiment, I find that it is extremely effective in boosting turnout (+5.9 percentage points). This success of a simple, pre-existing, and non-partisan safety net is promising, and I conclude by discussing policy implications.

This is important and innovative work, I highly recommend her paper for readers interested in voter registration and voter turnout. She uses two different methods, one observational and the other causal, to show the reduction in the likelihood of turnout for registered voters who move.

Election forensics and machine learning

We recently published a new paper on election forensics in PLOS ONE, “Election forensics: Using machine learning and synthetic data for possible election anomaly detection.” . It’s a paper that I wrote with Mali Zhang (a recent PhD student at Caltech), and Ines Levin at UCI. PLOS ONE is an open access journal, so there is no paywall!

Here’s the paper’s abstract:

Assuring election integrity is essential for the legitimacy of elected representative democratic government. Until recently, other than in-person election observation, there have been few quantitative methods for determining the integrity of a democratic election. Here we present a machine learning methodology for identifying polling places at risk of election fraud and estimating the extent of potential electoral manipulation, using synthetic training data. We apply this methodology to mesa-level data from Argentina’s 2015 national elections.

This new PLOS ONE paper advances the paper that Ines and I coauthored with Julia Pomares, “Using machine learning algorithms to detect election fraud”, that appeared in the volume of papers that I edited, Computational Social Science: Discovery and Prediction. This is an area where my research group and some of my collaborators are continuing to work on methodologies to quickly obtain elections data and analyze it for anomalies and outliers, similar to our Monitoring the Election project. More on all of this soon!