Auditing Voter Registration Databases

As many readers know, we’ve been working on a variety of election performance auditing projects, focusing on voter registration database auditing and other types of statistical forensics (you can see a number of examples on our Monitoring The Election project page). We also have been working on post-election ballot auditing, including our recent Election Audit Summit.

Recently the first paper from our new election integrity project in Orange County (CA) was published, in the peer-reviewed journal American Politics Research. This paper, “Evaluating the Quality of Changes in Voter Registration Databases”, was co-authored by myself, Silvia Seo-young Kim (a Ph.D. student here at Caltech), and Spencer Schneider (a Caltech undergrad). Here’s the paper’s abstract:

The administration of elections depends crucially upon the quality and integrity of voter registration databases. In addition, political scientists are increasingly using these databases in their research. However, these databases are dynamic and may be subject to external manipulation and unintentional errors. In this article, using data from Orange County, California, we develop two methods for evaluating the quality of voter registration data as it changes over time: (a) generating audit data by repeated record linkage across periodic snapshots of a given database and monitoring it for sudden anomalous changes and (b) identifying duplicates via an efficient, automated duplicate detection, and tracking new duplicates and deduplication efforts over time. We show that the generated data can serve not only to evaluate voter file quality and election integrity but also as a novel source of data on election administration practices.

An ungated pre-print version of this paper is available from the Caltech/MIT Voting Technology Project’s website, as Working Paper 134.

We are continuing this work with Orange County, and have in recent months been working to explore how these same voter registration database auditing methodologies can work in larger jurisdictions (Los Angeles County) and in states (Oregon). More on those results soon.

The process that led to the development of this project, and to the publication of this paper, is also interesting to recount. In this paper, we make use of daily voter registration “snapshots”, that we obtained from the Orange County Registrar of Voters, starting back in April 2018. This required that we collaborate closely with Neal Kelley, the Orange County Registrar of Voters, and his staff. We are very happy to participate in this collaborative effort, and thank Neal and his team for their willingness to work with us. It’s been a very productive partnership, and we are very excited to continue our collaboration with them going in the 2020 election cycle. This is the sort of academic-election official partnership that we have worked to build and foster at the Caltech/MIT Voting Technology Project since our project’s founding in the immediate aftermath of the 2000 presidential election.

It’s also fun to note that both of my coauthors are Caltech students. Silvia is in her final year in our Social Science Ph.D. program, and she is working on related work for her dissertation (I’ll write later about some of that work, which you can see on Silvia’s website). Spencer worked closely with us on this project in 2018, as he participated in Caltech’s Summer Undergraduate Research Fellowship program. His project was to work with us to help build the methodology for voter registration database auditing. Currently, Spencer is working in computer science and engineering here at Caltech. This paper is a great example of how we like to involve graduate and undergraduate students in our voting technology and election administration research.