VTP report: The Voting Technology Project: Looking Back, Looking Ahead”

The Caltech/MIT Voting Technology Project has recently released the first of a series of reports for the 2016 U.S. presidential election. This report, “The Voting Technology Project: Looking Back, Looking Ahead” outlines the history of the Caltech/MIT Voting Technology Project (VTP), and discusses some of the issues and states where the VTP will be focusing their collective research activities for this fall’s election.

As this report discusses, the VTP was formed immediately after the 2000 presidential election. In particular, the project was established to study the problems associated with voting technologies in that election, and to propose solutions for those technological problems before the next presidential election in 2004. To assess the problems with voting technology in 2000, the VTP was constituted as a bicoastal, interdisciplinary research group.

While studying the issues with voting technology in the 2000 presidential election was our initial focus, the team quickly figured out that voting technologies were not the only issues plaguing U.S. elections: our studies revealed that significant numbers of votes were lost in the 2000 presidential election to problems other than bad voting technology. The non-technological issues that the VTP identified were voter registration, absentee voting, and problems with polling place practices.

The VTP issued our first major research studies in June and July 2001 — fifteen years ago! The first of those studies examined the reliability of existing voting technologies, using the residual votes metric; the second study took a broader focus, and used the lost votes measure to compare the problems of voting technology to those associated with other aspects of the election process in the U.S. Both of these 2001 studies were significant: they were widely read by policymakers, election officials, other academics, and the interested public. These studies played important roles in the development of federal, state, and local election reform efforts after 2001, including the Help America Vote Act. These studies also laid the foundation for the development of a surge of interest in the study of election administration and voting technology by academics, which fifteen years later has grown to include researchers across the globe, who jointly have produced many important books and articles on voter registration, voter identification, absentee voting, voting technology, polling place practices, and election administration.

Fifteen years later, the VTP continues to carry out ambitious and important research on voting technology and election administration. As a project, we have released a number of important policy reports since 2001, we have published our research widely, we have helped election officials and policymakers across the globe improve their voting technology and election administration practices, we have trained many students in the science of elections, and we have collaborated widely with researchers at many other colleges and universities.

As this new report discusses, going into the 2016 November general elections in the U.S., the VTP will be focusing on many of the same issues which have received our attention in the past (ironically, in some of the same states where we have focused our studies in past elections). Our efforts will involve the study of how to improve polling place practices, in particular the elimination of long lines at polling places on Election Day. We will continue our studies of voter identification and authentication procedures, and how new technologies might allow for accurate identification without disenfranchisement. The VTP will be looking closely at the performance of old and new voting technologies that will be used this fall. Finally, we will also be studying voting-by-mail and early voting in the states which widely use those convenience voting options. We’ll provide additional reports about those studies as the election season progresses, and issue post-election evaluations when we have results to share with our colleagues and readers.

The 2000 presidential election was unique, and the combination of problematic voting technologies with a very close election focused the attention of the world on how American elections are conducted. The good news is that much has improved in the conduct of federal elections in the U.S. since 2000, and the research community now has metrics and methods to study election performance well.

However, in battleground states, where the presidential election will be fought, it’s likely that attention will again focus on administrative and technological issues in November 2016, especially if none of the presidential candidates can easily claim an Electoral College victory the evening of November 8, 2016. We hope that the release of this report, and the others that we will published between now and Election Day, will help minimize the number of votes that are lost in the electoral process.

Media exit polls, election analytics, and conspiracy theories

The integrity of elections is a primary concern in a democratic society. One of the most important developments in the study of elections in recent decades has been the rapid development of tools and methods for evaluation of elections, most specifically, what many call “election forensics.” I and a number of my colleagues have written extensively on election evaluation and forensics; I refer interested readers to the book that Lonna Atkeson, Thad Hall, and I wrote, Evaluating Elections, and to the book that I edited with Thad and Susan Hyde, Election Fraud.

One question that continues to arise concerns whether observed differences between election results and media exit polls is evidence of electoral manipulation or election fraud. These questions have been raised in a number of recent U.S. presidential elections, and have come up again in the recent presidential primary elections in the U.S. In a recent piece in the New York Times, Nate Cohn wrote about these claims, and why we should be cautious in the use of media exit polls to detect election fraud. Each of the points that Cohn makes is valid and important, so this is an article worth reading closely.

I’d add to Cohn’s arguments, and note that while media exit polls have clear weaknesses as the sole forensic tool for determining the integrity of an election, we have a wide variety of other tools and methods to use in situations where there are questions raised about an election.
As Lonna, Thad and I wrote in Evaluating Elections, a good post-election study of an election’s integrity should involve a variety of data sources and multiple methods: including surveys and polls, post-election audits, and forensic analysis of disaggregated election returns. Each analytic approach has it’s strengths and weaknesses (media exit polls included), so by approaching the study of election integrity using as many data sources and different methods as we can, we can best locate where we might want to launch further investigation of potential problems in an election.

I have no doubt that we will hear more about the use of exit polls to evaluate the integrity of the presidential election this fall. Keep in mind Cohn’s cautionary points about using exit polls for this purpose, and also keep in mind that there are many other ways to evaluate the integrity of an election that have been tested and used in past elections. Media exit polls aren’t a great forensic tool, as Cohn argues: the types of exit polls that the news media uses to make inferences about voting behavior are not designed to detect election fraud or manipulation. Rather, those interested in a detailed examination of an election’s integrity should instead use the full array of analytic forensic tools that have been developed and tested in the research literature.

California’s 2016 Primary Election: Lessons Learned?

I headed to my local polling place this morning at about 8am, which this election was in our neighborhood firestation. I have to confess, I had quite feeling of deja vu this morning, having again experienced and seen many of the same things that I’ve seen in the hundreds of polling places I’ve visited in my time associated with the Caltech/MIT Voting Technology Project (since the 2000 presidential election).

First, good and bad news. The good news is that my polling place was busy this morning; the bad news was that meant that parking wasn’t straightforward (I had to circle the block to find street parking, a typical problem regarding polling places in densely-populated urban areas). I also had to wait a bit, it took me about 3-4 minutes to check in an to then get my ballot, and then I had to wait for just over 5 minutes to be able to use a ballot booth with an Inkavote device (I’m an LA County voter). Part of my wait time, of course, arose because one voter cut in front of the folks waiting in line (which generated some irritation).

But based on what I saw this morning, I’m betting that turnout in today’s California primary election will be higher than seen in recent statewide primary elections.

Second, there was again the same confusion we’ve seen in polling places in previous elections regarding the rules of the top-two primary, which admittedly are complex. Some non-party-preference voters don’t understand that they can request the party ballot for only some parties, and at least one person who seemed to be a registered Green party voter was vocal in their irritation at not being able to vote for Sanders. There was also confusion in the minds of some voters about which ballot booths to use, because they needed to use a specific vote recorder (and thus a specific party booth) in this election.

Third, one of the two Inkavote ballot scanners seemed to be having some problems; when I arrived, there was some sort of commotion regarding one of the two ballot scanners, with three poll workers trying to figure out why a voter was having some sort of problem scanning their ballot. Whether that was resolved or not was hard for me to tell, as it wasn’t the scanner associated with my voting precinct.

Fourth, as I had written about earlier, the ballot for the U.S. Senate race was confusing. The ballot had a warning page about the ballot design, and while I think that it was about as well laid out as possible for the InkaVote system, it was confusing. I’m hypothesizing that once the election is over and we have data to study, we are likely to see a higher-than-expected residual vote (caused by overvoting) with respect to this race.

So a mixed evaluation. Clearly, since 2000 academics and election officials have learned a lot about how to study and conduct elections. But it’s also somewhat frustrating to see the same issues cropping up again, with not terribly accessible polling locations, lines, voter confusion about the rules of the election, potential problems with ballot designs, and seemingly glitchy voting technologies. In general, it all seemed a bit more chaotic than necessary. Hopefully between now and November many of these issues will be mitigated or eliminated, as it’s looking like a contested and controversial general election is heading our way.

Computational Social Science and Election Administration

I recently edited a volume for Cambridge University Press, Computational Social Science: Discovery and Prediction. A summary of the book, and some ideas about new directions for this evolving field, is in a blog post that was just published on the CUP’s blog (fifteeneightyfour), “Computational Social Sciences: Advances and Innovation in Research Methods.”

computational-social-science-615x290

There’s a lot in this book that will be of interest to readers of this blog. Contributions include essays by Ines Levin, Julia Pomares and I on using machine learning techniques to detect election fraud; papers on how to use big data tools to improve government policymaking (Price and Gelman, and Griepentrog et al.); and chapters on a variety of new tools for analyzing text data, networks, and high-dimensional data.

Here we go again? Ballot design and the June California primary

Political observers will remember the 2003 California gubernatorial recall election, where 135 candidates ran in the election to replace Governor Davis. Rod Kiewiet and I wrote about how this complex election produced difficult decision problems for voters, and in a different paper (with Goodrich, Kiewiet, Hall and Sled) I also wrote about how the complexity of the recall election posed administrative problems for election officials.

The upcoming June primary in California is shaping up as one where again we have a complicated race, though this time it is for the U.S. Senate. In the primary for the U.S. Senate there are 34 candidates competing to win the primary, and to move on to the general election in November. There have been a few preliminary reports in the media about how the crowded ballot might be problematic for voters when they try to find their candidates in the primary election.

So I took a look at the sample Democratic ballot for Los Angeles County, which is reproduced below. The first page is designed to tell voters that the ballot for the U.S. Senate will have two pages, and it reproduces each page.

sample-ballot_Page_1

The next two pages show what the ballot will look like in L.A. County; in this example, there are 19 candidates listed on the first page, and 15 candidates listed on the second page. The way this ballot is laid out, voters would need to look for their candidate on the first page, and then if they don’t find their candidate listed there, flip to the second page to find their preferred candidate.

sample-ballot_Page_2

sample-ballot_Page_3

This is just an example from L.A. County, which uses a unique voting system (“InkaVote”). While the sample ballot provides ample warnings to voters to only vote for one candidate, and to check both pages for their candidate of choice, there’s a good chance that we’ll see voters make mistakes. In particular, we may see a increased risk of overvotes for the U.S. Senate race, as some voters may not understand that they are only supposed to vote for one candidate (or not see the warnings) and instead may believe they are supposed to make a mark for a candidate on each page. We may also see voters get confused and just skip this race, which might result in an increased rate of undervoting in this election.

As other counties are using different ballot designs and layouts for this race, this is just an example of what might happen in L.A. County. Given the complexity of this ballot and election, there’s a good chance that we might see increased rates of both undervoting and overvoting across the state this June, though the exact causes for that will depend on the specifics of ballot design and layout in each county.

Whether these designs and layouts lead to systematic voter errors of the sort seen in 2000 is not clear at this point, as I’ve not had a chance yet to look at sample ballots from many of the larger counties in the state. However, we do know from research published about the infamous “butterfly ballot” used in Palm Beach County in the 2000 presidential election, even a relatively small number of voter mistakes can be influential in a close election (see the paper by Wand et al. on the butterfly ballot). If the U.S. Senate primary is close in June, we could see some scrutiny of ballot design and layout, and whether problems with design and layout may have led to voter error.

Two new election science pieces in Political Analysis

Two new methodological pieces that will be of interest to students of election administration just came out in Political Analysis, (which is edited by my VTP-co-conspirator, Mike Alvarez).

(Warning to my non-academic followers:  serious math is involved in these papers.)

The first, by Kosuke Imai and Kabir Khanna, is entitled “Improving Ecological Inference by Predicting Individual Ethnicity from Voter Registration Records.”  In a nutshell, there are a lot of times when we need to know the race of registered voters, but we don’t have race as a data field in the voter file.  (This is true in all but a handful of states.)  Some people have dealt with this problem by relying on proprietary modeling techniques, such as that employed by Catalist, and others have simply used Census Bureau lists that classify last names by (likely) ethnicity.  Imai and Khanna have developed a technique, based on Bayes’s rule, to combine a variety of information, ranging from the surname list to geocoded information, to produce an improved method for modelling a voter’s ethnicity.  The technique is tested using the Florida voter file, which has race already coded, to make “ground truth” comparisons.

The second, by Gabriel Cepaluni and F. Daniel Hidalgo, is entitled “Compulsory Voting Can Increase Political Inequality:  Evidence from Brazil.”  This article will definitely be relevant for those interested in proposals to institute mandatory voting in the US.  Brazil is the largest country in the world with mandatory voting, which makes this case of particular interest.  Cepaluni and Hidalgo show that the causal effect of making voting mandatory is to increase SES disparities in turnout.  The reason is that the non-monetary penalties for non-voting primarily affect voters with higher incomes.

Happy reading!

 

The mystery of the Brooklyn voter reg “purge”

Reports from Brooklyn about the “purge” of  over 125,000 voters between last November and the recent presidential primary has turned the spotlight on the maintenance of voter lists. Today’s news brings word that the Kings County Board of Elections’ chief clerk apparently erred by removing voters from the rolls contrary to law.

Pam Fessler’s excellent NPR report on Wednesday about the rules governing removing voters from the rolls makes the point that the laws governing voter list maintenance are pretty clear.  Voters (and reporters) don’t always understand those rules, and when they do, they don’t necessarily agree with them.  For that reason, I’m going to sit back and wait for the reports of the New York City Comptroller and state Attorney General before passing judgement on what exactly happened and who was at fault.

That said, the whole story remains a bit of a mystery, first, because statistics about New York’s list maintenance activities are opaque and, second, no one really knows how many people “should be” on the voting rolls and, therefore, how many people “should be” removed when list maintenance activities are done.

New York’s murky voter registration statistics

On the issue of statistical opacity  Every two years, the U.S. Elections Assistance Commission is required by the National Voter Registration Act (NVRA) to issue a report about voter registration activities at the state level.  (Here is a link to the post-2014 report.)  To prepare the report, the EAC sends a survey to the states asking them to report, at the county level, statistics that describe the number of voters removed from the rolls, and why they were removed.  (The three major categories of removals are “failure to vote,” “moved from jurisdiction,” and “death.”)  In recent years, most states have complied with the request to provide this detailed information, but not New York.

As recently as 2008, New York only reported statistics for the whole state, not for individual counties.  In 2010 and 2012 New York finally started providing county-level statistics to the EAC, but the state backslid in 2014, providing no detailed breakdown for why voters were removed from any county in the state.  Not only that, but New York reported that between the 2012 and 2014 elections, only 47,634 voters had been removed from the rolls statewide, which is approximately the same number removed by Delaware.  (To provide further perspective, Florida removed over 484,000 voters and Pennsylvania removed over 853,000.)

Over the past few days, many people have asked me if the number of voters removed from the rolls in Brooklyn was unusual, to which I have to answer, “who knows?” because the relevant list maintenance statistics from New York (meaning the whole state, not just the city or one borough) are not being made public, as they are for most of the rest of the nation.

We don’t know how many people “should be” on the rolls

On the issue of how many people “should be” on the rolls and how many “should be” removed by list maintenance activities every year:  It turns out that this is a very hard question to answer. One attempt to answer this question was made in a recent conference paper that I wrote with a Harvard graduate student, Stephen Pettigrew.  (You can download the paper at this link.)  Because there is no national registry of all eligible adults (at least one that is available to the public) and no single national voter registration list, we don’t know the “true” number of registered voters.  (By “true number,” I mean people who are eligible to vote in the state in which they are registered, which excludes people on the rolls who have moved or died.)  Thus, official voter registration lists are, to some extent, “too big,” but by how much is currently unknown (and hotly contested among various groups).

Even so, it is possible to get an approximate sense of how many voters should be removed from the rolls on an annual basis, since there are two reasons that dominate all others:  moving out of a jurisdiction and dying.  Let’s see where Brooklyn (Kings County) stands on those measure.

WARNING:  Detailed calculations involving math follow

Deaths are easy.  The Centers for Disease Control maintain a database that records the number of deaths in each county of the United States, broken down by age.  In 2014 (the most recent year for which statistics are available), Kings County recorded 15,347 deaths among those 20 years and older.  (Unfortunately, the CDC database breaks down population groupings in five-year intervals, so we can’t add the deaths of 18- and 19-year-olds.  But, given the nature of death statistics, this is not a large number of people.)

Moving is a little more tricky, because there isn’t a national registry of movers, and the Census Bureau data is cumbersome to use to estimate how many people have moved out of a county or state.  However, the IRS (who knew?) provides data about county-to-county migration, based on income tax filings.  It can be used to estimate how many people move out of Kings County each year.

From what I can tell, between 2013 and 2014 (the most recent data available), about 110,000 people moved from Brooklyn — over 59,000 moving to other counties in New York and over 50,000 moving to other states.  Not all of these are registered voters, of course, or are all of them eligible.  The Census Bureau tells us that there were roughly 2.0 million residents in Brooklyn in 2014 who were 18 and older, out of the borough’s 2.6 million residents.  If all of these adults were registered, my back-of-the-envelope calculation suggests that you would have about 60,000 registered voters from Brooklyn moving somewhere else in New York each year and about 51,000 registered voters moving out-of-state.  The out-of-state movers should certainly be removed from the rolls (eventually); the in-state movers would presumably be removed from the Kings County rolls eventually, but would reappear on the rolls of another county.

However, the most recent official reports from the state indicates that there are only between 1.3 and 1.4 million registered voters in Kings County, depending on which set of statistics you trust (last November or this April).  Either way, my back-of-the-envelope calculations suggest that with this more reasonable estimate of how many registered voters there actually are in Brooklyn, you probably have only about 39,000 registered voters moving within New York in any given year and 33,000 moving out-of-state.  And, if people who die are registered at the same rate as those who survive another year, that gives us only about 10,000 deaths that need to be taken care of each year.

This is a long way of saying that the only way you could get 125,000 voters removed from the rolls in a year (assuming that list maintenance happens annually) is if everyone eligible to vote is registered and if everyone who moves and dies is then taken off the rolls.  More likely, if only about 60% of eligible voters are registered in Brooklyn, then the expected number of removals would be in the range of 40,000 to 80,000 voters each year.

As a side note, in 2014, Kings County reported to the EAC that it removed only 4,548 voters from the rolls for all reasons between the 2012 and 2014 elections.  Thus, it is reasonable to infer that Brooklyn (and the rest of New York state) isn’t even removing voters who die, which should be the easiest part of the removal process to manage.

If you’ve read this far, you deserve a medal, but you should also now have a sense about why the question of how many voters we should expect to be removed via regular list maintenance activities is so unclear.  It would help if New York’s counties started reporting the same detailed list maintenance statistics as the rest of the nation.  If they did, then at least we would have a better sense about the efforts being undertaken to keep the rolls reasonably free of deadwood.  Until then, no one outside the state board of elections and the county boards will be able to judge the efforts that are going into making sure the voter rolls in New York are accurate.

 

Making sure that California election officials are ready for the upcoming primary

California’s statewide primary is approaching rapidly, and it sounds as if voter interest in the primary is building. This could be an important test of the state’s top-two primary system, and it might the first time that we see strong voter turnout under the top-two. Clearly election officials throughout the state need to be prepared — there might be a lot of last-minute new registrants, a lot of ballots cast by mail, and perhaps many new voters showing up on election day. The LA Times editorialized about this exactly concern, “How do we prevent the California primary from becoming another Arizona?”.

Printing errors raise concerns about voter confusion in New York

The New York Times is reporting that there have been significant problems printing ballots in advance of New York’s upcoming primary elections. The article, “A $200,000 Ballot Error and Other Misprints at New York City’s Board of Elections”, reports that various incorrect mailings have been sent to voters, ranging from notices of upcoming elections with incorrect dates to errors in the printing of absentee ballots. How many lost votes these errors might generate in the state’s primary is difficult to estimate at this point, but once the primary elections are over and data is available it might be possible to determine whether these mistakes misled primary election voters.

Competing Lessons from the Utah Republican Caucus

If you want a case that illustrates the clash of expectations in the presidential nomination process, you need look no further than Utah’s Republican caucuses that have just been held.

These problems were well illustrated in two postings that recently came across my computer screen (h/t to Steve Hoersting via Rick Hasen’s Election Law Blog).  I make no claims about the accuracy of the claims (especially in Post # 1), but the sentiments expressed are certain genuine and representative.

Post # 1, a very interesting (to say the least) description of one person’s experience at the caucus, is a classic clash-of-expectations account.  In this posting, we learn that the lines to check in were long, ballots were given out in an unsecured fashion, those running the event didn’t always seem to know what’s going on, one-person-one-vote may have been violated, the ballot wasn’t exactly secret, and those counting the ballots didn’t want too many people looking over their shoulders.  About which I think, “sounds like a caucus to me.”

Caucuses are a vestige of early 19th century America, intended to pick nominees, to be sure, but with other goals in mind as well, such as rewarding the party faithful with meaningful activity and instilling control over the party base.  What we moderns value about primaries — that they are run by professionals, are designed to minimize coercion, and value access and security simultaneously — is precisely what caucuses are not.  Primaries were not gifted to Americans by a benevolent God, but were fought for by reformers over many years.  Primaries have their problems (among which is the fact that primary laws also had the [intended] effect of killing off minor parties), but it is a mistake to judge caucuses as if they were primaries.

Post # 2 is a story in Wired about the Utah Republican Party’s use of an online elections vendor to run an absentee voting process for the caucuses over the Internet. The writer’s point of view is that online voting in an election is an outrage because of the well-known problems with security and auditability of voting over the Internet.  Fair enough.  But, this is not a secret ballot election, it is a caucus.  If there is outrage to be expressed along these lines, it is for Republican leaders lending the appearance of a secret ballot election to a different sort of proceeding.

The Wired story also uncovers frustration among many thousands (probably) of would-be Internet voters that they were unable to vote because their party registration could not be verified, which may be another way of saying they were not eligible to vote in the first place, and would have been turned away from a physical caucus if they had appeared there instead.  Thus, we have another mismatch of expectations, pitting party leaders, who have every right to guard the associational rights of the party organization, against party voters, whose affiliation with the parties is one of identity rather than organizational membership.

This presidential nomination season has been infinitely interesting, one that will go down in history.  As the process drags on, moving from the high-profile early states to the low-profile middle and later states, we are seeing more and more examples of inconsistent expectations between process organizers and voters.  I suspect this will lead to an interesting round of reform activity (The Republican Party Meets McGovern-Fraser anyone?) once the dust has settled in November.