The Internet lit up on Monday over the news, reported in New York Magazine, that a team of computer scientists and lawyers had reported to the Clinton campaign that “they’ve found persuasive evidence that results in Wisconsin, Michigan, and Pennsylvania may have been manipulated or hacked. ”
A later posting in Medium by University of Michigan computer science professor J. Alex Halderman, who was quoted in the NY Mag piece, stated that the reporter had gotten the point of the analysis wrong, along with some of the numbers. As he notes, the important point is that all elections should be audited, and not only if you have statistics suggesting that something might be fishy.
Unfortunately, the cat is out of the bag. Because of the viral spread of the NY Mag article, the conspiratorially minded now have something to hang their hats on, if they want to believe that the 2016 election (like the 2004 election) was stolen by hacked electronic voting machines.
Many of my friends who are not conspiratorially minded have been asking me if I believe the statistical analysis suggested by the NY Mag piece is evidence that something is amiss. They’re not satisfied with me echoing Alex Halderman’s point that this is beside the point. So, here are some thoughts about the statistical analysis.
- Some very good commentary about the statistical analysis has already appeared in fivethirtyeight.com and vox.com. Please read it. (And, do read Halderman’s Medium post, referenced above.)
- I should start my own commentary by saying that I have not seen the actual statistical analysis alluded to by the NY Mag piece. I know no one else who has seen it, either. (I’ve asked.) Therefore, I must make assumptions about what was done. I’ve been doing analyses such as this for over 16 years, so I have a good idea about what was probably done, but without the actual study and the data on which the analysis was conducted, I can’t claim to be replicating the study. (By the way, I’m also assuming that a “study” was done, but it’s also not at all clear that this was the case. It could be that Halderman and his colleagues provided some thoughts to the Clinton campaign, and this communication was misconstrued by the public when word got out.)
- The gist of the analysis described by NY Mag appears to be comparing Clinton’s vote share across the types of voting machines used by voters in Michigan, Pennsylvania, and Wisconsin. To attempt a replication of this analysis, it would be necessary to obtain election returns and voting machine data at the appropriate unit of analysis from these three states.
- Voting machine use. Both Michigan and Wisconsin only use paper ballots for Election Day voting.* Therefore, one simply cannot compare the performance of electronic and paper systems within these states. This sentence in the NY Mag article must be false: “The academics presented findings showing that in Wisconsin, Clinton received 7 percent fewer votes in counties that relied on electronic-voting machines compared with counties that used optical scanners and paper ballots.” On the other hand, some counties in Pennsylvania do use electronic voting machines, known in the election administration field as “DREs” for “direct recording electronic” devices. Pennsylvania, therefore, could be used to compare results for voters who used electronic machines with those who used paper.
- Voting machine data. For many decades Kim Brace, the owner of Election Data Services, has collected data about the use of voting technologies as a part of his business. Every four years I buy Kim’s updated data, which I have done for 2016. Verified Voting also has a publicly available data set that reports voting machine use at the local level. I tend to prefer Brace’s data because of his long track record of gathering it. As I show below, both data sources tell similar stories about the use of voting machines in Pennsylvania. The comparisons are the same, regardless of the voting machine data set.
- Election return data. Here, I use county-level election return data I purchased from Dave Leip at his wonderful U.S. Election Atlas website. (This is from data release 0.5.)
- The Pennsylvania comparison. Using the Brace voting machine data to classify counties, Clinton received 39.3% of the vote in Pennsylvania counties that used opscans and 49.0% of the vote in counties that used DREs. However, when the standard statistical controls are included to account for the other factors that would predict the Clinton vote share in a county — race, population density, and education — the difference in vote share between Clinton and Trump is reduced to 0.095%. Using the Verified Voting data to classify counties, Clinton received 40.2% of the vote in opscan counties and 52.4% of the vote in DRE counties. (The Brace and Verified Voting data sets differ in reporting the machines used in four counties.) In this case, when the statistical controls for race, population density, and education are included, the vote share difference between Clinton and Trump goes down to 0.6%.
- Virtually all Michigan and Wisconsin Election Day voters (and absentee voters, for that matter) use paper ballots. In Michigan, these ballots are counted on scanners; in Wisconsin, some are counted by hand, but most by scanners. Election returns from these states cannot be used to compare voting patterns using electronic machines and paper-based systems. The core empirical claim in the NY Mag article that has the Internet all atwitter cannot be true.
- The difference in voting patterns between Pennsylvania voters who used electronic machines and those who used optically scanned ballots is accounted for by the fact that voting machines are not randomly distributed in Pennsylvania. Clinton received proportionally more votes in counties with electronic machines, but that is because these counties were disproportionately nonwhite and metropolitan — factors that are correlated with using DREs in Pennsylvania.
- The importance of advocating for post-election audits to ensure that the ballots were counted correctly is not a matter of electronic vs. paper ballots, or a matter of whether doing so will save the election for one’s favored candidate. The reason all systems, regardless of technology, should be audited after every election is to ensure that the election was fair and that the equipment and counting procedures functioned properly. This critical message was unfortunately garbled by playing to conspiratorial fears about the outcome of the 2016 election.
- My biggest fear in this episode is that election officials, state legislators, and voters will now regard advocates for post-election audits as part of the movement to discredit the election of Donald Trump as president. I know that this is not the intention. My biggest hope is decisionmakers will look beyond the sensational headlines and recognize that post-election audits are simply a good tool to make sure that the election system has functioned as intended.
*I have learned that between 5% and 10% of Wisconsin voters who are not physically disabled do use the so-called “accessibility machines,” rather than the regular opscan paper ballots. However, I know of no election returns that have reported the results of ballots cast on these machines alone, nor do I believe that the reports discussed in the NY Mag article were referring to these ballots.