Two More Thoughts about the NC 9th CD Situation

The North Carolina 9th congressional district controversy is an interesting case of how the data-rich environment of North Carolina elections allows election geeks to explore in great detail the dynamics of an election, using the incomparable North Carolina Board of Elections data website.  In particular, Nathaniel Rakich at FiveThirtyEight  and Michael Bitzer at Old North State Politics have mined the data deeply.

I don’t have much more to add, but I did want to put my oar in on two topics  that may have relevance to the unfolding scandal.  The topics are:

  • Unreturned ballots by newly registered voters
  • Unreturned ballots by infrequent voters

Thing # 1: Unreturned ballots by newly registered voters

The first topic is the return rate of absentee ballots by newly registered voters.  Robeson County officials noticed a large number of absentee ballot requests being dropped off in batches, along with new voter registration forms.  This apparently was one of the things that alerted officials to the possibility that something was up.  In all the analyses posted, I hadn’t seen any reports of the percentage of unreturned absentee ballots by newly registered voters.  Here it goes.

First, this pattern of batches of absentee ballots along with registration forms was reported in August.  It turns out that the non-return rate of absentee ballots requested in August in Robeson County when the registration was also received in August was quite high — 95%, compared to 33% in the rest of the county.  The number of affected ballots was quite small, 21, but this is still an eye-popping statistic when compared to other counties.

Second, broadening the window a bit, the non-return rate of absentee ballots among those who registered any time in 2018 in Robeson County was 81%, compared to 67% for those who had registered before 2017.

Thus, it’s likely that some sort of registration+absentee request bundling  was going on in Robeson.  However, the non-return rate is still high if we exclude the (possibly) bundled requests.  Clearly, if there was fraud, it was multi-strategy.

 

Thing # 2: Unreturned ballots by infrequent voters

The second topic is whether infrequent voters were more likely to request an absentee ballot and not return it.  This question occurred to me because it fits into a scenario I’ve talked about with other election geeks, about how absentee ballots might be used fraudulently.  The idea is that if someone wants to request a ballot to use it fraudulently, they need to request it for someone who is unlikely to vote.  Otherwise, when they — the actual legitimate voter — do go to vote, it will be noticed that they had already requested an absentee ballot.  If this happens a lot in a jurisdiction, the fraud is more likely to be noticed.

Were a disproportionate number of absentee ballot requests being generated among likely non-voters in the 9th CD?  Yes, but mostly in Bladen County.

To investigate whether this type of calculation may have played into the strategy, I looked a bit more closely at the unreturned absentee ballots in the recent North Carolina election.  I hypothesized that registered voters who had not voted in a long time would be more likely to have an absentee ballot request manufactured for them than a regular voter.  To test this hypothesis, I went to the North Carolina voter history file, and counted up the number of general and primary elections each currently registered voter had participated in since 2010.  There have been nine statewide elections in this time (5 primaries and 4 general elections, not counting November 2018).

Sure enough, frequent voters were less likely to have an unreturned absentee ballot  than non-voters.  Statewide, voters who had participated in the past 9 statewide elections had a non-return rate of 14%, compared to a non-return rate of 32% for those who had never voted.  (Among those who had never voted, but had registered in 2010 or before, the non-return rate was 38%.)  In the 9th CD, these percentages were 25% and 43%, respectively.  In Bladen, they were 22% and 72%

Interestingly enough, in Robeson County, which had the highest non-return rate in the district — and in the state — the relationship between being an infrequent voter and not returning the absentee ballot was not as strong.  Among registered voters who had not cast a ballot since 2010, 81% failed to return their absentee ballot.  Among those who had voted in every election, the non-return rate was 60%.

The accompanying graph shows the more general trend.  The grey circles represent each county in North Carolina.  (Counties in more than one CD show up more than once.)  Throughout the state, infrequent voters are more likely to request absentee ballots that are not returned.

Bladen County is highlighted with the blue hollow circles.  Robeson is highlighted with the hollow red circles.  All the other counties in the districts are the hollow green circles.

If the unreturned absentee ballots reflect, in part, artificial generation of absentee ballot requests, the logic of who was getting targeted looks to have been different in Bladen and Robeson Counties.  Bladen County’s non-returns look more like they were associated with the strategy of requesting absentee ballots from people who would not notice.  Something else was going on in Robeson County.

A Quick Look at North Carolina’s Absentee Ballots

News comes that North Carolina’s State Board of Elections and Ethic Enforcement has chosen not to certify the results of the 9th congressional district race, which was (provisionally) won by the Republican Mark Harris over Democrat Dan McCready by 905 only votes. News accounts provide speculation that this is related to “irregularities” among absentee ballots in the district.  Because North Carolina has such a great collection of election-related datasets, I thought I’d dive in quickly to see what we can see.

(For the data geeks out there, go to this web page, and enjoy!)

My interest is guided by a number of statements that have appeared in news sources and filings with the SBOE.  Among these are:

  • Charges of an unusually large number of mail absentee ballot requests in the “eastern” part of the district, especially Robeson and Bladen Counties.
  • Charges that an unusually high proportion of mail absentee ballots were unreturned.
  • Charges that “ballot harvesters” were gathering up ballots and collecting them in unsealed envelopes (presumably allowing the harvesters to fill in choices on the ballot and then submit them).

What do the data show?  Here are some quick takes.  This is certainly not the last word, but reveals what one can glean from the SBOE’s public data.

Number of ballots by county

It certainly is true that Bladen County had a disproportionately high level of absentee ballot usage in the 2018 congressional election, but it goes beyond Bladen County and beyond the 9th CD.  The accompanying graph shows the percentage of votes that were cast by mail absentee ballots for each district-county unit.  (For instance, Mecklenburg County is in two districts, so it appears twice in the graph,  once for each district.)  The part of Bladen County that is in the 9th District did cast the highest percentage of mail absentee ballots in a congressional race, at 7.3%.  In the entire district, 3.8% of ballots were cast absentee.  And in the part of Bladen County that is not in the 9th District, a lower percentage (4.6%) was cast by mail.

(As with all the graphs in this post, you can click on them to enlargify them.)

Note, however, that Mecklenburg County also cast a notably high percentage of mail ballots in the race — 5.8% of all votes.  Also, because Mecklenburg is about ten times larger than Bladen, it turns that that its absentee ballots (over 5500) swamped Bladen’s (nearly 700).

Finally, it should be said that one other county, Yancey, is an even bigger outlier, if what we’re looking for is a comparison of mail absentee ballot use with the rest of a district.  Nearly six percent (5.6%) of Yancey’s votes were cast by mail, compared to 2.4% in the rest of the 11th district.

Party composition of ballots by county

For absentee ballots to have a major influence on the outcome of a race, they need to overwhelmingly support one of the candidates.    Here, we encounter even more interesting and unexpected patterns.

In this case, the accompanying graph has two parts.  The left part is a scatterplot of the percentage of the two-party vote given to the Democratic congressional candidate in all mail absentee ballots (y-axis) against the percentage of the vote given to the Democratic -congressional candidate in all ballots.  Again, the unit is the county-district.  The red dashed line is set to 45-degrees (ignoring the aspect ratio).  Most counties are above the red line, indicating that in most counties, Democratic congressional candidates did better in the mail absentee vote than they did in the other voting modes.  The data tokens are clustered around the line.  There are outliers, to be sure — a few counties are below the line, where Republican candidates actually out-performed in the absentee ballots, and a few are well above the cloud of circles.

The right part of the graph pulls out the counties that are part of the 9th CD.  There are three counties of note (at least) in the graph.  The first is our friend, Bladen County, which is identified here as one of the few counties in the state in which the Republican congressional candidate actually did better in the mail absentee ballots than in the other modes.  No wonder Democrats were suspicious.  At the same time, Union and Anson Counties are outliers on the other side of the equation.  Union County’s absentee ballots were 21 points more Democratic than votes overall.

As an aside, in the part of Bladen County that is in the 7th congressional district, the Democratic share of the mail absentee vote was 86.6%, compared to an overall Democratic share of 61.3% in that part of the county.  It makes one wonder whether the Democrats and Republicans were concentrating their efforts to get their supporters to cast mail ballots at opposite ends of the county.

Unreturned ballots

This is where it gets interesting.  Some of the speculation that has been floating around suggests that there was a significant number of unreturned mail absentee ballots in the district.  This has been attributed to a number of things.  It could be that political activists were requesting ballots for voters without their consent, and those ballots simply went unreturned.  Another possibility is that “ballot harvesters” were going door-to-door asking people to give them their ballots — and then maybe not delivering them to the county.

I looked at the percentage of requested mail absentee ballots that were never returned for counting, and sure enough, Bladen and Robeson Counties stand out.  The pattern stands out in the accompanying graph, which really needs to be enlarged to be fully appreciated.  (Again, you can enbiggify the graph by clicking on it.)  The graph shows the percentage of mail absentee ballots requested by Democrats (blue dots), Republicans (red dots), and unaffiliated voters (purple dots) that were unreturned in each county.  I have made the dots associated with the counties in the 9th district a bit bigger.  Statewide, about 24% of mail absentee ballots were not returned after being requested — 27% of Democrats, 19% of Republicans, and 24% of unaffiliated.  In Anson, Bladen, and Robeson Counties, the nonreturn rates were 43%, 47%, and 69%, respectively.

Robeson County stands out, because not only is the overall nonreturn rate high, but the partisan discrepancy is so high, as well.  The overall nonreturn rate 69%, but it was 73% for ballots requested by Democrats and 66% for ballots requested by unaffiliated voters.  Still, the Republican nonreturn rate was also unusually high, at 49%.

Some news accounts remarked that Robeson County officials started noticing batches of absentee requests being delivered in August, and started keeping track.  This made me wonder whether the unreturned ballots were associated with these batch requests.  To explore this, I calculated the percentage of mail absentee ballots that were unreturned, based on the week of the year when they were requested.

That led to the accompanying graph.  The grey circles represent the fraction of mail ballots requested each week of 2018 that ended up not being returned for counting, in each county.  Note that the grey circles become a grey blob toward the end.  The black line shows the average nonreturn rate for the whole state, as a function of the week when the ballot was requested.  The hollow blue circles represent Robeson County.  Note the large number of unreturned ballots that appear after week 30 — the August period noted before.  After Labor Day, the nonreturn rate in Robeson fell, although it was still high by statewide standards.

I’ve also shown the Bladen and Anson nonreturn rates by week.  We don’t see the same patterns in these counties that we see in Robeson.

Some concluding remarks

The purpose of this post has been to show the reader the type of numerical exploration one can engage in, using data provided by the North Carolina elections board on its incredible data page.  The analysis seems to confirm the suspicious that “something’s going on” with absentee ballots in the 9th district, but it also suggests complications that aren’t always clear from news accounts.  It seems quite likely that the campaigns — or individuals acting to support them — targeted absentee ballots in some counties, and not just in the 9th district.  (I have generated similar graphs to the ones shown here for the 2016 election, and there are some stories to be told…)  Whether this was just a small bit of tactical political warfare or something more nefarious, we’ll have to wait to see.

 

Confidence in Election Cyber-Preparedness Sees Post-Election Improvements

Pre-election worries about the conduct of the 2018 election centered on the threats of cyber-attacks on election systems from abroad and hacking of voting machines from, well, everywhere.  Although the election produced the usual supply of stories that raised concerns about election administration overall, there was no verified successful attack on computerized election equipment this year.  The question this raises is whether this news percolated down to the general public.

Based on public opinion research I conducted after the election, it seems that it did.

However,the public was already becoming more optimistic about cyber-preparation before the election.  Last June, 53% of the public stated they were either very or somewhat confident that local election officials had “taken adequate measures to guard against voting being interfered with this November, due to computer hacking.” By October, this proportion had risen to 62%.  Immediately after the election, 68% of the public stated they were either very or somewhat confident that local officials had, in fact, taken adequate measures to guard against computer hacking in the election.

Not surprisingly, both before and after the election, attitudes about election cyber-preparation were structured along partisan lines. Republicans were more confident than Democrats in June (66% vs. 51%), October (80% vs. 60%), and November (79% vs. 71%).

What is probably more interesting is that attitudes about cyber-preparation also varied by respondent education and attention to the news.   As we will see, the pattern of responses by education was especially interesting.

The data in this post were taken from three surveys I conducted during June 7-11 and October 5-7, before the election, and during November 7-9, after the election.  In each case, I interviewed 1,000 adults as a part of the YouGov Omnibus survey.  The research was supported by a grant from NEO Philanthropy, which bears no responsibility for the results or analysis.

Partisan attitudes

Unsurprisingly, attitudes about election administration have become structured around partisanship for many years.  In the case of attitudes about cyber preparations, in June Republicans were 14 points more likely to agree that local officials were taking adequate precautions against computer hacking in the upcoming election.  By October, that gap had opened up a bit to 17 points, although both Democrats and Republicans had become more confident across those four months.

Experience from the election did not change how Republicans viewed cyber preparations, but it did alter the views of Democrats quite a bit.  Republicans were still more sanguine, but the gap between Democratic and Republican attitudes had been cut in half.

Respondents who were neither Democrats nor Republicans — which includes both “pure” independents (about 17% of respondents) and minor-party identifiers (6%) were much, much less likely to express confidence in preparations about computer hacking across all three surveys.  They were also immune to changing opinions across the five months.

Interest in the news

The fact that partisans of all stripes became more confident in the preparations of local election officials to handle computer security suggests there were other factors that led Americans to change their attitudes about cyber preparations. What might these be?  A couple come immediately to mind.  The first is attention to the news.  The second is education.

The YouGov omnibus has a question intended to measure how closely respondents pay attention to the news and public affairs: “Would you say you follow what’s going on in government and public affairs … most of the time/ some of the time/ only now and then/ hardly at all.”

Throughout the past five months, the respondents who were the most confident that local officials had taken adequate precautions against election hacking were also the most likely to follow what’s going on in government.  Right after the election, 78% of those who followed public affairs “most of the time” had confidence in these preparations, compared to 69% of those who followed public affairs “some of the time” or “now and then.”  Among those who followed public affairs “hardly at all” or who didn’t know, only 34% were confident.

In addition, respondents at all level of attention to public affairs increased their confidence in the adequacy of computer-hacking preparations over the three surveys.

The fact that high-information respondents — political junkies — have consistently expressed the greatest confidence in the adequacy of the response to potential election cyber attacks is interesting, considering the amount of negative press that election officials received before the election about their security preparations for the 2018 election.  This finding suggests that the negative tone of many of these articles did not sink into the consciousness of all readers.  Or, it could suggest that high-information respondents already are more likely to trust election officials as a general matter any way.

Education

The correlations between educational attainment and attitudes about cyber preparations are probably the most interesting in the surveys.  All educational groups became more confident over time in the degree of preparations to counter hacking the election.  However, one group stands out in how this correlation changed — those with postgraduate degrees.

Back in June, when the question about cyber preparation was first asked, respondents with postgraduate degrees were by far the most skeptical.  Only 43% of postgraduates had confidence in the level of preparation, compared to 54% of all other respondents.

As summer turned to fall, all groups, with the possible exception of those with no more than a high school education, became more confident, but the biggest movement came from those with postgraduate degrees.  Finally, in the month that bracketed the election, all educational group became more confident, but the increase in confidence among postgraduate degree-holders is especially striking.

Opinions and election machines

Finally, one of the major topics in the election security realm was the fact that about 20% of voters, including all in-person voters in five states, continued to cast ballots on paperless voting machines (direct-recording electronic machines, or DREs).  The past couple of years have seen a relentless attack on these machines by reform groups and expert bodies (including one I served on), and so it would be natural to see if voters from states with a high degree of DRE usage had a lower opinion about hacking preparations at the state and local level.

It is notable that in the five states that rely entirely on DREs without a voter-verifiable paper audit trail (Delaware, Georgia, Louisiana, New Jersey, and South Carolina), a majority of respondents were not confident in computer hacking preparations in the summer.  In June, 42% of respondents from these fives states expressed confidence, compared to 54% of respondents from all other states.  By October, these numbers had tightened up, to mere 58%/63% differential.   Finally, in the November poll, 67% of respondents from the all-DRE states were confident in their states’ preparations to combat computer hacking, compared to 69% of respondents in the non-DRE states.

The number of voters in the surveys from the DRE states is relatively small (only about 90), so I would not bet too much on this analysis.  However, as I have written before, (see this link, for instance) up until recently, voters in all-DRE states have been quite confident in the voting machines they use.  The fact that respondents in these states may have been less confident in overall computer hacking preparations during the summer may be further evidence in the gradual erosion of confidence in these machines, where they are being used.  Still, we don’t see evidence here of those voters being more worried about whether their states are adequately pushing back against the dangers of hacking elections.

Conclusion

Computer security is a new topic in the area of election administration for most of the public.  It is unsurprising, therefore, that attitudes are fluid.  Like other election-administration attitudes, they are amenable to being viewed through a partisan lens.  But, because the issue is so new, attitudes about hacking are also amenable to being changed by unfolding events.  No verifiable computer attacks on voting machines were reported in 2018, and some of the public picked up on it.  Whether this positive state of affairs remains unchanged is, of course, subject to the unfolding of history.  It will be interesting to see what happens, as we move into the 2020 election season, and the outcome of the election (and thus the threat environment) moves to a different level.

On the recounts: Let’s get it right

Why don’t we immediately know the results of American elections right after polls close on election night?

The answer is simple. American elections are highly decentralized, and highly complex. The laws, procedures, and technologies used for our elections are not designed to produce quick results. Rather the way we administer elections in America requires patience, as we want to get the numbers right, not rely on guesswork.

In America we pride ourselves on our federalist system. One important principle of our democracy is that states many rights under the U.S. Constitution, and important state rights is running elections. States have wide authority to determine the conduct of their elections, and that’s one reason that we see such vast differences in how elections are run in America.

But the decentralization goes further, because in most states elections are largely run by counties or even municipalities. This means that we don’t have a single federal election, nor do we have fifty elections in the states. Rather we have thousands of elections in the November of each even-numbered year, with very different procedures and technologies.

The reality of this extreme decentralization of election administration in America, which is largely unique in the world, is that we have to rely on under-resourced local governments to run elections with integrity. That’s a big ask, because elections are complex administrative tasks.

At Caltech, we’ve been working in collaboration with the Orange County Registrar of Voters here in Southern California, and studying various methods to help audit their election administration practices. When you look under the hood, and see exactly how elections are administered in Orange County, you see quickly how complicated it is.

In the elections this fall, Orange County had over 1500 polling locations, and had to recruit thousands of poll workers to service the polling locations. They have about 1.5 million registered voters, with at least 8,000 of them living abroad or serving in the military. 1.1 million ballots were sent to voters in the mail before the election.

Our research group spent time observing voting in five of Orange County’s early voting centers, and in 35 polling places on Election Day. Seeing how poll workers do their jobs, how the technology works, and witnessing voter experiences directly, is an invaluable experience. We observed just how diligent polling place inspectors and clerks about about trying to provide a good experience for voters.

But we also saw how complicated the process is for poll workers, and saw first-hand why it takes so long for final election results to be tabulated and certified in places like Orange County.

In every Election Day polling place we visited, we saw many voters bringing in their completed and sealed mail ballots, depositing them in the ballot box. Many voters who had received a by-mail ballot brought them along, and surrendered them at the polling place, preferring to vote at the polling place instead. Some of the by-mail voters forgot to bring their ballots to surrender, and others could not be found in the registration books, leading many voters to cast provisional ballots.

All of these ballots have to be confirmed and reconciled after the polls close on Election Day. Despite what people may claim, election officials count every valid ballot — but they must first determine which ballots are valid, and they need to reconcile the vast array of ballots coming from different sources: from in-person early voting, absentee ballots sent by mail, ballots from overseas voters and military personnel, Election Day ballots, provisionals, and mail ballots dropped off on Election Day.

Keep in mind that this process happens in every election jurisdiction in America. The exact procedures and voting technologies used differ across states and counties, but every one of those jurisdictions is doing this very process to come up with a final and accurate tally of all valid votes that were cast in this midterm election. Some jurisdictions do it quickly, others will be slower, but in every single election jurisdiction in America, it takes time to count all of the votes.

This process isn’t pretty to watch, but it’s vital for the health of our democracy. And this process just takes time, because election officials want to get the most accurate count of the vote as is possible.

Not having final election results just after the polls close is not an indication of fraud, or any necessary indication that there was something wrong with the election. Instead, the delay in reporting final results is generally a good thing, as it means that election officials are working hard to make sure that all valid votes are included in the final tabulation.

So why don’t we have final results in many places, a week after the election? Because American elections are decentralized, and complex. Election officials are working to get the results right. We need to give them the time to do that, free from political pressure.

My advice?

Be patient, let the process continue, and make sure that every valid vote cast in the midterm election is counted.

The close gubernatorial election in Georgia: monitoring public opinion about the administration of the election

By Nicholas Adams-Cohen

This is a guest essay, written by Nicholas Adams-Cohen, a Ph.D. student at Caltech, who is working on the Monitoring the Election project.

Nearly half of the American public turned out to vote on November 6th 2018, representing more ballots cast in a midterm than in the last 50 years. As is often the case in a closely contested election, concerns about voter fraud and suppression were broadcast by various media institutions, with journalists and pundits concerned about the ways the democratic process might have been compromised. What if there was a way to detect problem areas in real-time, gauging how voters react to problems in the voting process as incidents occur? Detecting these issues early might allow us to troubleshoot areas where voting procedures break down, ultimately improving the democratic process.

With these goals in mind, the California Institute of Technology’s “Monitoring the Election” project has built a social media election monitor aimed at pinpointing problem areas through social media discussions. If we can determine how the intensity of discussions about various instances of voter fraud correlate with the severity of issues in the voting process, it becomes possible to detect and address voting issues as they occur.

Historically, if social scientists wanted to study whether or not voters had concerns about the voting process, they might rely on voter satisfaction surveys. While useful, survey methods suffer from numerous issues, including non-response biases that are increasingly difficult to correct and a lag between when citizens vote and when they eventually fill out a survey. Our method instead tracks social media streams, specifically Twitter, to discover when, who, and how voters discuss problems in real-time. By collecting all messages mentioning keywords related to potential problems in the voting process, we can extract a signal about where and when the voting process breaks down.

This monitor ran throughout the November 6th, 2018 election, and with the data we collected we can analyze how conversations concerning voter fraud evolved throughout this historic midterm. One of the most insightful ways we can use these data is by determining which areas of the United States faced the most criticisms about voter fraud and suppression. To that end, we used various natural language processing methodologies to determine which messages about fraud and suppression were directed at specific states. The results of this analysis is found in the following map, where we use a gradient the highlight the number how many messages about voter fraud mention a specific state. As shown in the plot below, which charts the number of tweets, we find an unusually high number of messages concerned with Georgia, where the Governor’s race between Brian Kemp and Stacey Abrams was inundated with concerns about voter suppression. For examples of news reports, you can see the articles here and here.

As shown in line plot below, which plots the number of tweets concerned with voter suppression in Georgia over time, our monitor detected a potential issue with Georgia as early as 12pm PST, before many media groups could widely broadcast these concerns.

As voters become more vocal about the electoral process on social media platforms, these maps and monitors serve as an important and powerful prognosis tool for officials to solve problems and citizens to discover disturbances in the voting process. Ultimately, we hope to continue developing tools to provide transparency, increase efficiency, and help understand the American electoral process.

A High-Intensity Midterm Election: Lessons

Yesterday’s midterm elections across the U.S. were intense. There were highly contested gubernatorial, U.S. Senate, and U.S. House elections, across the country. While final results on voter turnout, and the exact outcome of many of the contested races, will take days or weeks to determine, the good news is that despite the pressure that was put on U.S. election infrastructure yesterday, in general the elections went smoothly.

Keep in mind that before Tuesday, there were concerns about potential attempts to infiltrate the infrastructure of U.S. elections. At this point there’s no evidence of any successful hacks. And as we move into post-election ballot tabulation and reconciliation, we’ll be paying close attention and continue to monitor the integrity of the midterm elections.

And our electoral infrastructure was under pressure yesterday. We will be working to put together data from our OC election integrity pilot project, in particular, documenting the observations from our election-day monitoring, from our Twitter monitor, and the various auditing and forensics analyses we will be doing in coming weeks. All of these will be summarized on the general election dashboard for our project, and we’ll also be pushing out notifications via social media.

So stay tuned.

Boom or Bust in 2018 House Election

Every indication suggests that the 2018 midterm election will come in as expected from longstanding political models:  seat losses in the House for the president’s party in the midterm, like usual, and a standoff in the Senate, which is also to be expected, given the specific configuration of the president’s party in 2012 and 2018.  (On this latter point, see my post from yesterday by clicking here.

Despite the fact that the auguries are pointing toward a Democratic pick-up in the House, fretting is beginning to emerge over whether the pick-up might evaporate or, at the very least, may not be big enough to give the House Democrats the freedom they would like to dominate business in the House.  While the former is highly unlikely, the latter does have some basis in the facts about the marginal House seats in 2018 — that is, the seats on which control of the House will turn in this election.

To appreciate the situation, we first need to return to the election of 2016 and the distribution of returns from the House election.  The accompanying graph shows the percentage of the two-party vote received by the Republican candidate in each district. (Click on the graph to enlargify.)

The dashed line shows the location of the median district — the 218th from the left, or the district that would flip the House to Democratic control if we added the same percentage of Democratic votes to each district.  That district  (NC-2) had a Republican two-party vote share of 56.7% in 2016.  Thus, if we were to shift the entire distribution to the left by 6.7 points, we get a majority  of Democratic seats.

Note, however, that the median district is located right as the fat part of the two-party vote-share distribution begins for Republicans.  This means that if the shift in vote share from 2016 is just slightly less than 6.7 points, it won’t make much of a difference in the party distribution of the House — other than the fact that Republicans still control it — but if we shift it slightly more than 6.7 points, it makes a huge difference.  If, for instance, the shift is a point greater, at 7.7 points, Democrats control the House with 14 seats to spare; at a shift of 8.7 points, Democrats control the House with 25 seats to spare.

As of right now, the FiveThirtyEight models are consistent with a shift of about 8.5 points compared to 2016.  That’s consistent with a healthy Democratic majority, but also notice that because of the distribution of partisan support in the pivotal districts, it’s possible for the actual outcome to significantly over- or under-shoot that mark.  It’s for that reason that the Democrats’ fortunes are in “boom or bust” territory:  If they come in slightly ahead of expectations on the popular vote, they will have a healthy majority to control the chamber with.  If they come in slightly behind expectations, controlling the chamber will be very, very difficult, from a practical perspective.

Caveats and conclusions

The analysis I just performed is a simplistic version of “uniform swing analysis,” which has been around in political science for a century.  The advantage of uniform swing analysis is that it gives us intuitions about how more sophisticated modelling techniques work.  Without reference to the 2016 two-party vote distribution, for instance, it is not necessarily clear why the various modelers are hedging their predictions a bit.  All models are uncertain, of course, but 2018 is especially uncertain because of how partisan support arrays itself among the pivotal House districts.

On the whole, new- and old-school of models midterm elections are pointing to a Democratic pick-up of seats in the House.  The degree of that pick-up is hard to nail down at this point, mainly because of the districts that are in play.

Following the 2018 midterms on Twitter

As part of our election integrity study in Orange County (CA) we are tracking what people are saying on Twitter about the 2018 midterm elections.

We are summarizing Twitter discussion about the midterm elections on a number of topics: tweets about Election Day Voting, Remote Voting, Voter Fraud, Voter ID, and Polling Places.

If you are interested in following the online conversation hourly or daily, the dashboard is live. There’s also a series of maps where we display the Twitter conversation about the administration of the 2018 midterms by state, for Tweets that we can geocode.

We have a Caltech/MIT Voting Technology Project working paper that describes the general approach to how we collect, process, and categorize these Tweets.

Election Fundamentals in 2018

The modelers at FiveThirtyEight have made a compelling case that we should expect Republicans to pick up a seat or two in the upcoming U.S. Senate election.  The purpose of this post is to show that this is essentially the same prediction we would have made two years ago, once we knew a Republican would be president at the midterm.

Before launching in, I must do my political science duty by recommending a symposium on election forecasting that appeared in  the October edition of PS: Political Science and Politics.  You can access that symposium by clicking here.

In the interest of brevity, I am leaving aside the intellectual justifications for the two simple predictive models I will use here.  The first model, the presidential partisanship model, predicts the net change in seats experienced by the president’s party at midterm by taking into account (1) the party of the president who won when the current class of senators was last elected and (2) the party of the president at midterm.  The second model, the seats-at-risk model, substitutes the number of seats held by the incumbent president’s party for the party of the president who won the last time this class of senators were up for election.

Presidential partisanship model

The presidential partisanship model focuses on the role of the president in driving outcomes of national elections.  It is obvious that we would take into the account the party of the incumbent president in predicting the outcome of a midterm Senate election, because midterm elections are always, in part, a referendum on the incumbent’s performance.  We take into account the party of the previous president because the class of senators running for reelection in a midterm were last elected when the previous president was on the ballot.

For 2018, Republican Senate candidates are disadvantaged by the fact that the incumbent president is a Republican.  This would be true if the Republican were named Donald Trump or John Kasich.  Since 1946, Republicans have lost an average of 2.9 seats in the Senate when the president at midterm has been a Republican, compared to gaining 4.4 seats under Democratic presidents.

At the same time, Republican Senate candidates in 2018 are helped by the fact that the class of senators up in 2018 was last elected in 2012, which was a moderately good Democratic year — Barack Obama was elected president, Democrats picked up a net of eight seats in the House, and picked up two seats in the Senate.  Since 1946, Republicans have gained an average of 3.6 seats in the Senate when the previous president was a Democrat, compared to losing 2.0 seats when the previous president was a Republican.

We can put these two factors together.  The accompanying table shows the average change in Republican Senate seats since 1946, based on the party of the current and previous president.  The cell colored yellow is the one relevant to 2018 — Republican incumbent and Democrat previous president.  Note that the average change in Republican seats under these circumstances has been half a seat, which is essentially the same as FiveThirtyEight’s prediction of 0.8 as of this morning (Sunday before Election Day).

Seats-at-risk model

The seats-at-risk model can be thought of as modifying the presidential partisanship model in one important way.  Rather than just noting the partisanship of the previous president, we can note how much of a boost to that president’s party was experienced in the senatorial election.  It is reasonable to expect that Senate candidates swept into office on the coattails of a presidential candidate will do worse the next time the president is not on the ballot.  If the president has long senatorial coattails, that means the number of vulnerable Senate seats will be greater six years later (without the same president on the ballot)  than if the coattails were short.

The numbers bear this out.  Since 1946, 14.7 Republican seats have been “at risk” in each midterm Senate election.  In elections with more than 14 seats at risk, Republicans have lost an average of 1.7 seats; with fewer than 14 seats at risk, they have gained an average of 3.9 seats.  Not surprisingly, controlling for seats at risk, Republicans have done better when the incumbent president was a Democrat than when he was a Republican.

One way to illustrate this is in the accompanying figure.  The figure is a scatterplot that shows the net change in Republican seats plotted against the number of Republicans up for re-election.  Red circles are midterms with Republican incumbents; blue circles have Democratic incumbents.  The two lines are simply the result of fitting a linear regression through the data, with a dummy variable indicating whether the incumbent president is a Republican.

This graph illustrates the two major features of the seats-at-risk model.  First, fewer Republicans up for re-election are correlated with more Republican gains in the Senate.  Second, Republican presidents at midterm are associated with smaller gains/bigger losses.

On the x-axis I have indicated the number of Republican seats up for reelection in 2018, eight.  Note that the point prediction of the change in Republican seats in 2018 is a pick-up of 0.8, precisely what FiveThirtyEight is predicting today.

Caveats and conclusions

The point of this posting has been to provide a bit of historical context to the most likely outcome of the upcoming Senate election — Republicans might pick up a seat or two.  These models — and the much more sophisticated ones that one can read in the political science literature — don’t need to know anything about the factors that are currently the subject of so much discussion, such as the unpopularity of the president, political polarization, the mobilization of the resistance, and the counter-mobilization of the President’s base.

There are two things that this posting is not.  First, it is not a dig at more sophisticated models, such as one finds in the political science literature or on websites such as FiveThirtyEight.  In fact, it’s just the opposite.  The value of these more sophisticated models is that they allow us to probe generic “fundamental” expectations in more depth.

Second, this posting is not an effort to argue that campaigns don’t matter, or that current political activism doesn’t matter.  Yes, as I’ve noted, it’s possible to generate plausible predictions about the outcome of the 2018 Senate election without any reference to any “real world” politics.  But, it’s also important to note that these simple models work because they are characterizing a political system that is in a type of equilibrium, such that when one set of conditions is met — for instance, a Republican incumbent is in place at midterm following a Democratic president — the political environment shifts in predictable ways.  Those working pieces are difficult, if not impossible, to model with a high degree of confidence.  That’s why we work with the simpler models.

We won’t know whether these predictions work out until all the votes are counted, which won’t be until the days and weeks following Election Day.  We can be certain that the actual results will deviate from the predictions, at least somewhat.  But, I’m also feeling confident that the analytical tools at our disposal will help up to make sense of what can sometimes seem like chaos.

OCRV project gearing up for the general election

Our Orange County election integrity project is gearing up for the general election.

At this point, we are tracking by-mail ballots, the most recent data on ballots mailed and ballots returned is on the general election dashboard, at “Vote By Mail Return.”

We are also monitoring a number of different conversations about the elections on Twitter, you can see what that conversation looks like at the “National Twitter Monitor”. We are currently seeing a lot of Twitter conversation about Election Day voting and about Remote voting (early and voting by mail).

Finally, we have recently posted a summary report that presents the results from our voter registration auditing collaboration with OCRV. The summary report can be found on the “Voter Registration Database Auditing” tab, on the general election dashboard.

We will continue to update the dashboard over the next few weeks!