In the EAC’s NVRA 2010 dataset, the FIPS code is stored as a numeric value, and this will cause problems with anyone who is trying to match / merge this with other data because 10 state FIPS codes start with a “0”. Most statistical programs will strip that leading zero, making it hard to identify the states properly.
For instance, lines 2-4 of the Excel file look like this:
AK | ALASKA | 200000000 | 560146 | only active voters | |
AL | AUTAUGA COUNTY | 0100100000 | 34727 | active and inactive registered voters | |
AL | BALDWIN COUNTY | 0100300000 | 114952 | active and inactive registered voters |
The third column contains the FIPS code; columns 1-2 contain the state code, columns 3-5 the county code, an additional columns refer to smaller jurisdictions (mainly townships in NE and the Midwest).
The first problem you’ll notice is that Alaska’s FIPS is already artificially shortened to “2” not “02” an users will need to fix this manually. Next, additional FIPS codes will come across a “10010000” (for example) without the leading zero.
The solution for now is a kludge; insert a dummy line into the Excel file and include a string value such as an “x” in column 3. This will trick Excel into outputting the values as text, and statistical programs will input the data as character strings. Then the process is straightforward; capture substrings (01, 02, etc) and match / merge away!