Benford’s law states that in naturally occurring numbers (as opposed to made up numbers) leading digits have a greater likelihood of being one than they do being two, and a greater likelihood of being two than they do of being three, and so forth.
In other words, the distribution of digits in a naturally occurring set of numbers should look something like this:
When someone makes up a set of numbers, it is very difficult for it to follow Benford’s law. In fact, Benford’s law has been used to show accounting fraud. It has also been used to show election fraud. Therefore, I decided to use it to see if there was any suspicious activity in the Pennsylvania election.
I want to warn you that I am not a mathematician. I also have a real job, so I don’t have unlimited time to comb through data. Therefore, rather than using precinct level data, which would have given me a much larger sample size, I used county level data. Since this shrunk my sample size, rather than simply using the first two leading digits, as I would with a very large sample size, I used every leading digit, up to five, excluding a final digit.
For example, Biden had 252,719 face to face votes in Philadelphia county on election day. I eliminated the final digit, the 9, and kept the first five digits. So, there were 2 twos, 1 five, one 7, and one 1 in that county level data for Biden’s face to face. I did this for every county in the state for face-to-face (election day,) mail-in votes, and provisional ballots that were counted.
Here is what the distribution looked like for Trump:
As you can see, it does not look exactly like the expected curve, with sixes occurring more often than fives, but by and large, it follows pretty closely the ideal.
Here is what Biden’s looks like:
This does not look like what we would expect. Twos are actually higher than ones, even though we would expect ones to be about twice as high as twos.
This does not mean that there certainly was fraud. As I previously stated, I am not a mathematician, and I have a real job, so I cannot look at data all day. However, it does present a curiosity, and is something that should be explored further.
For a look at the data I used, follow this link: https://docs.google.com/spreadsheets/d/125RE761Xn0bhx6FO0PqT3rA-XtXWBbOcbpY9JJR_Bkg/edit?usp=sharing
I welcome feedback from real mathematicians.