Benford's Law and Statistical Sleuthing on the Iran Election - The Numbers Guy

My print column this week examines the use of statistical techniques to search for election fraud. These techniques have gotten a workout on the contested election in Iran, but have also been used in prior races and likely will be used in the future, as vote counts get posted quickly to the Internet and blogs and electronic journals allow for quick publishing.

Numbers Guy One of these tools, Benford’s Law, is more commonly associated with accounting. Mark Nigrini, a professor of accounting at the College of New Jersey and popularizer of the law, says he has consulted with companies that have used it successfully. One example: It found too many 4s as leading digits in one employee’s expense accounting, which was caused by profligate expensing of daily Starbucks purchases. Of course, the law couldn’t account for whether that was an authorized repeat expense. (It wasn’t.)

Nigrini is skeptical about the application of the law to vote counts. “This really depends on how the fraud was committed,” Nigrini said. “If someone went in and rigged paper ballots, Benford’s Law will struggle to pick that up. Benford’s Law has potential if they blatantly made up the numbers.”

But why not make up numbers that conform to Benford’s Law? Aaron Clauset, a computer scientist at the Santa Fe Institute, said, “People are very strategic in their behavior, and they can use it to game the system. It sets up a coevolutionary arms race between the detector and the evader.”

Others are skeptical of the possibility of evading detection. “If people are actually cheating, they’re probably doing it in a hurry,” said Andrew Gelman, a statistician at Columbia University. “It may be more challenging to do an artificial set of votes that look realistic from all these perspectives,” added Walter Mebane, a pioneer of using Benford’s Law to test election results and a political scientist at the University of Michigan.

Mebane, a Democrat, first became interested in digging for fraud on November 8, 2000, when he recalls “walking around in a daze” after staying up late the night before and hearing Florida declared for Al Gore, then for George W. Bush, and then for neither. Later he went to Florida and, in 2003, he wrote a paper entitled, “The Wrong Man is President!” While he’s employed a variety of methods, Mebane was prompted to study the digits of election returns by the inquiry of a student in 2005.

Benford’s Law had already gotten a workout. Stephen Ansolabeher, a political scientist at Harvard, recalls using among other tests in an analysis of the 2000 U.S. presidential election. And in the wake of a a disputed Venezuelan recall referendum in 2004. Imre Mikoss, a physicist at Simón Bolívar University in Caracas, was suspicious of Hugo Chávez’s victory. “I saw this election and said, this is the opportunity to prove something with Benford,” Mikoss says now. He found that leading digits of vote counts didn’t conform to Benford’s Law, but his findings were questioned by the Carter Center. “The question is open,” Mikoss says now of his analysis.

Mebane has refined his technique, and in regular updates on his analysis of Iran numbers, he is careful to note what he does and doesn’t know. He took his analysis a step further, showing that the Benford anomalies stem from ballot boxes in Iran where the proportion of invalid votes was particularly low, suggesting that if there was tampering, it happened in these stations with ballots that should have been discarded. (The Iranian consulate to the United Nations didn’t return a call seeking comment.)

His usage of Benford focused on the leading digit of vote counts. Others studied the first digit, or the last digit. Mebane said he welcomed all the participation. “I think it’s great. Things move at light speed here,” Mebane said. “We’re not talking about waiting for papers to go through peer review. Nowadays it’s dueling blogs.”

One risk is that what seems like an anomaly is really the result of lots of analysts running lots of tests. “If you don’t specify ahead of time what you’re looking for, you’d be surprised if you don’t find rare events,” said Steven Miller, a mathematician at Williams College.

Bernd Beber, a political-science graduate student at Columbia University who, with Alexandra Scacco, studied final digits of vote counts, said criticism their findings were cherry-picked is wrong, because he based the chosen tests on prior work on Nigerian elections.

Arlene Ash, a research professor at Boston University’s medical school, finds all the hunting for data anomalies in Iran a bit unseemly. She said that the U.S. election system has plenty of flaws of its own. “One question that comes up rarely is whether we’re sure who won the election,” Ash said. “But something that comes up every day is where we have elections with massive amounts of problems.” Mebane added that the “worst country I’ve encountered” for getting election data is the U.S., in part because of the decentralized election system.

University of California, Berkeley, political scientist Henry E. Brady said he’d like to see some small fraction of the money spent on U.S. elections, perhaps a few million dollars a year, spent on election auditing.

What do you think? What methods are most effective for detecting election fraud? Please let me know in the comments.

Further reading: Many bloggers have crunched numbers on Iran election results. Benford’s Law has also been applied to Bernie Madoff and football. The New York Times spotlighted Benford’s Law in 1998, and USA Today did the same in 2000. One recent study analyzed which numbers people pick when falsifying data.