The Spectre of Math

June 30, 2010

Cheating pollsters

Filed under: Mathematics,Politics — jlebl @ 4:38 pm

A second pollster in recent history got caught cheating. And they get caught by really simple means. When you are a pollster I would presume you know at least a little bit about statistics. You would know that making up numbers by hand will get you caught. Even if it is a matter of a few minutes to make numbers that look credible.

Guide to cheating pollsters:

1) pick the numbers you want in some way, probably look at other pollsters, then add or subtract a little bit, but make sure they add up to a 100. The biggest rule here is: only run a poll that other pollsters have done already. Don’t be suckered into polling something you can’t make up credible sounding numbers for.

2) Write the few lines of code that runs an experiment using your numbers as weights. If you can’t do this, drop me a line I’ll write the code for you in a few minutes for a percentage of your ill-earned money (though you’ll have to pay me enough so that I won’t make more money turning you in to your clients).

Now what could really blow your mind is that it is quite possible that there are many pollsters who do this. The above scheme is very hard to catch. The only way it can get caught is if there is actually some sort of election that actually does get you the actual numbers. Then if all the pollsters are simply copying and fudging numbers that some pollster made up at some point, it is reasonably likely that the election may be a surprise (unless voter intent is changed by the polling numbers).

So let’s assume there are cheating pollsters. Now what is the probability that a cheating pollster is a moron with no understanding of statistics. I would have thought fairly low, but let’s assume that it is 50% in absence of further data on stupidity of cheating pollsters. Therefore given that there are at least 2 stupid cheating pollsters, we should expect at least 2 smarter, much harder to detect cheating pollsters.

Furthermore cheating is probably a fuzzy logic kind of thing. It’s not that a cheating pollster is simply a random number generator (and as we see they’re pretty bad at that too). I assume that a typical cheater actually does some polls and only cheats to cut costs. Suppose that someone wants a daily poll. You could run a poll weekly, then do some sort of interpolation and make up the daily numbers (of course the interpolation is going to be a week late, but most of these numbers do not jump quickly). You should probably factor in somebody elses poll numbers to reduce the error That’s doing 1/7th of the work for the money. Or you could perhaps inflate your sample by running a simulation off of your numbers and other polls. I bet the temptation must be high to cheat since, well run cheating is hard to catch. If you are making up all your numbers, then sooner or later you might get caught.

Create a free website or blog at