First off, rather than you make assumptions and have implicit bias, some facts: I’m an independent person who voted for Bush two; then Obama—barely the first time (Sarah Palin swayed me), and enthusiastically the second time. I would have voted for anyone besides Trump in 2016; and this year, I remain hopeful about the prospect of Biden doing a better job on COVID and having at least a chance of getting the country on a path to being united. You can love me for this or hate me for this or anything in between. That’s up to you! Either way, the rest of this post is going to be about why the polls were so wrong and what we can learn from it.
About a week before election night, I thought the pollsters were, if anything, going to over-correct from their massive error in 2016. Like other people wanting Trump’s presidency to end, I took solace in the large polling leads Biden had nationally, and also in nearly all the key swing states. I kept telling myself some version of, The pollsters have a vested interest (i.e., saving their industry) to get this right and thus have got to be extra conservative. So many reputable polls are pointing in the same direction. There is a real pattern here. Obviously, there was some motivated reasoning on my part, since I wanted to hear that Trump was losing. But, even after discounting for that, it still seemed highly likely Biden would win with a pretty good margin.
Then, a day before the election, I read a sound piece
by Zeynup Tufecki (if you don’t follow her work, you should — she’s great) arguing that we should stop paying any attention to election projections. In short, she wrote the following:
- The earliest reliable data set for presidential election polling is from 1972. This means the sample size is 12. No real “science” has a sample size of 12.
- Even worse, the sample size since the proliferation of social media is 3.
- On top of that, the sample size in a pandemic when states have been in and out of various forms of economic restrictions is 0.
I agreed with Tufecki and, as a result of her piece, lost some confidence in Trump getting routed. But even so, I told myself, the above arguments relate to projections, not polls. Polls are just raw data.
As of this writing, Wednesday at around 3 PM EST, it looks pretty clear I am going to be wrong again. The national polls are way off. Many of the swing-state polls are way off too. Here’s what I came up with for why, and what I guess I couldn’t (or didn’t want to) see before:
- Polls by people and teams with good methodologies that all point in the same direction are probably an accurate measurement of what they are measuring.
- But, and this is a big but, what they are measuring is not the electorate (i.e., the average voter). What they are actually measuring is the average person who picks up a phone call from a random number and talks to a pollster. That is a very different data-set than America.
- There may be some real diversity in the polls in regards to age, gender, race, geography, education level, etc. But there is no diversity in terms of people who answer random phone calls and talk to pollsters versus those who do not. By nature of their taking a poll, everyone in the data set falls into the former group.
It’s kind of like taking a bunch of runners and using push-ups to predict how good of a runner they are, and then extrapolating out for all runners. You are only getting the runners who want to do push-ups and want to have you watch. That’s not all runners. It’s a small minority. And of course all the patterns would be the same, because your sample population is alike in that they all want to do push-ups for you!
What the polls told us is that among people who answer the phone and talk to pollsters Biden had a commanding lead. And that’s all they told us.
The small lesson is stop paying attention to election polls.
The big lesson is just because you can measure something does not make it meaningful. And not everything meaningful can be measured. This year’s election serves as a stark reminder.