The classic polling error is from a poll on the 1948 Presidential election in the U.S. On Election night, the Chicago Tribune printed the headline DEWEY DEFEATS TRUMAN, which turned out to be mistaken. The reason the Tribune was mistaken is that their editor trusted the results of a phone survey. Survey research was then in its infancy, and few academics realized that a sample of telephone users was not representative of the general population. Telephones were not yet widespread, and those who had them tended to be prosperous and have stable addresses.
Opinion polls or surveys are very useful tools in human rights measurement. We can use them to measure public opinion on certain human rights violations, such as torture or gender discrimination. High levels of public approval of such rights violations may make them more common and more difficult to stop. And surveys can measure what governments don’t want to measure. Since we can’t trust oppressive governments to give accurate data on their own human rights record, surveys may fill in the blanks. Although even that won’t work if the government is so utterly totalitarian that it doesn’t allow private or international polling of its citizens, or if it has scared its citizens to such an extent that they won’t participate honestly in anonymous surveys.
But apart from physical access and respondent honesty in the most dictatorial regimes, polling in general is vulnerable to mistakes and fraud (fraud being a conscious mistake). Here’s an overview of the issues that can mess up public opinion surveys, inadvertently or not.
There’s the well-known problem of question wording, which I’ve discussed in detail before. Pollsters should avoid leading questions, questions that are put in such a way that they pressure people to give a certain answer, questions that are confusing or easily misinterpreted, wordy questions, questions using jargon, abbreviations or difficult terms, double or triple questions etc. Also quite common are “silly questions”, questions that don’t have meaningful or clear answers: for example “is the catholic church a force for good in the world?” What on earth can you answer to that? Depends on what elements of the church you’re talking about, what circumstances, country or even historical period you’re asking about. The answer is most likely “yes and no”, and hence useless.
The importance of wording is illustrated by the often substantial effects of small modifications in survey questions. Even the replacement of a single word by another, related word, can radically change survey results: see this post for examples.
Of course, one often claims that biased poll questions corrupt the average survey responses, but that the overall results of the survey can still be used to learn about time trends and difference between groups. As long as you make a mistake consistently, you may still find something useful. That’s true, but no reason not to take care of wording. The same trends and differences can be seen in survey results that have been produced with correctly worded questions.
Order effect or contamination effect
Answers to questions depend on the order they’re asked in, and especially on the questions that preceded. Here’s an example:
Fox News yesterday came out with a poll that suggested that just 33 percent of registered voters favor the Democrats’ health care reform package, versus 55 percent opposed. … The Fox News numbers on health care, however, have consistently been worse for Democrats than those shown by other pollsters. (source)
The problem is not the framing of the question. This was the question: “Based on what you know about the health care reform legislation being considered right now, do you favor or oppose the plan?” Nothing wrong with that.
So how can Fox News ask a seemingly unbiased question of a seemingly unbiased sample and come up with what seems to be a biased result? The answer may have to do with the questions Fox asks before the question on health care. … the health care questions weren’t asked separately. Instead, they were questions #27-35 of their larger, national poll. … And what were some of those questions? Here are a few: … Do you think President Obama apologizes too much to the rest of the world for past U.S. policies? Do you think the Obama administration is proposing more government spending than American taxpayers can afford, or not? Do you think the size of the national debt is so large it is hurting the future of the country? … These questions run the gamut slightly leading to full-frontal Republican talking points. … A respondent who hears these questions, particularly the series of questions on the national debt, is going to be primed to react somewhat unfavorably to the mention of another big Democratic spending program like health care. And evidently, an unusually high number of them do. … when you ask biased questions first, they are infectious, potentially poisoning everything that comes below. (source)
If you want to avoid this mistake – if we can call it that (since in this case it’s quite likely to have been a “conscious mistake” aka fraud) – randomizing the question order for each respondent might help.
Similar to the order effect is the effect created by follow-up questions. It’s well-known that follow-up questions of the type “but what if…” or “would you change your mind if …” change the answers to the initial questions.
The Bradley effect is a theory proposed to explain observed discrepancies between voter opinion polls and election outcomes in some U.S. government elections where a white candidate and a non-white candidate run against each other.
Contrary to the wording and order effects, this isn’t an effect created – intentionally or not – by the pollster, but by the respondents. The theory proposes that some voters tend to tell pollsters that they are undecided or likely to vote for a black candidate, and yet, on election day, vote for the white opponent. It was named after Los Angeles Mayor Tom Bradley, an African-American who lost the 1982 California governor’s race despite being ahead in voter polls going into the elections.
The probable cause of this effect is the phenomenon of social desirability bias. Some white respondents may give a certain answer for fear that, by stating their true preference, they will open themselves to criticism of racial motivation. They may feel under pressure to provide a politically correct answer. The existence of the effect is, however, disputed. (Some say the election of Obama disproves the effect, thereby making another statistical mistake).
Another effect created by the respondents rather than the pollsters is the fatigue effect. As respondents grow increasingly tired over the course of long interviews, the accuracy of their responses could decrease. They may be able to find shortcuts to shorten the interview; they may figure out a pattern (for example that only positive or only negative answers trigger follow-up questions). Or they may just give up halfway, causing incompletion bias.
However, this effect isn’t entirely due to respondents. Survey design can be at fault as well: there may be repetitive questioning (sometimes deliberately for control purposes), the survey may be too long or longer than initially promised, or the pollster may want to make his life easier and group different polls into one (which is what seems to have happened in the Fox poll mentioned above, creating an order effect – but that’s the charitable view of course). Fatigue effect may also be caused by a pollster interviewing people who don’t care much about the topic.
Ideally, the sample of people who are to be interviewed for a survey should represent a fully random subset of the entire population. That means that every person in the population should have an equal chance of being included in the sample. That means that there shouldn’t be self-selection (a typical flaw in many if not all internet surveys of the “Polldaddy” variety) or self-deselection. That reduces the randomness of the sample, which can be seen from the fact that self-selection leads to polarized results. The size of the sample is also important. Samples that are too small typically produce biased results.
Even the determination of the total population from which the sample is taken, can lead to biased results. And yes, that has to be determined… For example, do we include inmates, illegal immigrants etc. in the population? See here for some examples of the consequences of such choices.
A house effect occurs when there are systematic differences in the way that a particular pollster’s surveys tend to lean toward one or the other party’s candidates; Rasmussen is known for that.
I probably forgot an effect or two. Fill in the blanksif you care. Go here for other posts in this series.