Should lies and false statements of fact be protected by free speech laws, or can the speech rights of those who intentionally lie be limited in some cases? The US Supreme Court believes the latter is true, somewhat surprisingly given the often quasi-absolutist nature of First Amendment jurisprudence in the US. In Gertz v. Robert Welch, the Court claimed that
there is no constitutional value in false statements of fact.
There are some obvious problems with this exception to free speech. First, it can’t work unless it’s possible to distinguish real lies from false statements of fact that are simple errors. This means it must be possible to determine someone’s intentions, and that’s always difficult. However, one could claim that a person’s speech rights can only be limited on account of lying when his or her intentions are clear.
That would save the exception, but it wouldn’t undo some of its harmful consequences. People who speak in good faith may still be afraid that their speech will unwittingly come across as false, without their good intentions being absolutely clear. Hence, they may fear that they will run afoul of the law, and limit their speech preemptively. The lies exception to freedom of speech has therefore a chilling effect, an effect which is enhanced by the fuzzy nature of the difference between facts and opinions.
Given these problems with the lies exception to free speech, how could we instead argue in favor of free speech protection for lies and knowingly false statements of fact?
One rather ironic way to do it is to appeal to the metaphor of the marketplace of ideas: free speech is necessary for the pursuit of truth (or, in a weaker form, for the improvement of the quality of our ideas). John Stuart Mill has the canonical quote:
The peculiar evil of silencing the expression of an opinion is, that it is robbing the human race; posterity as well as the existing generation; those who dissent from the opinion, still more than those who hold it. If the opinion is right, they are deprived of the opportunity of exchanging error for truth: if wrong, they lose, what is almost as great a benefit, the clearer perception and livelier impression of truth, produced by its collision with error.
As such, this doesn’t really justify the acceptance of expressions of lies. If we need lies to see the truth more clearly, you could also say that we need evil to see the good more clearly, and few I guess would accept the latter statement. However, if we interpret this quote liberally (pun intended), we may get somewhere. We could argue that someone’s lies can motivate others to search for, investigate and disseminate the truth. For example, I think it’s fair to say that holocaust deniers have done a lot for holocaust education. They have given teachers and researchers a hook.
Another reason why we wouldn’t want to prohibit lying, at least not across the board, is the fact that lies are often necessary for the protection of human rights. This is the case that’s made in jest in the cartoon on the right, and is also the origin of the rejection of Kant’s claim that we shouldn’t lie to the murderer inquiring about the location of his intended victim. (I have an older post about the usefulness of lying here).
Obviously, nothing said here implies that lying is generally beneficial or that it should be welcomed and protected whatever the circumstances. If lying becomes the norm, we will most likely lose our humanity. In the words of Montaigne, “we are men, and hold together, only by our word” and our civilization and systems of cooperation would come crashing down if we can’t generally trust each other. However, the general albeit not exceptionless moral good of telling the truth doesn’t translate into a right to be told the truth or a legal duty to tell the truth (and to shut up if we can’t). Mortality and human rights don’t completely overlap.
If lying were to become the normal habit, free speech would lose its meaning. We have free speech rights precisely because we want to share information, opinions and beliefs, and because we want to learn and pay attention to verbal assertions. There has to be some level of general trust that people speak their minds rather than the opposite. Otherwise it’s better if there’s no speech at all, and hence also no right to free speech. Hence, the free speech defense of lying has to be limited somewhere.
That is why, despite the fact that in general there shouldn’t be a right to be told the truth or a legal duty to tell the truth, we do want some cases in which there is such a right and such a duty. Lying is legitimately prohibited in the case of libel, of witnesses testifying under oath, of someone impersonating a doctor etc. But those are cases of different rights having to be balanced against each other: the free speech rights of the liars against the rights of those suffering harmful consequences when people lie (consequences such as bad medical treatment, miscarriages of justice etc.). The duty of government officials and elected politicians to tell the truth is based on the requirement of democratic transparency, and is therefore also a case of balancing rights: democracy is a human right, and democracy can’t function if there’s no transparency and if people in power don’t tell the truth about what they are doing.
What I want to criticize in this installment of our series on lies and statistics, is the ordinal ranking of relatively similar entities in a way that creates the illusion of substantial disparity. You often see it combined with color schemes: one entity that’s just below a threshold value gets one color, and the next one which is just above gets another color, and then it’s like they differ substantially. It’s rather common in maps, of which there’s an example here:
(source, more information on the Human Development Index is here; note: the criticism offered in this post is not directed against the HDI itself)
Louisiana has a score of .801, West Virginia .800, and Mississippi at .799, and that makes Mississippi stand out although it’s really not different from the other two.
Something to keep in mind when looking at all the maps I post on this blog, or any ordinal ranking for that matter.
More about lies and statistics here.
- Lies, Damn Lies and World Cup Statistics (broadstuff.com)
- Lies, Damned Lies and Cat Statistics (idle.slashdot.org)
- How To Lie With Statistics (seobook.com)
- Lies, Damn Lies, and Statistics (slideshare.net)
(source, click image to enlarge)
I discussed the so-called Omitted Variable Bias before on this blog (here and here). So I suppose I can mention this other example: guess what is the correlation, on a country level, between per capita smoking rates and life expectancy rates? High smoking rates equal low life expectancy rates, right? And vice versa?
Actually, and surprisingly, the correlation goes the other way: the higher smoking rates – the more people smoke in a certain country – the longer the citizens of that country live, on average.
Why is that the case? Smoking is unhealthy and should therefore make life shorter, on average. However, people in rich countries smoke more; in poor countries they can’t afford it. And people in rich countries live longer. But they obviously don’t live longer because they smoke more but because of the simple fact they have the good luck to live in a rich country, which tends to be a country with better healthcare and the lot. If they would smoke less they would live even longer.
Why is this important? Not because I’m particularly interested in smoking rates (although I am interested in life expectancy). It’s important because it shows how easily we are fooled by simple correlations, how we imagine what correlations should be like, and how we can’t see beyond the two elements of a correlation when we’re confronted with one that goes against our intuitions. We usually assume that, in a correlation, one element should cause the other. And apart from the common mistake of switching the direction of the causation, we often forget that there can be a third element causing the two elements in the correlation (in this example, the prosperity of a country causing both high smoking rates and high life expectancy), rather than one element in the correlation causing the other.
More posts in this series are here.
I’ve discussed the role of framing before: the way in which you ask questions in surveys influences the answers you get and therefore modifies the survey results. (See here and here for instance). It happens quite often that polling organizations or media inadvertently or even deliberately frame questions in a way that will seduce people to answer the question in a particular fashion. In fact you can almost frame questions in such a way that you get any answer you want.
However, the questioner may matter just as much as the question.
Consider this fascinating new study, based on surveys in Morocco, which found that the gender of the interviewer and how that interviewer was dressed had a big impact on how respondents answered questions about their views on social policy. …
[T]his paper asks whether and how two observable interviewer characteristics, gender and gendered religious dress (hijab), affect survey responses to gender and non-gender-related questions. [T]he study finds strong evidence of interviewer response effects for both gender-related items, as well as those related to support for democracy and personal religiosity … Interviewer gender and dress affected responses to survey questions pertaining to gender, including support for women in politics and the role of Shari’a in family law, and the effects sometimes depended on the gender of the respondent. For support for gender equality in the public sphere, both male and female respondents reported less progressive attitudes to female interviewers wearing hijab than to other interviewer groups. For support for international standards of gender equality in family law, male respondents reported more liberal views to female interviewers who do not wear hijab, while female respondents reported more liberal views to female respondents, irrespective of dress. (source, source)
Other data indicate that the effect occurs in the U.S. as well. This is potentially a bigger problem than the framing effect since questions are usually public and can be verified by users of the survey results, whereas the nature of the questioner is not known to the users.
Opinion polls or surveys are very useful tools in human rights measurement. We can use them to measure public opinion on certain human rights violations, such as torture or gender discrimination. High levels of public approval of such rights violations may make them more common and more difficult to stop. And surveys can measure what governments don’t want to measure. Since we can’t trust oppressive governments to give accurate data on their own human rights record, surveys may fill in the blanks. Although even that won’t work if the government is so utterly totalitarian that it doesn’t allow private or international polling of its citizens, or if it has scared its citizens to such an extent that they won’t participate honestly in anonymous surveys.
But apart from physical access and respondent honesty in the most dictatorial regimes, polling in general is vulnerable to mistakes and fraud (fraud being a conscious mistake). Here’s an overview of the issues that can mess up public opinion surveys, inadvertently or not.
There’s the well-known problem of question wording, which I’ve discussed in detail before. Pollsters should avoid leading questions, questions that are put in such a way that they pressure people to give a certain answer, questions that are confusing or easily misinterpreted, wordy questions, questions using jargon, abbreviations or difficult terms, double or triple questions etc. Also quite common are “silly questions”, questions that don’t have meaningful or clear answers: for example “is the catholic church a force for good in the world?” What on earth can you answer to that? Depends on what elements of the church you’re talking about, what circumstances, country or even historical period you’re asking about. The answer is most likely “yes and no”, and hence useless.
The importance of wording is illustrated by the often substantial effects of small modifications in survey questions. Even the replacement of a single word by another, related word, can radically change survey results: see this post for examples.
Of course, one often claims that biased poll questions corrupt the average survey responses, but that the overall results of the survey can still be used to learn about time trends and difference between groups. As long as you make a mistake consistently, you may still find something useful. That’s true, but no reason not to take care of wording. The same trends and differences can be seen in survey results that have been produced with correctly worded questions.
Order effect or contamination effect
Answers to questions depend on the order they’re asked in, and especially on the questions that preceded. Here’s an example:
Fox News yesterday came out with a poll that suggested that just 33 percent of registered voters favor the Democrats’ health care reform package, versus 55 percent opposed. … The Fox News numbers on health care, however, have consistently been worse for Democrats than those shown by other pollsters. (source)
The problem is not the framing of the question. This was the question: “Based on what you know about the health care reform legislation being considered right now, do you favor or oppose the plan?” Nothing wrong with that.
So how can Fox News ask a seemingly unbiased question of a seemingly unbiased sample and come up with what seems to be a biased result? The answer may have to do with the questions Fox asks before the question on health care. … the health care questions weren’t asked separately. Instead, they were questions #27-35 of their larger, national poll. … And what were some of those questions? Here are a few: … Do you think President Obama apologizes too much to the rest of the world for past U.S. policies? Do you think the Obama administration is proposing more government spending than American taxpayers can afford, or not? Do you think the size of the national debt is so large it is hurting the future of the country? … These questions run the gamut slightly leading to full-frontal Republican talking points. … A respondent who hears these questions, particularly the series of questions on the national debt, is going to be primed to react somewhat unfavorably to the mention of another big Democratic spending program like health care. And evidently, an unusually high number of them do. … when you ask biased questions first, they are infectious, potentially poisoning everything that comes below. (source)
If you want to avoid this mistake – if we can call it that (since in this case it’s quite likely to have been a “conscious mistake” aka fraud) – randomizing the question order for each respondent might help.
Similar to the order effect is the effect created by follow-up questions. It’s well-known that follow-up questions of the type “but what if…” or “would you change your mind if …” change the answers to the initial questions.
The Bradley effect is a theory proposed to explain observed discrepancies between voter opinion polls and election outcomes in some U.S. government elections where a white candidate and a non-white candidate run against each other.
Contrary to the wording and order effects, this isn’t an effect created – intentionally or not – by the pollster, but by the respondents. The theory proposes that some voters tend to tell pollsters that they are undecided or likely to vote for a black candidate, and yet, on election day, vote for the white opponent. It was named after Los Angeles Mayor Tom Bradley, an African-American who lost the 1982 California governor’s race despite being ahead in voter polls going into the elections.
The probable cause of this effect is the phenomenon of social desirability bias. Some white respondents may give a certain answer for fear that, by stating their true preference, they will open themselves to criticism of racial motivation. They may feel under pressure to provide a politically correct answer. The existence of the effect is, however, disputed. (Some say the election of Obama disproves the effect, thereby making another statistical mistake).
Another effect created by the respondents rather than the pollsters is the fatigue effect. As respondents grow increasingly tired over the course of long interviews, the accuracy of their responses could decrease. They may be able to find shortcuts to shorten the interview; they may figure out a pattern (for example that only positive or only negative answers trigger follow-up questions). Or they may just give up halfway, causing incompletion bias.
However, this effect isn’t entirely due to respondents. Survey design can be at fault as well: there may be repetitive questioning (sometimes deliberately for control purposes), the survey may be too long or longer than initially promised, or the pollster may want to make his life easier and group different polls into one (which is what seems to have happened in the Fox poll mentioned above, creating an order effect – but that’s the charitable view of course). Fatigue effect may also be caused by a pollster interviewing people who don’t care much about the topic.
Ideally, the sample of people who are to be interviewed for a survey should represent a fully random subset of the entire population. That means that every person in the population should have an equal chance of being included in the sample. That means that there shouldn’t be self-selection (a typical flaw in many if not all internet surveys of the “Polldaddy” variety) or self-deselection. That reduces the randomness of the sample, which can be seen from the fact that self-selection leads to polarized results. The size of the sample is also important. Samples that are too small typically produce biased results.
Even the determination of the total population from which the sample is taken, can lead to biased results. And yes, that has to be determined… For example, do we include inmates, illegal immigrants etc. in the population? See here for some examples of the consequences of such choices.
A house effect occurs when there are systematic differences in the way that a particular pollster’s surveys tend to lean toward one or the other party’s candidates; Rasmussen is known for that.
I probably forgot an effect or two. Fill in the blanksif you care. Go here for other posts in this series.
Inflation is often a significant part of growth in any time series measured in dollars (or other currencies), or – in other words – it’s an important part of an increase over time in data expressed in dollars. So when you compare data for the current year, month or whatever with the same data for some period in the past, you may just see inflation rather than actual growth or increases. By adjusting for inflation, you uncover the real growth. You may even discover that growth hides decline. Here’s an innocuous example of the consequences of failing to adjust data for inflation:
Over the last month, newspapers and film Web sites have proclaimed Avatar the highest-grossing film in American history. … Moviegoers in [the U.S.] have now spent about $700 million on tickets to Avatar. … No. 2 on the all-time list is Titanic, which brought in about $600 million. Avatar surpassed Titanic in late January. The problem with these numbers is that they aren’t adjusted for inflation. … When you adjust movie grosses for inflation, as Box Office Mojo does, you see that “Gone With the Wind” remains the top-grossing movie of all time, with $1.5 billion in box-office sales (using today’s dollars). (source)
This won’t do much damage. The problems start when unadjusted data are being used to push a political point or legislation. For example, one can claim that it isn’t a good idea to raise gasoline taxes because gasoline prices are already very high compared to the old days, but this claim loses much of its strength when you adjust the prices for inflation and it turns out that they are actually rather average, historically.
Of course, you can make mistakes while trying to adjust for inflation, and there are several techniques available, none of which will provide the same numbers. But any adjustment, especially for comparisons over long periods of time, are better than no adjustment at all.
There’s a cool inflation adjusting tool here (only for U.S. data I’m afraid).
Following up from an older post on the importance of survey questions, here’s a nice example of the way in which small modifications in survey questions can radically change survey results:
Our survey asked the following familiar question concerning the “right to die”: “When a person has a disease that cannot be cured and is living in severe pain, do you think doctors should or should not be allowed by law to assist the patient to commit suicide if the patient requests it?”
57 percent said “doctors should be allowed,” and 42 percent said “doctors should not be allowed.” As Joshua Green and Matthew Jarvis explore in their chapter in our book, the response patterns to euthanasia questions will often differ based on framing. Framing that refers to “severe pain” and “physicians” will often lead to higher support for ending the patient’s life, while including the word “suicide” will dramatically lower support. (source)
Similarly, seniors are willing to pay considerably more for “medications” than for “drugs” or “medicine” (source). Yet another example involves the use of “Wall Street”: there’s greater public support for banking reform when the issue is more specifically framed as regulating “Wall Street banks”.
What’s the cause of this sensitivity? Difficult to tell. Cognitive bias probably has some effect, and the psychology of associations (“suicide” brings up images of blood and pain, whereas ”physicians” brings up images of control; similarly “homosexual” evokes sleazy bars, “gay” evokes art and design types). Maybe the willingness not to offend the person asking the question. Anyway, the conclusion is that pollsters should be very careful when framing questions. One tactic could be to use as many different words and synonyms as possible in order to avoid a bias created by one particular word.
Push polls are used in election campaigns, not to gather information about public opinion, but to modify public opinion in favor of a certain candidate, or – more commonly – against a certain candidate. They are called “push” polls because they intend to “push” the people polled towards a certain point of view.
Push polls are not cases of “lying with statistics” as we usually understand them in this blog series, but it’s appropriate to talk about them since they are very similar to a “lying technique” that we discussed many times, namely leading questions (see here for example). The difference here is that leading questions aren’t used to manipulate poll results, but to manipulate people.
The push poll isn’t really a poll at all, since the purpose isn’t information gathering. Which is why many people don’t like the term and label it oxymoronic. A better term indeed would be advocacy telephone campaigns. A push poll is more like a gossip campaign, a propaganda effort or telemarketing. They’re very similar to political attack ads, in the sense that they intend to smear candidates, often with little basis in facts. Compared to political ads, push polls have the “advantage” that they don’t seem to emanate from the campaign offices of one of the candidates. (Push polls are typically conducted by bogus polling agencies). Hence it’s more difficult for the recipients of the push poll to classify the “information” contained in the push poll as political propaganda. He or she is therefore more likely to believe the information. Which is of course the reason push polls are used. Also, the fact that they are presented as “polls” rather than campaign messages, makes it more likely that people listen, and as they listen more, they internalize the messages better than in the case of outright campaigning (which they often dismiss as propaganda).
Push polls usually, but not necessarily, contain lies or false rumors. They may also be limited to misleading or leading questions. For example, a push poll may ask people: “Do you think that the widespread and persistent rumors about Obama’s Muslim faith, based on his own statements, connections and acquaintances, are true?”. Some push polls may even contain some true but unpleasant facts about a candidate, and then hammer on these facts in order to change the opinions of the people being “polled”.
One infamous example of a push poll was the poll used by Bush against McCain in the Republican primaries of 2000 (insinuating that McCain had an illegitimate black child), or the poll used by McCain (fast learner!) against Obama in 2008 (alleging that Obama had ties with the PLO).
One way to distinguish legitimate polls from push polls is the sample size. The former are usually content with relatively small sample sizes (but not too small), whereas the latter typically want to “reach” as many people as possible. Push polls won’t include demographic questions about the people being polled (gender, age, etc.) since there is no intention to aggregate results, let alone aggregate by type of respondent. Another way to identify push polls is the selection of the target population: normal polls try to reach a random subset of the population; push polls are often targeted at certain types of voters, namely those likely to be swayed by negative campaigning about a certain candidate. Push polls also tend to be quite short compared to regular polls, since the purpose is to reach a maximum number of people.
(source, cartoon by Eric Allie)
In a previous post in this series, I already mentioned the temptation to see things in data that just aren’t there, or to make data say things they don’t really say. I focused on the correlation-causation problem, a typical case of “jumping to conclusions”.
Elsewhere I gave the following example: there are data doing the rounds claiming that Republicans follow political news more closely than Democrats, which has some people saying that Republicans are more knowledgable and make better political choices. However, people don’t read more news because they are Republicans, but because they are relatively wealthy and older, and when they are they also tend to be more of the Republican type. So if you see data showing a correlation between political conservatism and attention to the news, don’t jump to conclusions and say that conservatives are inherently more attentive to the news, let alone that they make better political choices. A young and relatively poor conservative probably pays less attention than a wealthy and older liberal. Attention isn’t a function of political orientation. It has other causes.
However, as is evident from the cartoon above, data don’t have to be of the correlation type for people to see things in them that aren’t there. People have indeed interpreted popular rejection of healthcare reform or of the Obama administration in general as an expression of underlying racism, as if there can’t be any other reasons for rejection.*
Polling on the health-care bill is … complicated. Voters don’t know much about the plan. Most disapprove of it, but many disapprove because they want to see it go further. (source)
So there’s a “double jump” to conclusions in the cartoon:
- First, jumping from disapproval of healthcare reform to anti-Obama racism (blaming the former on the latter when this isn’t shown by the data), which is ridiculed, rightly to the extent that it is something real.
- Second, jumping from disapproval ratings on “something” to disapproval ratings on “healthcare reform”. The data only show that people disapprove of “something”: people may disapprove of only a part of healthcare reform, or may disapprove of the fact that it doesn’t go far enough rather than disapprove of reform as such; or they may disapprove of something that is not really proposed and hence misunderstand the whole thing and base their disapproval on lack of knowledge. Needless to say, this second jump in the cartoon is quite unconscious and probably not on purpose.
All this jumping is quite understandable. We always have to interpret data, and we can easily lose our way in the process. It’s also tempting to “find” explanations for data that fit with our pre-established opinions and biases.
* Personally, I’m in favor of reform.
I’ve mentioned in a previous post how some numbers or stats can make a problem appear much bigger than it really is (the case in the previous post was about the numbers of suicides in a particular company). The error – or fraud, depending on the motivation – lies in the absence of a comparison with a “normal” number (in the previous post, people failed to compare the number of suicides in the company with the total number of suicides in the country, which made them leap to conclusions about “company stress”, “hyper-capitalism”, “worker exploitation” etc.).
The error is, in other words, absence of context and of distance from the “fait divers”. I’ve now come across a similar example, cited by Aleks Jakulin here. As you know, one of the favorite controversies (some would say nontroversies) of the American right wing is the fate of the prisoners at Guantanamo. President Obama has vowed to close the prison, and either release those who cannot be charged or tranfer them to prisons on the mainland. Many conservatives fear that releasing them would endanger America (some even believe that locking them away in supermax prisons on the mainland is a risk not worth taking). Even those who can’t be charged with a crime, they say, may be a threat in the future. I won’t deal with the perverse nature of this kind of reasoning, except to say that it would justify arbitrary and indefinite detention of large groups of “risky” people.
What I want to deal with here is one of the “facts” that conservatives cite in order to substantiate their fears: recidivism by former Guantanamo detainees.
Pentagon officials have not released updated statistics on recidivism, but the unclassified report from April says 74 individuals, or 14 percent of former detainees, have turned to or are suspected of having turned to terrorism activity since their release.
Of the more than 530 detainees released from the prison between 2002 and last spring, 27 were confirmed to have engaged in terrorist activities and 47 were suspected of participating in a terrorist act, according to Pentagon statistics cited in the spring report. (source)
Such and other stats are ostentatiously displayed and repeated by partisan mouthpieces as a means to scare the s*** out of us, and keep possibly innocent people in jail. The problem is that the levels of recidivism cited above, are way below normal levels of recidivism:
I read the following headline in a local newspaper recently:
Most prisoners escape from their cells
My first reaction was: Christ! What’s the world coming to! We can’t keep the majority of prisoners inside? It turned out that what they wanted to say was that prisoners, when they escape, do so from their cell, most of the time. Other prisoners escape from the workshop, while being transported etc. That looks much better already.
More serious posts in this series are here.
Time for a more lighthearted post in this blog series. Suppose you find a correlation between two phenomena. And you’re tempted to conclude that there’s a causal relation as well. The problem is that this causal relation – if it exists at all – can go either way. It’s a common mistake – or a case of fraud, as it happens - to choose one direction of causation and forget that the real causal link can go the other way, or both ways at the same time.
An example. We often think that people who play violent video games are more likely to show violent behavior because they are incited by the games to copy the violence in real life. But can it not be that people who are more prone to violence are more fond of violent video games? (See also here). We choose a direction of causation that fits with our pre-existing beliefs.
Another widely shared belief is that uninformed and uneducated voters will destroy democracy, or at least diminish its value (see here, here and here). No one seems to ask the question whether it’s not a diminished form of democracy that renders citizens apathetic and uninformed. Maybe a full or deep democracy can encourage citizens to participate and become more knowledgeable through participation.
A classic example is the correlation between education levels and GDP (see also here). Do countries with higher education levels experience more economic growth because of the education levels of their citizens? Or is it that richer countries can afford to spend more on education and hence have better educated citizens? Probably both.
Another cartoon that expresses the same risk:
More posts in this blog series.
I explained what I mean by “omitted variable bias” in a previous post in this series, so go there first if the following isn’t immediately clear. (In a few words: you see a correlation between two variables, for example clever people wear fancy clothes. Then you assume that one variable must cause the other, in our case: a higher intellect gives people also a better sense of aesthetic taste, or good taste in clothing somehow also makes people smarter. In fact, you may be overlooking a third variable which explains the other two, as well as their correlation. In our case: clever people earn more money, which makes it easier to buy your clothes in shops which help you with your aesthetics. Nonsense, I know, but it’s just to make a point).
I gave a few examples in the previous post, but found some others in the meantime. This one’s from Nate Silver’s blog:
Gallup has some interesting data out on the percentage of Americans who pay a lot of attention to political news. Although the share of Americans following politics has increased substantially among partisans of all sides, it is considerably higher among Republicans than among Democrats:
The omitted variable here is age, and the data should be corrected for it in order to properly compare these two populations.
News tends to be consumed by people who are older and wealthier, which is more characteristic of Republicans than Democrats.
People don’t read more or less news because they are Republicans or Democrats. And here’s another one from Matthew Yglesias’ blog:
It’s true that surveys indicate that gay marriage is wildly popular among DC whites and moderately unpopular among DC blacks, but I think it’s a bit misleading to really see this as a “racial divide”. Nobody would be surprised to learn about a community where college educated people had substantially more left-wing views on gay rights than did working class people. And it just happens to be the case that there are hardly any working class white people living in DC. Meanwhile, with a 34-48 pro-con split it’s hardly as if black Washington stands uniformly in opposition—there’s a division of views reflecting the diverse nature of the city’s black population.
A morbid one in our ongoing series on mistakes and lies in statistics, from a news report some weeks ago:
French Finance Minister Christine Lagarde Thursday voiced her support for France Telecom’s chief executive, who is coming under increased pressure from French unions and opposition politicians over a recent spate of suicides at the company.
Ms. Lagarde summoned France Telecom CEO Didier Lombard to a meeting after the telecommunications company confirmed earlier this week that one of its employees had committed suicide. It was the 24th suicide at the company in 18 months.
In a statement released after the meeting, Ms. Lagarde said she had “full confidence” that Mr. Lombard could get the company through “this difficult and painful moment.”
The French state, which owns a 27% stake in France Telecom, has been keeping a close eye on the company, following complaints by unions that a continuing restructuring plan at the company is putting workers under undue stress.
The suicide rate among the company’s 100,000 employees is in line with France’s national average. Still, unions say that the relocation of staff to different branches of the company around France has added pressure onto employees and their families.
On Tuesday, a spokesman for France’s opposition Socialist Party called for France Telecom’s top management to take responsibility for the suicides and step down. Several hundred France Telecom workers also took to the streets to protest against working conditions.
In the statement released after Thursday’s meeting, France’s Finance Ministry said Mr. Lombard had set up an emergency hotline aimed at providing help to depressed workers. The company has also increased the number of psychologists available to staffers, according to the statement. (source)
More on the problems caused by averages is here.
Did you hear the joke about the statistician who put her head in the oven and her feet in the refrigerator? She said, “On average, I feel just fine.” That’s the same message as in this more widely known joke about statisticians:
And then there’s this one: did you know that the great majority of people have more than the average number of legs? It’s obvious, really: Among the 57 million people in Britain, there are probably 5,000 people who have only one leg. Therefore, the average number of legs is
But seriously now, averages can be very misleading, also in statistical work in the field of human rights. Take income data, for example. Income as such isn’t a human rights issue, but poverty is, as well as income inequality. When we look at income data, we may see that average income is rising. However, this may be due to extreme increases at the top 1% of income. If you then exclude the income increases of the top 1% of the population, the large majority of people may not experience rising income. Possible even the opposite. And rising average income – even excluding extremes at the top levels – is perfectly compatible with rising poverty for certain parts of the population.
Averages are often skewed by outliers. That is why it’s often necessary to remove outliers and calculate the averages without them. That will give you a better picture of the characteristics of the general population (the “real” average income evolution in my example). A simple way to neutralize outliers is to look at the median – the middle value of a series of values – rather than the average (or the mean).
An average (or a median for that matter) also doesn’t say anything about the extremes (or, in stat-speak, about the variability or dispersion of the population). A high average income level can hide extremely low and high income levels for certain parts of the population. So, for example, if you start to compare income levels across different countries, you’ll use the average income. Yet country A may have a lower average income than country B, but also lower levels of poverty than country B. That’s because the dispersion of income levels in country A is much smaller than in country B. The average in B is the result of adding together extremely low incomes (i.e. poverty) and extremely high incomes, whereas the average in A comes from the sum of incomes that are much more equal. From the point of view of poverty – which is a human rights issue – average income is misleading because it identifies country A as most poor, whereas in reality there are more poor people in country B. So when looking at averages, it’s always good to look at the standard deviation as well. SD is a measure of the dispersion around the mean.
Throughout this blog-series on abuses and mistakes in statistics, we’ve often seen how the failure to compare things that can be validly compared leads to error or deceit. Here’s another example: the introduction of tin helmets during the First World War. Before this introduction, soldiers only had cloth hats to wear. The strange thing was that after the introduction of tin hats, the number of injuries to the head increased dramatically. Needless to say, this was counter-intuitive. The new helmets were designed precisely to avoid or limit such injuries.
Of course, people were comparing apples with oranges, namely statistics on head injuries before and after the introduction of the new helmets. In fact, what they should have done, and effectively did after they realized their mistake, was to include in the statistics, not only the injuries, but also the fatalities. After the introduction of the new helmets, the number of fatalities dropped dramatically, but the number of injuries went up because the tin helmet was saving soldiers’ lives, but the soldiers were still injured.
Some more detailed information after my casual remark on the correlation-causation problem. Here’s a fictitious example of what is meant by “Omitted Variable Bias“, a type of statistical bias that illustrates this problem. Suppose we see from Department of Defense data that male U.S. soldiers are more likely to be killed in action than female soldiers. Or, more precisely and in order to avoid another statistical error, the percentage of male soldiers killed in action is larger than the percentage of female soldiers. So there is a correlation between the gender of soldiers and the likelihood of being killed in action.
One could – and one often does – conclude from such a finding that there is a causation of some kind: the gender of soldiers increases the chances of being killed in action. Again more precisely: one can conclude that some aspects of gender – e.g. a male propensity for risk taking – leads to higher mortality.
However, it’s here that the Omitted Variable Bias pops up. The real cause of the discrepancy between male and female combat mortality may not be gender or a gender related thing, but a third element, an “omitted variable” which doesn’t show in the correlation. In our fictional example, it may be the type of deployment: it may be that male soldiers are more commonly deployed in dangerous combat operations, whereas female soldiers may be more active in support operations away from the front-line.
OK, time for a real example. It has to do with home-schooling. In the U.S., many parents decide to keep their children away from school and teach them at home. For different reasons: ideological ones, reasons that have to do with their children’s special needs etc. The reasons are not important here. What is important is that many people think that home-schooled children are somehow less well educated (parents, after all, aren’t trained teachers). However, proponents of home-schooling point to a study that found that these children score above average in tests. However, this is a correlation, not necessarily a causal link. It doesn’t prove that home-schooling is superior to traditional schooling. Parents who teach their children at home are, by definition, heavily involved in their children’s education. The children of such parents do above average in normal schooling as well. The omitted variable here is parents’ involvement. It’s not the fact that the children are schooled at home that explains their above average scores. It’s the type of parents. Instead of comparing home-schooled children to all other children, one should compare them to children from similar families in the traditional system.
Greg Mankiw believes he has found another example of Omitted Variable Bias in this graph plotting test scores for U.S. students against their family income:
[T]he above graph … show[s] that kids from higher income families get higher average SAT scores. Of course! But so what? This fact tells us nothing about the causal impact of income on test scores. … This graph is a good example of omitted variable bias … The key omitted variable here is parents’ IQ. Smart parents make more money and pass those good genes on to their offspring. Suppose we were to graph average SAT scores by the number of bathrooms a student has in his or her family home. That curve would also likely slope upward. (After all, people with more money buy larger homes with more bathrooms.) But it would be a mistake to conclude that installing an extra toilet raises yours kids’ SAT scores. … It would be interesting to see the above graph reproduced for adopted children only. I bet that the curve would be a lot flatter. Greg Mankiw (source)
Meaning that adopted children, who usually don’t receive their genes from their new families, have equal test scores, no matter if they have been adopted by rich or poor families. Meaning in turn that the wealth of the family in which you are raised doesn’t influence your education level, test scores or intelligence.
However, in his typical hurry to discard all possible negative effects of poverty, Mankiw may have gone a bit too fast. While it’s not impossible that the correlation is fully explained by differences in parental IQ, other evidence points elsewhere. I’m always suspicious of theories that take one cause, exclude every other type of explanation and end up with a fully deterministic system, especially if the one cause that is selected is DNA. Life is more complex than that. Regarding this particular matter, take a look back at this post, which shows that education levels are to some extent determined by parental income (university enrollment is determined both by test scores and by parental income, even to the extent that people from high income families but with average test scores, are slightly more likely to enroll in university than people from poor families but with high test scores).
What Mankiw did, in trying to avoid the Omitted Variable Bias, was in fact another type of bias, one which we could call the Singular Variable Bias: assuming that a phenomenon has a singular cause. In honor of Professor Mankiw (who does some good work, see here for example), I propose that henceforth we call it the Mankiw Bias.
More posts in this series.
In earlier posts (here and here) I described the specific difficulties faced by those wanting to measure respect for human rights in dictatorial countries. Measuring human rights requires a certain level of respect for human rights (freedom to travel, freedom to speak, to interview etc.). Trying to measure human rights in situations characterized by the absence of freedom is quite difficult, and can even lead to unexpected results: the absence of (access to) good data may give the impression that things aren’t as bad as they really are. Conversely, when a measurement shows a deteriorating situation, the cause of this may simply be better access to better data. And this better access to better data may be the result of more openness in society. Deteriorating measurements may therefore signal an actual improvement. I gave an example of this dynamic here (it’s an example of statistics on violence against women).
The graph below is a case of the way in which oppression may actually produce measurements that signal a lack of oppression:
(source, the “yes” answers in fact relate to the question “do you approve”; this is an example of a sloppy graph because the question in the title of the graph doesn’t permit the answer “yes” or “no”)
This is clearly a case of “lying with statistics”. Measuring public opinion in authoritarian countries is always difficult, but if you ask the public if they love or hate their government, it’s likely that you’ll have higher rates of “love” in the more authoritarian countries. After all, in those countries it can be pretty dangerous to tell someone in the street that you hate your government. They choose to lie and say that they approve. That’s the safest answer but probably in many cases not the real one. I don’t believe for a second that the percentage of people approving of their government is 19 times higher in Azerbaijan than in Ukraine, when Ukraine is in fact much more liberal than Azerbaijan.
In the words of Robert Coalson:
The Gallup chart is actually an index of fear. What it reflects is not so much attitudes toward the government as a willingness to openly express one’s attitudes toward the government. As one member of RFE/RL’s Azerbaijan Service told me, “If someone walked up to me in Baku and asked me what I thought about the government, I’d say it was great too”.
There seems to be no end to the number of battles in our war against the abuse of statistics. Take a look at this graph:
A poll of presidential approval ratings is a public opinion poll, so one expects to see the diverse opinions of the entire public represented in the results. That’s not the case here. As you can see, the numbers for the red and green lines don’t add up to 100%. Only the extreme opinions – strong approval and disapproval – are shown. Now, strictly speaking, there’s nothing to object: all necessary information is given, there’s no undue manipulation of the scales etc.
However, there’s approximately a third of public opinion that’s not included in this graph. At a minimum, this should have been made clear. I admit that my first, quick impression of this graph was that I was looking at a graph that shows the entirety of public opinion. Only after a few seconds of looking more closely did I realize that the graph doesn’t in fact offer a measurement of public opinion, but only of the opinion of the most outspoken parts of the public. Why not include a third and fourth line for “moderately (dis)approve”? Or, even better, include the moderates in the totals and just give the number for approval and disapproval, combining strong and moderate? What’s the added value of only showing the extremes? Or is this part of the current media culture?
I understand that it’s useful to know the strength of the groups who strongly approve and disapprove, but this is misleading. The graph as it is now clearly hints at a strong swing towards disapproval of Obama, but including the moderates could change that impression, and could, theoretically, show an increase in overall approval (moderate and strong). The difference between strong approval and strong disapproval is smaller than the total share of the moderates who are left out; if all or most of those moderates moderately approve (unlikely but possible), then the total approval ratings would be higher than the total disapproval ratings.
For example, the 2004 exit poll put George W. Bush’s strong approval at 33%, to strong disapproval of 34%. But his overall approval was 53% to disapproval at 46%, and he was re-elected 51%-48%. (source)
But maybe the point of this graph is precisely the creation of the impression that Obama is going down the drain. If that’s the case, then this is an example of statistical fraud. There’s no way to know this, however. One thing I know is that all this will strengthen the persistent criticism that Rasmussen, the author of the graph, has a republican bias.
I said before that strictly speaking, there’s nothing wrong with this graph, apart from the fact that it could have mentioned more explicitly that a large chunk of public opinion is left out. However, if we look at this graph against the background of contemporary politics, it becomes more problematic. Politics today is often a shouting match between extreme positions. Such a spectacle is, after all, more entertaining than intelligent discussions that look for a common ground and a real possibility of persuasion of the other side. Hence, cable TV and the internet promote this kind of “gladiator politics“. Graphs such as this one only drive people further down the cul-de-sac of us-against-them politics. I don’t believe democracy was intended to end up there.
Another installment in our ongoing series on mistakes and fraud in statistics. Here’s a graph explaining that the top 1 percent of U.S. taxpayers paid 40.42 percent of total federal income taxes in 2007, and did in fact pay more taxes than the bottom 95% (if you can call that a “bottom”):
The point of this is obvious: those poor rich people pay too much in taxes, and pay more and more, presumably to finance the “state-beast” and welfare dependents. Such information is also used to argue against progressive taxation and in favor of “trickle down economics” (allowing the rich to prosper, and hence not taxing them disproportionately, is good for everyone, ultimately, because their wealth will trickle down to the rest of us).
There’s nothing theoretically, statistically or logically ”wrong” with this graph. What’s wrong and misleading is that it hides relevant explanatory information. Why do the 1% richest people pay an ever increasing share of the total amount of taxes collected? Is it because their tax rates have increased? In other words, is it because the government takes an ever increasing percentages of their income? Even those of us who favor a progressive tax system would admit that there are limits to this: it’s indeed economically unwise to discourage wealth creation, and there are undoubtedly some (albeit minor) trickle down effects.
However, it’s not the case that tax rates for the top 1% have risen, on the contrary:
So then why do the rich pay more and more taxes? It’s simple: because they have become increasingly rich. Their incomes have risen sharply. These are data for the top 5%, but the top 1% have done just as well if not better (see here for top 1% data):
And because they earn more, they pay more in taxes, even with decreasing tax rates. A quote from the NY Times:
Here’s a chart showing the portion of adjusted gross income earned by the top 1 percent and by the bottom 95 percent. You’ll see that one major reason why the share of taxes paid by the richest Americans has risen is that the richest Americans have experienced much greater income growth:
The pink line in this graph clearly correlates with the blue line in the first graph above. And if you don’t believe the NYT, look here.
So the first graph above, supposedly showing “an increasing tax burden”, is misleading. The top 1% do indeed pay more and more taxes, but that’s no reason to assume that they carry a heavier burden. On the contrary: their tax rates have fallen. So they pay an ever smaller share of an increasing income. Taken together, their shares do indeed represent an ever increasing share of total government tax revenues, but that’s because the tax base has increased. Presenting this as somehow unfair and increasingly burdensome is misleading because relevant explanatory information is hidden, or not mentioned. Statistics serve to explain, and if there’s an explanation for some data, those publishing this data should, in all honesty, provide this explanation. If not, then that’s a “lie of omission”.
Just to show that there’s no real unfair treatment of the rich, consider this graph, showing the after-tax incomes:
Another one in our series on intended and unintended mistakes in statistics. Take for instance unemployment or employment rates. (We’ve talked about this before in this series). Employment statistics usually measure the number of people at work or unemployed, the number of people claiming unemployment benefits, the number of jobs that are created or lost, etc. Especially during an economic recession, like the one we have now, people look anxiously at those statistics. However, during a recession, companies that are struggling may be unwilling to lay off people, either because they feel responsible for their employees, or because – less altruistically – they don’t want to lose valuable experience which they will need when the economy recovers. Many companies therefore choose to convince their people to work less hours, work part-time etc. Rather than dismissing some people, the burden of the recession is equally spread over all employees.
The phenomenon is called “labor hoarding” and it is attributable to the costs of finding, hiring and training new workers and the costs in terms of severance pay and morale when firing workers. Jeffrey Frankel (source)
However, a simple unemployment statistic composed of numbers of jobs or job losses will fail to notice this. In times of recession, such a statistic will underestimate real unemployment because it won’t include the partial unemployment in the companies that increase part-time work. So you think you are measuring unemployment, but actually you’re not, at least not completely or accurately. A better dataset is the average weekly hours worked. Or you could include the numbers of people who are involuntarily part-timers in the numbers of unemployed:
(source, marginally attached workers are persons not in the labor force who want and are available for work, and who have looked for a job sometime in the prior 12 months, but were not counted as unemployed because they had not searched for work in the 4 weeks preceding the survey; discouraged workers are a subset of the marginally attached)
An example from Greg Mankiw’s blog:
Should we [the U.S.] envy European healthcare? Gary Becker says the answer is no:
“A recent excellent unpublished study by Samuel Preston and Jessica Ho of the University of Pennsylvania compare mortality rates for breast and prostate cancer. These are two of the most common and deadly forms of cancer – in the United States prostate cancer is the second leading cause of male cancer deaths, and breast cancer is the leading cause of female cancer deaths. These forms of cancer also appear to be less sensitive to known attributes of diet and other kinds of non-medical behavior than are lung cancer and many other cancers. [Health effects of diet and behavior should be excluded when comparing the quality of healthcare across countries. FS]
These authors show that the fraction of men receiving a PSA test, which is a test developed about 25 years ago to detect the presence of prostate cancer, is far higher in the US than in Sweden, France, and other countries that are usually said to have better health delivery systems. Similarly, the fraction of women receiving a mammogram, a test developed about 30 years ago to detect breast cancer, is also much higher in the US. The US also more aggressively treats both these (and other) cancers with surgery, radiation, and chemotherapy than do other countries.
Preston and Hu show that this more aggressive detection and treatment were apparently effective in producing a better bottom line since death rates from breast and prostate cancer declined during the past 20 [years] by much more in the US than in 15 comparison countries of Europe and Japan.” (source)
Another example: the website of the National Alert Registry for sexual offenders used to post a few “quick facts”. One of them said:
“The chance that your child will become a victim of a sexual offender is 1 in 3 for girls… Source: The National Center for Victims of Crime“.
Someone took the trouble of actually checking this source, and found that it said:
Twenty-nine percent [i.e. approx. 1 in 3] of female rape victims in America were younger than eleven when they were raped.
One in three rape victims is a young girl, but you can’t generalize from that by saying that one in three young girls will be the victim of rape. Perhaps they will be, but you can’t know that from these data. Like you can’t conclude from the way the U.S. deals with two diseases that it “shouldn’t envy European healthcare”. Perhaps it shouldn’t, but more general data on life expectancy says it should.
These are two examples of induction or inductive reasoning, sometimes called inductive logic, a reasoning which formulates laws based on limited observations of recurring phenomenal patterns. Induction is employed, for example, in using specific propositions such as:
This door is made of wood.
to infer general propositions such as:
All doors are made of wood. (source)
As I’ve stated before in this series about errors and lies in statistics, many things can go wrong in the design and execution of opinion surveys. And opinion surveys are a common tool in data gathering in the field of human rights.
As it’s often impossible (and undesirable) to question a whole population, statisticians usually select a sample from the population and ask their questions only to the people in this sample. They assume that the answers given by the people in the sample are representative of the opinions of the entire population. But that’s only the case if the sample is a fully random subset of the population – that means that every person in the population should have an equal chance of being chosen – and if the sample hasn’t been distorted by other factors such as self-selection by respondents (a common thing in internet polls) or personal bias by the statistician who selects the sample.
A sample that is too small is also not representative for the entire population. For example, if we ask 100 people if they approve or disapprove of discrimination of homosexuals, and 55 of them say they approve, we might assume that about 55% of the entire population approves. Now it could possible be that only 45% of the total population approve, but that we just happened, by chance, to interview an unusually large percentage of people who approve. For example, this may have happened because, by chance and without being aware of it, we selected the people in our sample in such a way that there are more religious conservatives in our sample than there are in society, relatively speaking.
This is the problem of sample size: the smaller the sample, the greater the influence of luck on the results we get. Asking the opinion of 100 people, and taking this as representative of millions of citizens, is like throwing a coin 10 times and assuming – after having 3 heads and 7 tails – that the probability of throwing heads is 30%. We all know that it’s not 30 but 50%. And we know this because we know that when we increase the “sample size” - i.e. when we throw more than 10 times, say a thousand times – we will have heads and tails approximately half of the time. Likewise, if we take our example of the survey on homosexuality: increasing the sample size reduces the chance that religious conservatives (or other groups) are disproportionately represented in the sample.
When analyzing survey results, the first thing to look at is the sample size, as well as the level of confidence (usually 95%) that the results are within a certain margin of error (usually + or – 5%). High levels of confidence that the results are correct within a small margin of error indicate that the sample was sufficiently large and random.
I often see graphs that contain a time series of some sort, but the numbers are just plain numbers, not normalized by population. Here’s an example of a graph from the Bush-era, flaunting the supposedly beneficial effects of Bush’s labor policy on job growth (green line, “jobs on the rise”, number of jobs in thousands):
Just presenting the numbers of job without relating them to the population, is meaningless. Maybe the population grew faster than the number of jobs, in which case the growth exhibited here is in fact a decrease. Or the population shrunk, in which case the growth in the number of jobs was even bigger.
Here’s the correct graph, showing that employment did increase under Bush, but decreased during the last years of his presidency:
“Population” can mean actual population (i.e. people or residents), but can also mean any other relevant basis of comparison. For example:
The following statistics suggest that 16-year-olds are safer drivers than people in their twenties, and that octogenarians are very safe:
As the following graph shows, the reason 16-year-old and octogenarians appear to be safe drivers is that they don’t drive nearly as much as people in other age groups:
Another example is the national debt statistic. Often the graph shows just the national debt in dollar, without relating it to GDP. Whereas the absolute amounts do have some relevancy, it’s better to express the debt as a percentage of GDP because a bigger economy can carry a bigger debt (a poor household may go bankrupt with a debt of $10,000, whereas a rich household can live with a debt of perhaps $100,000).
Take this graph for instance:
Now compare it to this one:
Or this, slightly more recent one, including the latest recession:
And a final example: looking at the relative safety of air travel and road travel and the probability of dying in either a road accident or a plane accident, you can also find divergent data depending on how you divide: number of casualties per trip, per miles traveled, per hours traveled etc.
The data on U.S. defense spending (“defense” being of course a euphemism) are here. (I hope the connection to the issue of human rights is obvious and doesn’t need spelling out). The amounts involved are incredible, and yet you can still find national security hawks who believe that it isn’t enough, or who advocate that cutting some of this spending would be extremely dangerous. The Heritage Foundation, for example, has an article out lambasting the Obama administration for some supposed spending cuts. They have this graph for instance:
Now, this graph should be used in every textbook on statistics as a classic example of misinformation and manipulation of data. As Benjamin H. Friedman points out:
It’s true that defense spending will probably decline as a percentage of GDP, assuming the economy recovers. But that’s because GDP grows. Ours [GDP] is more than six times bigger than it was in 1950.
The correct way to measure growth or decline in defense spending is to look at the amounts spent on defense in real, inflation adjusted terms. See the solid line in this graph:
And then it’s clear that the U.S. spends more now than at the height of the Cold War. Friedman again:
By saying that defense spending needs to grow with GDP to be “level”, you are arguing for an annual increase in defense spending without saying so directly. That’s the point, of course. (source)
Defense hawks want military spending to rise together with GDP growth, whatever the international situation, whatever the threats.
As Matthew Yglesias points out:
Since economic growth causes real wages to rise over time, there is some reason for thinking that a military sized appropriately to the strategic environment would need real increases in spending to maintain its level of capabilities. But one way or another, the crucial issue is that the appropriate level of defense spending is determined by the nature of the strategic environment, not by the pace of economic growth. The US economy grew rapidly during the 1990s but the level of military threats facing the country didn’t—thus, a decline in defense expenditures relative to GDP was appropriate.
One interesting trope both in the substance and rhetoric of this argument from Heritage is the idea that 9/11 ought to have touched off a large and sustained increase in defense spending. On the merits, this is a little hard to figure out. It’s difficult to make the case that the 9/11 plot succeeded because the gap in financial expenditures between the U.S. government and Osama bin Laden was not big enough. Would an extra aircraft carrier have helped? A more advanced fighter plane? A larger Marine Corps? Additional nuclear weapons? One of the most realistic ways an organization like al-Qaeda can damage the United States is to provoke us into wasting resources on a far larger scale than they could ever destroy. The mentality Heritage is expressing here is right in line with that path.
Those who want to cover up human rights violations often modify statistics in such a way that they don’t really make a voluntary mistake. For example, they can change the unit of measurement. Suppose we want to know how many forced disappearances there are in Chechnya. Assuming we have good data this isn’t hard to do. The number of disappearances that have been registered, by the government or some NGO, is x on a total Chechen population of y, giving z%. The Russian government may decide that the better measurement is for Russia as a whole. Given that there are almost no forced disappearances in other parts of Russia, the z% goes down dramatically, perhaps close to or even below the level other comparable countries.
Good points for Russia! But that doesn’t mean that the situation in Chechnya is OK. The data for Chechnya are simply “drowned” into those of Russia, giving the impression that “overall”, Russia isn’t doing all that bad. This, however, is misleading. The proper unit of measurement should be limited to the area where the problem occurs. The important thing here isn’t a comparison of Russia with other countries; it’s an evaluation of a local problem.
Something similar happens to the evaluation of the Indian economy:
Madhya Pradesh, for example, is comparable in population and incidence of poverty to the war-torn Democratic Republic of Congo. But the misery of the DRC is much better known than the misery of Madhya Pradesh, because sub-national regions do not appear on “poorest country” lists. If Madhya Pradesh were to seek independence from India, its dire situation would become more visible immediately. …
But because it’s home to 1.1 billion people, India is more able than most to conceal the bad news behind the good, making its impressive growth rates the lead story rather than the fact that it is home to more of the world’s poor than any other country. …
A 10-year-old living in the slums of Calcutta, raising her 5-year-old brother on garbage and scraps, and dealing with tapeworms and the threat of cholera, suffers neither more nor less than a 10-year-old living in the same conditions in the slums of Lilongwe, the capital of Malawi. But because the Indian girl lives in an “emerging economy,” slated to battle it out with China for the position of global economic superpower, and her counterpart in Lilongwe lives in a country with few resources and a bleak future, the Indian child’s predicament is perceived with relatively less urgency. (source)
All this should be kept in mind when browsing our human rights maps. It’s not because a country compares favorably to another that it doesn’t have serious problems. More on Russia and Chechnya, and on poverty in India. And more posts in this series.
Sometimes, when you want to compare two time-series which are far apart from each other in terms of numbers – such as, for example, the yearly average number of inhabitants of NY and their yearly average height (the former being in the millions, the latter in the single digits) – you have to plot one series on the left y-axis and the other on the right y-axis, each with a different scale. If you put both on the same y-axis (usually the left) you have to use the same scale. In my example, the line for the average height would just be a flat line at the bottom of the graph and coincide more or less with the x-axis because the numbers are too small compared with the numbers for population. If you put them on two different y-axes, you’ll be able to compare them.
Here’s an example I discussed before. Proponents of the death penalty usually show the following famous graph in order to “prove” that capital punishment results in fewer homicides in the U.S., and is therefore a successful deterrent:
What’s wrong with this graph is that they tried to jam the two series – which are totally different in terms of magnitude – into one y-axis. To do so, they recalculated the number of murders series. Rather than giving the numbers as they are, they give the numbers per 66.000 people. Why this strange number: 66.000? Why not the more obvious 100.000, or why not plot the two series on different y-axis? Because now they can give the impression that the recent rise in the number of executions is closely correlated with the recent drop in the number of homicides.
Now compare this graph to this version, using the same data (but going back a bit further in time) and another graphical presentation:
The important differences:
- that the second graph uses two y-axes
- and it counts the number of executions per homicide, and not just the total number of executions – from the point of view of deterrence, this is obviously the better measure.
We can see from the second graph that the recent upswing in the number of executions is really quite small, compared to earlier periods (there was moratorium on executions in the U.S. in the early 1970s). Unless deterrence has somehow become much more effective than it was in the early parts of the 20th century – which is doubtful given the relatively low numbers of executions and the relatively humane methods – it’s doubtful that such a relatively small increase in the number of executions during the last decades is the cause of the extraordinary decrease in the number of homicides during the same period.
When we look at the whole time series, going back in time long enough; when we use both y-axes; and when we avoid using strange measures such as murders per 66.000 people or executions tout court rather than executions per homicide, then there isn’t a clear correlation between executions and decreasing numbers of murders.
Of course, using a left and right y-axis can also be misleading. I’ll post an example when I come across one.
Another common manipulation of statistics: play a bit with the starting and ending values on the y-axis of your graphs. This can give astonishing results. I prepared a fictional example. Compare the two graphs:
The data are absolutely the same, but the y-axis in the second graph starts at 3,500 instead of 0, giving the impression that government violation of freedom of speech in Dystopia has risen sharply in 2008, compared to the year before, whereas in reality things are just as awful, more or less, as before.
E.D. Kain of The League of Ordinary Gentlemen believes he has spotted a real-life example of this kind of manipulation. While it’s not difficult to find such examples, this isn’t one. On the contrary, Kain himself commits the mistake he accuses someone else of making. Let me explain. He points to this graph from Conor Clarke on Andrew Sullivan’s blog:
This graph, illustrating (or not, if you’re Kain) the drop in effective income tax rates for the top 1% of Americans from the Clinton to the Bush years, is used by many to argue that a small increase in taxation for the super-rich wouldn’t mean Armageddon. At first sight, the y-axis does indeed look like it has been manipulated in order to highlight a sharp decline in tax rates for the rich.
Hence, Kain goes to work and “corrects” the chart, making the y-axis start at 0% and end at 100%:
Just goes to show that manipulation can also mean using the apparently “neutral” starting and ending points of 0 and 100. Not only does he remove all useful information from the previous graph; he also assumes that taxes can somehow be close to 0% or 100%. One shouldn’t assume this, since it never happens in reality. Making the graph start at 0 and end at 100 means assuming it can happen, and is therefore disingenuous. An example: suppose I want to show that life expectancy hasn’t risen a lot over the last centuries (which isn’t true). So I include the extreme of 500 years as the end value in my y-axis. Nobody ever lives or will live till he or she is 500. Obviously, the graph will show no visible increase in life expectancy, even if people now live twice as long as a thousand years ago, on average (which is the case).
Lesson: minimum and maximum values in y-axis should be close to realistic real-life minimums and maximums. In that respect, the Clarke graph is better. (Although he could have used a longer period, avoiding another error).
Just to show that this type of lie occurs in real life:
Statistics can be dangerous, as is evident from the previous posts in this series. People making them can make mistakes, or can use them to deceive. And people reading them can misinterpret them. Our treatment of human rights on this blog depends heavily on the use of statistics, and so the quality of those statistics is important. This blog series mentions some of the things that can go wrong.
Statistical mistakes or statistical lies occur in all kinds of fields, not only the field of human rights. Here’s one that is often made in discussions on climate change. It has to do with measuring growth rates (which we also do for human rights).
Kevin Drum has a quote from George Will, and replies with a graph:
George Will [claimed] that “If you’re 29, there has been no global warming for your entire adult life”. … If you’re 29, you became an adult in 1998, and average global temperatures last year were lower than they were in 1998. So: no global warming in your adult lifetime.
The earth is actually cooling! But as about a thousand serious climate researchers have pointed out, it’s not true. Global temps have been trending up for over a century, but in any particular year they can spike up and down quite a bit. In 1998 they spiked up far above the trend line and last year they spiked below the trend line. So 2008 was cooler than 1998.
Of course, you can prove anything you want if you cherry pick your starting and ending points carefully enough. For example: The year 2000 was below the trend line and 2005 was above it. Temps were up 0.4°C in only five years! The seas will be boiling by 2050!
Here’s another example of cherry picking start or ending dates in a time series so as to highlight or drown a growth rate (positive or negative), this time more closely related to the issue of human rights (more specifically the right to work).* Compare these two graphs (in the first graph, just look at the red line for “unemployment rate”, the rest isn’t important, for now – I’ll come back to it in a future post because there are other problems with this first graph):
The first graph makes the – honest? – mistake of starting in 2003, giving the impression that Bush’s economic policies brought down unemployment. The second graph, however, gives some more historical perspective because it starts earlier, and shows that unemployment was much lower before Bush (Bush took office in 2000) and that the decrease during his presidency wasn’t so spectacular as the first graph suggests.
Of course, you can’t hold a president responsible for unemployment, at least not exclusively. But then neither should you tweak graphs so as to give the impression that the president’s policies have a beneficial impact (read the title of the first graph).
* Technically, this isn’t a growth rate, just a time series, but the same logic holds.
I don’t think it’s a good idea to be blinded by love, and I apply that to my love for statistics. If you’re tempted to take the statistics on this blog (or elsewhere) too seriously, take a look at the image below (and also this one).
The same thing happens in this joke:
Two statisticians were flying from Los Angeles to New York. About an hour into the flight, the pilot announced, “Unfortunately, we have lost an engine, but don’t worry: There are three engines left. However, instead of five hours, it will take seven hours to get to New York.”
A little later, he told the passengers that a second engine had failed. “But we still have two engines left. We’re still fine, except now it will take ten hours to get to New York.”
Somewhat later, the pilot again came on the intercom and announced that a third engine had died. “But never fear, because this plane can fly on a single engine. Of course, it will now take 18 hours to get to New York.”
At this point, one statistician turned to another and said, “Gee, I hope we don’t lose that last engine, or we’ll be up here forever!”
Unfortunately, such things don’t happen only in jokes. It’s quite common to take a trend and assume it will continue on the same path it has taken in the past.
You know I love graphs and statistics, so here’s one showing how importing lemons from Mexico reduces highway fatality rates in the U.S.:
And here‘s another one. Just so that you don’t automatically believe everything I write (as if you would), and a funny reminder that correlation doesn’t necessarily imply causation.
A busload of politicians was driving down a country road, when suddenly the bus ran off the road and crashed into an old farmer’s barn.
The old farmer got off his tractor and went to investigate. Soon he dug a hole and buried the politicians. A few days later, the local sheriff came out, saw the crashed bus and asked the old farmer where all the politicians had gone.
The old farmer told him he had buried them.
The sheriff asked the old farmer, “Lordy, they were ALL dead?”
The old farmer said, “Well, some of them said they weren’t, but you know how them crooked politicians lie.”
Here is something on the role of truth and opinions in politics.