lies and statistics, statistics

Lies, Damned Lies, and Statistics (40): The Composition Effect

wage stagnation

Take the evolution of the median wage in the US over the last decades. The trend is nearly flat and one would therefore naturally assume that there have been hardly any income gains for the average US citizen. However, some have argued that this conclusion is wrong because it ignores the composition effect. In this example, the composition of the labor force has obviously changed over the last decades, and has changed dramatically. More women and immigrants have entered the workforce and those tend to be lower income groups, especially at the moment of entry. When they enter the labor force, their incomes go up, obviously, but they bring the average and the median down. When, at the same time, the wages of white men go up, the aggregate effect may be close to zero. And yet, paradoxically, all groups have progressed. The conclusion that the average citizen did not progress would only hold if the composition of the population whose wages are compared over time had not changed.

Now, it seems to be the case that in this particular example there is really no large composition effect (see here). However, this effect is always a possibility and one should at least consider it and possibly rule it out before drawing hasty conclusions from historical time series. If you don’t do this, or don’t even try, then you may be “lying with statistics”.

More posts in this series are here.

Standard
lies and statistics, statistics

Lies, Damned Lies, and Statistics (39): Availability Bias

availability bias on newspaper frontpage

example of availability bias on a newspaper’s frontpage

(source)

This is actually only about one type of availability bias: if a certain percentage of your friends are computer programmers or have red hair, you may conclude that the same percentage of a total population are computer programmers or have red hair. You’re not working with a random and representative sample – perhaps you like computer programmers or you are attracted to people with red hair – so you make do with the sample that you have, the one that is immediately available, and you extrapolate on the basis of that.

Most of the time you’re wrong to do so – as in the examples above. In some cases, however, it may be a useful shortcut that allows you to avoid the hard work of establishing a random and representative sample and gathering information from it. If you use a sample that’s not strictly random but also not biased by your own strong preferences such as friendship or attraction, it may give reasonably adequate information on the total population. If you have a reasonably large number of friends and if you couldn’t care less about their hair color, then it may be OK to use your friends as a proxy of a random sample and extrapolate the rates of each hair color to the total population.

The problem is the following: because the use of available samples is sometimes OK, we are perhaps fooled into thinking that they are OK even when they’re not. And then we come up with arguments like:

  • Smoking can’t be all that bad. I know a lot of smokers who have lived long and healthy lives.
  • It’s better to avoid groups of young black men at night, because I know a number of people who have been attacked by young black men (and I’ll forget that I’ll hardly ever hear of people not having been attacked).
  • Cats must have a special ability to fall from great heights and survive, because I’ve seen a lot of press reports about such events (and I forget that I’ll rarely read a report about a cat falling and dying).
  • Violent criminals should be locked up for life because I’m always reading newspaper articles about re-offenders (again, very unlikely that I’ll read anything about non-re-offenders).

As is clear from some of the examples above, availability bias can sometimes have consequences for human rights: it can foster racial bias, it can lead to “tough on crime” policies, etc.

More posts in this series are here.

Standard
lies and statistics, statistics

Lies, Damned Lies, and Statistics (38): The Base-Rate Fallacy

help wanted white only

(source)

When judging whether people engage in discrimination it’s important to make the right comparisons. Take the example of an American company X where 98 percent of employees are white and only 2 percent are black. If you compare to (“if your base is”) the entire US population – of which about 13 percent are African American – then you’ll conclude that company X is motivated by racism in its employment decisions.

However, in cases such as these, it’s probably better to use another base rate, namely the number of applicants rather than the total population. If only 0.1 percent of job applications where from blacks, then an employment rate of 2 percent blacks actually shows that company X has favored black applicants.

The accusation of racism betrays a failure to point to the real causes of discrimination. It’s a failure to go back far enough and to think hard enough. The fact that only 0.1 percent of applicants were black – instead of the expected 13 percent – may still be due to racism, but not racism in company X. Blacks may suffer from low quality education, which results in a skill deficit among blacks, which in turn leads to a low application rate for certain jobs.

The opposite error is also quite common: people point to the number of blacks in prison, compare this to the total number of blacks, and conclude that blacks must be more attracted to crime. However, they should probably compare incarceration rates to arrest rates (blacks are arrested at higher rates because of racial profiling). And they should take into account jury behavior as well.

More about racism. More posts in this series.

Standard
lies and statistics, statistics

Lies, Damned Lies, and Statistics (37): When Surveyed, People Express Opinions They Don’t Hold

dumbfounded

(source)

It’s been a while since the last post in this series, so here’s a recap of its purpose. This blog promotes the quantitative approach to human rights: we need to complement the traditional approaches – anecdotal, journalistic, legal, judicial etc. – with one that focuses on data, country rankings, international comparisons, catastrophe measurement, indexes etc.

Because this statistical approach is important, it’s also important to engage with measurement problems, and there are quite a few in the case of human rights. After all, you can’t measure respect for human rights like you can measure the weight or size of an object. There are huge obstacles to overcome in human rights measurement. On top of the measurement difficulties that are specific to the area of human rights, this area suffers from some of the general problems in statistics. Hence, there’s a blog series here about problems and abuse in statistics in general.

Take for example polling or surveying. A lot, but not all, information on human rights violations comes from surveys and opinion polls, and it’s therefore of the utmost importance to describe what can go wrong when designing, implementing and using surveys and polls. (Previous posts about problems in polling and surveying are here, here, here, here and here).

One interesting problem is the following:

opinion pollSimply because the surveyor is asking the question, respondents believe that they should have an opinion about it. For example, researchers have shown that large minorities would respond to questions about obscure or even fictitious issues, such as providing opinions on countries that don’t exist. (source, source)

Of course, when people express opinions they don’t have, we risk drawing the wrong conclusions from surveys. We also risk that a future survey asking the same questions comes up with totally different results. Confusion guaranteed. After all, if we make up our opinions when someone asks us, and those aren’t really our opinions but rather unreflected reactions we give because of a sense of obligation, it’s unlikely that we will express the same opinion in the future.

Another reason for this effect is probably our reluctance to come across as ignorant: rather than selecting the “I don’t know/no opinion” answer, we just pick one of the other possible answers. Again a cause of distortions.

(image source)
Standard
governance, lies and statistics, statistics

Lies, Damned Lies, and Statistics (36): Manipulating the X-axis Scale in Graphs

Although less common than its sister lie – manipulating the y-axis in graphs – manipulation of the x-axis does occur.

But first a technical note: “bins” are clusters of subpopulations for which the frequency of some characteristic is measured. Together, the bins form a histogram or a graphical representation showing the distribution of a characteristic for an entire population (like a survey group). Here’s an example:

histogram example

A survey of 31 black cherry tress revealed that three of them had a height between 60 and 65 feet; 8 had a height between 70 and 75 feet etc. There are 6 bins on this graph’s x-axis, probably because the person analyzing the survey data thought that 6 would be an adequate number. And indeed, dividing a population of 31 into 20 or 2 subgroups would probably not result in interesting numbers, at least not in this case.

Working with bins means that the x-axis shows a split of the surveyed population into smaller groups according to certain ranges of the characteristic that was surveyed (height in this case), making it possible to see how many individuals (trees in this case) belong to a certain range or subgroup. Notice that in this example the bins are

  • not too numerous
  • not too few
  • of equal size (always a range of 5 feet)
  • consecutive and
  • non-overlapping.

As they should be. (The size shouldn’t always be equal, but often is).

Many histograms have a “bell-shape” like in this example (in which case they show what is called a “normal distribution“), but they can also have other shapes, depending on the population and the characteristic surveyed. A survey of the frequency of a certain disease among the population of a country, with the population divided into bins according to individuals’ age, would be skewed to the left since older people – on the right – may suffer more frequently from the disease.

Since all this is probably old news to most of you, let’s go straight to an example of manipulation of bins. Such manipulation often involves tinkering with the ranges of certain bins, so that the different bins are no longer of the same size. The following example is about income shares across the population of the U.S. Technically, the graph below is not a histogram because the y-axis shows cumulated income for ranges of income groups rather than frequencies, but for our purposes it’s equivalent:

wsj graph of income distribution

wsj graph of income distribution

(source, source)

This graph is then used by the Wall Street Journal to argue against increased taxation of the rich as a means to close the budget deficit, because supposedly that’s not where the money is. Or, better, the money is there, otherwise they wouldn’t be rich, but there are just not enough of them; taxing the middle class would be better according to the WSJ because it’s they who have all the money … at least if you believe their graph. The problem is that the highest bar in their graph is for people making $100-$200K, whereas the bar immediately to the left of this one is for the income range of $75K to $100K – an income range only one-quarter the size. No surprise that the bar for $100-$200K is so much larger than the rest…

If you want to argue that taxing the rich does make it possible to bring in a lot more revenue, then you could use this alternative graph, made from the same data:

wsj graph of income distribution alternative

wsj graph of income distribution alternative

(source)

Or this one:

blog_where_money_is

(source)

More alternative presentations of the same data are here.

It all depends what you mean by “rich” and “middle class”, but claiming -  as does the WSJ – that $200K is still “middle class” is stretching the point.

More posts in this series are here.

Standard
data, economics, lies and statistics, statistics

Lies, Damned Lies, and Statistics (35): Sample Sizes, Ctd.

cherry picking

cherry picking

(source)

This isn’t the first time I mention sample sizes as a common problem in statistics. Usually, the problem is one of survey design: insufficiently large sample sizes for respondents produce unreliable survey results.

However, the same error – or fraud, when the error is willful – can occur in data interpretation. Take a look at this graph by John Taylor:

investment GPD and unemployment John Taylor graph

(source)

The problem?

Taylor’s conclusion: The data on spending shares show that the most effective way to reduce unemployment is to raise investment as a share of GDP. But why begin the scatter plot in 1990? There’s no good reason. In fact, most folks typically download the entire history of available macro data. … The chart below goes back to 1948:

investment GPD and unemployment John Taylor graph 2

(source)

This is a form of cherry-picking data that allows you to “prove” a strong correlation where there’s actually none at all. In this way, you’ll find a correlation in almost all data sets, as long as you pick a sufficiently small sample of the set. In this example, you can only limit the selection to the last two decades if you have a good argument about why the economy is different now compared to some decades ago, and why there’s a correlation now when there wasn’t before. However, that argument – which would be interesting – seems to be lacking. And if it’s lacking,  there’s no excuse for cherry picking the last two decades.

Other examples of cherry-picking are here. More posts about lies and errors in statistics are here.

Standard
data, lies and statistics, statistics

Lies, Damned Lies, and Statistics (34): The Narcissism of Small Differences

What I want to criticize in this installment of our series on lies and statistics, is the ordinal ranking of relatively similar entities in a way that creates the illusion of substantial disparity. You often see it combined with color schemes: one entity that’s just below a threshold value gets one color, and the next one which is just above gets another color, and then it’s like they differ substantially. It’s rather common in maps, of which there’s an example here:

hdi by states of the US

HDI by states of the US

(source, more information on the Human Development Index is here; note: the criticism offered in this post is not directed against the HDI itself)

Louisiana has a score of .801, West Virginia .800, and Mississippi at .799, and that makes Mississippi stand out although it’s really not different from the other two.

Something to keep in mind when looking at all the maps I post on this blog, or any ordinal ranking for that matter.

More about lies and statistics here.

Standard
lies and statistics, statistics

Lies, Damned Lies, and Statistics (33): The Omitted Variable Bias, Ctd.

dilbert statistical joke

(source, click image to enlarge)

I discussed the so-called Omitted Variable Bias before on this blog (here and here). So I suppose I can mention this other example: guess what is the correlation, on a country level, between per capita smoking rates and life expectancy rates? High smoking rates equal low life expectancy rates, right? And vice versa?

Actually, and surprisingly, the correlation goes the other way: the higher smoking rates – the more people smoke in a certain country – the longer the citizens of that country live, on average.

Why is that the case? Smoking is unhealthy and should therefore make life shorter, on average. However, people in rich countries smoke more; in poor countries they can’t afford it. And people in rich countries live longer. But they obviously don’t live longer because they smoke more but because of the simple fact they have the good luck to live in a rich country, which tends to be a country with better healthcare and the lot. If they would smoke less they would live even longer.

Why is this important? Not because I’m particularly interested in smoking rates (although I am interested in life expectancy). It’s important because it shows how easily we are fooled by simple correlations, how we imagine what correlations should be like, and how we can’t see beyond the two elements of a correlation when we’re confronted with one that goes against our intuitions. We usually assume that, in a correlation, one element should cause the other. And apart from the common mistake of switching the direction of the causation, we often forget that there can be a third element causing the two elements in the correlation (in this example, the prosperity of a country causing both high smoking rates and high life expectancy), rather than one element in the correlation causing the other.

More posts in this series are here.

Standard
data, lies and statistics, statistics

Lies, Damned Lies, and Statistics (32): The Questioner Matters

I’ve discussed the role of framing before: the way in which you ask questions in surveys influences the answers you get and therefore modifies the survey results. (See here and here for instance). It happens quite often that polling organizations or media inadvertently or even deliberately frame questions in a way that will seduce people to answer the question in a particular fashion. In fact you can almost frame questions in such a way that you get any answer you want.

However, the questioner may matter just as much as the question.

Consider this fascinating new study, based on surveys in Morocco, which found that the gender of the interviewer and how that interviewer was dressed had a big impact on how respondents answered questions about their views on social policy. …

[T]his paper asks whether and how two observable interviewer characteristics, gender and gendered religious dress (hijab), affect survey responses to gender and non-gender-related questions. [T]he study finds strong evidence of interviewer response effects for both gender-related items, as well as those related to support for democracy and personal religiosity … Interviewer gender and dress affected responses to survey questions pertaining to gender, including support for women in politics and the role of Shari’a in family law, and the effects sometimes depended on the gender of the respondent. For support for gender equality in the public sphere, both male and female respondents reported less progressive attitudes to female interviewers wearing hijab than to other interviewer groups. For support for international standards of gender equality in family law, male respondents reported more liberal views to female interviewers who do not wear hijab, while female respondents reported more liberal views to female respondents, irrespective of dress. (source, source)

Other data indicate that the effect occurs in the U.S. as well. This is potentially a bigger problem than the framing effect since questions are usually public and can be verified by users of the survey results, whereas the nature of the questioner is not known to the users.

There’s an overview of some other effects here. More on the headscarf is here. More posts in this series are here.

Standard
democracy, lies and statistics, statistics

Lies, Damned Lies, and Statistics (31): Common Problems in Opinion Polls

truman and dewey and opinion polls

The classic polling error is from a poll on the 1948 Presidential election in the U.S. On Election night, the Chicago Tribune printed the headline DEWEY DEFEATS TRUMAN, which turned out to be mistaken. The reason the Tribune was mistaken is that their editor trusted the results of a phone survey. Survey research was then in its infancy, and few academics realized that a sample of telephone users was not representative of the general population. Telephones were not yet widespread, and those who had them tended to be prosperous and have stable addresses.

Opinion polls or surveys are very useful tools in human rights measurement. We can use them to measure public opinion on certain human rights violations, such as torture or gender discrimination. High levels of public approval of such rights violations may make them more common and more difficult to stop. And surveys can measure what governments don’t want to measure. Since we can’t trust oppressive governments to give accurate data on their own human rights record, surveys may fill in the blanks. Although even that won’t work if the government is so utterly totalitarian that it doesn’t allow private or international polling of its citizens, or if it has scared its citizens to such an extent that they won’t participate honestly in anonymous surveys.

But apart from physical access and respondent honesty in the most dictatorial regimes, polling in general is vulnerable to mistakes and fraud (fraud being a conscious mistake). Here’s an overview of the issues that can mess up public opinion surveys, inadvertently or not.

Wording effect

There’s the well-known problem of question wording, which I’ve discussed in detail before. Pollsters should avoid leading questions, questions that are put in such a way that they pressure people to give a certain answer, questions that are confusing or easily misinterpreted, wordy questions, questions using jargon, abbreviations or difficult terms, double or triple questions etc. Also quite common are “silly questions”, questions that don’t have meaningful or clear answers: for example “is the catholic church a force for good in the world?” What on earth can you answer to that? Depends on what elements of the church you’re talking about, what circumstances, country or even historical period you’re asking about. The answer is most likely “yes and no”, and hence useless.

The importance of wording is illustrated by the often substantial effects of small modifications in survey questions. Even the replacement of a single word by another, related word, can radically change survey results: see this post for examples.

Of course, one often claims that biased poll questions corrupt the average survey responses, but that the overall results of the survey can still be used to learn about time trends and difference between groups. As long as you make a mistake consistently, you may still find something useful. That’s true, but no reason not to take care of wording. The same trends and differences can be seen in survey results that have been produced with correctly worded questions.

Order effect or contamination effect

Answers to questions depend on the order they’re asked in, and especially on the questions that preceded. Here’s an example:

Fox News yesterday came out with a poll that suggested that just 33 percent of registered voters favor the Democrats’ health care reform package, versus 55 percent opposed. … The Fox News numbers on health care, however, have consistently been worse for Democrats than those shown by other pollsters. (source)

The problem is not the framing of the question. This was the question: “Based on what you know about the health care reform legislation being considered right now, do you favor or oppose the plan?” Nothing wrong with that.

So how can Fox News ask a seemingly unbiased question of a seemingly unbiased sample and come up with what seems to be a biased result? The answer may have to do with the questions Fox asks before the question on health care. … the health care questions weren’t asked separately. Instead, they were questions #27-35 of their larger, national poll. … And what were some of those questions? Here are a few: … Do you think President Obama apologizes too much to the rest of the world for past U.S. policies? Do you think the Obama administration is proposing more government spending than American taxpayers can afford, or not? Do you think the size of the national debt is so large it is hurting the future of the country? … These questions run the gamut slightly leading to full-frontal Republican talking points. … A respondent who hears these questions, particularly the series of questions on the national debt, is going to be primed to react somewhat unfavorably to the mention of another big Democratic spending program like health care. And evidently, an unusually high number of them do. … when you ask biased questions first, they are infectious, potentially poisoning everything that comes below. (source)

If you want to avoid this mistake – if we can call it that (since in this case it’s quite likely to have been a “conscious mistake” aka fraud) – randomizing the question order for each respondent might help.

Similar to the order effect is the effect created by follow-up questions. It’s well-known that follow-up questions of the type “but what if…” or “would you change your mind if …” change the answers to the initial questions.

Bradley effect

tom bradley

Tom Bradley

The Bradley effect is a theory proposed to explain observed discrepancies between voter opinion polls and election outcomes in some U.S. government elections where a white candidate and a non-white candidate run against each other.

Contrary to the wording and order effects, this isn’t an effect created – intentionally or not – by the pollster, but by the respondents. The theory proposes that some voters tend to tell pollsters that they are undecided or likely to vote for a black candidate, and yet, on election day, vote for the white opponent. It was named after Los Angeles Mayor Tom Bradley, an African-American who lost the 1982 California governor’s race despite being ahead in voter polls going into the elections.

The probable cause of this effect is the phenomenon of social desirability bias. Some white respondents may give a certain answer for fear that, by stating their true preference, they will open themselves to criticism of racial motivation. They may feel under pressure to provide a politically correct answer. The existence of the effect is, however, disputed. (Some say the election of Obama disproves the effect, thereby making another statistical mistake).

Fatigue effect

Another effect created by the respondents rather than the pollsters is the fatigue effect. As respondents grow increasingly tired over the course of long interviews, the accuracy of their responses could decrease. They may be able to find shortcuts to shorten the interview; they may figure out a pattern (for example that only positive or only negative answers trigger follow-up questions). Or they may just give up halfway, causing incompletion bias.

However, this effect isn’t entirely due to respondents. Survey design can be at fault as well: there may be repetitive questioning (sometimes deliberately for control purposes), the survey may be too long or longer than initially promised, or the pollster may want to make his life easier and group different polls into one (which is what seems to have happened in the Fox poll mentioned above, creating an order effect – but that’s the charitable view of course). Fatigue effect may also be caused by a pollster interviewing people who don’t care much about the topic.

Sampling effect

Ideally, the sample of people who are to be interviewed for a survey should represent a fully random subset of the entire population. That means that every person in the population should have an equal chance of being included in the sample. That means that there shouldn’t be self-selection (a typical flaw in many if not all internet surveys of the “Polldaddy” variety) or self-deselection. That reduces the randomness of the sample, which can be seen from the fact that self-selection leads to polarized results. The size of the sample is also important. Samples that are too small typically produce biased results.

Even the determination of the total population from which the sample is taken, can lead to biased results. And yes, that has to be determined… For example, do we include inmates, illegal immigrants etc. in the population? See here for some examples of the consequences of such choices.

House effect

A house effect occurs when there are systematic differences in the way that a particular pollster’s surveys tend to lean toward one or the other party’s candidates; Rasmussen is known for that.

I probably forgot an effect or two. Fill in the blanksif you care. Go here for other posts in this series.

Standard
data, economics, lies and statistics, statistics, trade

Lies, Damned Lies, and Statistics (30): Failing to Correct for Inflation

Inflation is often a significant part of growth in any time series measured in dollars (or other currencies), or – in other words – it’s an important part of an increase over time in data expressed in dollars. So when you compare data for the current year, month or whatever with the same data for some period in the past, you may just see inflation rather than actual growth or increases. By adjusting for inflation, you uncover the real growth. You may even discover that growth hides decline. Here’s an innocuous example of the consequences of failing to adjust data for inflation:

Over the last month, newspapers and film Web sites have proclaimed Avatar the highest-grossing film in American history. … Moviegoers in [the U.S.] have now spent about $700 million on tickets to Avatar. … No. 2 on the all-time list is Titanic, which brought in about $600 million. Avatar surpassed Titanic in late January. The problem with these numbers is that they aren’t adjusted for inflation. … When you adjust movie grosses for inflation, as Box Office Mojo does, you see that “Gone With the Wind” remains the top-grossing movie of all time, with $1.5 billion in box-office sales (using today’s dollars). (source)

This won’t do much damage. The problems start when unadjusted data are being used to push a political point or legislation. For example, one can claim that it isn’t a good idea to raise gasoline taxes because gasoline prices are already very high compared to the old days, but this claim loses much of its strength when you adjust the prices for inflation and it turns out that they are actually rather average, historically.

Of course, you can make mistakes while trying to adjust for inflation, and there are several techniques available, none of which will provide the same numbers. But any adjustment, especially for comparisons over long periods of time, are better than no adjustment at all.

There’s a cool inflation adjusting tool here (only for U.S. data I’m afraid).

More lying with statistics. More on taxation. More on inflation.

Standard
discrimination and hate, equality, lies and statistics, statistics

Lies, Damned Lies, and Statistics (29): How (Not) to Frame Survey Questions, Ctd.

Following up from an older post on the importance of survey questions, here’s a nice example of the way in which small modifications in survey questions can radically change survey results:

homosexual or gay importance of survey questions

(source, source, source)

Another example:

Our survey asked the following familiar question concerning the “right to die”: “When a person has a disease that cannot be cured and is living in severe pain, do you think doctors should or should not be allowed by law to assist the patient to commit suicide if the patient requests it?”

57 percent said “doctors should be allowed,” and 42 percent said “doctors should not be allowed.” As Joshua Green and Matthew Jarvis explore in their chapter in our book, the response patterns to euthanasia questions will often differ based on framing. Framing that refers to “severe pain” and “physicians” will often lead to higher support for ending the patient’s life, while including the word “suicide” will dramatically lower support. (source)

Similarly, seniors are willing to pay considerably more for “medications” than for “drugs” or “medicine” (source). Yet another example involves the use of “Wall Street”: there’s greater public support for banking reform when the issue is more specifically framed as regulating “Wall Street banks”.

survey wording effect

(source)

What’s the cause of this sensitivity? Difficult to tell. Cognitive bias probably has some effect, and the psychology of associations (“suicide” brings up images of blood and pain, whereas ”physicians” brings up images of control; similarly “homosexual” evokes sleazy bars, “gay” evokes art and design types). Maybe the willingness not to offend the person asking the question. Anyway, the conclusion is that pollsters should be very careful when framing questions. One tactic could be to use as many different words and synonyms as possible in order to avoid a bias created by one particular word.

More on DADT and homosexuals in the military. More on assisted suicide. More on lying with statistics.

Standard
democracy, lies and statistics, statistics

Lies, Damned Lies, and Statistics (28): Push Polls

Push polls are used in election campaigns, not to gather information about public opinion, but to modify public opinion in favor of a certain candidate, or – more commonly – against a certain candidate. They are called “push” polls because they intend to “push” the people polled towards a certain point of view.

Push polls are not cases of “lying with statistics” as we usually understand them in this blog series, but it’s appropriate to talk about them since they are very similar to a “lying technique” that we discussed many times, namely leading questions (see here for example). The difference here is that leading questions aren’t used to manipulate poll results, but to manipulate people.

push pollThe push poll isn’t really a poll at all, since the purpose isn’t information gathering. Which is why many people don’t like the term and label it oxymoronic. A better term indeed would be advocacy telephone campaigns. A push poll  is more like a gossip campaign, a propaganda effort or telemarketing. They’re very similar to political attack ads, in the sense that they intend to smear candidates, often with little basis in facts. Compared to political ads, push polls have the “advantage” that they don’t seem to emanate from the campaign offices of one of the candidates. (Push polls are typically conducted by bogus polling agencies). Hence it’s more difficult for the recipients of the push poll to classify the “information” contained in the push poll as political propaganda. He or she is therefore more likely to believe the information. Which is of course the reason push polls are used. Also, the fact that they are presented as “polls” rather than campaign messages, makes it more likely that people listen, and as they listen more, they internalize the messages better than in the case of outright campaigning (which they often dismiss as propaganda).

Push polls usually, but not necessarily, contain lies or false rumors. They may also be limited to misleading or leading questions. For example, a push poll may ask people: “Do you think that the widespread and persistent rumors about Obama’s Muslim faith, based on his own statements, connections and acquaintances, are true?”. Some push polls may even contain some true but unpleasant facts about a candidate, and then hammer on these facts in order to change the opinions of the people being “polled”.

One infamous example of a push poll was the poll used by Bush against McCain in the Republican primaries of 2000 (insinuating that McCain had an illegitimate black child), or the poll used by McCain (fast learner!) against Obama in 2008 (alleging that Obama had ties with the PLO).

One way to distinguish legitimate polls from push polls is the sample size. The former are usually content with relatively small sample sizes (but not too small), whereas the latter typically want to “reach” as many people as possible. Push polls won’t include demographic questions about the people being polled (gender, age, etc.) since there is no intention to aggregate results, let alone aggregate by type of respondent. Another way to identify push polls is the selection of the target population: normal polls try to reach a random subset of the population; push polls are often targeted at certain types of voters, namely those likely to be swayed by negative campaigning about a certain candidate. Push polls also tend to be quite short compared to regular polls, since the purpose is to reach a maximum number of people.

More posts about lying with statistics.

Standard
health, lies and statistics, statistics

Lies, Damned Lies, and Statistics (27): Jumping to Conclusions, Ctd.

poll healthcare reform anti-obama racism

(source, cartoon by Eric Allie)

In a previous post in this series, I already mentioned the temptation to see things in data that just aren’t there, or to make data say things they don’t really say. I focused on the correlation-causation problem, a typical case of “jumping to conclusions”.

Elsewhere I gave the following example: there are data doing the rounds claiming that Republicans follow political news more closely than Democrats, which has some people saying that Republicans are more knowledgable and make better political choices. However, people don’t read more news because they are Republicans, but because they are relatively wealthy and older, and when they are they also tend to be more of the Republican type. So if you see data showing a correlation between political conservatism and attention to the news, don’t jump to conclusions and say that conservatives are inherently more attentive to the news, let alone that they make better political choices. A young and relatively poor conservative probably pays less attention than a wealthy and older liberal. Attention isn’t a function of political orientation. It has other causes.

However, as is evident from the cartoon above, data don’t have to be of the correlation type for people to see things in them that aren’t there. People have indeed interpreted popular rejection of healthcare reform or of the Obama administration in general as an expression of underlying racism, as if there can’t be any other reasons for rejection.*

Regarding the specific issue mentioned in the cartoon, there’s also another interesting statistical point related to the difficulty of doing a good survey (see also here, here and here):

Polling on the health-care bill is … complicated. Voters don’t know much about the plan. Most disapprove of it, but many disapprove because they want to see it go further. (source)

So there’s a “double jump” to conclusions in the cartoon:

  • First, jumping from disapproval of healthcare reform to anti-Obama racism (blaming the former on the latter when this isn’t shown by the data), which is ridiculed, rightly to the extent that it is something real.
  • Second, jumping from disapproval ratings on “something” to disapproval ratings on “healthcare reform”. The data only show that people disapprove of “something”: people may disapprove of only a part of healthcare reform, or may disapprove of the fact that it doesn’t go far enough rather than disapprove of reform as such; or they may disapprove of something that is not really proposed and hence misunderstand the whole thing and base their disapproval on lack of knowledge. Needless to say, this second jump in the cartoon is quite unconscious and probably not on purpose.

All this jumping is quite understandable. We always have to interpret data, and we can easily lose our way in the process. It’s also tempting to “find” explanations for data that fit with our pre-established opinions and biases.

* Personally, I’m in favor of reform.

More posts in this series are here. More on healthcare. More on racism.

Standard
law, lies and statistics, statistics

Lies, Damned Lies, and Statistics (26): Objects in Statistics May Appear Bigger Than They Are, Ctd.

I’ve mentioned in a previous post how some numbers or stats can make a problem appear much bigger than it really is (the case in the previous post was about the numbers of suicides in a particular company). The error – or fraud, depending on the motivation – lies in the absence of a comparison with a “normal” number (in the previous post, people failed to compare the number of suicides in the company with the total number of suicides in the country, which made them leap to conclusions about “company stress”, “hyper-capitalism”, “worker exploitation” etc.).

The error is, in other words, absence of context and of distance from the “fait divers”. I’ve now come across a similar example, cited by Aleks Jakulin here. As you know, one of the favorite controversies (some would say nontroversies) of the American right wing is the fate of the prisoners at Guantanamo. President Obama has vowed to close the prison, and either release those who cannot be charged or tranfer them to prisons on the mainland. Many conservatives fear that releasing them would endanger America (some even believe that locking them away in supermax prisons on the mainland is a risk not worth taking). Even those who can’t be charged with a crime, they say, may be a threat in the future. I won’t deal with the perverse nature of this kind of reasoning, except to say that it would justify arbitrary and indefinite detention of large groups of “risky” people.

What I want to deal with here is one of the “facts” that conservatives cite in order to substantiate their fears: recidivism by former Guantanamo detainees.

Pentagon officials have not released updated statistics on recidivism, but the unclassified report from April says 74 individuals, or 14 percent of former detainees, have turned to or are suspected of having turned to terrorism activity since their release.

Of the more than 530 detainees released from the prison between 2002 and last spring, 27 were confirmed to have engaged in terrorist activities and 47 were suspected of participating in a terrorist act, according to Pentagon statistics cited in the spring report. (source)

Such and other stats are ostentatiously displayed and repeated by partisan mouthpieces as a means to scare the s*** out of us, and keep possibly innocent people in jail. The problem is that the levels of recidivism cited above, are way below normal levels of recidivism:

[In the] general population, … about 65% of prisoners are expected to be rearrested within 3 years. The numbers seem lower in recent years, about 58%. More at Wikipedia. (source)

Another post on risk is here. There are more posts in this blog series here.

Standard
lies and statistics, statistics

Lies, Damned Lies, and Statistics (25): Misleading Headlines

I read the following headline in a local newspaper recently:

Most prisoners escape from their cells

My first reaction was: Christ! What’s the world coming to! We can’t keep the majority of prisoners inside? It turned out that what they wanted to say was that prisoners, when they escape, do so from their cell, most of the time. Other prisoners escape from the workshop, while being transported etc. That looks much better already.

More serious posts in this series are here.

Standard
democracy, education, lies and statistics, statistics

Lies, Damned Lies, and Statistics (24): Mistakes in the Direction of Causation

penguin cartoon global warming direction of causation

Time for a more lighthearted post in this blog series. Suppose you find a correlation between two phenomena. And you’re tempted to conclude that there’s a causal relation as well. The problem is that this causal relation – if it exists at all – can go either way. It’s a common mistake – or a case of fraud, as it happens - to choose one direction of causation and forget that the real causal link can go the other way, or both ways at the same time.

An example. We often think that people who play violent video games are more likely to show violent behavior because they are incited by the games to copy the violence in real life. But can it not be that people who are more prone to violence are more fond of violent video games? (See also here). We choose a direction of causation that fits with our pre-existing beliefs.

Another widely shared belief is that uninformed and uneducated voters will destroy democracy, or at least diminish its value (see here, here and here). No one seems to ask the question whether it’s not a diminished form of democracy that renders citizens apathetic and uninformed. Maybe a full or deep democracy can encourage citizens to participate and become more knowledgeable through participation.

A classic example is the correlation between education levels and GDP (see also here). Do countries with higher education levels experience more economic growth because of the education levels of their citizens? Or is it that richer countries can afford to spend more on education and hence have better educated citizens? Probably both.

Another cartoon that expresses the same risk:

dilbert direction of causation

(source)

More posts in this blog series.

Standard
lies and statistics, statistics

Lies, Damned Lies, and Statistics (23): The Omitted Variable Bias, Ctd.

I explained what I mean by “omitted variable bias” in a previous post in this series, so go there first if the following isn’t immediately clear. (In a few words: you see a correlation between two variables, for example clever people wear fancy clothes. Then you assume that one variable must cause the other, in our case: a higher intellect gives people also a better sense of aesthetic taste, or good taste in clothing somehow also makes people smarter. In fact, you may be overlooking a third variable which explains the other two, as well as their correlation. In our case: clever people earn more money, which makes it easier to buy your clothes in shops which help you with your aesthetics. Nonsense, I know, but it’s just to make a point).

I gave a few examples in the previous post, but found some others in the meantime. This one’s from Nate Silver’s blog:

Gallup has some interesting data out on the percentage of Americans who pay a lot of attention to political news. Although the share of Americans following politics has increased substantially among partisans of all sides, it is considerably higher among Republicans than among Democrats:

attention to political news

The omitted variable here is age, and the data should be corrected for it in order to properly compare these two populations.

News tends to be consumed by people who are older and wealthier, which is more characteristic of Republicans than Democrats.

People don’t read more or less news because they are Republicans or Democrats. And here’s another one from Matthew Yglesias’ blog:

It’s true that surveys indicate that gay marriage is wildly popular among DC whites and moderately unpopular among DC blacks, but I think it’s a bit misleading to really see this as a “racial divide”. Nobody would be surprised to learn about a community where college educated people had substantially more left-wing views on gay rights than did working class people. And it just happens to be the case that there are hardly any working class white people living in DC. Meanwhile, with a 34-48 pro-con split it’s hardly as if black Washington stands uniformly in opposition—there’s a division of views reflecting the diverse nature of the city’s black population.

More on same-sex marriage herehere and here. More posts in this series here.

Standard
health, lies and statistics, statistics, work

Lies, Damned Lies, and Statistics (22): Objects in Statistics May Appear Bigger Than They Are

France Telecom employees gather in memory of a colleague who committed suicide last July. Photograph: Anne-christine Poujoulat/AFP/Getty Images

France Telecom employees gather in memory of a colleague who committed suicide last July. Photograph: Anne-christine Poujoulat/AFP/Getty Images

(source)

A morbid one in our ongoing series on mistakes and lies in statistics, from a news report some weeks ago:

French Finance Minister Christine Lagarde Thursday voiced her support for France Telecom’s chief executive, who is coming under increased pressure from French unions and opposition politicians over a recent spate of suicides at the company.

Ms. Lagarde summoned France Telecom CEO Didier Lombard to a meeting after the telecommunications company confirmed earlier this week that one of its employees had committed suicide. It was the 24th suicide at the company in 18 months.

In a statement released after the meeting, Ms. Lagarde said she had “full confidence” that Mr. Lombard could get the company through “this difficult and painful moment.”

The French state, which owns a 27% stake in France Telecom, has been keeping a close eye on the company, following complaints by unions that a continuing restructuring plan at the company is putting workers under undue stress.

The suicide rate among the company’s 100,000 employees is in line with France’s national average. Still, unions say that the relocation of staff to different branches of the company around France has added pressure onto employees and their families.

On Tuesday, a spokesman for France’s opposition Socialist Party called for France Telecom’s top management to take responsibility for the suicides and step down. Several hundred France Telecom workers also took to the streets to protest against working conditions.

In the statement released after Thursday’s meeting, France’s Finance Ministry said Mr. Lombard had set up an emergency hotline aimed at providing help to depressed workers. The company has also increased the number of psychologists available to staffers, according to the statement. (source)

More on the problems caused by averages is here.

Standard
comedy, equality, lies and statistics, poverty, statistics

Lies, Damned Lies, and Statistics (21): Misleading Averages

Did you hear the joke about the statistician who put her head in the oven and her feet in the refrigerator? She said, “On average, I feel just fine.” That’s the same message as in this more widely known joke about statisticians:

statistician drowning in a pond with an average depth of 3ft

statistician drowning in a pond with an average depth of 3ft

(source)

And then there’s this one: did you know that the great majority of people have more than the average number of legs? It’s obvious, really: Among the 57 million people in Britain, there are probably 5,000 people who have only one leg. Therefore, the average number of legs is

average

And because most people have two legs, they have more legs than the average number of legs (1.9999123) (source). In this case, the median would be a better measure than the average or the mean.

But seriously now, averages can be very misleading, also in statistical work in the field of human rights. Take income data, for example. Income as such isn’t a human rights issue, but poverty is, as well as income inequality. When we look at income data, we may see that average income is rising. However, this may be due to extreme increases at the top 1% of income. If you then exclude the income increases of the top 1% of the population, the large majority of people may not experience rising income. Possible even the opposite. And rising average income – even excluding extremes at the top levels – is perfectly compatible with rising poverty for certain parts of the population.

Averages are often skewed by outliers. That is why it’s often necessary to remove outliers and calculate the averages without them. That will give you a better picture of the characteristics of the general population (the “real” average income evolution in my example). A simple way to neutralize outliers is to look at the median – the middle value of a series of values – rather than the average (or the mean).

An average (or a median for that matter) also doesn’t say anything about the extremes (or, in stat-speak, about the variability or dispersion of the population). A high average income level can hide extremely low and high income levels for certain parts of the population. So, for example, if you start to compare income levels across different countries, you’ll use the average income. Yet country A may have a lower average income than country B, but also lower levels of poverty than country B. That’s because the dispersion of income levels in country A is much smaller than in country B. The average in B is the result of adding together extremely low incomes (i.e. poverty) and extremely high incomes, whereas the average in A comes from the sum of incomes that are much more equal. From the point of view of poverty – which is a human rights issue – average income is misleading because it identifies country A as most poor, whereas in reality there are more poor people in country B. So when looking at averages, it’s always good to look at the standard deviation as well. SD is a measure of the dispersion around the mean.

More posts in this series.

Standard
discrimination and hate, equality, justice, lies and statistics, poverty, statistics

Lies, Damned Lies, and Statistics (19): Fun With Percentages

Affirmative action cartoon

Affirmative action cartoon

(source, cartoon by Rob Rogers)

This instalment in the series on “how to lie with statistics” deals with an example of “honest mistake followed by conclusions based on prejudice” rather than “outright lie”. (At least I hope. It’s often extremely difficult to distinguish between mistakes and manipulation when confronted with misuse of statistics).

It’s about the dangers of using percentages, and I’ll start with a funny anecdote. A certain company discovered that 40% of all sick days were taken on a Friday or a Monday. They immediately clamped down on sick leave before they realised their mistake. Forty percent represents two days out of a five day working week and is therefore a normal spread. Nothing to do with lazy employees wishing to extend their weekends. They are just as sick on any other day.

A more serious example, now, more relevant also to human rights:

The stunning statistic that 70% of black babies are born out of wedlock is driven, to be sure, by the fact that many poor black women have a lot of children. But it turns out it is also driven by the fact that married black women have fewer children than married white women. (source)

The fact that married black women have fewer children than married white women obviously inflates the percentage of black babies born out of wedlock. If married black women had just as many children as married white women, the proportion or percentage of black babies out of wedlock would drop mechanically. But why do they have fewer children? It seems it’s a matter of being able to afford children.

It’s well known that the black middle class has a lot less in the way of assets than whites of similar income levels – hardly surprising, given the legacy of generations of discrimination and poverty. But that also means that things that a lot of white middle class people take for granted - like help with a down-payment on a house when you have your first kid – are less available. Middle class black parents have less in the way of a parental safety net than their white equivalents, so they’re less likely to have a second kid. (source)

The 70%, when compared to the national average which is about 40%, may seem high, but it’s artificially inflated by the relatively low number of black babies in wedlock. So before you go out yelling (see here for example) that all the poverty and educational problems of African-Americans are caused by the fact that too many of their children are born and raised out of wedlock, and presumably by single parents (although the latter doesn’t follow from the former), and that it’s better to promote “traditional marriage” instead of affirmative action, welfare etc., you may want to dig a bit deeper first. If you do, you’ll paint a more nuanced picture than the one about dysfunctional black families and irresponsible black fathers.

Nevertheless, while the percentages may not be as high as they seem at first glance, it remains true that black babies still make up a disproportionate share of kids born out of wedlock. And if “born out of wedlock” means “single parents” (usually mothers) then this can be a problem. Although many single parents do a great job raising their children (and often a better job than many “normal” families), it can be tough and the risks of ending up in poverty are much higher. And yet, even this is not enough to justify sermons about irresponsible black fathers. Maybe the misguided war on drugs, racial profiling and incarceration statistics have something to do with it.

More posts in this series.

Standard
lies and statistics, statistics, war

Lies, Damned Lies, and Statistics (18): Comparing Apples and Oranges

helmet bullet hole world war 1

(source)

Throughout this blog-series on abuses and mistakes in statistics, we’ve often seen how the failure to compare things that can be validly compared leads to error or deceit. Here’s another example: the introduction of tin helmets during the First World War. Before this introduction, soldiers only had cloth hats to wear. The strange thing was that after the introduction of tin hats, the number of injuries to the head increased dramatically. Needless to say, this was counter-intuitive. The new helmets were designed precisely to avoid or limit such injuries.

Of course, people were comparing apples with oranges, namely statistics on head injuries before and after the introduction of the new helmets. In fact, what they should have done, and effectively did after they realized their mistake, was to include in the statistics, not only the injuries, but also the fatalities. After the introduction of the new helmets, the number of fatalities dropped dramatically, but the number of injuries went up because the tin helmet was saving soldiers’ lives, but the soldiers were still injured.

Standard
education, lies and statistics, poverty, statistics, war

Lies, Damned Lies, and Statistics (17): The Correlation-Causation Problem and Omitted Variable Bias, aka “Jumping to Conclusions”

correlation vs causation

correlation vs causation

(source)

Some more detailed information after my casual remark on the correlation-causation problem. Here’s a fictitious example of what is meant by “Omitted Variable Bias“, a type of statistical bias that illustrates this problem. Suppose we see from Department of Defense data that male U.S. soldiers are more likely to be killed in action than female soldiers. Or, more precisely and in order to avoid another statistical error, the percentage of male soldiers killed in action is larger than the percentage of female soldiers. So there is a correlation between the gender of soldiers and the likelihood of being killed in action.

One could – and one often does – conclude from such a finding that there is a causation of some kind: the gender of soldiers increases the chances of being killed in action. Again more precisely: one can conclude that some aspects of gender – e.g. a male propensity for risk taking – leads to higher mortality.

However, it’s here that the Omitted Variable Bias pops up. The real cause of the discrepancy between male and female combat mortality may not be gender or a gender related thing, but a third element, an “omitted variable” which doesn’t show in the correlation. In our fictional example, it may be the type of deployment: it may be that male soldiers are more commonly deployed in dangerous combat operations, whereas female soldiers may be more active in support operations away from the front-line.

correlation and causation

correlation and causation

(source)

OK, time for a real example. It has to do with home-schooling. In the U.S., many parents decide to keep their children away from school and teach them at home. For different reasons: ideological ones, reasons that have to do with their children’s special needs etc. The reasons are not important here. What is important is that many people think that home-schooled children are somehow less well educated (parents, after all, aren’t trained teachers). However, proponents of home-schooling point to a study that found that these children score above average in tests. However, this is a correlation, not necessarily a causal link. It doesn’t prove that home-schooling is superior to traditional schooling. Parents who teach their children at home are, by definition, heavily involved in their children’s education. The children of such parents do above average in normal schooling as well. The omitted variable here is parents’ involvement. It’s not the fact that the children are schooled at home that explains their above average scores. It’s the type of parents. Instead of comparing home-schooled children to all other children, one should compare them to children from similar families in the traditional system.

correlation

(source)

Greg Mankiw believes he has found another example of Omitted Variable Bias in this graph plotting test scores for U.S. students against their family income:

sat scores by income

(source, the R-square for each test average/income range chart is about 0.95)

[T]he above graph … show[s] that kids from higher income families get higher average SAT scores. Of course! But so what? This fact tells us nothing about the causal impact of income on test scores. … This graph is a good example of omitted variable bias … The key omitted variable here is parents’ IQ. Smart parents make more money and pass those good genes on to their offspring. Suppose we were to graph average SAT scores by the number of bathrooms a student has in his or her family home. That curve would also likely slope upward. (After all, people with more money buy larger homes with more bathrooms.) But it would be a mistake to conclude that installing an extra toilet raises yours kids’ SAT scores. … It would be interesting to see the above graph reproduced for adopted children only. I bet that the curve would be a lot flatter. Greg Mankiw (source)

Meaning that adopted children, who usually don’t receive their genes from their new families, have equal test scores, no matter if they have been adopted by rich or poor families. Meaning in turn that the wealth of the family in which you are raised doesn’t influence your education level, test scores or intelligence.

However, in his typical hurry to discard all possible negative effects of poverty, Mankiw may have gone a bit too fast. While it’s not impossible that the correlation is fully explained by differences in parental IQ, other evidence points elsewhere. I’m always suspicious of theories that take one cause, exclude every other type of explanation and end up with a fully deterministic system, especially if the one cause that is selected is DNA. Life is more complex than that. Regarding this particular matter, take a look back at this post, which shows that education levels are to some extent determined by parental income (university enrollment is determined both by test scores and by parental income, even to the extent that people from high income families but with average test scores, are slightly more likely to enroll in university than people from poor families but with high test scores).

What Mankiw did, in trying to avoid the Omitted Variable Bias, was in fact another type of bias, one which we could call the Singular Variable Bias: assuming that a phenomenon has a singular cause. In honor of Professor Mankiw (who does some good work, see here for example), I propose that henceforth we call it the Mankiw Bias.

More posts in this series.

Standard
democracy, freedom, lies and statistics, statistics

Lies, Damned Lies, and Statistics (16): Measuring Public Opinion in Dictatorships

In earlier posts (here and here) I described the specific difficulties faced by those wanting to measure respect for human rights in dictatorial countries. Measuring human rights requires a certain level of respect for human rights (freedom to travel, freedom to speak, to interview etc.). Trying to measure human rights in situations characterized by the absence of freedom is quite difficult, and can even lead to unexpected results: the absence of (access to) good data may give the impression that things aren’t as bad as they really are. Conversely, when a measurement shows a deteriorating situation, the cause of this may simply be better access to better data. And this better access to better data may be the result of more openness in society. Deteriorating measurements may therefore signal an actual improvement. I gave an example of this dynamic here (it’s an example of statistics on violence against women).

The graph below is a case of the way in which oppression may actually produce measurements that signal a lack of oppression:

government approval ratings in post-soviet countries

(source, the “yes” answers in fact relate to the question “do you approve”; this is an example of a sloppy graph because the question in the title of the graph doesn’t permit the answer “yes” or “no”)

This is clearly a case of “lying with statistics”. Measuring public opinion in authoritarian countries is always difficult, but if you ask the public if they love or hate their government, it’s likely that you’ll have higher rates of “love” in the more authoritarian countries. After all, in those countries it can be pretty dangerous to tell someone in the street that you hate your government. They choose to lie and say that they approve. That’s the safest answer but probably in many cases not the real one. I don’t believe for a second that the percentage of people approving of their government is 19 times higher in Azerbaijan than in Ukraine, when Ukraine is in fact much more liberal than Azerbaijan.

In the words of Robert Coalson:

The Gallup chart is actually an index of fear. What it reflects is not so much attitudes toward the government as a willingness to openly express one’s attitudes toward the government. As one member of RFE/RL’s Azerbaijan Service told me, “If someone walked up to me in Baku and asked me what I thought about the government, I’d say it was great too”.

More posts in this series.

Standard
democracy, lies and statistics, statistics

Lies, Damned Lies, and Statistics (15): Measuring Public Opinion But Leaving Out a Chunk of the Public

There seems to be no end to the number of battles in our war against the abuse of statistics. Take a look at this graph:

presidential approval rasmussen

(source)

A poll of presidential approval ratings is a public opinion poll, so one expects to see the diverse opinions of the entire public represented in the results. That’s not the case here. As you can see, the numbers for the red and green lines don’t add up to 100%. Only the extreme opinions – strong approval and disapproval – are shown. Now, strictly speaking, there’s nothing to object: all necessary information is given, there’s no undue manipulation of the scales etc.

However, there’s approximately a third of public opinion that’s not included in this graph. At a minimum, this should have been made clear. I admit that my first, quick impression of this graph was that I was looking at a graph that shows the entirety of public opinion. Only after a few seconds of looking more closely did I realize that the graph doesn’t in fact offer a measurement of public opinion, but only of the opinion of the most outspoken parts of the public. Why not include a third and fourth line for “moderately (dis)approve”? Or, even better, include the moderates in the totals and just give the number for approval and disapproval, combining strong and moderate? What’s the added value of only showing the extremes? Or is this part of the current media culture?

I understand that it’s useful to know the strength of the groups who strongly approve and disapprove, but this is misleading. The graph as it is now clearly hints at a strong swing towards disapproval of Obama, but including the moderates could change that impression, and could, theoretically, show an increase in overall approval (moderate and strong). The difference between strong approval and strong disapproval is smaller than the total share of the moderates who are left out; if all or most of those moderates moderately approve (unlikely but possible), then the total approval ratings would be higher than the total disapproval ratings.

For example, the 2004 exit poll put George W. Bush’s strong approval at 33%, to strong disapproval of 34%. But his overall approval was 53% to disapproval at 46%, and he was re-elected 51%-48%. (source)

But maybe the point of this graph is precisely the creation of the impression that Obama is going down the drain. If that’s the case, then this is an example of statistical fraud. There’s no way to know this, however. One thing I know is that all this will strengthen the persistent criticism that Rasmussen, the author of the graph, has a republican bias.

I said before that strictly speaking, there’s nothing wrong with this graph, apart from the fact that it could have mentioned more explicitly that a large chunk of public opinion is left out. However, if we look at this graph against the background of contemporary politics, it becomes more problematic. Politics today is often a shouting match between extreme positions. Such a spectacle is, after all, more entertaining than intelligent discussions that look for a common ground and a real possibility of persuasion of the other side. Hence, cable TV and the internet promote this kind of “gladiator politics“. Graphs such as this one only drive people further down the cul-de-sac of us-against-them politics. I don’t believe democracy was intended to end up there.

Standard
justice, lies and statistics, statistics

Lies, Damned Lies, and Statistics (14): Leaving Out Relevant Explanatory Variables

Another installment in our ongoing series on mistakes and fraud in statistics. Here’s a graph explaining that the top 1 percent of U.S. taxpayers paid 40.42 percent of total federal income taxes in 2007, and did in fact pay more taxes than the bottom 95% (if you can call that a “bottom”):

tax burden of the top 1 percent of taxpayers

(source, source)

The point of this is obvious: those poor rich people pay too much in taxes, and pay more and more, presumably to finance the “state-beast” and welfare dependents. Such information is also used to argue against progressive taxation and in favor of “trickle down economics” (allowing the rich to prosper, and hence not taxing them disproportionately, is good for everyone, ultimately, because their wealth will trickle down to the rest of us).

There’s nothing theoretically, statistically or logically ”wrong” with this graph. What’s wrong and misleading is that it hides relevant explanatory information. Why do the 1% richest people pay an ever increasing share of the total amount of taxes collected? Is it because their tax rates have increased? In other words, is it because the government takes an ever increasing percentages of their income? Even those of us who favor a progressive tax system would admit that there are limits to this: it’s indeed economically unwise to discourage wealth creation, and there are undoubtedly some (albeit minor) trickle down effects.

However, it’s not the case that tax rates for the top 1% have risen, on the contrary:

tax rates top 1 percent

(source)

So then why do the rich pay more and more taxes? It’s simple: because they have become increasingly rich. Their incomes have risen sharply. These are data for the top 5%, but the top 1% have done just as well if not better (see here for top 1% data):

income inequality in the us

(source)

And because they earn more, they pay more in taxes, even with decreasing tax rates. A quote from the NY Times:

Here’s a chart showing the portion of adjusted gross income earned by the top 1 percent and by the bottom 95 percent. You’ll see that one major reason why the share of taxes paid by the richest Americans has risen is that the richest Americans have experienced much greater income growth:

gross income by income group

(source)

The pink line in this graph clearly correlates with the blue line in the first graph above. And if you don’t believe the NYT, look here.

So the first graph above, supposedly showing “an increasing tax burden”, is misleading. The top 1% do indeed pay more and more taxes, but that’s no reason to assume that they carry a heavier burden. On the contrary: their tax rates have fallen. So they pay an ever smaller share of an increasing income. Taken together, their shares do indeed represent an ever increasing share of total government tax revenues, but that’s because the tax base has increased. Presenting this as somehow unfair and increasingly burdensome is misleading because relevant explanatory information is hidden, or not mentioned. Statistics serve to explain, and if there’s an explanation for some data, those publishing this data should, in all honesty, provide this explanation. If not, then that’s a “lie of omission”.

Just to show that there’s no real unfair treatment of the rich, consider this graph, showing the after-tax incomes:

real after tax average income by income group

(source)
Standard
lies and statistics, statistics, work

Lies, Damned Lies, and Statistics (13): You’re Not Measuring What You Think You Are

unemployment

(source)

Another one in our series on intended and unintended mistakes in statistics. Take for instance unemployment or employment rates. (We’ve talked about this before in this series). Employment statistics usually measure the number of people at work or unemployed, the number of people claiming unemployment benefits, the number of jobs that are created or lost, etc. Especially during an economic recession, like the one we have now, people look anxiously at those statistics. However, during a recession, companies that are struggling may be unwilling to lay off people, either because they feel responsible for their employees, or because – less altruistically – they don’t want to lose valuable experience which they will need when the economy recovers. Many companies therefore choose to convince their people to work less hours, work part-time etc. Rather than dismissing some people, the burden of the recession is equally spread over all employees.

The phenomenon is called “labor hoarding” and it is attributable to the costs of finding, hiring and training new workers and the costs in terms of severance pay and morale when firing workers. Jeffrey Frankel (source)

However, a simple unemployment statistic composed of numbers of jobs or job losses will fail to notice this. In times of recession, such a statistic will underestimate real unemployment because it won’t include the partial unemployment in the companies that increase part-time work. So you think you are measuring unemployment, but actually you’re not, at least not completely or accurately. A better dataset is the average weekly hours worked. Or you could include the numbers of people who are involuntarily part-timers in the numbers of unemployed:

unemployment rate including involuntary part-timers

unemployment rate including involuntary part-timers

part-time for economic reasons
(source, marginally attached workers are persons not in the labor force who want and are available for work, and who have looked for a job sometime in the prior 12 months, but were not counted as unemployed because they had not searched for work in the 4 weeks preceding the survey; discouraged workers are a subset of the marginally attached)

More on the recession. More on unemployment. More on misleading statistics.

Standard
health, lies and statistics, statistics

Lies, Damned Lies, and Statistics (12): Generalization

induction

(source)

An example from Greg Mankiw’s blog:

Should we [the U.S.] envy European healthcare? Gary Becker says the answer is no:

“A recent excellent unpublished study by Samuel Preston and Jessica Ho of the University of Pennsylvania compare mortality rates for breast and prostate cancer. These are two of the most common and deadly forms of cancer – in the United States prostate cancer is the second leading cause of male cancer deaths, and breast cancer is the leading cause of female cancer deaths. These forms of cancer also appear to be less sensitive to known attributes of diet and other kinds of non-medical behavior than are lung cancer and many other cancers. [Health effects of diet and behavior should be excluded when comparing the quality of healthcare across countries. FS]

These authors show that the fraction of men receiving a PSA test, which is a test developed about 25 years ago to detect the presence of prostate cancer, is far higher in the US than in Sweden, France, and other countries that are usually said to have better health delivery systems. Similarly, the fraction of women receiving a mammogram, a test developed about 30 years ago to detect breast cancer, is also much higher in the US. The US also more aggressively treats both these (and other) cancers with surgery, radiation, and chemotherapy than do other countries.

Preston and Hu show that this more aggressive detection and treatment were apparently effective in producing a better bottom line since death rates from breast and prostate cancer declined during the past 20 [years] by much more in the US than in 15 comparison countries of Europe and Japan.” (source)

Even if all this is true, how on earth can you assume that a healthcare system is better because it is more successful in treating two (2!) diseases? See here and here for a more complete picture.

Another example: the website of the National Alert Registry for sexual offenders used to post a few “quick facts”. One of them said:

“The chance that your child will become a victim of a sexual offender is 1 in 3 for girls… Source: The National Center for Victims of Crime“.

Someone took the trouble of actually checking this source, and found that it said:

Twenty-nine percent [i.e. approx. 1 in 3] of female rape victims in America were younger than eleven when they were raped.

One in three rape victims is a young girl, but you can’t generalize from that by saying that one in three young girls will be the victim of rape. Perhaps they will be, but you can’t know that from these data. Like you can’t conclude from the way the U.S. deals with two diseases that it “shouldn’t envy European healthcare”. Perhaps it shouldn’t, but more general data on life expectancy says it should.

These are two examples of induction or inductive reasoning, sometimes called inductive logic, a reasoning which formulates laws based on limited observations of recurring phenomenal patterns. Induction is employed, for example, in using specific propositions such as:

This door is made of wood.

to infer general propositions such as:

All doors are made of wood. (source)

More posts in this series.

Standard
lies and statistics, statistics

Lies, Damned Lies, and Statistics (11): Polarized Statistics as a Result of Self-Selection

shouting match, composite image of works by Michelangelo and Grunewald

shouting match, composite image of works by Michelangelo and Grünewald

I’ve stated many times before in this series about errors and lies in statistics that one of the most important things in the design of an opinion survey - and opinion surveys are a common tool in data gathering in the field of human rights – is the definition of the sample of people who will be interviewed. We can only assume that the answers given by the people in the sample are representative of the opinions of the entire population if the sample is a fully random subset of the population – that means that every person in the population should have an equal chance of being part of the survey group.

Unfortunately, many surveys depend on self-selection – people get to decide themselves if they cooperate – and self-selection distorts the randomness of the sample:

Those individuals who are highly motivated to respond, typically individuals who have strong opinions, are overrepresented, and individuals that are indifferent or apathetic are less likely to respond. This often leads to a polarization of responses with extreme perspectives being given a disproportionate weight in the summary. (source)

Self-selection is almost always a problem in online surveys (of the PollDaddy variety), phone-in surveys for television or radio shows, and so-called “red-button” surveys in which people vote with the remote control of their television set. However, it can also occur in more traditional types of surveys. When you survey the population of a brutal dictatorial state (if you get the chance) and ask the people about their freedoms and rights, many will deselect themselves: they will refuse to cooperate with the survey for fear of the consequences.

us vs them

When we limit ourselves to the effects of self-selection (or self-deselection) in democratic states, we may find that this has something to do with the often ugly and stupid “us-and-them” character of much of contemporary politics. There seems to be less and less room for middle ground, compromise or nuance (the president of the country is either a Muslim-socialist terrorist friend, or a warmongering Texas hillbilly idiot).

Standard
discrimination and hate, lies and statistics, statistics

Lies, Damned Lies, and Statistics (10): How (Not) to Frame Survey Questions

leading survey questions

(source, some more background on the controversial Lancet survey of casualties in Iraq following the U.S. invasion)

I’ve mentioned before that information on human rights depends heavily on opinion surveys. Unfortunately, surveys can be wrong and misleading for so many different reasons that we have to be very careful when designing surveys and when using and interpreting survey data. One reason I haven’t mentioned before is the framing of the questions.

Even very small differences in framing can produce widely divergent answers. And there is a wide variety of problems linked to the framing of questions:

  • Questions can be leading questions, questions that suggests the answer. For example: “It’s wrong to discriminate against people of another race, isn’t it?” Or: “Don’t you agree that discrimination is wrong?”
  • Questions can be put in such a way that they put pressure on people to give a certain answer. For example: “Most reasonable people think racism is wrong. Are you one of them?” This is also a leading question of course, but it’s more than simply ”leading”.
  • Questions can be confusing or easily misinterpreted. Such questions often include a negative, or, worse, a double negative. For example: “Do you agree that it isn’t wrong to discriminate under no circumstances?” Needless to say that your survey results will be infected by answers that are the opposite of what they should have been.
  • Questions can be wordy. For example: “What do you think about discrimination (a term that refers to treatment taken toward or against a person of a certain group that is based on class or category rather than individual merit) as a type of behavior that promotes a certain group at the expense of another?” This is obviously a subtype of the confusing-variety.
  • Questions can also be confusing because they use jargon, abbreviations or difficult terms. For example: “Do you believe that UNESCO and ECOSOC should administer peer-to-peer expertise regarding discrimination in an ad hoc or a systemic way?”
  • Questions can in fact be double or even triple questions, but there is only one answer required and allowed. Hence people who may have opposing answers to the two or three sub-questions will find it difficult to provide a clear answer. For example: “Do you agree that racism is a problem and that the government should do something about it?”
  • Open questions should be avoided in a survey. For example: “What do you think about discrimination?” Such questions do not yield answers that can be quantified and aggregated.
  • You also shouldn’t ask questions that exclude some possible answers, and neither should you provide a multiple-choice set of answers that doesn’t include some possible answers. For example: “How much did the government improve its anti-discrimination efforts relative to last year? Somewhat? Average? A lot?” Notice that such a framing of the question doesn’t allow people to respond that the effort had not improved or had worsened. Another example: failure to include “don’t know” as a possible answer.

Here’s a real-life example:

In one of the most infamous examples of flawed polling, a 1992 poll conducted by the Roper organization for the American Jewish Committee found that 1 in 5 Americans doubted that the Holocaust occurred. How could 22 percent of Americans report being Holocaust deniers? The answer became clear when the original question was re-examined: “Does it seem possible or does it seem impossible to you that the Nazi extermination of the Jews never happened?” This awkwardly-phrased question contains a confusing double-negative which led many to report the opposite of what they believed. Embarrassed Roper officials apologized, and later polls, asking clear, unambiguous questions, found that only about 2 percent of Americans doubt the Holocaust. (source) (More on holocaust denial here).

Holocaust Denial Joke

(source)

More posts in this series on “statistics gone wrong”.

Standard
discrimination and hate, equality, lies and statistics, statistics

Lies, Damned Lies, and Statistics (9): Too Small Sample Sizes in Surveys

counting girl

(source)

As I’ve stated before in this series about errors and lies in statistics, many things can go wrong in the design and execution of opinion surveys. And opinion surveys are a common tool in data gathering in the field of human rights.

As it’s often impossible (and undesirable) to question a whole population, statisticians usually select a sample from the population and ask their questions only to the people in this sample. They assume that the answers given by the people in the sample are representative of the opinions of the entire population. But that’s only the case if the sample is a fully random subset of the population – that means that every person in the population should have an equal chance of being chosen – and if the sample hasn’t been distorted by other factors such as self-selection by respondents (a common thing in internet polls) or personal bias by the statistician who selects the sample.

A sample that is too small is also not representative for the entire population. For example, if we ask 100 people if they approve or disapprove of discrimination of homosexuals, and 55 of them say they approve, we might assume that about 55% of the entire population approves. Now it could possible be that only 45% of the total population approve, but that we just happened, by chance, to interview an unusually large percentage of people who approve. For example, this may have happened because, by chance and without being aware of it, we selected the people in our sample in such a way that there are more religious conservatives in our sample than there are in society, relatively speaking.

This is the problem of sample size: the smaller the sample, the greater the influence of luck on the results we get. Asking the opinion of 100 people, and taking this as representative of millions of citizens, is like throwing a coin 10 times and assuming – after having 3 heads and 7 tails – that the probability of throwing heads is 30%. We all know that it’s not 30 but 50%. And we know this because we know that when we increase the “sample size” - i.e. when we throw more than 10 times, say a thousand times – we will have heads and tails approximately half of the time. Likewise, if we take our example of the survey on homosexuality: increasing the sample size reduces the chance that religious conservatives (or other groups) are disproportionately represented in the sample.

When analyzing survey results, the first thing to look at is the sample size, as well as the level of confidence (usually 95%) that the results are within a certain margin of error (usually + or – 5%). High levels of confidence that the results are correct within a small margin of error indicate that the sample was sufficiently large and random.

Standard
lies and statistics, statistics, work

Lies, Damned Lies, and Statistics (8): Failure to Divide by Population

I often see graphs that contain a time series of some sort, but the numbers are just plain numbers, not normalized by population. Here’s an example of a graph from the Bush-era, flaunting the supposedly beneficial effects of Bush’s labor policy on job growth (green line, “jobs on the rise”, number of jobs in thousands):

graph number of jobs

(source)

Just presenting the numbers of job without relating them to the population, is meaningless. Maybe the population grew faster than the number of jobs, in which case the growth exhibited here is in fact a decrease. Or the population shrunk, in which case the growth in the number of jobs was even bigger.

Here’s the correct graph, showing that employment did increase under Bush, but decreased during the last years of his presidency:

employment-population ratio

employment-population ratio

“Population” can mean actual population (i.e. people or residents), but can also mean any other relevant basis of comparison. For example:

The following statistics suggest that 16-year-olds are safer drivers than people in their twenties, and that octogenarians are very safe:

misleading stat accidents

(source)

As the following graph shows, the reason 16-year-old and octogenarians appear to be safe drivers is that they don’t drive nearly as much as people in other age groups:

misleading stat accidents2

(source)

Another example is the national debt statistic. Often the graph shows just the national debt in dollar, without relating it to GDP. Whereas the absolute amounts do have some relevancy, it’s better to express the debt as a percentage of GDP because a bigger economy can carry a bigger debt (a poor household may go bankrupt with a debt of $10,000, whereas a rich household can live with a debt of perhaps $100,000).

Take this graph for instance:

us national debt corrected for inflation

(source)

Now compare it to this one:

National debt as a  percentage of gdp

(source)

Or this, slightly more recent one, including the latest recession:

National debt as percentage of GDP

National debt as percentage of GDP

(source)

And a final example: looking at the relative safety of air travel and road travel and the probability of dying in either a road accident or a plane accident, you can also find divergent data depending on how you divide: number of casualties per trip, per miles traveled, per hours traveled etc.

More posts in this series.

Standard
lies and statistics, statistics

Lies, Damned Lies, and Statistics (7): “Drowning” Data

40 percent of Statistics Are Wrong

(source)

Those who want to cover up human rights violations often modify statistics in such a way that they don’t really make a voluntary mistake. For example, they can change the unit of measurement. Suppose we want to know how many forced disappearances there are in Chechnya. Assuming we have good data this isn’t hard to do. The number of disappearances that have been registered, by the government or some NGO, is x on a total Chechen population of y, giving z%. The Russian government may decide that the better measurement is for Russia as a whole. Given that there are almost no forced disappearances in other parts of Russia, the z% goes down dramatically, perhaps close to or even below the level other comparable countries.

Good points for Russia! But that doesn’t mean that the situation in Chechnya is OK. The data for Chechnya are simply “drowned” into those of Russia, giving the impression that “overall”, Russia isn’t doing all that bad. This, however, is misleading. The proper unit of measurement should be limited to the area where the problem occurs. The important thing here isn’t a comparison of Russia with other countries; it’s an evaluation of a local problem.

Something similar happens to the evaluation of the Indian economy:

Madhya Pradesh, for example, is comparable in population and incidence of poverty to the war-torn Democratic Republic of Congo. But the misery of the DRC is much better known than the misery of Madhya Pradesh, because sub-national regions do not appear on “poorest country” lists. If Madhya Pradesh were to seek independence from India, its dire situation would become more visible immediately. …

But because it’s home to 1.1 billion people, India is more able than most to conceal the bad news behind the good, making its impressive growth rates the lead story rather than the fact that it is home to more of the world’s poor than any other country. …

A 10-year-old living in the slums of Calcutta, raising her 5-year-old brother on garbage and scraps, and dealing with tapeworms and the threat of cholera, suffers neither more nor less than a 10-year-old living in the same conditions in the slums of Lilongwe, the capital of Malawi. But because the Indian girl lives in an “emerging economy,” slated to battle it out with China for the position of global economic superpower, and her counterpart in Lilongwe lives in a country with few resources and a bleak future, the Indian child’s predicament is perceived with relatively less urgency. (source)

All this should be kept in mind when browsing our human rights maps. It’s not because a country compares favorably to another that it doesn’t have serious problems. More on Russia and Chechnya, and on poverty in India. And more posts in this series.

Standard
lies and statistics, statistics

Lies, Damned Lies, and Statistics (6): Statistical Bias in the Design and Execution of Surveys

dilbert statistician

(source)

Statisticians can – wittingly or unwittingly – introduce bias in their work. Take the case of surveys for instance. Two important steps in the design of a survey are the definition of the population and the selection of the sample. As it’s often impossible (and undesirable) to question a whole population, statisticians usually select a sample from the population and ask their questions only to the people in this sample. They assume that the answers given by the people in the sample are representative of the opinions of the entire population.

Bias can be introduced

  • at the moment of the definition of the population
  • at the moment of the selection of the sample
  • at the moment of the execution of the survey (as well as at other moments of the statistician’s work, which I won’t mention here).

Population

Let’s take a fictional example of a survey. Suppose statisticians want to measure public opinion regarding the level of respect for human rights in the country called Dystopia.

First, they set about defining their “population”, i.e. the group of people whose “public opinion” they want to measure. “That’s easy”, you think. So do they, unfortunately. It’s the people living in this country, of course, or is it?

Not quite. Suppose the level of rights protection in Dystopia is very low, as you might expect. That means that probably many people have fled the country. Including in the survey population only the residents of the country will then overestimate the level of rights protection. And there is another point: dead people can’t talk. We can assume that many victims of rights violations are dead because of them. Not including these dead people in the survey will also artificially push up the level of rights protection. (I’ll mention in a moment how it is at all possible to include dead people in a survey; bear with me).

Hence, doing a survey and then assuming that the people who answered the survey are representative for the whole population, means discarding the opinions of refugees and dead people. If those opinions were included the results would be different and more correct. Of course, in the case of dead people it’s obviously impossible to include their opinions, but perhaps it would be advisable to make a statistical correction for it. After all, we know their answers: people who died because of rights violations in their country presumably wouldn’t have a good opinion of their political regime.

Sample

And then there are the problem linked to the definition of the sample. An unbiased sample should represent a fully random subset of the entire and correctly defined population (needless to say that if the population is defined incorrectly, as in the example above, then the sample is by definition also biased even if no sampling mistakes have been made). That means that every person in the population should have an equal chance of being chosen. That means that there shouldn’t be self-selection (a typical flaw in many if not all internet surveys of the “Polldaddy” variety) or self-deselection. The latter is very likely in my Dystopia example. People who are too afraid to talk won’t talk. The harsher the rights violations, the more people who will fail to cooperate. So you have a perverse effect that very cruel regimes may score better on human rights surveys that modestly cruel regimes. The latter are cruel, but not cruel enough to scare the hell out of people.

The classic sampling error is from a poll on the 1948 Presidential election in the U.S.

On Election night, the Chicago Tribune printed the headline DEWEY DEFEATS TRUMAN, which turned out to be mistaken. In the morning the grinning President-Elect, Harry S. Truman, was photographed holding a newspaper bearing this headline. The reason the Tribune was mistaken is that their editor trusted the results of a phone survey. Survey research was then in its infancy, and few academics realized that a sample of telephone users was not representative of the general population. Telephones were not yet widespread, and those who had them tended to be prosperous and have stable addresses. (source)

truman holding the newspaper with the headline dewey defeats truman

(source)

Execution

Another reason why bias in the sampling may occur is the way in which the surveys are executed. If the government of Dystopia allows statisticians to operate on its territory, it will probably not allow them to operate freely, or circumstances may not permit them to operate freely. So the people doing the interviews are not allowed to, or don’t dare to, travel around the country. Hence they themselves deselect entire groups from the survey, distorting the randomness of the sample. Again, the more repressive the regime, the more this happens. With possible adverse effects. The people who can be interviewed are perhaps only those living in urban areas, close to the residence of the statisticians. And those living there may have a relatively large stake in the government, which makes them paint a rosy image of the regime.

More posts in this series.

Standard
lies and statistics, statistics

Lies, Damned Lies, and Statistics (5): (Not) Using a Left and a Right Y-axis

Sometimes, when you want to compare two time-series which are far apart from each other in terms of numbers – such as, for example, the yearly average number of inhabitants of NY and their yearly average height (the former being in the millions, the latter in the single digits) – you have to plot one series on the left y-axis and the other on the right y-axis, each with a different scale. If you put both on the same y-axis (usually the left) you have to use the same scale. In my example, the line for the average height would just be a flat line at the bottom of the graph and coincide more or less with the x-axis because the numbers are too small compared with the numbers for population. If you put them on two different y-axes, you’ll be able to compare them.

Here’s an example I discussed before. Proponents of the death penalty usually show the following famous graph in order to “prove” that capital punishment results in fewer homicides in the U.S., and is therefore a successful deterrent:

deterrence capital punishment death penalty

What’s wrong with this graph is that they tried to jam the two series – which are totally different in terms of magnitude – into one y-axis. To do so, they recalculated the number of murders series. Rather than giving the numbers as they are, they give the numbers per 66.000 people. Why this strange number: 66.000? Why not the more obvious 100.000, or why not plot the two series on different y-axis? Because now they can give the impression that the recent rise in the number of executions is closely correlated with the recent drop in the number of homicides.

Now compare this graph to this version, using the same data (but going back a bit further in time) and another graphical presentation:

homicides and executions in the U.S.

(source)

The important differences:

  • that the second graph uses two y-axes
  • and it counts the number of executions per homicide, and not just the total number of executions – from the point of view of deterrence, this is obviously the better measure.

We can see from the second graph that the recent upswing in the number of executions is really quite small, compared to earlier periods (there was moratorium on executions in the U.S. in the early 1970s). Unless deterrence has somehow become much more effective than it was in the early parts of the 20th century – which is doubtful given the relatively low numbers of executions and the relatively humane methods – it’s doubtful that such a relatively small increase in the number of executions during the last decades is the cause of the extraordinary decrease in the number of homicides during the same period.

When we look at the whole time series, going back in time long enough; when we use both y-axes; and when we avoid using strange measures such as murders per 66.000 people or executions tout court rather than executions per homicide, then there isn’t a clear correlation between executions and decreasing numbers of murders.

Of course, using a left and right y-axis can also be misleading. I’ll post an example when I come across one.

Standard
freedom, lies and statistics, statistics

Lies, Damned Lies, and Statistics (4): Manipulating the Y-axis Scale in Graphs

Another common manipulation of statistics: play a bit with the starting and ending values on the y-axis of your graphs. This can give astonishing results. I prepared a fictional example. Compare the two graphs:

growth rate 2

growth rate 1

The data are absolutely the same, but the y-axis in the second graph starts at 3,500 instead of 0, giving the impression that government violation of freedom of speech in Dystopia has risen sharply in 2008, compared to the year before, whereas in reality things are just as awful, more or less, as before.

E.D. Kain of The League of Ordinary Gentlemen believes he has spotted a real-life example of this kind of manipulation. While it’s not difficult to find such examples, this isn’t one. On the contrary, Kain himself commits the mistake he accuses someone else of making. Let me explain. He points to this graph from Conor Clarke on Andrew Sullivan’s blog:

graph effective federal tax rate for top 1 percent of households by income

(source)

This graph, illustrating (or not, if you’re Kain) the drop in effective income tax rates for the top 1% of Americans from the Clinton to the Bush years, is used by many to argue that a small increase in taxation for the super-rich wouldn’t mean Armageddon. At first sight, the y-axis does indeed look like it has been manipulated in order to highlight a sharp decline in tax rates for the rich.

Hence, Kain goes to work and “corrects” the chart, making the y-axis start at 0% and end at 100%:

income tax graph

Just goes to show that manipulation can also mean using the apparently “neutral” starting and ending points of 0 and 100. Not only does he remove all useful information from the previous graph; he also assumes that taxes can somehow be close to 0% or 100%. One shouldn’t assume this, since it never happens in reality. Making the graph start at 0 and end at 100 means assuming it can happen, and is therefore disingenuous. An example: suppose I want to show that life expectancy hasn’t risen a lot over the last centuries (which isn’t true). So I include the extreme of 500 years as the end value in my y-axis. Nobody ever lives or will live till he or she is 500. Obviously, the graph will show no visible increase in life expectancy, even if people now live twice as long as a thousand years ago, on average (which is the case).

Lesson: minimum and maximum values in y-axis should be close to realistic real-life minimums and maximums. In that respect, the Clarke graph is better. (Although he could have used a longer period, avoiding another error).

Just to show that this type of lie occurs in real life:

Fox News manipulates Y-axis in graph

Standard
lies and statistics, statistics, work

Lies, Damned Lies, and Statistics (3): Growth Rates and Cherry Picking

Statistics can be dangerous, as is evident from the previous posts in this series. People making them can make mistakes, or can use them to deceive. And people reading them can misinterpret them. Our treatment of human rights on this blog depends heavily on the use of statistics, and so the quality of those statistics is important. This blog series mentions some of the things that can go wrong.

Statistical mistakes or statistical lies occur in all kinds of fields, not only the field of human rights. Here’s one that is often made in discussions on climate change. It has to do with measuring growth rates (which we also do for human rights).

Kevin Drum has a quote from George Will, and replies with a graph:

George Will [claimed] that “If you’re 29, there has been no global warming for your entire adult life”. … If you’re 29, you became an adult in 1998, and average global temperatures last year were lower than they were in 1998. So: no global warming in your adult lifetime.

Global warming 1998 2008

The earth is actually cooling! But as about a thousand serious climate researchers have pointed out, it’s not true. Global temps have been trending up for over a century, but in any particular year they can spike up and down quite a bit. In 1998 they spiked up far above the trend line and last year they spiked below the trend line. So 2008 was cooler than 1998.

Of course, you can prove anything you want if you cherry pick your starting and ending points carefully enough. For example: The year 2000 was below the trend line and 2005 was above it. Temps were up 0.4°C in only five years! The seas will be boiling by 2050!

Here’s another example of cherry picking start or ending dates in a time series so as to highlight or drown a growth rate (positive or negative), this time more closely related to the issue of human rights (more specifically the right to work).* Compare these two graphs (in the first graph, just look at the red line for “unemployment rate”, the rest isn’t important, for now – I’ll come back to it in a future post because there are other problems with this first graph):

graph number of jobs

(source)

US unemployment rate 1998 present

(source)

The first graph makes the – honest? – mistake of starting in 2003, giving the impression that Bush’s economic policies  brought down unemployment. The second graph, however, gives some more historical perspective because it starts earlier, and shows that unemployment was much lower before Bush (Bush took office in 2000) and that the decrease during his presidency wasn’t so spectacular as the first graph suggests.

Of course, you can’t hold a president responsible for unemployment, at least not exclusively. But then neither should you tweak graphs so as to give the impression that the president’s policies have a beneficial impact (read the title of the first graph).

* Technically, this isn’t a growth rate, just a time series, but the same logic holds.

Standard
comedy, lies and statistics, statistics

Lies, Damned Lies, and Statistics (2): Extrapolation

I don’t think it’s a good idea to be blinded by love, and I apply that to my love for statistics. If you’re tempted to take the statistics on this blog (or elsewhere) too seriously, take a look at the image below (and also this one).

statistics extrapolation

(source)

The same thing happens in this joke:

Two statisticians were flying from Los Angeles to New York. About an hour into the flight, the pilot announced, “Unfortunately, we have lost an engine, but don’t worry: There are three engines left. However, instead of five hours, it will take seven hours to get to New York.”

A little later, he told the passengers that a second engine had failed. “But we still have two engines left. We’re still fine, except now it will take ten hours to get to New York.”

Somewhat later, the pilot again came on the intercom and announced that a third engine had died. “But never fear, because this plane can fly on a single engine. Of course, it will now take 18 hours to get to New York.”

At this point, one statistician turned to another and said, “Gee, I hope we don’t lose that last engine, or we’ll be up here forever!”

Unfortunately, such things don’t happen only in jokes. It’s quite common to take a trend and assume it will continue on the same path it has taken in the past.

Standard
comedy, lies and statistics, statistics

Lies, Damned Lies, and Statistics (1): Correlation

You know I love graphs and statistics, so here’s one showing how importing lemons from Mexico reduces highway fatality rates in the U.S.:

lemon graph correlation is not causation

(source)

And here‘s another one. Just so that you don’t automatically believe everything I write (as if you would), and a funny reminder that correlation doesn’t necessarily imply causation.

For some real statistics, see here. For something more on the famous quote in the title, go here.

Standard