# Lies, Damned Lies, and Statistics (4): Manipulating the Y-axis Scale in Graphs

Another common manipulation of statistics: play a bit with the starting and ending values on the y-axis of your graphs. This can give astonishing results. I prepared a fictional example. Compare the two graphs:

The data are absolutely the same, but the y-axis in the second graph starts at 3,500 instead of 0, giving the impression that government violation of freedom of speech in Dystopia has risen sharply in 2008, compared to the year before, whereas in reality things are just as awful, more or less, as before.

E.D. Kain of The League of Ordinary Gentlemen believes he has spotted a real-life example of this kind of manipulation. While it’s not difficult to find such examples, this isn’t one. On the contrary, Kain himself commits the mistake he accuses someone else of making. Let me explain. He points to this graph from Conor Clarke on Andrew Sullivan’s blog:

###### (source)

This graph, illustrating (or not, if you’re Kain) the drop in effective income tax rates for the top 1% of Americans from the Clinton to the Bush years, is used by many to argue that a small increase in taxation for the super-rich wouldn’t mean Armageddon. At first sight, the y-axis does indeed look like it has been manipulated in order to highlight a sharp decline in tax rates for the rich.

Hence, Kain goes to work and “corrects” the chart, making the y-axis start at 0% and end at 100%:

Just goes to show that manipulation can also mean using the apparently “neutral” starting and ending points of 0 and 100. Not only does he remove all useful information from the previous graph; he also assumes that taxes can somehow be close to 0% or 100%. One shouldn’t assume this, since it never happens in reality. Making the graph start at 0 and end at 100 means assuming it can happen, and is therefore disingenuous. An example: suppose I want to show that life expectancy hasn’t risen a lot over the last centuries (which isn’t true). So I include the extreme of 500 years as the end value in my y-axis. Nobody ever lives or will live till he or she is 500. Obviously, the graph will show no visible increase in life expectancy, even if people now live twice as long as a thousand years ago, on average (which is the case).

Lesson: minimum and maximum values in y-axis should be close to realistic real-life minimums and maximums. In that respect, the Clarke graph is better. (Although he could have used a longer period, avoiding another error).

Just to show that this type of lie occurs in real life:

Standard