health, lies and statistics, statistics

Lies, Damned Lies, and Statistics (12): Generalization



An example from Greg Mankiw’s blog:

Should we [the U.S.] envy European healthcare? Gary Becker says the answer is no:

“A recent excellent unpublished study by Samuel Preston and Jessica Ho of the University of Pennsylvania compare mortality rates for breast and prostate cancer. These are two of the most common and deadly forms of cancer – in the United States prostate cancer is the second leading cause of male cancer deaths, and breast cancer is the leading cause of female cancer deaths. These forms of cancer also appear to be less sensitive to known attributes of diet and other kinds of non-medical behavior than are lung cancer and many other cancers. [Health effects of diet and behavior should be excluded when comparing the quality of healthcare across countries. FS]

These authors show that the fraction of men receiving a PSA test, which is a test developed about 25 years ago to detect the presence of prostate cancer, is far higher in the US than in Sweden, France, and other countries that are usually said to have better health delivery systems. Similarly, the fraction of women receiving a mammogram, a test developed about 30 years ago to detect breast cancer, is also much higher in the US. The US also more aggressively treats both these (and other) cancers with surgery, radiation, and chemotherapy than do other countries.

Preston and Hu show that this more aggressive detection and treatment were apparently effective in producing a better bottom line since death rates from breast and prostate cancer declined during the past 20 [years] by much more in the US than in 15 comparison countries of Europe and Japan.” (source)

Even if all this is true, how on earth can you assume that a healthcare system is better because it is more successful in treating two (2!) diseases? See here and here for a more complete picture.

Another example: the website of the National Alert Registry for sexual offenders used to post a few “quick facts”. One of them said:

“The chance that your child will become a victim of a sexual offender is 1 in 3 for girls… Source: The National Center for Victims of Crime“.

Someone took the trouble of actually checking this source, and found that it said:

Twenty-nine percent [i.e. approx. 1 in 3] of female rape victims in America were younger than eleven when they were raped.

One in three rape victims is a young girl, but you can’t generalize from that by saying that one in three young girls will be the victim of rape. Perhaps they will be, but you can’t know that from these data. Like you can’t conclude from the way the U.S. deals with two diseases that it “shouldn’t envy European healthcare”. Perhaps it shouldn’t, but more general data on life expectancy says it should.

These are two examples of induction or inductive reasoning, sometimes called inductive logic, a reasoning which formulates laws based on limited observations of recurring phenomenal patterns. Induction is employed, for example, in using specific propositions such as:

This door is made of wood.

to infer general propositions such as:

All doors are made of wood. (source)

More posts in this series.


2 thoughts on “Lies, Damned Lies, and Statistics (12): Generalization

  1. Pingback: Lies, Damned Lies, and Statistics (31): Common Problems in Opinion Polls « P.A.P. Blog – Human Rights Etc.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s