Recently I came across this Washington Post headline: “59,000 farmer suicides in India over 30
years may be linked to climate change, study says.” The article explains that a researcher looked
back almost 50 years comparing data and climate information and “concluded that
temperature may have ‘a strong influence’ on suicide rates during the growing
season.” Note the words “may have a
strong [but unspecified] influence.” The
researcher goes on to project an increase in the number of “lives lost to
self-harm” in India.
What are we to make of this?
The number of suicides correlates to the temperature. Is Global Warming to blame?
Correlation is a mathematical expression of how closely any
two measurements move relative to each other.
If the first one gets larger at exactly the same rate as the second, the
correlation equals 1, a perfect positive correlation. If the first one decreases at exactly the
same rate that the second increases, the correlation equals -1, a perfect
negative correlation.
In our imperfect world this rarely happens, so a correlation
of 90% seems to indicate that the two measurements are moving very closely
together. A correlation close to zero
shows no mathematical relationship. The math
is not difficult. A laptop can do it
easily and show the graph with points either closely tracking each other for a
strong correlation or looking like a random scattering of points for a weak
(near zero) correlation.
The most important thing to understand about correlation is
that correlation is not causation! This
is a major point of emphasis in every first year statistics class. Just because two things vary together, they
are not necessarily related in any way.
Lots of things get bigger together and are related, like the
size of a tree and the amount of lumber available from the tree or the number
of pizzas (or amount of beer) needed to feed guests at a party – more people =
more pizza. Gas mileage (MPG) may be
related to the size of a truck, the power of the engine or the weight of the
load. These correlations are easy to
explain.
When two measures are correlated, sometimes one influences the
other, or perhaps they are both affected by some unseen common factor. Often though, two measures are mathematically related but have no
logical link. Some websites specialize
in finding odd examples of these supposed relationships.
Here is one I’ve used before telling about a study
correlating dog ownership with eating eggrolls, eating cabbage with having an
innie bellybutton, and many others.
Another site, with lovely graphs, correlates the divorce rate in Maine
with consumption of margarine, consumption of cheese with the number of people
who died by becoming tangled in their bed sheets and several more entertaining examples. There is the joking relationships between the
stock market and skirt lengths or the stock market and the conference of the team that won the last Super
Bowl. I wrote last year about the journalist who compared many measurements and found a surprising correlation between eating dark chocolate and weight loss in one data set. He later admitted that it was bad science and should not be taken seriously. If you look hard enough, you can
find all kinds of weird examples of data that correlate just by
coincidence. That’s what makes it
tricky.
The use of correlation is quite common in studies about
health and other areas.
Studies find a mathematical relationship and use the magic word linked.
An action or habit is linked to a healthy or unhealthy outcome. Eating A is linked to longer or shorter life. Researchers find relationships that may be the direct, pizza supply to number-of-guests, kind or it may be the cheese to death-by-bed-sheets
kind. But at that point it's only math, not reality.
This is what came to mind when I saw that headline. To be fair, the researcher tries to explain
how there can be an other-than-mathematical relationship. “High temperatures in the growing season
reduce crop yields, putting economic pressure on India's farmers.” I think any farmer can tell you that weather
often increases economic pressure, but how often does that lead to suicide? Don’t we need more than just two sets of
numbers to compare on a graph? Is this reality or just math?
Whenever critical thinkers hear the word linked, they know someone has found a mathematical
relationship. Researchers hope to discover a real
relationship so they can use the word causes,
as in action A causes outcome B, but that is much more difficult. Understanding this is very important to keep
from panicking over the latest headline, insisting that someone do something or
trying to persuade friends and family to change eating habits to avoid catastrophe. You don’t have to be a math or statistical wizard
to know that linked is not
necessarily the same as causes.
No comments:
Post a Comment
Click again on the title to add a comment