We often hear the term statistically significant in
reference to scientific studies. What
does it mean and why does it matter?
In life, in nature, in the world, almost everything varies: the heights of trees outside your window,
your temperature and blood pressure, your heating bill, the shoe sizes of all
your friends, the rate at which the plants in your garden grow. Even in finely tooled machine parts,
precision is a matter of how closely you care to measure.
Variation is everywhere, but we take it for granted. It’s not until things start repeating that we
question. If trees are exactly the same
height, we suspect that we are looking at a cartoon or drawing, not a true
picture of nature. If you meet someone
with the same birthday as you, it’s unexpected.
If your electric usage is exactly the same month after month, it seems suspicious. I’m sure you can think of many other
examples. Variation is natural and
expected.
When scientists develop a new treatment, whether it be a
drug or anything else, they want to know if it works, that is, if it produces
the effect they are looking for. They
treat one group and compare the results to a second, untreated group. If the treated group does better, they want
to believe that the treatment works, but they know that everything varies. There are subjects (patients, crops, or white
mice) in the treated group that improve a lot, some that improve a little and
some that may not change. There are
subjects in the control group that improve, despite the fact that they were not
treated. Comparing the two groups, they
may find quite a bit of overlap where a few untreated individuals do better
than a few treated ones.
The question to answer is: Has there been enough improvement overall in the treated group to attribute
it to the treatment, or could it be the result of natural variation? Has variation of individuals tricked them
into seeing a favorable, but false effect?
In fact there is no way to know for sure. They can only judge if the probability of such
variation is very low, low enough to make them confident that the treatment is working. They use statistics to calculate that
probability, and if it’s low enough, they call the result statistically
significant. That means they are
confident, but never absolutely certain.
This explains how two studies of the same drug, herb, food
or fertilizer can produce different results.
It explains such things as the recent reversal on the benefits of calcium
and vitamin D supplements for post-menopausal women and that cranberry juice "does not appear to have significant benefits" in preventing urinary tract infections. Try as they might to eliminate or account for
all the causes of variation, the result is never perfect. Variation is everywhere. Greater confidence requires multiple tests that
may confirm or may contradict prior results.
Variation makes certainty unachievable.
No comments:
Post a Comment
Click again on the title to add a comment