I was writing back and forth with Gautham a bit, and he observed that statistics truly is a grim science: the whole point of it is to curb your enthusiasm, be the wet blanket, rain on the parade. I think it’s also true that statistics as applied to science today is perhaps not used the way statisticians really intended it to be used. From what I’ve read (correct me if I’m wrong, dear readers), things like the p-value were designed as quick tests just to make sure that whatever you were observing might not just be a fluke. It is not a substitute for scientific thought. But in our desire to systematize our analyses, we’ve created a system in which the p-value is an end in and of itself. We’ve all heard of people digging through statistical tests to check whether such and such effect is statistically significant, and that is of course bad, and we'd of course never do anything like that ourselves :). But at least there is an effect that someone has reasoned may have

*scientific*significance, so the problem can in principle be rectified with more data. The other situation, where you just look for statistical significance as the scientific finding itself, seems more troubling to me from the perspective of gaining scientific insight (I have been guilty of this myself). We have a tricky situation in biology now, where our measurement tools are getting so good that we can pick up tons of effects with statistical significance. The real question is scientific significance. As an aside, at Nature, I think you can’t even utter the word significant unless it comes with a p-value. To me, this is placing the wrong emphasis on the use of the word significant.

One of my favorite Gautham quotes is “Would Newton have come up with the theory of gravitation by machine learning?” A corollary quote could be “What is the p-value on the theory of gravitation?” Lately, my thinking is that a theory cannot have a p-value: it is the product of scientific thinking and trying to understand our world, and

*that*is what is significant. A theory can lead to a model of a system, which you can use to make predictions which can have p-values (like the five-sigma standard for the discovery of the Higgs) to evaluate whether the theory is right or wrong. But the theory itself stands or does not stand on the collective judgement of us as scientists and humans as to whether it is telling us the truth about reality. I feel like that is the truest test of significance.

The reliance on p-values in science is pretty nuts. Many people - including scientists - do either not understand them, or interpret them properly. They are not useless, but frequently do not answer the question scientists are interested in. See here for a good discussion

ReplyDeletehttp://bayesianbiologist.com/2011/08/21/p-value-fallacy-on-more-or-less/

You are very correct. Theorems ( not theories are real science ). Not statistics

ReplyDelete