I started thinking about this because I saw Yoav Gilad’s reanalysis of some previous expression profile data and showed that the “interesting” finding went away after correcting for batch effects. Someone on Twitter asked whether the paper should be retracted. Should it?
I grew up with the maxim “Flawed data, retract; flawed interpretation, don’t retract”. I think that made a lot of sense. If the data themselves are not reproducible (fraudulent or otherwise), then that’s of course grounds for retraction. Flawed interpretations come in a couple varieties. Some are only visible in hindsight. For example “I thought that this band on the gel showed proof of XYZ effect, but actually it’s a secondary effect due to ABC that I didn’t realize at the time” is a flaw, yes, but at the time, the author would have been fine in believing that the interpretation was right. Not really retraction worthy, in my opinion. Especially because all theories and interpretations are wrong on some level or another–should we retract Newton’s gravitation because of Einstein?
Now, there’s another sort of interpretational flaw, which comes from a logical error. These can also come in a number of types. Some are just plain old interpretational flaws, like claiming something that your data doesn’t fully support. This can be subtle, like failing to consider a reasonable alternative explanation, which is a common problem. (Flawed experimental design also falls under this heading, I think.) Certainly overclaiming and so forth are rampant and generally considered relatively benign.
Where it gets more interesting is if there is a flaw in the analysis, an issue that is becoming more prevalent as complex computational analyses are more common (and where many authors have to essentially trust that someone did something right). Is data processing part of the analysis or part of the data? I think that puts us squarely in the grey zone. What makes it complex is the interplay between the biological interpretation and the nature of the technical flaw. Here are some examples:
- The one that got me thinking about this was when Yoav Gilad reanalyzed some existing expression profiles from human and mouse tissues. The conclusion of the original paper was that human and mouse profiles clustered together, rather than by tissue (surprise!), but upon removing batch effects, one finds that tissues cluster together more tightly than species (whoops!). Retraction? Is this an obvious flaw in methodology? Would it matter whether people figured out the importance of batch effects before or after it was published? If so, how long after? I would say this should not be retracted because these lines seem rather arbitrarily drawn.
- Furthermore, if we were to retract papers because the analysis method was not right, then we would go down a slippery slope. What if I analyze my RNA-seq using an older aligner that doesn’t do quite as good a job as the newer one? Is that grounds for retraction? I’m pretty sure most people would say no. But how is that really so different than the above? One could say that in this case, there is little change in the biological conclusion. But there are very few biological conclusions that stand the test of time, so I’m less swayed by that argument.
- Things may seem more complicated depending on where the error arises. Let’s take the case of RNA/DNA differences as reported by sequencing, which was a controversial paper that came out a few years back. Many people provided rebuttals with evidence that many of the differences were in fact sequencing artifacts. I’m no expert, but on the face of it, it seems as though the artifact people have a point. Should this paper be retracted? Here, the issue is allegedly a flaw in the early stages of the analysis. Does this count as data or interpretation? To many, it feels like it should be retracted, but where’s the real difference from the two previous examples?
- I know a very nice and influential paper in which there is a minor mathematical error in a formula in part of the analysis method (I am not associated with this paper). This changes literally all the results, but only by a small amount, and none of the main conclusions of the paper are affected. Here, the analysis is wrong, but the interpretation is right. I believe they were contacted by a theorist who pointed out the error and asked “when will you retract the paper?”. Should they retract? I would say no, as would most people in this case. Erratum? Maybe that’s the way to go? But I am somewhat sympathetic to the fact that a stated mathematical result is wrong, which is bad. And this is a case in which I’m saying that the biological conclusion should trump the analysis flaw.
I realize this is a pretty high bar for retraction. For me, that’s fine because, practically speaking, I think it’s far better to just leave flawed papers in the literature. Retractions in biomedical science come with the association of fraud, and I think that associating non-fraudulent but flawed papers with examples of fraud is very harmful. Also, perhaps the data is useful to someone else down the road. We wouldn’t want the data to be designated as “retracted” just because of some mistake in the analysis, right? But this also will depend on what point the data is considered data? For instance, let’s say I used the wrong annotations to quantify transcript abundance per gene and report that data. So then the data is flawed. But probably the raw reads are fine. Hmm. Retract one and not the other?
Anyway, I think it’s something worth thinking about.
Update, 5/12/2015: Lots of interesting commentary around this, especially in the case of the Gilad reanalysis of the PNAS paper. Leonid Kruglyak had a nice point:
@arjunrajlab @lizatucsf @BioMickWatson @joe_pickrell line is "in light of new evidence..." vs "we messed up."— Leonid Kruglyak (@leonidkruglyak) May 11, 2015
Sounds reasonable in this case, right? I still think there are many situations in which this distinction is pretty arbitrary, though. In this case, the issue was that they didn’t watch out for batch effects. Now, once people realized that batch effects were a thing, how long does it take before it’s considered standard procedure to correct for it? 1 year? 2 years? A consensus of 90% of the community? 95%? And what if it turns out 10 years from now that the batch effect thing is not actually a problem after all and the original conclusion was valid? These all sound less relevant in this instance, but I think the principle still applies.
Great point from Joe Pickrell:
@arjunrajlab but overall in a world where online commenting is easy, retraction does seem like a pretty blunt instrument— Joe Pickrell (@joe_pickrell) May 11, 2015
I really like the idea of just marking papers as wrong in the comments, perhaps accompanied by making comments more visible. (A more involved version of this could be paper versioning.) In this case, the data were fine, and were the paper retracted, then nobody could do a reanalysis to show that the opposite conclusion actually holds (which is also useful information).