RajLab: Pre-registration in molecular biology

A few years back, perhaps in pre-pandy times, I was on a faculty development panel in which I was one of two presenters. I was of course there to present on how to use Twitter to build your brand (sigh, I’m lame), and a more senior faculty member (I think a neuroscientist) was there to talk about pre-registration in lab work. He was very kind and wise-seeming, and explained how he had been pre-registering their results in the lab for a while, and how it transformed their work.

What is pre-registration? It’s probably most familiar to you in the form of clinical studies, where there was a notorious selection bias in which results would be reported. Like, does drinking coffee cause flatulence? One would have to do a randomized controlled trial to check. But if people did, say, 100 clinical trials and only reported the ones where there was a “positive” result, then you would see 5 clinical trials with p < 0.05 showing that coffee causes flatulence, and none of the contradictory results. So now you have to pre-register a trial, meaning that you have to say, I am going to do this trial with this power and what not, and then you are obligated to report the outcome, no matter what the outcome is. A great idea!

But here was someone advocating for pre-registration much closer to home, in our day to day lab work. I remembering being vehemently and vocally opposed. Sure, clinical trials are one thing, with a clearly stated hypothesis and major resources devoted to a single experiment. But in my line of work, where we are constantly trying new experiments and checking out new avenues of work, where there are tons of false leads and new directions? How could that possibly work without gumming up the works in needless bureaucracy? I was vehemently and vocally opposed, to which the senior faculty member just patiently and calmly responded “Sure, I hear you, just think about it”.

Ever since, I keep coming back to that moment, and it has come to have a major effect on how I approach our science—and especially our reporting of it. The key take home point is: if you did an experiment to answer a question, and you don’t have any reason to exclude it based on the experiment itself, then you have to report the results. Repeat: unless there is an independent basis for the exclusion of a result, you have to report the results. Or, to put it another way: if you would have included the data if the result had come out the other way, you have to report it.

Selective reporting of data is a strange issue in molecular biology in that almost everyone agrees that it is wrong and yet the overall culture of the field leans towards selective reporting in so many ways. Here is an example from our own work. In a recent paper, we were trying to confirm the knockdown of a particular protein. We were able to show a convincing knockdown by RNA FISH, but also wanted to show that the protein levels went down. We did a bunch of westerns, but the results came out ambiguously: sometimes we saw an effect and sometimes not (there are reasons that that could be the case, but we didn't confirm those because they were very difficult). The standard thing to do here would be to not report the western results. But there was no reason to exclude the experiment other than being annoyed with the results. So, we reported it.

But again, the cultural standard in molecular biology is often not to report such ambiguous results. I saw this mindset a lot early in my career, back when RNA FISH was considered cool and people wanted our help to add some RNA FISH to their paper to spice it up. There were several times when people came to us with data in support of a, shall we say… “fanciful” hypothesis, and then we would do the RNA FISH, which would basically show the hypothesis was wrong. At which point, the would-be collaborator would beg out, saying that given the “ambiguous” nature of the RNA FISH results, “perhaps we should save the data for the next paper” (which of course never materialized). After enough of these moments, I started asking potential collaborators what stage of their paper they were at, and if they were close to the end, whether they really wanted us to do this experiment. At least one time, when faced with this choice, the person said, uhhh, let’s not!

There have also been many times when we’ve tried following up on work where we are pretty sure there has been a lot of selective reporting of positive results. Let’s just say that that is an unpleasant realization to make.

I want to emphasize that I don’t think that people are being malicious or fraudulent in their work. I think the vast majority of scientists are honest people and are not trying to do something wrong. I just think that science would benefit from having a more transparent reporting of results, because it is sometimes the data that doesn’t fit the narrative that leads to something new in the future. I also don’t necessarily think we need to formally pre-register our work, although it might be an interesting experiment to try. We should just try and shift our culture a bit towards transparent reporting. One potential challenge in doing science this way is that our stories are a lot less likely to be “perfect”. There will almost always be some bits of conflicting evidence, and given our adversarial peer review system, there is seemingly a lot of pressure to keep these conflicting results out. Or is there? We have been doing this for quite a while, and I would say that our experience has been largely fine in the sense that reviewers don’t mind as long as you are transparent about it. I say “largely” because there have definitely been cases in which reviewers point out some issue that we were transparent about and reject our paper because of it. So at least in my experience, I would say that adopting this more transparent reporting of results is not entirely without consequence. All I can say is that if we do decide to make this cultural shift, we also have to be more tolerant of imperfections in the “story” when we put our reviewer hats on.

By the way, I think a lot of people tend to think of selective reporting as a problem of experimental science. Not at all the case! Same goes for every analysis of e.g. some large scale dataset: if you checked for some signal in the data, you have to report the result, regardless of whether the result came out the way you wanted. It’s actually if anything even more of an issue in computational work in some ways, where many hypotheses can be tested with the same data in (relatively) rapid fashion.

There is also a bit of a gray area in terms of what to do about false leads. Sometimes, you have an idea that goes in a new direction that has nothing to do with the story of the paper. I don’t know what to do in this case. Certainly, science would be in some ways better for having these results out there, since there was probably (hopefully?) some basis for the experiment or analysis in the first place. But it may just serve to distract from the main thread of the paper, making it harder to follow. I don’t know how best to balance these competing and important principles, but I think it’s an important discussion for us to have.

I’m very curious how people will respond to this discussion. Ultimately, there is no form or checklist that can solve the issues we have in science. Pre-registration sounds like a bureaucratic solution, but in the end, it’s just a call for careful, honest thought about the work we do. I’m sure some people reading this will have a strongly negative reaction, much like I did at first. All I’m saying is “Sure, I hear you, just think about it.” 🙂

1 comment:

Nikolai SlavovFebruary 5, 2024 at 2:42 PM
More complete and transparent reporting clearly benefits research and the community. Since it has associated cost, each instance is associated with a cost benefit analysis, which can also help determine the degree of documentation. For example, ambiguous and inconclusive results should be reported as having occurred, but their thorough documentation is unlikely to merit the needed time and effort.

As you wrote, more complete and transparent disclosure requires a culture change. What can help is the realization of the benefits, including the benefits for the scientific teams engaging in more transparent disclosure of their results.

Monday, February 5, 2024

Pre-registration in molecular biology

1 comment: