RajLab: August 2015

Tuesday, August 25, 2015

New York Gyro lost... and found!

We had some serious issues in the lab over the last several months. In mid-June, our favorite food truck (okay, Ally's and mine), New York Gyro, disappeared from the corner of 38th and Walnut, only to be replaced by some other New York Gyro guy. Okay, no big deal, one New York Gyro is as good as the next, right? Oh, how so very naive, my friend. There is this New York Gyro (affectionately referred to as "Far Gyro" because it's sort of, well, far), and then there's everyone else. His chicken over rice is hands-down the best. And a free drink?! What more can you ask for. So we thought, okay, he's leaving for Ramandan, no big deal. Well, Ramadan came and went, but no Far Gyro. At first, well, maybe he tacked on a vacation. Then maybe it was some family thing. Then... well, let's just say we were hoping that he had somehow gotten bumped off his spot and was out there, somewhere, waiting for us to find him. It got to the point where Martha (from the Winkelstein lab) made this sign:

Meanwhile, some usurper had taken his place, driving down the ratings on Yelp for New York Gyro on 38th and Walnut. Blasphemy! And then...

Found Gyro

Far Gyro is now Super Far Gyro, having resurfaced on 41st and Walnut! Rohit just happened to see him because he lives right there. Well worth the additional 3 blocks of walking. Whew.

Anyway, if you have never been to this particular truck, give it a try sometime. Chicken over rice, all the sauces, hold the lettuce (well, that last part is just for me–I have boycotted lettuce as a waste of precious time and space).

Sunday, August 23, 2015

Top 10 signs that a paper/field is bogus

These days, there has been a lot of hang-wringing about how most papers and wrong and reproducibility and so forth. Often this is accompanied with some shrill statements like “There’s a crisis in the biomedical research system! Everything is broken! Will somebody please think of the children?!” And look, I agree that these are all Very Bad Things. The question is what to do about it. There are some (misguided, in my view) reproducibility efforts out there, with things like registered replication studies and publishing all negative results and so forth. I don’t really have too much to say about all that except that it seems like a pretty boring sort of science to do.

So what to do about this supposed crisis? I remember someone I know telling me that when he was in graduate school, he went to his (senior, pretty famous) PI with a bunch of ideas based on what he'd been reading, and the PI said something along the lines of "Look, don't confuse yourself by reading too much of that stuff, most of it’s wrong anyway". I've been thinking for some time now that this is some of the best advice you can get.

Of course, that PI had decades of experience to draw upon, whereas the trainee obviously didn't. And I meet a lot of trainees these days who believe in all kinds of crazy things. I think that learning how to filter out what is real from the ocean of scientific literature is a skill that hopefully most trainees get some exposure to during their science lives. That said, there’s precious little formalized advice out there for trainees on this point, and I believe that a little knowledge can go a long way: for trainees, following up on a bogus result can lead to years of wasted time. Even worse is choosing a lab that works on a bogus field–a situation from which escape is difficult. So I actually think it is fair to ask “Will somebody please think of the trainees?”.

With this in mind, I thought it might be useful to share some of the things I've learned over the last several years. A lot of this is very specific to molecular biology, but maybe useful beyond. Sadly, I’ll be omitting concrete examples for obvious reasons, but buy me a beer sometime and then maybe I'll spill the beans. If you’re looking for a general principle underlying these thoughts, it’s to have a very strong underlying belief system based in Golden Era molecular biology. Like: DNA replication, yeah, I’m pretty sure that’s a thing. Golgi Apparatus, I think that exists. Transcription and translation, pretty sure those really happen. Beyond that, well…

Run the numbers. One consistent issue in molecular biology is that because it tends to be so qualitative, we have little sense for magnitudes and plausibility of various mechanisms. That said, we now are getting to the point where we have a lot more quantitative data that lets us run some basic sanity checks (BioNumbers is a great resource for this). An example that I’ve come across often is mRNA localization. Many people I’ve met have, umm, fairly fanciful notions of the degree to which mRNA is localized. From what we’ve seen in the lab, almost every mRNA seems to just be randomly distributed around the cytoplasm, with the exception being ER-localized ones, which are, well, localized to the ER. Ask yourself: why should there be any mRNA localization? Numbers indicate that proteins diffuse quite rapidly around the cell, on a timescale that is likely faster than mRNA transport. So for most cells, the numbers say that you shouldn’t localize mRNA–rather, just localize proteins. And, uh, that’s what we see. There are of course exceptions, like lncRNA, that show interesting localization patterns–again, this makes sense because there is no protein to localize downstream. There are other things that people say about lncRNA that don’t make sense, though. I’ll leave that as an exercise for the reader… :) (Also should point out that these considerations can actually help make the case for mRNA localization in neurons, which I think is a thing.)
Consider why nobody has seen this Amazing New Phenomenon before. Was it a lack of technology? Okay, then it might be real. Was it just brute force? Also possible that it's real. Was it just waiting for someone to think of the idea? Well, in my experience, nutty ideas are relatively cheap. So I'd be very suspicious if this result was just apparently sitting there without anyone noticing. Ask yourself: should this putative set of genes have shown up in a genetic screen? Should this protein have co-purified with this other protein? Did people already do similar experiments a long time ago and come up empty handed? What are other situations in which people may have inadvertently seen the same thing before? It’s also possible that the result is indeed true, but represents a “one-off” special case: consider this exchange about a recent paper (I have to say that I was surprised that some people in the lab didn’t even find this result surprising!). Whether you choose to pursue one-offs is I think a largely aesthetic choice.
Trust your brain, not stats. If looking at an effect makes you wonder what the p-value is, you’re already on thin ice, so tread carefully. Also, beware of new statistical methods that claim to extract results from the same data where none existed before. Usually, these will at best find only marginally interesting new examples. More often, they just find noise. If there was something really obvious, probably the original authors would have found it by manual inspection of the data. Also, if there’s a clear confounding factor that the authors claim to have somehow controlled for, be suspicious.
Beware of the "dynamic process". Sometimes, when you press someone on the details of a particular entity or process in the cell whose existence is dubious, they will respond with "Well, it's a dynamic object/process." Often (though certainly not always), this is an excuse for lazy thinking. Remember that just because something is "dynamic" doesn't mean that you should not be able to see it! Equilibrium, people.
For some crazy new proposed mechanism, ask yourself if that is how you think the cell would do it. We often hear that nothing in biology makes sense except in light of evolution. In this context, I think it's worth wondering whether the proposed mechanism would be a reasonable way for the cell to do something it was not otherwise able to do. Otherwise, maybe it’s some sort of artifact. As a (made up) example, cells have many well-established mechanisms for communicating with each other. For a new mechanism of communication to be plausible (in my book), it should offer some additional functionality beyond these existing mechanisms. Evolution can do weird stuff, though, so this line of reasoning is inherently somewhat suspect.
Check for missing obvious-next-step experiments. Sometimes you’ll find a paper describing something cool and new, and you’ll immediately wonder “Hmm, if what they’re saying is true, then shouldn’t those particles also be able to…”. Well, if you thought of it after reading a paper for 30 minutes, then presumably the authors had the same idea as some point as well. And presumably tried it. And it presumably didn’t work. (Oh, wait, sorry, I meant the results were “inconclusive”.) Or they tried to get RNA from those cells to profile. And they just didn’t get enough RNA. And so on. Keep an eye out for these, especially if multiple papers are missing these key experiments.
For methods, look for validation with known biology. The known positives should be positive and presumed negatives should be negative. Let’s say you have some new sequencing method for measuring all RNA-protein interactions (again, completely hypothetical). Have a list of known interactions that should show up and a list of ones for which there’s no plausible reason to expect an interaction. Most people think about the positives, but less often about the negatives. Think carefully about them.
Dig carefully into validation studies. I remember reading some paper in which they claimed to have detected a bunch of new molecules and then “validated” their existence. Then the validation had things like blots exposed for weeks to show signals and PCRs run for 80 cycles and stuff like that. Hmm. Often this data is buried deep in supplements. Spend the time to find it.
Be suspicious of the interpretation of biological perturbations. Cells are hard to manipulate. And so it’s perhaps unsurprising that most perturbations can lead you astray. Off-target effects for knockdown are notoriously difficult to control for. And even if you do have target specificity, another problem is that as our measurements get better, biological complexity means that virtually all hypotheses will be true at least 50% of the time. Overexpression often leads to hugely non-biological protein levels and can lead to artifacts. Cloning out single cells leads to weird variability. Frankly, playing with cells is so difficult that I’m sort of amazed we understand anything!
Know the limitations of methods. If you’re looking for differential gene expression, how much can you trust RT-PCR? Have you heard of the MIQE guidelines for RT-PCR? I hadn't, but they are extensive. For RNA-seq, how well-validated is it in your expression regime? If you’re analyzing sequence variants, how do you know it’s not sequencing error (since largely discredited claims of extensive RNA editing are one widely-publicized example of this issue). ChIP-seq hotspots? The list goes on. If you don’t know much about a method, ask someone who does.
Bonus: autofluorescence. Enough said.

I offer these more as a set of guidelines for how I like to think about new results, and I’m sure we can all think of several counterexamples to virtually every one of these. My point is that high-level thinking in molecular biology requires making decisions, and making a real decision means leaving something else on the table. Making decisions based on the literature means deciding what avenues not to follow up on, and I think that most good molecular biologists learn this early on. Even more importantly, they develop the social networks to get the insider’s perspective on what to trust and what to ignore. As a beginning trainee, though, you typically will have neither the experience nor the network to make these decisions. My advice would be to pick a PI who asks these same sorts of questions. Then keep asking yourself these questions during your training. Seek out critical people and bounce your ideas off of them. At the same time, don’t become one of those people who just rips every paper to shreds in journal club. The point is to learn to exhibit sound judgement and find a way forward, and that also means sifting out the good nuggets and threading them together across multiple papers.

As a related point, as I mentioned earlier, there’s a lot of fuss out there about the “reproducibility crisis”. There’s two possible models: one in which we require every paper to be “true”, and the more laissez-faire model I am advocating for in which we just assume a lot of papers are wrong and train people to know the difference. Annoying reviewers often say that extraordinary claims require extraordinary evidence. I think this is wrong, and I’m not alone (hat tip: Dynamic Ecology). I think that in any one paper, you do your best, and time will tell whether you were right or wrong. It’s just not practical or efficient for a paper to solve every potential problem, with every result cross-validated with every method. Science is bigger than any one paper, and I think it’s worth providing our trainees with a better appreciation of that fact.

Update, 8/25/2015:
Couple nice suggestions from various folks. One from anonymous suggests "#12: Whenever the word 'modulating' appears anywhere in title/abstract." Well said!

Another point from Casey Bergman:

Lemaitre: be skeptical of results where there is a big paper with no follow-up #wtdros14
— Casey Bergman (@caseybergman) August 2, 2014

Wednesday, August 19, 2015

A fun day in the lab

Yesterday was a really great day in the lab! Olivia, who graduated in May, came back from her trip to dance camp to have lunch with us... and showed off her new engagement ring! Awesome! And with one of the best engagement speeches ever from her fiancé Derek (as seen on Facebook). Olivia's answer to the big question? "Okay, fine."

Also, it was Maggie's birthday, and Ally made this fantastic cake made of fruit and yogurt:

Happy birthday, Maggie!

Then we went to Taco Tuesday at Wahoo's and enjoyed a lovely summer evening outside with beer, tacos and cornhole.

Sigh, now back to writing a grant...

Tuesday, August 11, 2015

The impact-factor introduction

Last week, I went to the Penn MSTP retreat (for MD/PhD and VMD/PhD students), which was really cool. It truly is The Best MSTP Program in the Galaxy™, with tons of very talented students, including, I'm proud to say, four in our lab! There was lots of interesting and inspiring science in talks and posters throughout the day, and I also got to meet with a couple of cool incoming students, which is always a pleasure.

One thing I noticed several times, however, was the pernicious habit of mentioning of what journals folks in the program were publishing in or somehow associated with, emphasizing, of course, the fancy ones like Nature, etc. I noticed this in particular in the introduction of the keynote speaker, Chris Vakoc (Penn alum from Gerd Blobel's lab), because the introduction only mentioned where his work was published and didn't say anything about what science he actually did! I feel it bears mentioning that Chris gave a magnificent talk about his work on chromatin and cancer, including finding an inhibitor that actually seems to have cured a patient of leukemia. That's real impact.

I've seen these "impact-factor introductions" outside of the MSTP retreat a few times as well, and it really rubs me the wrong way. Frankly, being praised for the journals you've published in is just about the worst praise one could hope for. In a way, it's like saying "I don't even care enough to learn about what you do, but it seems like some other people think it's good". Remember, "where" we publish is just something we invent to separate out the mostly uninteresting science from the perhaps-marginally-less-likely-to-be-uninteresting-but-still-mostly-uninteresting science. If you actually are lucky enough to do something really important, it won't really matter where it's published.

What was even more worrisome was that the introduction for the speaker came from a (very well-intentioned) trainee. I absolutely do not want to single out this trainee, and I am certain the trainee knows about Chris's work and holds it in high regard. Rather, I think the whole thing highlights a culture we have fostered in which trainees have come to value perceived "impact" more than science itself. As another example, I remember bumping into a (non-MSTP) student recently and mentioning that we had recently published a paper, and rather than first asking what it was about, they only asked about where it was published! I think that's frightening, and shows that our trainees are picking up the worst form of scientific careerism from us. Not that I'm some sort of saint, either. I found it surprising to read BioRxiv recently and feel a bit disoriented without a journal name on the paper to help me know whether a paper was worth reading. Hmm. I'm clearly still in recovery.

Now, I'm not an idealist, nor particularly brave. I still want to publish papers in glossy journals for all the same reasons everyone else does, mostly because it will help ensure someone actually reads our work, and because (whether I like it or not) it's important for trainees and also for keeping the lab running. I also personally think that this journal hierarchy system has arisen for reasons that are not easy to fix, some of which are obvious and some less so. More ideas on that hopefully soon. But in the meantime, can we all at least agree not to introduce speakers by where they publish?

Incidentally, the best introduction I've ever gotten was when I gave a talk relatively recently and the introducer said something like "... and so I'm excited to hear Dr. Raj talk about his offbeat brand of science." Now that's an introduction I can live up to!