Thursday, March 26, 2015

Friday, March 20, 2015

Priority in science is mostly a mirage

It seems these days that there are a lot of CRISPR priority fights out there, with the biggest being the patent dispute between Doudna and Zhang. Got me thinking about why we place so much emphasis on priority.

As scientists, it is a deeply engrained goal to be first. First to think of an idea, first to work it out, first to publish the idea. And in our culture, the one who does it first typically gets all the credit, sometimes even if they just “win” by a couple of months (although sometimes other factors come into play). It is what separates those perceived as the most shiny stars from the rest of us.

But think about it. If you’re working on something and get there just a few months before someone else, are you really that much more shiny than the next person? If the goal is to win a footrace, sure, you win. But if the goal is to create knowledge, the world would essentially be unchanged if you had never existed. Sobering. And true for the vast majority of us.

I think a pretty rational definition of a really original advance is something that if the researcher had not existed it is likely to have taken a really long time before anyone else would have come up with it. Almost by definition, such an advance is much more likely to come from one person. Like maybe RNAi. Or PCR. Although who knows? Maybe someone would have figured out these same things a few years later. I’m consistently surprised at how often you think you’re working on something completely alone only to find someone else hot on your heels. I think there are two reasons for this. One is that as technology and knowledge develops, the time becomes ripe for some discoveries. Like once sequencing was around, the discovery of several new long non-coding RNAs was essentially an inevitability. The other reason is that there are just so many smart people out there these days. With so many scientists, it’s virtually impossible that nobody out there is thinking about the same things you are. Even in math, which historically has worshipped at the altar of the solitary genius, often has multiple names attached to many new theorems. Which makes the exceptions all the more remarkable, like Yiteng Zhang’s amazing theorem about bounded gap primes or the discovery that primes is in P by a small group in India working largely independently. Kudos to them!

This is not to say that “meat and potatoes” science is not important. In fact, I think the steady, cumulative effects of incremental advances of the entire scientific community mostly outweigh the contributions of those few geniuses, especially in biomedical sciences in the current era. Somehow, I find this very reassuring and in many ways freeing: if you realize that we parcel out winners and losers in this race based on essentially arbitrary factors–and probably our innate desire to create heroic narratives–then it’s okay to just continue doing what you’re doing and not worry about it. In the long run, nobody really wins or loses, and science will continue moving, regardless.

So what is the strategy if you want to do something really original? Well, if the goal is to make a discovery that would take a long time to happen if you did not exist, then you can either do something really original or something that nobody cares about. Often one and the same!

Friday, January 23, 2015

Some thoughts on Tomasetti and Vogelstein (and post-publication review)

Interesting paper from Tomasetti and Vogelstein entitled “Variation in cancer risk among tissues can be explained by the number of stem cell divisions” (screw the paywall). This paper has generated a lot of controversy on Twitter and blogs, which is in many ways a preview of what a post-publication review environment might look like. I worry that it’s been largely negative, so here are my (admittedly relatively uninformed) thoughts.

Here is the abstract:
Some tissue types give rise to human cancers millions of times more often than other tissue types. Although this has been recognized for more than a century, it has never been explained. Here, we show that the lifetime risk of cancers of many different types is strongly correlated (0.81) with the total number of divisions of the normal self-renewing cells maintaining that tissue’s homeostasis. These results suggest that only a third of the variation in cancer risk among tissues is attributable to environmental factors or inherited predispositions. The majority is due to “bad luck,” that is, random mutations arising during DNA replication in normal, noncancerous stem cells. This is important not only for understanding the disease but also for designing strategies to limit the mortality it causes.
Basically, the idea is that part of the reason that some tissues are more prone to cancer is because they have a lot of stem cell divisions–an idea supported by the data they present. I think this is a really important point! In particular, because in some ways it establishes what I consider an important null, which is that in considering cancer incidence, it seems reasonable to consider that the more proliferative tissues will be more prone to cancer just because of the increased number of cell divisions. Darryl Shibata (USC) has a series of really nice papers on this point, focusing on colorectal cancer. In particular, in this paper, he points out that such models would predict that taller (i.e., bigger) people would have more stem cells and thus should have a higher incidence of cancer. And that’s actually what they find! I saw Shibata give an (excellent) talk on this at a Physics of Cancer workshop, and afterwards, a cancer biologist criticised this height result, incredulously saying “Well, but there are so many other factors associated with being tall!” Fair enough. But I think that Darryl’s is an economical model that explains the data, and would be what I would consider an important null that deviations should be measured against. I think this is a nice point that Tomasetti and Vogelstein make as well.

What are the consequences of such a null? Tomasetti and Vogelstein frame their discussion around stochastic, environmental and genetic influences on cancer incidence between tissues. Emphasis on between tissues. What exactly does this mean? Well, what they are saying is that if you compare lung cancer rates in smokers vs. non-smokers (environmental effect), then the rate of getting cancer is around 10-20 times higher, but your chances of getting lung cancer even as a non-smoker is still much higher than getting, say, head osteosarcoma, and a plausible possible reason for this is that there are way more stem cell divisions in lung than in the bones in your head. Similarly, colorectal cancer incidence rates are much higher in people with a genetic predisposition (APC mutation), but again, even without the genetic predisposition, that is still many orders of magnitude higher than in other tissues with much lower rates of stem cell divisions. I think this is pretty interesting! Of course, as with Shibata’s height association, the association with stem cell divisions is not proof that the stem cell divisions are per se the cause of this association, but one of the nice things about Shibata’s work is that he shows that a model of stem cell divisions and number of genetic “hits” required for a particular cancer can match the actual cancer incidence data. So I think this is a plausible null model for a baseline of how much certain tissues will get cancer. Incidentally, this made me realize a perhaps obvious point on the genetic determinants of cancer: if you find an association of a gene with cancer incidence, then it may be that the association is because the gene is associated with, e.g., height, in which case, yes, there is technically a genetic underpinning for that variation, but it is hard to imagine designing any sort of drug based on this finding. Tomasetti and Vogelstein make this point in their paper.

The authors then go on to further analyze their data and separate cancers into ones in which the variance in incidence is dominated by “stochastic” effects vs. “deterministic” effects. I can’t say I’ve gone into the details of this analysis, but it seems interesting–and a natural question to ask with these data. Here are a few thoughts on the ideas this analysis explores. One question that has come up a lot is why is this correlation not so strong, especially on a linear scale? I think that one issue is that the division into stochastic, environmental and genetic is missing a big component, which is the tissue, cell and molecular biology of cancer. Some tissues may require more genetic “hits” than others, or a long series of epigenetic effects, or have structures that enable rapid removal of defective stem cells, and so even tissues with the same number of divisions, in the absence of any genetic or environmental factors, will have different rates of cancer. Another issue is that these data are imperfect, and so you will get some spread no matter what. Still, I think the association is real and interesting.

Anyway, I think this “null model” is pretty cool. I wonder if one of the reasons that we focus so much on environmental and genetic effects is that we can do “experiments” on them, whereas the causal links in the stem cell division hypothesis are hard to prove.

There was a very interesting critique from Yaniv Erlich that said that the authors’ analysis implicitly assumes that there is no interaction between the number of stem cell divisions and genetic and environmental factors. A good point, although I do think that Tomasetti and Vogelstein have thought about this–as I mentioned, they say explicitly:
The total number of stem cells in an organ and their proliferation rate may of course be influenced by genetic and environmental factors such as those that affect height or weight.
Their example about the mouse vs. human incidence of colon vs. small intestine cancer in the case of the APC mutation is I think a nice piece of evidence suggesting that number of divisions is very important factor in determining cancer incidence. Although again, many alternative explanations here.

I think some of the confusion out there about this paper can be summed up as follows:
“You are a smoker and I am not, so I have a lower rate of getting lung cancer.”
“Yeah, but you still have a much higher rate of getting lung cancer than bone cancer.”
“Uhh… okay… sure… don’t think I’m gonna take up smoking anytime soon, though.”
It’s just a weird comparison to make. That said, I don’t think the authors really make this comparison anywhere in their manuscript. What I think they are saying at the end is that for cancers that have strong determinants due to environmental factors, lifestyle changes and other such interventions could be useful (like quitting smoking), whereas for other cancers that arise more randomly, we should just focus on detection. Although I have to admit that perhaps I’m missing something, but this seems like a point one could make even without this analysis.

There has been a lot of discussion out there about how weak the correlation is and whether its appropriate to use log-log or linear scales and so forth. I think the basic point they are trying to make is that more highly proliferative tissues are more prone to cancer. I think the data they present are consistent with this conclusion. Whether the specific amount of variance they quote in the abstract is right or not is an important technical matter that I think other people are already talking about a lot, but I think the fundamental conclusion is sound.

A note about the reaction to this paper: in principle, I like the concept of moving from pre-publication anonymous peer review to a post-publication peer review world. I think that pre-publication anonymous peer review is slow, arbitrary, and (most importantly) demoralizing, especially for trainees. That said, now that I’ve seen a bit of post-publication peer review happen online, I think the sad thing I must report is that in many cases, the culture seems to be one of the hardcore takedown, often in a rather accusatorial tone. And I thought it was hard to get a positive review from a journal! Here are some nice thoughts from Kamoun, who recently responded (admirably) to an issue raised on Pubpeer.

My view is that in any paper with real-world data, there will be points that are solid and points that are weak. In post-publication peer review, we run the risk of reducing a paper to a negative soundbite that propagates very fast, and thus throwing out the baby with the bathwater, not to mention putting the author (often a trainee) under very intense public scrutiny that they might not be equipped to handle. I think we should be very careful in how we approach post-publication review because of its viral nature online. Anyway, those are my two cents.

PS: Apropos of discussions of log-log correlations vs. linear correlations, we have a fairly extensive comparison of RNA-seq data to RNA FISH data. More very soon.

Friday, January 16, 2015

Gordon Conference turns graduate student into crazy reptile lady

Just got back from a cool Gordon conference on Stochastic Physics in Biology with a couple students in the lab. Lots of interesting science, and lots of cool people to talk with as well!

The food was overall really good, but one day, we decided to go get some Mexican food from a local taco shack. Delicious! On the way back, we noticed a little store on the side of the road called "Exotic Emporium". When we went inside, what did we find but a reptile pet store. Olivia fell in love with those little critters, and here's the evidence:

"Hello, strange lizard":


"That is a large snake!"


"I think I like snakes."


"Okay, put the snake around your neck then." "Umm, okay..."


"The colors! The colors!"



"Can I keep it?"






Saturday, December 27, 2014

Three observations about anonymity in peer review

I made a vow to myself to not blog about peer review ever again. Oh well. Anyway, I have been thinking about a few things related to anonymity in the review process that I don’t think I’ve heard discussed elsewhere:
  1. Everyone I talk to who has published there has raved about eLife. Like, literally everyone–in fact, they have all said it was one of their best publication experiences, with a swift, fair, and responsive review process. I was wondering what it was in particular that made the review process so much less painful. Then somebody told me something that made a ton of sense (I forget who, but thanks, Dr. Insight, wherever you are!). The referees confer to reach a joint verdict on the paper. In theory, this is to build a scientific consensus to harmonize the feedback. In practice, Dr. Insight pointed out that the main benefit is that it’s a lot harder to give those crazy jackass reviews we all get because you will be discussing it with your fellow reviewers, who are presumably peers in some way or another. You don’t want to look like a complete tool or someone with an axe to grind in front of your peers. And so I think this process yields many of the benefits of non-anonymous peer review while still being anonymous (to the author). Well played, eLife!
  2. One reimagining of the publishing system that I definitely favor is one in which every paper gets published in a journal that only publishes based on technical veracity, like PLOS ONE. Then the function of the “selective journal” is just to publish a “Best of…” list of the papers they like the best. I think that a lot of people like this idea, one which decouples assessments of whether the paper is technically correct from assessments of “impact”. In theory, sounds good. One issue, though, is that it ignores the hierarchy on the reviewer side of the fence. Editors definitely do not just randomly select reviewers, nor select them just based on field-specific knowledge. And not every journal gets the same group of reviewers–you better believe that people who are too busy to review for Annals of the Romanian Plant Society B will somehow magically find time in their schedule to review for Science. Perhaps what might happen is that this new version of “Editor” (i.e., literature curator) might commission further post-publication reviews from a trusted critic before putting this paper on their list. Anyway, it’s something to work out.
  3. I recently started signing all my reviews (not sure if they ever made it to the authors, but I can at least say I tried). I think this makes sense for a number of reasons, most of which have been covered elsewhere. As I had noted here, though, there is “Another important factor that gets discussed less often, which is that in the current system, editors have more information than you as an author do. Sometimes you’ll get 2/3 good reviews and its fine. Sometimes not. Whether the editor is willing to override the reviewer can often depend on relative stature more than the content of the review–after all, the editor is playing the game as well, and probably doesn’t want to override Prof. PowerPlayer who gave the negative review. This definitely happens. The editor can have an agenda behind who they send reviews to and who they listen to. So no matter how much blinding is possible (even double blind doesn’t really seem plausible), as long as we have editors choosing reviewers and deciding who to listen to, there will be information asymmetry. Far better, in my mind, to have reviewer identities open–puts a bit of the spotlight on editors, also.” Another interesting point: as you work your way down the ladder, if you get a signed negative review, you will know who to exclude next time around. Not sure of all the implications of that.
Anyway, that’s it–hopefully will never blog about peer review again until we are all downloading PDFs from BioRxiv directly to our Google self-driving cars.

Friday, December 26, 2014

Posting comments on papers

For many years, people have wondered why most online forums for comments result in hundreds of comments, but even the most exciting scientific results lead to the sound of crickets chirping. Lots of theories as to why, like fear of scientific reprisal or fear of saying something stupid or lack of anonymity.

Perhaps. But I wonder if part of it is just that it feels… incongruous to post comments on scientific papers. To date, I have posted exactly two comments on papers. My first owed its genesis (I think) to the fact that I had just read something about how nobody comments on papers, and so I was determined to post a comment on something. And it was a nice paper on something I found interesting and so I wanted to say something. I just now wrote my second comment. It was on this AWESOME paper (hat tip to Sri Kosuri) comparing efficiency of document preparation using Word vs. LaTeX (verdict: LaTeX loses, little surprise to me). Definitely something I found interesting, and so I somehow felt the urge to comment.

And then, as I started writing my comment, something just felt… wrong. Firstly, the process was annoying. I had to log in to my PLOS account, which I of course forgot all the details of. Then, as I was leaving my comment, I noticed a radio button at the bottom to say whether I had a competing interest. The whole process was starting to feel a whole lot more official than I had anticipated. Suddenly, the relatively breezy and light-hearted nature of my comment felt very out of place. It’s just very hard to escape the feeling that any commentary on a scientific paper must be couched in the stultifying language and framework of the typical peer review, which is just so different than the far more informal commentary than you get on, for instance, blog posts. And heaven forbid if you actually posted a joke or something like that.

I feel like part of the reason nobody comments is that publishing a paper seems like a Very Serious Business™, and so any writing or commentary associated with it seems like it should be just as serious. Well, I agree that publishing a paper is a very tedious business, but I think making scientific discourse a bit more lighthearted would be a good thing overall. And who knows, one side-effect could be that maybe someone might actually read the paper for a change!

Tuesday, December 23, 2014

Fortune cookies and peer review

Ever play that game where you take the fortune from a fortune cookie and then add “in bed” to the end of it for a funny reinterpretation? I’ve found it works pretty well if you just replace “in bed” with “in peer review”. Behold (from some recent fortune cookies I got):

Look for the dream that keeps coming back. It is your destiny in peer review.

Wiseness makes for oneself an island which no flood can overwhelm in peer review.

Ignorance never settles a question in peer review.

In the near future, you will discover how fortunate you are in peer review.

Every adversity carries with it the seed of an equal or greater benefit in peer review.

You will find luck when you go home in peer review.

Also reminds me of the weirdest fortune I ever got: “Alas! The onion you are eating is someone else’s water lily.” Not sure exactly what that means, in peer review or otherwise…