Friday, July 31, 2020

Alternative hypotheses and the Gautham Transform

As I have mentioned several times, having Gautham in the lab really changed how I think about science. In particular, I learned a lot about how to take a more critical approach to science. I think this has made me a far better and more rigorous scientist, and I want to impart those lessons to all members of the lab.

The most important thing I learned from Gautham was to consider alternative hypotheses. I know this sounds like duh, that’s what I learn in my RCR meetings, “expected outcomes and potential pitfalls” sections of grants, and boring classes on how to do science, but I think that’s because we so rarely see how powerful it is in practice. I think it was one of Gautham’s favorite pastimes, and really exemplified his scientific aesthetic (indeed, he was very well known for demonstrating some alternative hypotheses for carrier multiplication, I believe). There were many, many times Gautham proposed alternative hypotheses in our lab, and it was always illuminating. Indeed, one of the main points of his second paper from the lab was about how one could explain “fluctuations between states” by simple population dynamics without any state switching—a whole paper’s worth of alternative hypothesis!

Why do we generally fail to consider alternative hypotheses? One reason is that it’s scary and not fun. Generally, the hypothesis you want to consider is the option that is the fun one. It is scary to contemplate the idea that something fun might turn out to be something boring. (Gautham and I used to joke that the “Gautham Transform” was taking something seemingly interesting and showing that it was actually boring.) The truth of it, though, is that most things are boring. Sure, in biology, there are a lot more surprises than in, say, physics, but there are still far fewer interesting things than are generally claimed. I think that we would all do better to come in with a stronger prior belief that most findings actually have a boring explanation, and a critical implementation of that belief is to propose alternative hypotheses. Keep in mind also that when we are trained, we typically are presented with a list of facts with no alternatives. This manner of pedagogy leaves most of us with very little appreciation for all the wrong turns that comprise science as it’s being made as opposed to the little diagrams in the textbooks.

The other reason we fail to consider alternatives is that it’s a lot of work. It’s always going to be harder to spend as much time actively thinking of ways to show that your pet theory is incorrect, and so in my experience it’s usually more work to come up with plausible alternative hypotheses. Usually, this difficulty manifests as a proclamation of “there’s just no other way it could be!” Thing is… there’s ALWAYS an alternative hypothesis. All models are wrong. You may get to a point where you just get tired, or the alternatives seem too outlandish, but there’s always another alternative to exclude. I remember as we were wrapping up our transcriptional-scaling-with-cell-size manuscript, we got this cool result suggesting that transcription was cut in half upon DNA replication (decrease in burst frequency). I was really into this idea, and Gautham was like, that’s really weird, there must be some other explanation. I was like, I can’t think of one, and I remember him saying “Well, it’s hard, but there has to be something, what you’re proposing is really weird”. So… I spent a couple days thinking about it, and then, voila, an alternative! (The alternative was a global decrease in transcription in S-phase, which Olivia eliminated with a clever experiment measuring transcription from a late-replicating gene.) Point is, it’s hard but necessary work.

(Note: I’m wondering about ways to actively encourage people to consider alternatives on a more regular basis. One suggestion was to stop, say, group meeting somewhere in the middle and just explicitly ask everyone to think of alternatives for a few minutes, then check in. Another option (HT Ben Emert) is to have a lab buddy who’s job is to work with you to challenge hypotheses. Anybody have other thoughts?)

So when do you stop making alternatives? I think that’s largely a matter of taste. At some point, you have to stand by a model you propose, exclude as many plausible alternatives as you can, and then acknowledge that there are other possible explanations for what you see that you just didn’t think of. Progress continues, excluding one alternative at a time…

“Hipster” overlay journals

Been thinking a lot about overlay journals and their implications these days. For those who don’t know, an overlay journal is sort of like a “meta-journal” in that it doesn’t formally publish its own papers. Rather, it provides links to other preprints/papers that it thinks are interesting. On some level, the idea is that the true value of a journal is to serve as a filter for what someone thinks is science worth reading so that you don’t have to read every single paper. An overlay journal provides that filter function without the need for the rest of the (costly) trappings of a journal, like peer review and, uhh, color figures ;). 

There is one very interesting aspect of an overlay journal that I don’t think has been discussed very much: in contrast with regular journals, they are fundamentally non-exclusive, meaning that ANY overlay journal can in principle “publish” ANY paper. What this non-exclusivity means is that there is no jockeying between journals to publish the “obviously important” papers, which have a perhaps slightly elevated chance of actually being important. You know, like “we sequenced 10x more single cells than the last paper in a fancy journal” kind of papers. If you run an overlay journal, you never have to gaze longingly at those “high impact” papers—if you want to publish it, just add it to your overlay!

What are the consequences of non-exclusivity? Primarily, I think it would serve to diminish the value of “obviously important” papers. Everyone can identify them based on authors and number of genomes sequenced or whatever, so there’s really not that much value in including them per se. It would be like saying “Here’s my playlist, it’s like a copy of the Billboard Top 40”. Nobody’s going to look to your overlay journal for that kind of stuff (which you can readily get from CNS or Twitter). Rather, the real value would be in making lists of papers that are awesome but might otherwise be overlooked—essentially a hipster playlist. As an editor, your cache would be in your ability to identify these new, cool papers and making Michael Cera-esque mixtapes out of them. Can leave the Hot 100 to Casey Kasem/Spotify algorithms.

Measuring the importance of an overlay journal would also be interesting. Clearly, impact factor is not a useful metric, since anybody can make their impact factor as high as they want by including highly cited papers. I would guess a far more sensible metric would be number of followers of the journal (which makes more sense anyway).

Another interesting aspect of an overlay journal is that it can be retrospective. You could include old papers as well, highlighting old gems that may have been forgotten.

Of course, an interesting question is whether there is any difference between an overlay journal and someone’s Twitter feed. Not sure, actually…

Also, thoughts on existing journals that have hipster qualities to them? I vote Current Biology, my lab votes eLife.

Friday, July 17, 2020

My favorite "high yield" guides to telling better stories

Guest post by Eric Sanford


In medical school, we usually have five lectures’ worth of new material to memorize each day. Since we can’t simply remember it all, we are always seeking “high yield” resources (a term used so often by med students that it quickly becomes a joke): those concise one or two-pagers that somehow contain 95 percent of what we need to know for our exams. My quest of finding the highest yield resources has continued in full force after becoming a PhD student.


A major goal of mine has been to improve my scientific communication skills (you know, writing, public speaking, figure-making… i.e. those extremely-important skills that most of us scientists are pretty bad at), and I’ve come across a few very high yield resources as I’ve worked on this. Here are my favorites so far:


Resonate, by Nancy Duarte:

  • The best talks are inspiring, but “be more inspiring” is not easy advice to follow.

  • This book teaches you how to turn your content into a story that inspires an audience.

  • I received extremely positive feedback and a lot of audience questions the first time I gave a talk where I tried to follow the suggestions of this book.

  • This was both the most fun and the most useful of all my recommendations.


The Visual Display of Quantitative Information, by Edward Tufte:

  • Tufte is probably the most famous “data visualization” guru, and I think this book, his first book, is his best one. (I’ve flipped through the sequels and would also recommend the chapter on color from “Envisioning Information.”)

  • This book provides a useful framework for designing graphics that convey information in ways that are easy (easier?) for readers to understand. Some pointers include removing clutter, repeating designs in “small multiples”, labeling important elements directly, and using space consistently when composing multiple elements in the same figure.


The Elements of Style, by Strunk and White, pages 18-25:


Words to Avoid When Writing, by Arjun Raj


Raj Lab basic Adobe Illustrator (CC) guide, by Connie Jiang


There are many other great resources out there that are also worth going through if you have the time (Style: Lessons in Clarity and Grace by Bizup and Williams is another excellent writing guide), but for me these ones above had the highest amount-learned-per-minute-of-concentration-invested. 



Guest post by Eric Sanford