Friday, July 31, 2020

Alternative hypotheses and the Gautham Transform

As I have mentioned several times, having Gautham in the lab really changed how I think about science. In particular, I learned a lot about how to take a more critical approach to science. I think this has made me a far better and more rigorous scientist, and I want to impart those lessons to all members of the lab.

The most important thing I learned from Gautham was to consider alternative hypotheses. I know this sounds like duh, that’s what I learn in my RCR meetings, “expected outcomes and potential pitfalls” sections of grants, and boring classes on how to do science, but I think that’s because we so rarely see how powerful it is in practice. I think it was one of Gautham’s favorite pastimes, and really exemplified his scientific aesthetic (indeed, he was very well known for demonstrating some alternative hypotheses for carrier multiplication, I believe). There were many, many times Gautham proposed alternative hypotheses in our lab, and it was always illuminating. Indeed, one of the main points of his second paper from the lab was about how one could explain “fluctuations between states” by simple population dynamics without any state switching—a whole paper’s worth of alternative hypothesis!

Why do we generally fail to consider alternative hypotheses? One reason is that it’s scary and not fun. Generally, the hypothesis you want to consider is the option that is the fun one. It is scary to contemplate the idea that something fun might turn out to be something boring. (Gautham and I used to joke that the “Gautham Transform” was taking something seemingly interesting and showing that it was actually boring.) The truth of it, though, is that most things are boring. Sure, in biology, there are a lot more surprises than in, say, physics, but there are still far fewer interesting things than are generally claimed. I think that we would all do better to come in with a stronger prior belief that most findings actually have a boring explanation, and a critical implementation of that belief is to propose alternative hypotheses. Keep in mind also that when we are trained, we typically are presented with a list of facts with no alternatives. This manner of pedagogy leaves most of us with very little appreciation for all the wrong turns that comprise science as it’s being made as opposed to the little diagrams in the textbooks.

The other reason we fail to consider alternatives is that it’s a lot of work. It’s always going to be harder to spend as much time actively thinking of ways to show that your pet theory is incorrect, and so in my experience it’s usually more work to come up with plausible alternative hypotheses. Usually, this difficulty manifests as a proclamation of “there’s just no other way it could be!” Thing is… there’s ALWAYS an alternative hypothesis. All models are wrong. You may get to a point where you just get tired, or the alternatives seem too outlandish, but there’s always another alternative to exclude. I remember as we were wrapping up our transcriptional-scaling-with-cell-size manuscript, we got this cool result suggesting that transcription was cut in half upon DNA replication (decrease in burst frequency). I was really into this idea, and Gautham was like, that’s really weird, there must be some other explanation. I was like, I can’t think of one, and I remember him saying “Well, it’s hard, but there has to be something, what you’re proposing is really weird”. So… I spent a couple days thinking about it, and then, voila, an alternative! (The alternative was a global decrease in transcription in S-phase, which Olivia eliminated with a clever experiment measuring transcription from a late-replicating gene.) Point is, it’s hard but necessary work.

(Note: I’m wondering about ways to actively encourage people to consider alternatives on a more regular basis. One suggestion was to stop, say, group meeting somewhere in the middle and just explicitly ask everyone to think of alternatives for a few minutes, then check in. Another option (HT Ben Emert) is to have a lab buddy who’s job is to work with you to challenge hypotheses. Anybody have other thoughts?)

So when do you stop making alternatives? I think that’s largely a matter of taste. At some point, you have to stand by a model you propose, exclude as many plausible alternatives as you can, and then acknowledge that there are other possible explanations for what you see that you just didn’t think of. Progress continues, excluding one alternative at a time…

“Hipster” overlay journals

Been thinking a lot about overlay journals and their implications these days. For those who don’t know, an overlay journal is sort of like a “meta-journal” in that it doesn’t formally publish its own papers. Rather, it provides links to other preprints/papers that it thinks are interesting. On some level, the idea is that the true value of a journal is to serve as a filter for what someone thinks is science worth reading so that you don’t have to read every single paper. An overlay journal provides that filter function without the need for the rest of the (costly) trappings of a journal, like peer review and, uhh, color figures ;). 

There is one very interesting aspect of an overlay journal that I don’t think has been discussed very much: in contrast with regular journals, they are fundamentally non-exclusive, meaning that ANY overlay journal can in principle “publish” ANY paper. What this non-exclusivity means is that there is no jockeying between journals to publish the “obviously important” papers, which have a perhaps slightly elevated chance of actually being important. You know, like “we sequenced 10x more single cells than the last paper in a fancy journal” kind of papers. If you run an overlay journal, you never have to gaze longingly at those “high impact” papers—if you want to publish it, just add it to your overlay!

What are the consequences of non-exclusivity? Primarily, I think it would serve to diminish the value of “obviously important” papers. Everyone can identify them based on authors and number of genomes sequenced or whatever, so there’s really not that much value in including them per se. It would be like saying “Here’s my playlist, it’s like a copy of the Billboard Top 40”. Nobody’s going to look to your overlay journal for that kind of stuff (which you can readily get from CNS or Twitter). Rather, the real value would be in making lists of papers that are awesome but might otherwise be overlooked—essentially a hipster playlist. As an editor, your cache would be in your ability to identify these new, cool papers and making Michael Cera-esque mixtapes out of them. Can leave the Hot 100 to Casey Kasem/Spotify algorithms.

Measuring the importance of an overlay journal would also be interesting. Clearly, impact factor is not a useful metric, since anybody can make their impact factor as high as they want by including highly cited papers. I would guess a far more sensible metric would be number of followers of the journal (which makes more sense anyway).

Another interesting aspect of an overlay journal is that it can be retrospective. You could include old papers as well, highlighting old gems that may have been forgotten.

Of course, an interesting question is whether there is any difference between an overlay journal and someone’s Twitter feed. Not sure, actually…

Also, thoughts on existing journals that have hipster qualities to them? I vote Current Biology, my lab votes eLife.

Friday, July 17, 2020

My favorite "high yield" guides to telling better stories

Guest post by Eric Sanford


In medical school, we usually have five lectures’ worth of new material to memorize each day. Since we can’t simply remember it all, we are always seeking “high yield” resources (a term used so often by med students that it quickly becomes a joke): those concise one or two-pagers that somehow contain 95 percent of what we need to know for our exams. My quest of finding the highest yield resources has continued in full force after becoming a PhD student.


A major goal of mine has been to improve my scientific communication skills (you know, writing, public speaking, figure-making… i.e. those extremely-important skills that most of us scientists are pretty bad at), and I’ve come across a few very high yield resources as I’ve worked on this. Here are my favorites so far:


Resonate, by Nancy Duarte:

  • The best talks are inspiring, but “be more inspiring” is not easy advice to follow.

  • This book teaches you how to turn your content into a story that inspires an audience.

  • I received extremely positive feedback and a lot of audience questions the first time I gave a talk where I tried to follow the suggestions of this book.

  • This was both the most fun and the most useful of all my recommendations.


The Visual Display of Quantitative Information, by Edward Tufte:

  • Tufte is probably the most famous “data visualization” guru, and I think this book, his first book, is his best one. (I’ve flipped through the sequels and would also recommend the chapter on color from “Envisioning Information.”)

  • This book provides a useful framework for designing graphics that convey information in ways that are easy (easier?) for readers to understand. Some pointers include removing clutter, repeating designs in “small multiples”, labeling important elements directly, and using space consistently when composing multiple elements in the same figure.


The Elements of Style, by Strunk and White, pages 18-25:


Words to Avoid When Writing, by Arjun Raj


Raj Lab basic Adobe Illustrator (CC) guide, by Connie Jiang


There are many other great resources out there that are also worth going through if you have the time (Style: Lessons in Clarity and Grace by Bizup and Williams is another excellent writing guide), but for me these ones above had the highest amount-learned-per-minute-of-concentration-invested. 



Guest post by Eric Sanford



Wednesday, August 21, 2019

I <3 Adobe Illustrator (for scientific figure-making) and I hope that you will too

Guest post by Connie Jiang

As has been covered somewhat extensively (see here, here, and here), we are a lab that really appreciates the flexibility and ease with which one can use Illustrator to compile and annotate hard-coded graphical data elements to create figures. Using Illustrator to set things like font size, marker color, and line weighting is often far more intuitive and time-efficient than trying to do so programmatically. Furthermore, it can easily re-arrange/re-align graphics and create beautiful vector schematics, with far more flexibility than hard-coded options or PowerPoint.

So why don’t more people use Illustrator?

For one, it’s not cheap. We are lucky to have access to relatively inexpensive licenses through Penn. If expense is your issue, I’ve heard good things about Inkscape and Gimp, but unfortunately I have minimal experience with these and this document will not discuss them. Furthermore, as powerful and flexible as Illustrator is, its interface can be overwhelming. Faced with the activation energy and cognitive burden of having to learn how to do even basic things (drawing an arrow, placing and reshaping a text box without distorting the text it contains), maybe it’s unsurprising that so many people continue to use PowerPoint, a piece of software that most people in our lab first began experimenting with prior to 8th grade [AR editor’s note: uhhh… not everyone]. 

Recently, I decided to try to compile a doc with the express purpose of decreasing that activation energy of learning to use Illustrator to accomplish tasks that we do in the lab setting. Feel free to skip to the bottom if you’d just like to get to that link, but here were the main goals of this document:
  1. Compile a checklist to run through for each figure before submission. This is a set of guidelines and standards we aim to adhere to in lab to maintain quality and consistency of figures.
  2. Give a basic but thorough rundown of essentially everything in Illustrator that you need to begin to construct a scientific figure. Furthermore, impart the Illustrator “lingo” necessary to empower people to search for more specific queries.
  3. Answer some of what I feel to be the most FAQs. Due to my love of science-art and general artistic/design experimentation, I’ve spent a lot of time in Illustrator, so people in lab will sometimes come to me with questions. These are questions like: “my figure has too many points and is slowing my Illustrator down: how can I fix it?” and “what’s the difference between linked and embedded images?”. Additionally, there are cool features that I feel like every scientist should be able to take advantage of, like “why are layers super awesome?” and “how can I select everything of similar appearance attributes?”.
Finally, a disclaimer: This document will (hopefully) give you the tools and language to use Illustrator as you see fit. It does not give any design guidance or impart aesthetic sense (aside from heavily encouraging you to not use Myriad Pro). Make good judgments~

Full Raj lab basic Illustrator guide can be found here.

Sunday, August 4, 2019

I need a coach

I’ve been ruminating over the course of the last several years on a conversation I had with Rob Phillips about coaches. He was saying (and hopefully he will forgive me if I’m mischaracterizing this) that he has had people serve the role of coach in his life before, and that that really helped push him to do better. It’s something I keep coming back to over and over, especially as I get further along in my career.

In processing what Rob was saying, one of the first questions that needed answering is exactly what is a coach? I think most of us think about formal training interactions (i.e., students, postdocs) when we think of coaching in science, and I think this ends up conflating two actually rather disparate things, which are mentoring and coaching. At least for me, mentorship is about wisdom that I have accumulated about decision making that I can hopefully pass on to others. These can be things like “Hmm, I think that experiment is unlikely to be informative” or “That area of research is pretty promising” or “I don’t think that will matter much for a job application, I would spend your time on this instead”. A coach, on the other hand, is someone who will help push you to focus and implement strategies for things you already know, but are having trouble doing. Like “I think we can get this experiment done faster” or “This code could be more cleanly written” or “This experiment is sloppy, let’s clean it up”. Basically, a mentor gives advice on what to do, a coach gives advice on how to actually do it.

Why does this decoupling matter, especially later in your career? When in a formal training situation, you will often get both of these from the same people—the same person, say, guiding your research project is the same person pushing you to get things done right. But after a few years in a faculty position, the N starts to get pretty small, and as such I think the value of mentorship per se diminishes significantly; basically, everybody gives you a bunch of conflicting advice on what to do in any given situation, which is frankly mostly just a collection of well-meaning but at best mildly useful anecdotes. But while the utility of mentorship decreases (or perhaps the availability of high quality mentorship) decreases, I have found that I still have a need for someone to hold me accountable, to help me implement the wisdom that I have accumulated but am sometimes too lazy or scared to put into practice. Like, someone to say “hey, watch a recording of your lecture finally and implement the changes” or “push yourself to think more mechanistically, your ideas are weak” or “that writing is lazy, do better” or “finish that half-written blog post”. To some extent, you can get this from various people in your life, and I desperately seek those people out, but it’s increasingly hard to find the further along you are. Moreover, even if you do find someone, they may have a different set of wisdom that they would be trying to implement for you, like, coaching you towards what they think is good, not what you yourself think is good (“Always need a hypothesis in each specific aim” whereas maybe you’ve come to the conclusion that that’s not important or whatever). If you have gotten to the point where you’ve developed your own set of models of what matters or doesn’t in the world, then you somehow need to be able to coach yourself in order to achieve those goals.

Is it possible to self-coach? I think so, but I’ve always struggled to figure out how. I guess the first step is to think about what makes a good coach. To me, the role of a good coach is to devise a concrete plan (often with some sort of measurable outcome) that promotes a desired change in default behavior. For example, when working with people in the lab in a coaching capacity, one thing I’ve tried to do is to propose concrete goals to try and help overcome barriers. If someone could be participating more in group meeting and seminars, I’ll say “try to ask at least 3 questions at group meeting and one at every seminar” and that does seem to help. Or I’ll push someone to make their figures, or write down their experiment along with results and conclusions. Or make a list of things to do in a day and then search for one more thing to add. Setting these sorts of rules can help provide the structure to achieve these goals and model new behaviors.

How do you implement these coaching strategies for yourself? I think there are a few steps, the first of which are relatively easy. Initially, the issue is to identify the issue, which is actually usually fairly clear: “I want to reduce time spent on email”, “I want to write clean code”, “I want to construct a set of alternative hypotheses every time I come up with some fun new idea”, “Push myself to really think in a model-based fashion”. Next, is reduction to a concrete set of goals, which is also usually pretty easy: “Read every email only once and batch process them for a set period of time” or “write software that follows XYZ design pattern” or “write down alternative hypotheses”. The biggest struggle is accountability, which is where having a coach would be good. How do I enforce the rules when I’m the only one following them?

I’m not really sure, but one thing that works for me (which is perhaps quite obvious) is to rely on something external for accountability. For example, I am always looking for ways to improve my talks, and value being able to do a good job. However, it was hard to get feedback, and even when I did, I often didn’t follow through to implement said feedback. So I did this thing where I show the audience a QR code which leads them to a form for feedback. Often, they pointed out things I didn’t realize were unclear, which was of course helpful. But what was also helpful was when they pointed out things that I already knew were unclear, but had been lazy about fixing. This provided me with a bit of motivation to finally fix the issue, and I think it’s improved things overall. Another externalization strategy I’ve tried is to imagine that I’m trying to model behavior for someone else. Example: I was writing some software a while back for the lab, and there were times where I could have done something in the quick, lazy, and wrong way, rather than in the right way. What helped motivate me to do it right was to say to myself, “Hey, people in the lab are going to look at this software as an example of how to do things, and I need to make sure they learn the right things, so do it right, dummy”.

Some things are really hard to externalize, like making sure you stress test your ideas with alternative hypotheses and designing the experiments that will rigorously test them. One form of externalization that works for me is to imagine former lab members who were really smart and critical and just imagine them saying to me “but what about…”. Just imagining what they might say somehow helps me push myself to think a bit harder.

Any thoughts on other ways to hold yourself accountable when nobody else is looking?

Monday, May 6, 2019

Wisdom of crowds and open, asynchronous peer review

I am very much in favor of preprints and open review, but something I listened to on Planet Money recently gave me some food for thought, along with a recent poll I tweeted about re-reviewing papers. The episode was about wisdom of the crowds, and how magically if you take a large number of non-expert guesses about, say, the weight of an ox, the average comes out pretty close to the actual value. Pretty cool effect!

But something in the podcast caught my ear. They talked about how when they asked some kids, you had to watch out, because once one kid said, say, 300 pounds (wildly inaccurate), then if the other kids heard it, then they would all start saying 300 pounds. Maybe some minor variations, but the point is that they were strongly influenced by that initial guess, rather than just picking something essentially purely random. The thing was that if you had no point of reference, then even a guess provides that point of reference.

Okay, so what does this have to do with peer review? What got me thinking about it was the tweet about re-reviewing a paper you had already seen but for a different journal. I'm like nah not gonna do it because it's a waste of time, but some people said, well, you are now biased. So… in a world where we openly and asynchronously review papers (preprints, postpub, whatever), we would have the same problem that the kids guessing the weight of the cow did: whoever gives the first opinion would potentially strongly influence all subsequent opinions. With conventional peer review, everyone does it blind to the others, and so reviews could be considered more independent samplings (probably dramatically undersampled, but that's another blog post). But imagine someone comments on a preprint with some purported flaw. That narrative is very likely to color subsequent reviews and discussions. I think we've all seen this coloring: take eLife collaborative peer review, or even grant review. Everyone harmonizes their scores, and it's often not an averaging. One could argue that unlike randos on the internet guessing a cow's weight, peer reviewers are all experts. Maybe, but I am somehow not so sure that once we are in the world of experts reviewing what is hopefully a reasonably decent paper that there's much signal beyond noise.

What could we do about this? Well, we could commission someone to hold all the open reviews in confidence and then publish them all at once… oh wait, I think we already have some annoying system for that. I dunno, not really sure, but anyway, was something I was wondering about recently, thoughts welcome.

Sunday, April 28, 2019

Reintegrating into lab following a mental health leave

[From AR] These days, there is a greatly increased awareness and decreased stigmatization of mental health amongst trainees (and faculty, for that matter), which is great. For mentors, understanding mental health issues amongst trainees is super important, and something we have until recently not gotten a lot of training on. More recently, it is increasingly common to get some training or at least information on how to recognize the onset of mental health issues, and in graduate groups here at Penn at least, it is fairly straightforward to initiative a leave of absence to deal with the issue, should that be required. However, one aspect of handling mental health leaves for which there appears to be precious little guidance out there is what challenges trainees face when returning from a mental health leave of absence, and what mentors might do about it. Here, I present a document written by four anonymous trainees with some of their thoughts (and I will chime in at the end with some thoughts from the mentor perspective).


[From trainees] This article is a collection of viewpoints from four trainees on mental health in academia. We list a collection of helpful practices on the part of the PI and the lab environment in general for cases when the trainees return to lab after recovering from mental health issues.

A trainee typically returns either because they feel recovered and ready to get back to normalcy, or they are **better** than before and have self-imposed goals (e.g. finishing their PhD), or they just miss doing science. Trainees in these situations are likely to have spent time introspecting on multiple fronts and they often return with renewed drive. However, it is very difficult to shake off the fear of recurrence of the episode (here we use episode broadly to refer to a phase of very poor mental health), which can make trainees more vulnerable and sensitive to external circumstances than an average person; for instance, minor stresses can appear much larger. In particular, an off-day post a mental health issue can make one think they are already slipping back into it. In some cases, students may find it more difficult to start a new task, perhaps due to the latent fear of not being able to learn afresh. Support from the mentor and lab environment in general can be crucial in both providing and sustaining the confidence of the trainee. It is important that the mentor recognize that the act of returning to the lab is an act of courage in itself. The PI’s interactions with the trainee have a huge bearing on how the trainee re-integrates into his/her work. Here are some steps that we think can help:

Explicitly tell trainees to seek the PI out if they need help. This can be important for all trainees to hear because the default assumption is that these are personal problems to be dealt with personally in its entirety. In fact, advisors should do this with every trainee -- explicitly tell them that they are there to be reached out to, should their mental health be compromised/affected in any way. Restating this to a returning trainee can help create a welcoming and safe environment.

Reintegrating the trainee into the lab environment. The PI should have an open conversation with the trainee about how much information they want divulged to the rest of the group/department, and how they communicate the trainee’s absence to the group, if at all.

Increased time with the mentee. More frequent meetings with a returning student for the first few months help immensely for multiple reasons: a. It can help quell internal fears by a process of regular reinforcement; b. It can get the students back on track with their research faster; c. The academically stimulating conversations can provide the gradual push needed to think at a level they were used to before mental health issues. Having said that, individuals have their preferred way of dealing with the re-entering situation and a frank conversation about how they want to proceed helps immensely.

Help rebuild the trainee’s confidence. One of the authors of this post recounts her experience of getting back on her feet. Her advisor unequivocally told her: “Your PhD will get done; you are smart enough. You just need to work on your mental health, and I will work with you to make that the first priority.” Words of encouragement can go a long way -- there is ample anecdotal evidence that people can fully recover from their mental health state if proper care is taken by all stakeholders.

Create a small, well-defined goal/team goals. One of the authors of this article spent her first few months working on a fairly easy and straightforward project with a clear message, one that was easy to keep pushing on as she settled in to lab again. While this may not be the best way forward for everyone depending on where they are with their research, a clearly-defined goal can come as a quick side-project, or a deliberate breaking-down of a large project into very actionable smaller ones. Another alternative is to allow the trainee to work with another student/postdoc, something which allows a constant back-and-forth, and quicker validation which can lead to less room for mental doubt.

Remember that trainees may need to come back for a variety of other reasons as well. There are costs associated with a prolonged leave of absence, and for some trainees, they may need to come back before they are totally done with their mental health work. It's likely that some time needs to be set aside to continue that work, and it's helpful if PIs can work with students to accommodate that, within reason.

Finally, it is important for all involved parties to realize that the job of a PI is not to be the trainee’s parent, but to help the student along in their professional journey. Facilitating a lab environment where one feels comfortable, respected, and heard goes a long way, even if that means going an extra mile on the PI’s part to ensure such conditions, case-by-case.

[Back to AR] Hopefully this article is helpful for mentors and also for trainees as they try to reintegrate into the lab. For my part as a mentor, I think that a little extra empathy and attention can go a long way. I think it's important for all parties to realize that mentors are typically not trained mental health professionals, but some common sense guidelines could include increased communication, reasonable expectations, and in particular a realization that tasks that would seem quite easy for a trainee to accomplish before might be much harder now at first, in particular anything out of the usual comfort zone, like a new technique, etc.

Comments more than welcome; it seems this is a relatively under-reported area. And a huge thank you to the anonymous writers of this letter for starting the discussion.