As a junior PI, you get a lot of advice about when to say “no”. One PI I know told me that he and his other junior PIs have a rule that they have to say no to at least one thing a day. And it is sage advice. The demands on our time are huge, and so every minute counts.
Sometimes I worry, though, that the pendulum may have swung too far in the other direction, to the point where the received wisdom to say no to everything prevents us from saying yes every once in a while. Say yes and you might just end up on a new adventure you may not have anticipated with cool and interesting people. Say no and you will never know.
I started thinking about this when I read this excellent blog post with advice for new PIs. All great tips, and one that really resonated with me was the tip to “Be a good colleague”. Basically, the point is that while there are some reasons you might think it a wise to do a bad job on something so that nobody asks you again, it’s far better to do a good job. I think the same holds for interpersonal interactions. I think it’s important to make time for the people you care about in your work life. Sometimes you might do a favor for a senior (or junior) colleague. Then you might have lunch and end up with an awesome collaboration. Or maybe the favor doesn’t get repaid. That’s okay, too, happens. And some people are just not going to make fun collaborators, and you might get burned. It takes time to get better at identifying those beforehand, and I know I still have much to learn about that. But I’m also learning not to be quite as suspicious of every request, and also trying to just go with the flow a bit. It’s led to some really great collaborations from which I've learned a lot.
My point is that by reflexively saying no to everything, I think we’re denying ourselves some of the richness of the life of a PI that comes through interactions with colleagues and their trainees, which I’ve found to be very valuable. And enjoyable. That’s the point, right?
Thursday, June 25, 2015
Biking in a world of self-driving cars will be awesome
While I was biking home the other day, I had a thought: this ride would be so much safer if all these cars were Google cars. I think it’s fair to say that most bikers have had some sort of a run-in with a car at some point in their cycling lives, and the asymmetry of the situation makes it very dangerous for bikers. Thing is, we can (and should) try to raise bike awareness in drivers, but the fact is that bikes can often come out of nowhere and in places that drivers don’t expect, and it’s just hard for drivers to keep track of all these possibilities. Whether it’s “fair” or “right” or not is beside the point: when I’m biking around, I just assume every driver I meet is going to do something stupid. It’s not about being right, it’s about staying alive.
But with self-driving cars? All those sensors means that the car would be aware of bikers coming from all angles. I think this would result in a huge increase in biker safety. I think it would also greatly increase ridership. I know a lot of people who at least say they would ride around a lot more if it weren’t for their fear of getting hit by a car. It would be great to get all those people on the road.
Two further thoughts: self-driving car manufacturers, if you are reading this, please come up with some sort of idea for what to do about getting “doored” (when someone opens a door in the bike lane). Perhaps some sort of warning, like “vehicle approaching”? Not just bikes, actually–would be good to avoid cars getting doored (or taking off the door) as well.
Another thing I wonder about is whether bike couriers and other very aggressive bikers will take advantage of cautious and safe self-driving cars to completely disregard traffic rules. I myself would never do that :), but I could imagine it becoming a problem.
But with self-driving cars? All those sensors means that the car would be aware of bikers coming from all angles. I think this would result in a huge increase in biker safety. I think it would also greatly increase ridership. I know a lot of people who at least say they would ride around a lot more if it weren’t for their fear of getting hit by a car. It would be great to get all those people on the road.
Two further thoughts: self-driving car manufacturers, if you are reading this, please come up with some sort of idea for what to do about getting “doored” (when someone opens a door in the bike lane). Perhaps some sort of warning, like “vehicle approaching”? Not just bikes, actually–would be good to avoid cars getting doored (or taking off the door) as well.
Another thing I wonder about is whether bike couriers and other very aggressive bikers will take advantage of cautious and safe self-driving cars to completely disregard traffic rules. I myself would never do that :), but I could imagine it becoming a problem.
Wednesday, June 24, 2015
And you thought Tim Hunt was bad?
I’ve been sort of following the ink trail on Tim Hunt’s comments (which, incidentally, seems to have made the trail following Alice Huang go cold (just like in politics!)), so the topic of sexism in academia has been on my mind. I don’t think I have anything useful to say on the Hunt thing beyond what’s already out there. Even before Tim Hunt, I have a lot of female trainees in my lab, and I thought I had a sense of the sort of serious obstacles they face, including the sorts of comments like those from Hunt. And yes, comments like those are a serious obstacle. Disappointing and damaging, but not entirely surprising to hear stuff like that, although perhaps not in such a public forum.
It is in that context that I was absolutely shocked to hear someone I know tell me about her experiences at a major US institution. Seriously inappropriate comments in the workplace, including heavy-handed sexual advances. Women being groped and physically pushed around behind closed doors. Men in power using that power to touch women inappropriately, and as was intimated without details, worse. Worse to the point that a woman has a physical reaction when a certain man enters the room. And an institution that essentially protects these predators.
My jaw was on the floor. And the response to my shock was “Arjun, you have no idea, this stuff is happening all the time.” All the time. (To be clear, this institution is not Penn.)
The woman said that it seems to be much more of a problem with the older generation of men. I suppose we can wait for them to retire and go away. My sense of justice makes me feel like these people should have to pay for what has likely been a career of preying on women. And any institution that enables this sort of behavior needs some pretty deep soul searching. Even if such behavior is less prevalent in the newer generation, that is no guarantee that it is eliminated. And even having one such person around is one too many.
I am purposefully not naming any names because this is some pretty serious stuff, and ultimately, it’s not really my story to tell. I just wanted to bring it up because while I think we have come a long way, for me, it was a wake-up call that we still have a very long way to go.
Also, I want to make sure that this post isn’t misconstrued as some sort of minimization of the negative impact of Tim Hunt’s frankly bewildering statements. Words matter. Actions matter. It all matters. Indeed, I see the reaction to Tim Hunt’s comments as a strongly positive indicator of how far the discussion has come. Rather, I also want to point out that the reason the discussion is where it is comes from the tireless efforts of women through the decades who have put up with things I couldn’t even imagine, whose very decision to stay in science can be regarded as an act of deep courage and bravery. The thing that blew my mind is that women are still making those decisions to this day.
It is in that context that I was absolutely shocked to hear someone I know tell me about her experiences at a major US institution. Seriously inappropriate comments in the workplace, including heavy-handed sexual advances. Women being groped and physically pushed around behind closed doors. Men in power using that power to touch women inappropriately, and as was intimated without details, worse. Worse to the point that a woman has a physical reaction when a certain man enters the room. And an institution that essentially protects these predators.
My jaw was on the floor. And the response to my shock was “Arjun, you have no idea, this stuff is happening all the time.” All the time. (To be clear, this institution is not Penn.)
The woman said that it seems to be much more of a problem with the older generation of men. I suppose we can wait for them to retire and go away. My sense of justice makes me feel like these people should have to pay for what has likely been a career of preying on women. And any institution that enables this sort of behavior needs some pretty deep soul searching. Even if such behavior is less prevalent in the newer generation, that is no guarantee that it is eliminated. And even having one such person around is one too many.
I am purposefully not naming any names because this is some pretty serious stuff, and ultimately, it’s not really my story to tell. I just wanted to bring it up because while I think we have come a long way, for me, it was a wake-up call that we still have a very long way to go.
Also, I want to make sure that this post isn’t misconstrued as some sort of minimization of the negative impact of Tim Hunt’s frankly bewildering statements. Words matter. Actions matter. It all matters. Indeed, I see the reaction to Tim Hunt’s comments as a strongly positive indicator of how far the discussion has come. Rather, I also want to point out that the reason the discussion is where it is comes from the tireless efforts of women through the decades who have put up with things I couldn’t even imagine, whose very decision to stay in science can be regarded as an act of deep courage and bravery. The thing that blew my mind is that women are still making those decisions to this day.
Sunday, June 14, 2015
RNA-seq vs. RNA FISH for 26 genes
Been meaning to post this for a while. Anyway, in case you're interested, here is a comparison of mean number of RNA per cell measured by RNA FISH to FPKM as measured by RNA-seq for 26 genes (bulk and also combined single cell RNA-seq). Experimental details in Olivia's paper. We used a standard RNA-seq library prep kit from NEB for the bulk, and used the Fluidigm C1 for the single cell RNA-seq. Cells are primary human foreskin fibroblasts.
Bulk RNA-seq vs. RNA FISH (avg. # molecules per cell) |
|
|
Single cell RNA-seq vs. RNA FISH (avg. # molecules per cell), linear scale |
Probably could be better with UMIs and so forth, but anyway, for whatever it's worth.
Saturday, June 6, 2015
Gene expression by the numbers, day 3: the breakfast club
(Day 0, Day 1, Day 2, Day 3)
So day 3 was… pretty wild! And inspiring. A bit hard to describe. There was one big session. The session had some dancing. A chair was thrown. Someone got a butt in the face. I’m not kidding.
How did such nuttiness come to pass? Well, today the 15 of us all gave exit talks, where we have the floor to discuss a point of our choosing. On the heels of the baseball game, we decided (okay, someone decided) that everyone should choose a walk-up song, and we’d play the song while the speaker made their way up for the exit talk. Later, I’ll post the playlist and the conference attendees and set up a matching game. The playlist was so good!
(Note: below is a fairly long post about various things we talked about. Even if you don’t want to read it all, check out the scientific Rorschach test towards the end.)
So day 3 was… pretty wild! And inspiring. A bit hard to describe. There was one big session. The session had some dancing. A chair was thrown. Someone got a butt in the face. I’m not kidding.
How did such nuttiness come to pass? Well, today the 15 of us all gave exit talks, where we have the floor to discuss a point of our choosing. On the heels of the baseball game, we decided (okay, someone decided) that everyone should choose a walk-up song, and we’d play the song while the speaker made their way up for the exit talk. Later, I’ll post the playlist and the conference attendees and set up a matching game. The playlist was so good!
(Note: below is a fairly long post about various things we talked about. Even if you don’t want to read it all, check out the scientific Rorschach test towards the end.)
I was somehow up first. (See if you can guess my song. People in my lab can probably guess my song.) The question I posed was “does transcription matter?” More specifically, if I changed the level of transcription of a gene from, say, 196 transcripts per cell to 248 transcripts per cell, does that change anything about the cell? I think the answer depends on the context. Which led me to my main point that I kind of mentioned in an earlier post, which is that (I think) we need strong definitions based on functional outcomes in order to shape how we approach studying transcriptional regulation. I personally think this means that we really need to have much better measurements of phenotype so we can see what the consequences are of, say, a 25% increase in transcription. If there is no consequence, then should we bother studying why transcription is 25% higher in one situation vs. the other? Along these lines, Mo Khalil made the point that maybe we can turn to experimental evolution to help us figure out what matters, and maybe that could help guide our search for what matters in regulation.
Barak led another great point about definitions. He started his talk by posing the question “Can someone please give me a good definition of an enhancer?” In the ensuing discussion, folks seemed to converge on the notion that in molecular biology, definitions of entities is often very vague and typically defined much more by the experiments that we can do. Example: is an enhancer a stretch of DNA that affects a gene independently of its position? At a distance? These notions often from experiments in which they move the enhancer around and find that it still drives expression. Yet from the quantitative point of view, the tricky thing with experimentally based definitions is that these were often qualitative experiments. If moving the enhancer changes expression by 50%, then is that “location independent”?
Justin made an interesting point: can we come up with “fuzzy” definitions? Is there a sense in which we can build models that incorporate this fuzziness that seems to be pervasive in biology? I think this idea got everyone pretty excited: the idea of a new framework is tantalizing, although we still have no idea exactly what this would look like. I have to admit that personally, I’m not so sure that dispensing with the rigidity of definitions is a good thing–without rigid definitions, we run the risk of not saying anything useful and concrete at all. Perhaps having flexible definitions is actually similar to just saying that we can parametrize classes of models, with experiments eliminating some fraction of those model classes.
Jané brought in a great perspective from physics, saying that actually having a lot of arguments about definitions is a great thing. Maybe by having a lot of competing definitions and all of us trying to prove ours and contrast with others will eventually lead us to the right answer, and myopia in science can really lead to stagnation. I really like this thought. I feel like “big science” endeavors often fail to provide real progress because of exactly this problem.
The discussion of definitions also fed into a somewhat more meta discussion about interdisciplinary science and different approaches. Rob is strongly of the opinion that physicists should not need to get the permission of biologists to study biology, nor should they allow them to dictate what’s “biologically relevant”. I think this is right, and I also find myself often annoyed when people tell us what’s important or not.
Al made a great point about the role of theory in quantitative molecular biology. The point of theory is to say, “Hey, look at this, this doesn’t make sense. When you run the numbers, the picture we have doesn’t work–we need a new model.” Jané echoed this point, saying that at least with a model, we have something to argue about.
He also said that it would be great if we could formulate “no-go” models. Can we place constraints on the system in the abstract? Gasper put this really nicely: let’s say I’m a cell in a bicoid gradient trying to make a decision on what to do with my life. Let’s say I had the most powerful regulatory “computer” in the world in that cell. What’s the best that that computer could do with the information it is given? How precisely can it make its decision? How close do real cells get to this? I think this is a very powerful way to look at biology, actually.
Some of the discussions on theory and definitions brought up an important meta point relating to interdisciplinary work. I think it’s important that we learn to speak each other’s languages. I’ve very often heard physicists give a talk where they garble the name of a protein or something like that, and when a biologist complains, the response is sort of “well, whatever, it doesn’t matter”. Perhaps it doesn’t matter, but can be grating to the ear and the attitude can come across as somewhat disrespectful. I think that if a biologist were to give a talk and said “oh, this variable here called p… oh, yes, you call it h-bar, but whatever, doesn’t matter, I call it p”, it would not go over very well. I think we have to be respectful and aware of each other’s terminology and definitions and world view if we want to get each other to care about what we are both doing. And while I agree with Rob that physicists shouldn’t need permission to study biology, I also think it would be nice to have their blessings. Personally, I like to be very connected to biologists, and I feel like it has opened my mind up a lot. But I also think that’s a personal choice, perhaps informed by my training with Sanjay Tyagi, a biologist who I admire tremendously.
Another point about communicating across fields came up in discussing synthetic biology approaches to transcriptional regulation. If you take a synthetic approach to regulatory DNA, you will often encounter fierce resistance that you’re studying a “toy model” and not the real system. The counter, which I think is a reasonable argument, is that if you study just the existing DNA, you end up throwing your hands in the air and saying “complexity, who knew!”. (One conferee even said complexity is a waste of time: it’s not a feature but rather a reflection of our ignorance. I disagree.) So the synthetic approach may allow us to get at the underlying principles in a controlled and rigorous manner. I think that’s the essence of mechanistic molecular biology: make a controlled environment and then see if we can boil something down to its parts. Sort of like working in cell extracts. I think this is a sensible approach and one that deserves support in the biological community–as Angela said, it’s a “hearts and minds” problem.
That said, personally, I’m not so sure that it will be so easy to boil things down to its parts–partly because it's clearly very hard to find non-regulatory DNA to serve as the "blank slate" to work with for synthetic biology. I'm thinking lately that maybe a more data first approach is the way to go, although I weirdly feel quite strongly against this view at the same time (much more on this in a perspective piece we are writing right now in lab). But that’s fundamentally scary, and for many scientists, this may not be a world they want to live in. Let me subject you to a scientific Rorschach test:
What do you see here?
Which leads us to #2 vs. #3. I posit that worldview #2 is science as we traditionally know it. A theory is a matter of belief, and doesn’t have a p-value. It can have exceptions, which point to places where we need some new theory, but in and of itself, it is a belief that is absolute. #3 is a different world, one in which we have abandoned understanding as we traditionally define it (and there is little right now to lead us to believe that #3 will give us understanding like #2, sorry omics people).
I would argue that the complexity of biological regulation may force us out of #2 and into #3. At this meeting, I saw some pretty strong evidence that a simple thermodynamic model can explain a fair amount of transcriptional regulation. So is that a theory, a simple explanation that most of us believe? And we just need some additional theory to explain the exceptions? Or, alternatively, can we just embrace the exceptions, come up with some effective theory based on regression, and then say we’ve solved it totally? The latter sounds “wrong” somehow, but really, what’s the difference between that and the thermodynamic model? I don’t think that any of us can honestly say that the thermodynamic model is anything other than an effective representation of molecular processes that we are not capturing fully. So then how different is that than a SVM telling us there are 90 features that explain most of the variance? How much variance do you explain before it’s a theory and not a statistical model? 90%? How many features before it’s no longer science but data science? 10? I think that where we place these bars is a matter of aesthetics, but also defines in some ways who we are as scientists.
Personally, I feel like complexity is making things hopeless and we have to have a fundamental rethink transitioning from #2 to #3 in some way. And I say this with utmost fear and trepidation, not to mention distaste. And I’m not so sure I’m right. Rob holds very much the opposite view, and we had a conversation in which he said, well, this field is messy right now and it might take decades to figure it out. He could be right. He also said that if I’m right, then it’s essentially saying that his work on finding a single equation for transcription is not progress. Did I agree that that was not progress? I felt boxed in by my own arguments, and so I had to say “Yeah, I guess that’s not progress”. But I very much believe that it is progress, and it’s objectively hard to argue otherwise. I don’t know, I’m deeply ambivalent on this myself.
Whew. So as you can probably tell, this conference got pretty meta by the end. Ido said this meeting was not a success for him, because he hasn’t come away with any tangible, actionable items. I agree and disagree. This meeting was sort of like The Breakfast Club. It was a bunch of us from different points of view, getting together and arguing, and over time getting in touch with our innermost hopes and anxieties. Here’s a quote from Wikipedia on the ending of the movie:
At the end, Angela put up the Ann Friedman’s Disapproval Matrix:
She remarked, rightly, that even when we disagreed, we were all pretty much in the top half of the matrix. I think this speaks to the level of trust and respect everyone had for each other, which was the best part of this meeting. For my part, I just want to say that I feel lucky to have been a part of this conference and a part of this community.
Walk-up song match game coming soon, along with a playlist!
Barak led another great point about definitions. He started his talk by posing the question “Can someone please give me a good definition of an enhancer?” In the ensuing discussion, folks seemed to converge on the notion that in molecular biology, definitions of entities is often very vague and typically defined much more by the experiments that we can do. Example: is an enhancer a stretch of DNA that affects a gene independently of its position? At a distance? These notions often from experiments in which they move the enhancer around and find that it still drives expression. Yet from the quantitative point of view, the tricky thing with experimentally based definitions is that these were often qualitative experiments. If moving the enhancer changes expression by 50%, then is that “location independent”?
Justin made an interesting point: can we come up with “fuzzy” definitions? Is there a sense in which we can build models that incorporate this fuzziness that seems to be pervasive in biology? I think this idea got everyone pretty excited: the idea of a new framework is tantalizing, although we still have no idea exactly what this would look like. I have to admit that personally, I’m not so sure that dispensing with the rigidity of definitions is a good thing–without rigid definitions, we run the risk of not saying anything useful and concrete at all. Perhaps having flexible definitions is actually similar to just saying that we can parametrize classes of models, with experiments eliminating some fraction of those model classes.
Jané brought in a great perspective from physics, saying that actually having a lot of arguments about definitions is a great thing. Maybe by having a lot of competing definitions and all of us trying to prove ours and contrast with others will eventually lead us to the right answer, and myopia in science can really lead to stagnation. I really like this thought. I feel like “big science” endeavors often fail to provide real progress because of exactly this problem.
The discussion of definitions also fed into a somewhat more meta discussion about interdisciplinary science and different approaches. Rob is strongly of the opinion that physicists should not need to get the permission of biologists to study biology, nor should they allow them to dictate what’s “biologically relevant”. I think this is right, and I also find myself often annoyed when people tell us what’s important or not.
Al made a great point about the role of theory in quantitative molecular biology. The point of theory is to say, “Hey, look at this, this doesn’t make sense. When you run the numbers, the picture we have doesn’t work–we need a new model.” Jané echoed this point, saying that at least with a model, we have something to argue about.
He also said that it would be great if we could formulate “no-go” models. Can we place constraints on the system in the abstract? Gasper put this really nicely: let’s say I’m a cell in a bicoid gradient trying to make a decision on what to do with my life. Let’s say I had the most powerful regulatory “computer” in the world in that cell. What’s the best that that computer could do with the information it is given? How precisely can it make its decision? How close do real cells get to this? I think this is a very powerful way to look at biology, actually.
Some of the discussions on theory and definitions brought up an important meta point relating to interdisciplinary work. I think it’s important that we learn to speak each other’s languages. I’ve very often heard physicists give a talk where they garble the name of a protein or something like that, and when a biologist complains, the response is sort of “well, whatever, it doesn’t matter”. Perhaps it doesn’t matter, but can be grating to the ear and the attitude can come across as somewhat disrespectful. I think that if a biologist were to give a talk and said “oh, this variable here called p… oh, yes, you call it h-bar, but whatever, doesn’t matter, I call it p”, it would not go over very well. I think we have to be respectful and aware of each other’s terminology and definitions and world view if we want to get each other to care about what we are both doing. And while I agree with Rob that physicists shouldn’t need permission to study biology, I also think it would be nice to have their blessings. Personally, I like to be very connected to biologists, and I feel like it has opened my mind up a lot. But I also think that’s a personal choice, perhaps informed by my training with Sanjay Tyagi, a biologist who I admire tremendously.
Another point about communicating across fields came up in discussing synthetic biology approaches to transcriptional regulation. If you take a synthetic approach to regulatory DNA, you will often encounter fierce resistance that you’re studying a “toy model” and not the real system. The counter, which I think is a reasonable argument, is that if you study just the existing DNA, you end up throwing your hands in the air and saying “complexity, who knew!”. (One conferee even said complexity is a waste of time: it’s not a feature but rather a reflection of our ignorance. I disagree.) So the synthetic approach may allow us to get at the underlying principles in a controlled and rigorous manner. I think that’s the essence of mechanistic molecular biology: make a controlled environment and then see if we can boil something down to its parts. Sort of like working in cell extracts. I think this is a sensible approach and one that deserves support in the biological community–as Angela said, it’s a “hearts and minds” problem.
That said, personally, I’m not so sure that it will be so easy to boil things down to its parts–partly because it's clearly very hard to find non-regulatory DNA to serve as the "blank slate" to work with for synthetic biology. I'm thinking lately that maybe a more data first approach is the way to go, although I weirdly feel quite strongly against this view at the same time (much more on this in a perspective piece we are writing right now in lab). But that’s fundamentally scary, and for many scientists, this may not be a world they want to live in. Let me subject you to a scientific Rorschach test:
Image from here |
- A catalog of data points.
- A rule with an exception.
- A best fit line that explains, dunno, 60% of the variance, p = 0.002 (or whatever).
Which leads us to #2 vs. #3. I posit that worldview #2 is science as we traditionally know it. A theory is a matter of belief, and doesn’t have a p-value. It can have exceptions, which point to places where we need some new theory, but in and of itself, it is a belief that is absolute. #3 is a different world, one in which we have abandoned understanding as we traditionally define it (and there is little right now to lead us to believe that #3 will give us understanding like #2, sorry omics people).
I would argue that the complexity of biological regulation may force us out of #2 and into #3. At this meeting, I saw some pretty strong evidence that a simple thermodynamic model can explain a fair amount of transcriptional regulation. So is that a theory, a simple explanation that most of us believe? And we just need some additional theory to explain the exceptions? Or, alternatively, can we just embrace the exceptions, come up with some effective theory based on regression, and then say we’ve solved it totally? The latter sounds “wrong” somehow, but really, what’s the difference between that and the thermodynamic model? I don’t think that any of us can honestly say that the thermodynamic model is anything other than an effective representation of molecular processes that we are not capturing fully. So then how different is that than a SVM telling us there are 90 features that explain most of the variance? How much variance do you explain before it’s a theory and not a statistical model? 90%? How many features before it’s no longer science but data science? 10? I think that where we place these bars is a matter of aesthetics, but also defines in some ways who we are as scientists.
Personally, I feel like complexity is making things hopeless and we have to have a fundamental rethink transitioning from #2 to #3 in some way. And I say this with utmost fear and trepidation, not to mention distaste. And I’m not so sure I’m right. Rob holds very much the opposite view, and we had a conversation in which he said, well, this field is messy right now and it might take decades to figure it out. He could be right. He also said that if I’m right, then it’s essentially saying that his work on finding a single equation for transcription is not progress. Did I agree that that was not progress? I felt boxed in by my own arguments, and so I had to say “Yeah, I guess that’s not progress”. But I very much believe that it is progress, and it’s objectively hard to argue otherwise. I don’t know, I’m deeply ambivalent on this myself.
Whew. So as you can probably tell, this conference got pretty meta by the end. Ido said this meeting was not a success for him, because he hasn’t come away with any tangible, actionable items. I agree and disagree. This meeting was sort of like The Breakfast Club. It was a bunch of us from different points of view, getting together and arguing, and over time getting in touch with our innermost hopes and anxieties. Here’s a quote from Wikipedia on the ending of the movie:
Although they suspect that the relationships would end with the end of their detention, their mutual experiences would change the way they would look at their peers afterward.I think that’s where I am. I actually learned a lot about regulatory DNA, about real question marks in the field, and got some serious challenges to how I’ve been thinking about science these days. It’s true that I didn’t come away with a burning experiment that I now have to do, but I would be surprised if my science were not affected by these discussions in the coming months and years (in fact, I am now resolved to work out a theory together with Ian in the lab by the end of the summer).
At the end, Angela put up the Ann Friedman’s Disapproval Matrix:
She remarked, rightly, that even when we disagreed, we were all pretty much in the top half of the matrix. I think this speaks to the level of trust and respect everyone had for each other, which was the best part of this meeting. For my part, I just want to say that I feel lucky to have been a part of this conference and a part of this community.
Walk-up song match game coming soon, along with a playlist!
Friday, June 5, 2015
Gene expression by the numbers, day 2: take me out to the ballgame
(Day 0, Day 1, Day 2, Day 3 (take Rorschach test at end of Day 3!))
First off, just want to thank a commenter for providing an interesting and thoughtful response to some of the topics we discussed in day 1. Highly recommended reading.
Day 2 started with Rob trying to stir the pot by placing three bets (the stakes are dinner in Paris at a fancy restaurant, yummy!). First bet was actually with me, or really a bet against pessimism. He claimed that he would be able to explain Hana’s complicated data on transcription in different conditions once we measured the relevant parameters, like, say, transcription factor concentration (wrote about this in the day 1 post). My response was, well, even if you could explain that with all the transcription factor concentrations, that’s not really the problem I have. My problem is that it is impossible to build a simple predictive model of transcription here. The input-output relationship depends on so many other factors that we end up with a mess–there are no well-defined modules. To which Rob rightfully responded by saying that that's moving the goalposts: I said he can't do X, he does X, I say now you have to do Y. Fair enough. I accept the original challenge: I claim that he will not be able to explain the differences in Hana's data using just transcription factor concentration.
First off, just want to thank a commenter for providing an interesting and thoughtful response to some of the topics we discussed in day 1. Highly recommended reading.
Day 2 started with Rob trying to stir the pot by placing three bets (the stakes are dinner in Paris at a fancy restaurant, yummy!). First bet was actually with me, or really a bet against pessimism. He claimed that he would be able to explain Hana’s complicated data on transcription in different conditions once we measured the relevant parameters, like, say, transcription factor concentration (wrote about this in the day 1 post). My response was, well, even if you could explain that with all the transcription factor concentrations, that’s not really the problem I have. My problem is that it is impossible to build a simple predictive model of transcription here. The input-output relationship depends on so many other factors that we end up with a mess–there are no well-defined modules. To which Rob rightfully responded by saying that that's moving the goalposts: I said he can't do X, he does X, I say now you have to do Y. Fair enough. I accept the original challenge: I claim that he will not be able to explain the differences in Hana's data using just transcription factor concentration.
Next bet was with Barak. In the day 1 post, I mention the statistical approach vs. the mechanistic approach. Rob and Barak still have to formulate the bet precisely (and I think they actually agree mostly), but basically, it is a bet against the statistical approach. Hmm. Personally, I don't know how I come down on this. I am definitely sympathetic to Rob's point of view, and don't like the overemphasis these days on statistics (my thoughts). But my thoughts are evolving. Rob asked "Would it really have been possible to derive gravitation with a bunch of star charts and machine learning?" To which I responded with something along the lines of "well, we are machines, and we learned it." Sort of silly, but sort of not.
Final bet was with Ido (something about universality of noise scaling laws). Ido also had a bet as well on this point, in this case offering up a bottle of Mezcal for a resolution. More on this some other time. I am going to try and get the bottle!
The talks were again great (I mean really great), if perhaps a bit more topically diffuse than yesterday. Started with evolution. Very cool, with beautiful graphs of clonal sweeps. An interesting point was that experimental evolution arrives at different answers than you expect initially. They are rational (or can be), but not what you expect early on–amazingly even in pathways as well worked out as the metabolic pathways. I'm wondering if we could leverage this to understand pathways better in some way?
On to the "tech development" section, which was only somewhat about tech development, somewhat not. Stirling gave a great talk about human NET-seq. What I really liked about it was that in the end, there was a simple answer to a simple question (is transcription different over exons when they're skipped? exons vs. introns?). I think it's awesome to see that genome-wide data can give such clear results.
So far, everything was about control of the mean levels of transcription. Both Ido and I talked about the variance around that mean, with Ido providing beautiful data on input-output functions. On the Mezcal, Ido shows that there is a strong relationship between the Fano factor and the mean. I am wondering whether this is due to volume variation. Olivia's paper has some data on this. Probably the subject of another blog post at some point in the future.
Theory: great discussion about Hill coefficients with Jeremy! How can you actually get thresholds in transcriptional regulation? Couple ideas. There's conventional cooperativity, and there could also be other mechanisms, like titration via dummy binding sites like in Nick Buchler's work. Surprising that we still have a lot of questions about mechanisms of thresholds after all this time.
Conversation with Jeremy and Harinder: how much do we know about whether sequence fully predicts binding? Thought for an experiment–if you sweep through transcription factor concentrations, what happens to binding as measured by e.g. ChIP-seq? Has anyone done this experiment?
Then, off to the Red Sox vs. the Twins. Biked over there on Hubway with Ron, which was perfect on a really lovely day in Cambridge. The game was super fun! Apparently there were some people playing baseball there, but that didn't distract me too much. Had a great time chatting with various folks, including two really awesome students from Angela's lab, Clarissa Scholes and Ben Vincent, who joined in the fun. Talked with them about the leaky pipeline, which is something I will never, ever discuss online for various reasons. Also crying in lab–someone at the conference told me that they've made everyone in their lab cry, which is so surprising if you know this person. Someone also told me that I'm weird. Like, they said "Arjun, you are weird." Which is true.
Oh, and the Twins won, which made me happy–not because I know the first thing about baseball, but I hate the Red Sox, mostly because of their very annoying fans. Oops, did I say that out loud?
Okay, fireworks are happening here on day 3. More soon!
Thursday, June 4, 2015
Gene expression by the numbers, verdict on day 1: awesome!
(Day 0, Day 1, Day 2, Day 3 (take Rorschach test at end of Day 3!))
Yesterday was day 1 of Gene expression by the numbers, and it was everything I had hoped it would be! Lots of discussion about big ideas, little ideas, and everything in between. Jane Kondev said at some point that we should have a “controversy meter” based on the loudness of the discussion. Some of the discussions would definitely have rated highly, which great! Here are some thoughts, very much from my own point of view:
We started the day with a lively discussion about how I am depressed (scientifically) :). I’m depressed because I’ve been thinking lately that maybe biology is just hopelessly complex, and we’ll never figure it out. At the very least, I’ve been thinking we need wholly different approaches. More concretely for this meeting, will we ever truly be able to have a predictive understanding of how transcription is regulated? Fortunately (?), only one other person in the room admitted to such feelings, and most people were very optimistic on this count. I have to say that at the end of the day, I’m not completely convinced, but the waters are muddier.
Who is an optimist? Rob Phillips is an optimist! And he made a very strong point. Basically, he’s been able to take decades of data on transcriptional regulation in E. coli and reduce it to a single, principled equation. Different conditions, different concentrations, whatever, it all falls on a single line. I have to say, this is pretty amazing. It’s one thing to be an optimist, another to be an optimist with data. Well played.
And then… over to eukaryotes. I don’t think anyone can say with a straight face that we can predict eukaryotic transcription. Lots of examples of a lot of effects that don’t resolve with simple models, and Angela DePace gave a great talk highlighting some of the standard assumptions that we make that may not actually hold. So what do we do? Just throw our hands in the air and say “Complexity, yipes!”?
Not so fast. First, what is the simple model? The simplest model is the thermodynamic model. Essentially, each transcription factor binds to the promoter independently of each other, and its effects are independent of each other. Um, duh, that can’t work, right? I was of the opinion that decades of conventional promoter bashing hasn’t really provided much in the way of general rules, and more quantitative work along these lines hasn’t really done so either.
But Barak brought up an extremely good point, which is that a lot of these approaches to seeing how promoter changes affect transcription suffer from being very statistically underpowered. They also made the point (with data) that once you really start sampling, maybe things are not so bad–and amazingly enough, maybe some of the simplest and “obviously wrong” caricatures of transcriptional regulation are not all that far off. Maybe with sufficient sampling, we can start to see rules and exceptions, instead of a big set of exceptions. Somehow, this really resonated with me.
I’m also left a bit confused. So do we have a good understanding of regulation or not? I saw some stuff that left me hopeful that maybe simple models may be pretty darn good, and maybe we’re not all that far off from the point where if I wanted to dial up a promoter that expressed at a certain level, I just type in this piece of DNA and I’ll get close. I also saw a lot of other stuff that left me scratching my head and sent me back to wondering how we’ll ever figure it all out.
There was also here an interesting difference in style. Some approach from a very statistical point of view (do a large amount of different things and look for emergent patterns). Some approach things from a very mechanistic point of view (tweak particular parameters we think are important, like distances and individual bases, and see what happens). I usually think it’s very intellectually lazy to say things like “we need both approaches, they are complementary”, but in this case, I think it’s apt, though if I had to lean one way, personally, I think I favor the statistical approach. Deriving knowledge from the statistical approach is a tricky matter, but that’s a bigger question. How much variance do we need to explain? As yet unanswered, see later for some discussion about the elephant in the room.
Models: some cool talks about models. One great point: “No such thing as validating a model. We can only disprove models.” A point of discussion was how to deal with models that don’t fit all the data. Do we want to capture everything? How many exceptions to the rule can you tolerate before it’s no longer a rule?
Which comes to a talk that was probably highest on the controversy meter. In this one, the conferee who shares my depression showed some results that struck me as very familiar. The idea was build a quantitative model, then go build some experiments to show transcriptional response, and the model fits nicely. Then you change something in the growth medium, and suddenly, the model is out the window. We’ve all seen this: day to day variability, batch variability, “weird stuff happened that day”, whatever. So does the model really reflect our understanding of the underlying system?
This prompted a great discussion about what our goals are as a community. Is the goal really to predict everything in every condition? Is that an unreasonable thing to expect from a model? This got down to understanding vs. predicting. Jane brought up the point that these are different: Google can predict traffic, but it doesn’t understand traffic. A nice analogy, but I’m not sure that it works the other way around. I think understanding means prediction, even if prediction doesn’t necessarily mean understanding. Perhaps this comes down to an aesthetic choice. Practically speaking, for the quantitative study of transcription, I think that the fact that the model failed to predict transcription in a different condition is a problem. One of my big issues with our field is that we have a bunch of little models that are very context specific, and the quantitative (and sometimes qualitative) details vary. How can we put our models together if the sands are shifting under our feet all the time? I think this is a strong argument against modularity. Rob made the solid counter that perhaps we’re just not measuring all the parameters–if we could measure transcription factor concentration directly, maybe that would explain things. Perhaps. I’m not convinced. But that’s just, like, my opinion, man.
So to me the big elephant in the room that was not discussed is what exactly matters about transcription? As quantitative scientists, we may care about whether there are 72 transcripts in this cell vs. 98 in the one next door, but does that have any consequences? I think this is an important question because I think it can shape what we measure. For instance, this might help us answer the question about whether explaining 54% of the variance is enough–maybe the cell only cares about on vs. off, in which case, all the quantitative stuff is irrelevant (I think there is evidence for and against this). Maybe then all we should be studying is how genes go from an inactive to an active state and not worry about how much they turn on. Dunno, all I’m saying is that without any knowledge of the functional consequences, we’re running the risk of heading down the wrong path.
Another benefit to discussing functional consequences is that I think it would allow us to come up with useful definitions that we can then use to shape our discussion. For instance, what is cross-talk? (Was the subject of a great talk.) We always talk about it like it’s a bad thing, but how do we know that? What is modularity? What is noise? I think these are functional concepts that must have functional definitions, and armed with those definitions, then maybe we will have a better sense of what we should be trying to understand and manipulate with regard to transcriptional output.
Anyway, looking forward to day 2!
Yesterday was day 1 of Gene expression by the numbers, and it was everything I had hoped it would be! Lots of discussion about big ideas, little ideas, and everything in between. Jane Kondev said at some point that we should have a “controversy meter” based on the loudness of the discussion. Some of the discussions would definitely have rated highly, which great! Here are some thoughts, very much from my own point of view:
We started the day with a lively discussion about how I am depressed (scientifically) :). I’m depressed because I’ve been thinking lately that maybe biology is just hopelessly complex, and we’ll never figure it out. At the very least, I’ve been thinking we need wholly different approaches. More concretely for this meeting, will we ever truly be able to have a predictive understanding of how transcription is regulated? Fortunately (?), only one other person in the room admitted to such feelings, and most people were very optimistic on this count. I have to say that at the end of the day, I’m not completely convinced, but the waters are muddier.
Who is an optimist? Rob Phillips is an optimist! And he made a very strong point. Basically, he’s been able to take decades of data on transcriptional regulation in E. coli and reduce it to a single, principled equation. Different conditions, different concentrations, whatever, it all falls on a single line. I have to say, this is pretty amazing. It’s one thing to be an optimist, another to be an optimist with data. Well played.
And then… over to eukaryotes. I don’t think anyone can say with a straight face that we can predict eukaryotic transcription. Lots of examples of a lot of effects that don’t resolve with simple models, and Angela DePace gave a great talk highlighting some of the standard assumptions that we make that may not actually hold. So what do we do? Just throw our hands in the air and say “Complexity, yipes!”?
Not so fast. First, what is the simple model? The simplest model is the thermodynamic model. Essentially, each transcription factor binds to the promoter independently of each other, and its effects are independent of each other. Um, duh, that can’t work, right? I was of the opinion that decades of conventional promoter bashing hasn’t really provided much in the way of general rules, and more quantitative work along these lines hasn’t really done so either.
But Barak brought up an extremely good point, which is that a lot of these approaches to seeing how promoter changes affect transcription suffer from being very statistically underpowered. They also made the point (with data) that once you really start sampling, maybe things are not so bad–and amazingly enough, maybe some of the simplest and “obviously wrong” caricatures of transcriptional regulation are not all that far off. Maybe with sufficient sampling, we can start to see rules and exceptions, instead of a big set of exceptions. Somehow, this really resonated with me.
I’m also left a bit confused. So do we have a good understanding of regulation or not? I saw some stuff that left me hopeful that maybe simple models may be pretty darn good, and maybe we’re not all that far off from the point where if I wanted to dial up a promoter that expressed at a certain level, I just type in this piece of DNA and I’ll get close. I also saw a lot of other stuff that left me scratching my head and sent me back to wondering how we’ll ever figure it all out.
There was also here an interesting difference in style. Some approach from a very statistical point of view (do a large amount of different things and look for emergent patterns). Some approach things from a very mechanistic point of view (tweak particular parameters we think are important, like distances and individual bases, and see what happens). I usually think it’s very intellectually lazy to say things like “we need both approaches, they are complementary”, but in this case, I think it’s apt, though if I had to lean one way, personally, I think I favor the statistical approach. Deriving knowledge from the statistical approach is a tricky matter, but that’s a bigger question. How much variance do we need to explain? As yet unanswered, see later for some discussion about the elephant in the room.
Models: some cool talks about models. One great point: “No such thing as validating a model. We can only disprove models.” A point of discussion was how to deal with models that don’t fit all the data. Do we want to capture everything? How many exceptions to the rule can you tolerate before it’s no longer a rule?
Which comes to a talk that was probably highest on the controversy meter. In this one, the conferee who shares my depression showed some results that struck me as very familiar. The idea was build a quantitative model, then go build some experiments to show transcriptional response, and the model fits nicely. Then you change something in the growth medium, and suddenly, the model is out the window. We’ve all seen this: day to day variability, batch variability, “weird stuff happened that day”, whatever. So does the model really reflect our understanding of the underlying system?
This prompted a great discussion about what our goals are as a community. Is the goal really to predict everything in every condition? Is that an unreasonable thing to expect from a model? This got down to understanding vs. predicting. Jane brought up the point that these are different: Google can predict traffic, but it doesn’t understand traffic. A nice analogy, but I’m not sure that it works the other way around. I think understanding means prediction, even if prediction doesn’t necessarily mean understanding. Perhaps this comes down to an aesthetic choice. Practically speaking, for the quantitative study of transcription, I think that the fact that the model failed to predict transcription in a different condition is a problem. One of my big issues with our field is that we have a bunch of little models that are very context specific, and the quantitative (and sometimes qualitative) details vary. How can we put our models together if the sands are shifting under our feet all the time? I think this is a strong argument against modularity. Rob made the solid counter that perhaps we’re just not measuring all the parameters–if we could measure transcription factor concentration directly, maybe that would explain things. Perhaps. I’m not convinced. But that’s just, like, my opinion, man.
So to me the big elephant in the room that was not discussed is what exactly matters about transcription? As quantitative scientists, we may care about whether there are 72 transcripts in this cell vs. 98 in the one next door, but does that have any consequences? I think this is an important question because I think it can shape what we measure. For instance, this might help us answer the question about whether explaining 54% of the variance is enough–maybe the cell only cares about on vs. off, in which case, all the quantitative stuff is irrelevant (I think there is evidence for and against this). Maybe then all we should be studying is how genes go from an inactive to an active state and not worry about how much they turn on. Dunno, all I’m saying is that without any knowledge of the functional consequences, we’re running the risk of heading down the wrong path.
Another benefit to discussing functional consequences is that I think it would allow us to come up with useful definitions that we can then use to shape our discussion. For instance, what is cross-talk? (Was the subject of a great talk.) We always talk about it like it’s a bad thing, but how do we know that? What is modularity? What is noise? I think these are functional concepts that must have functional definitions, and armed with those definitions, then maybe we will have a better sense of what we should be trying to understand and manipulate with regard to transcriptional output.
Anyway, looking forward to day 2!
Tuesday, June 2, 2015
Gene expression by the numbers, day 0: Big picture questions about transcription
(Day 0, Day 1, Day 2, Day 3 (take Rorschach test at end of Day 3!))
So just about to get on a plane to go to Boston/Cambridge for a meeting on transcription–I think it's going to be a lot of fun! Bunch of folks with a quantitative bent getting together, including the organizers Al Sanchez, Hernan Garcia, Jané Kondev, Angela DePace and Rob Phillips (big thanks for all their hard work!). The big reason I'm excited is that this is not going to be a typical meeting: the goal is to discard with the usual formalities of a meeting (like a bunch of boring talks that nobody pays attention to) and instead actually talk with each other about where we want the field to head and how we might get there. We even all made short videos beforehand as a sort of pre-conference introduction!
So just about to get on a plane to go to Boston/Cambridge for a meeting on transcription–I think it's going to be a lot of fun! Bunch of folks with a quantitative bent getting together, including the organizers Al Sanchez, Hernan Garcia, Jané Kondev, Angela DePace and Rob Phillips (big thanks for all their hard work!). The big reason I'm excited is that this is not going to be a typical meeting: the goal is to discard with the usual formalities of a meeting (like a bunch of boring talks that nobody pays attention to) and instead actually talk with each other about where we want the field to head and how we might get there. We even all made short videos beforehand as a sort of pre-conference introduction!
This is going to require changing our usual scientific behavior, which is to stamp out wild ideas as soon as we hear them. You know that crazy person who asks you some weird question at the end of your seminar about bees and the number 12? Well, that's going to be me, and I won't be satisfied with "talking about it later off-line". :)
Nor is it going to be completely off-line. I'm going to blog about the goings-on in the hope that others can participate as well in what is sadly (but perhaps necessarily) a rather small event. So drop me a line if you have any burning questions about transcription.
What are the sorts of questions we'll be discussing? Here's a few I’ve been thinking about after watching everyone’s videos:
- How close are we to a predictive understanding of the regulatory code? I.e., if I give you a cell type and a piece of DNA, can I predict how much transcription there will be?
- (Related bonus question) How do we deal with the complexity of metazoan transcriptional regulation? What new conceptual frameworks will we need to make further progress?
- What are some new methods that we could develop that would help us understand transcription? What are the quantities that we would like to measure?
- Development appears to be incredibly precise–how do developing organisms achieve this despite the sloppiness of chemical reactions? To what extent is this precision an intrinsic property of the cell and to what extent is it an emergent property of the interaction of different cells?
- What are the functional consequences of transcription? Which aspects of transcription “matter” and which ones are irrelevant? In chemistry, we talk about rate-limiting reactions. What are the biology-limiting reactions in transcription? What should we be measuring?
More soon!
Subscribe to:
Posts (Atom)