Sunday, February 19, 2017

Results from the Guess the Impact Factor Challenge

Results from the Guess the Impact Factor Challenge

By Uschi Symmons and Arjun Raj

tl;dr: We wondered if people could guess the impact factor of the journal a paper was published in by its title. The short answer is not really. The longer answer is sometimes yes. The results suggest that talking about any sort of weird organism makes people think your work is boring, unless you’re talking about CRISPR. This begs the question of whether the people who took this quiz are cynical or just shallow. Much future research will be needed to make this determination.

[Arjun] This whole thing came out of a Tweet I saw:

It showed the title: “Superresolution imaging of nanoscale chromosome contacts”, and the beginning of the link: Looking at the title, I thought, well, this sounds like it could plausibly be a paper in Nature, that most impacty of high impact journals (the article is actually in Scientific Reports, which is part of the Nature Publishing Group, which is generally considered to be low impact). This got Uschi and I thinking: could you tell what journal a paper went into by its title alone? Would you be fooled?

[Switching to Uschi and Arjun] By the way, although this whole thing is sort of a joke, we think it does hold some lessons for our glorious preprint based future, in which the main thing you have to go on is the title and the authors. Without the filter/recommendation role that current journals provide, will visibility in such a world be dominated by who the authors are and increasingly bombastic and hype-filled titles? (Not that that’s not the case already, but…)

To see if people could guess the impact factor of the journal a paper was published in solely based on the title we made up a little online questionnaire. More than 300 people filled out the questionnaire—and here are the results.

Our methodology was cooked up in an hour or two discussing by Slack, and has so many flaws it’s hard to enumerate them all. But we’ll try and hit the highlights in the discussion. Anyway, here’s what we did: we chose journals with a range of impact factors, three each in the high, medium, and low categories (>20, 8-20, <8, respectively). We tried to pick journals that would have papers with a flavor that most of our online audience would find familiar. We then chose two papers from each journal, picked from a random issue around December 2014/January 2015. The idea was to pick papers that have maybe receded from memory (and also have accumulated some citation statistics, reported as of Feb. 13, 2017), but not so long ago that the titles would be misleading or seem anachronistic. We picked the paper titles pretty much at random: picked an issue/did a search by date and basically just picked the first paper from the list that was in this area of biomedical science. The idea here was to avoid bias, so there was no attempt to pick “tricky” titles. There was one situation where we looked at an issue of Molecular Systems Biology and the first couple titles had colons in them, which we felt were perhaps a giveaway that it was not high profile, so we picked another issue. Papers and journals given in the results below.

The questionnaire itself presented the titles in random order and asked for each whether it was high, medium, or low impact, based on the cutoffs of 0-8, 8-20, 20+. Answering each question was optional, and we asked people to not answer for any papers that they already knew. At least a few people followed that instruction. We posted the questionnaire on Twitter (Twitter Inc.) and let Google (Alphabet) do its collection magic.

Google response analysis here, code and data here.

In total, we got 338 responses, mostly within the first day or two of posting. First question: how good were people at guessing the impact factor of the journal? Take a look:

The main conclusion is that people are pretty bad at this game. The average score was around 42%, which was not much above random chance (33%). Also, the best anyone got was 78%. Despite this, it looks like the answers were spread pretty evenly between the three categories, which matches the actual distribution, so there wasn’t a bias towards a particular answer.

Now the question you’ve probably been itching for: how well were people able to guess the journal specific titles? The answer is that they were good for some and not so good for others. To quantify how well people did, we calculated a “Perception score”, which is the average score given to a particular title, with low = 1, medium = 2, high = 3. Here is a table with the results:

TitleJournalImpact factorPerception score
Single-base resolution analysis of active DNA demethylation using methylase-assisted bisulfite sequencingNature Biotechnology43.1132.34
The draft genome sequence of the ferret (Mustela putorius furo) facilitates study of human respiratory diseaseNature Biotechnology43.1131.88
Dietary modulation of the microbiome affects autoinflammatory diseaseNature38.1382.37
Cell differentiation and germ–soma separation in Ediacaran animal embryo-like fossilsNature38.1381.77
The human splicing code reveals new insights into the genetic determinants of diseaseScience34.6612.55
Opposite effects of anthelmintic treatment on microbial infection at individual versus population scalesScience34.6611.44
Dynamic shifts in occupancy by TAL1 are guided by GATA factors and drive large-scale reprogramming of gene expression during hematopoiesisGenome Research11.3512.11
Population and single-cell genomics reveal the Aire dependency, relief from Polycomb silencing, and distribution of self-antigen expression in thymic epitheliaGenome Research11.3511.81
A high‐throughput ChIP‐Seq for large‐scale chromatin studiesMolecular Systems Biology10.8722.22
Genome‐wide study of mRNA degradation and transcript elongation in Escherichia coliMolecular Systems Biology10.8722.02
Browning of human adipocytes requires KLF11 and reprogramming of PPARĪ³ superenhancersGenes and Development10.0422.15
Initiation and maintenance of pluripotency gene expression in the absence of cohesinGenes and Development10.0422.09
Non-targeted metabolomics and lipidomics LC–MS data from maternal plasma of 180 healthy pregnant womenGigaScience7.4631.55
Reconstructing a comprehensive transcriptome assembly of a white-pupal translocated strain of the pest fruit fly Bactrocera cucurbitaeGigaScience7.4631.25
Asymmetric parental genome engineering by Cas9 during mouse meiotic exitScientific Reports5.2282.43
Dual sgRNA-directed gene knockout using CRISPR/Cas9 technology in Caenorhabditis elegansScientific Reports5.2282.25
A hyper-dynamic nature of bivalent promoter states underlies coordinated developmental gene expression modulesBMC Genomics3.8672.16
Transcriptomic and proteomic dynamics in the metabolism of a diazotrophic cyanobacterium, Cyanothece sp. PCC 7822 during a diurnal light–dark cycleBMC Genomics3.8671.25

In graphical form:

One thing really leaps out, which is the “bowtie” shape of this plot: while people, averaged together, tend to get medium-impact papers right, there is high variability in aggregate perception for the low and high impact papers. For the middle-tier, one possibility is that there is a bias towards the middle in general (like an “uh, dunno, I guess I’ll just put it in the middle” effect), but we didn’t see much evidence for an excess of “middle” ratings, so maybe people are just better at guessing these ones. Definitely not the case for the high and low end, though. The two titles apiece from Nature and Science had both high and low perceived impact. Also, the two Scientific Reports papers had very high perceived impact, presumably due to the fact that they have CRISPR in the title.

So what, if anything, makes a paper seem high or low impact? Here’s a table stratified by perceived impact factor, notice what all the low ones have in common?

TitleJournalImpact factorPerception score
The human splicing code reveals new insights into the genetic determinants of diseaseScience34.6612.55
Asymmetric parental genome engineering by Cas9 during mouse meiotic exitScientific Reports5.2282.43
Dietary modulation of the microbiome affects autoinflammatory diseaseNature38.1382.37
Single-base resolution analysis of active DNA demethylation using methylase-assisted bisulfite sequencingNature Biotechnology43.1132.34
Dual sgRNA-directed gene knockout using CRISPR/Cas9 technology in Caenorhabditis elegansScientific Reports5.2282.25
A high‐throughput ChIP‐Seq for large‐scale chromatin studiesMolecular Systems Biology10.8722.22
A hyper-dynamic nature of bivalent promoter states underlies coordinated developmental gene expression modulesBMC Genomics3.8672.16
Browning of human adipocytes requires KLF11 and reprogramming of PPARĪ³ superenhancersGenes and Development10.0422.15
Dynamic shifts in occupancy by TAL1 are guided by GATA factors and drive large-scale reprogramming of gene expression during hematopoiesisGenome Research11.3512.11
Initiation and maintenance of pluripotency gene expression in the absence of cohesinGenes and Development10.0422.09
Genome‐wide study of mRNA degradation and transcript elongation in Escherichia coliMolecular Systems Biology10.8722.02
The draft genome sequence of the ferret (Mustela putorius furo) facilitates study of human respiratory diseaseNature Biotechnology43.1131.88
Population and single-cell genomics reveal the Aire dependency, relief from Polycomb silencing, and distribution of self-antigen expression in thymic epitheliaGenome Research11.3511.81
Cell differentiation and germ–soma separation in Ediacaran animal embryo-like fossilsNature38.1381.77
Non-targeted metabolomics and lipidomics LC–MS data from maternal plasma of 180 healthy pregnant womenGigaScience7.4631.55
Opposite effects of anthelmintic treatment on microbial infection at individual versus population scalesScience34.6611.44
Reconstructing a comprehensive transcriptome assembly of a white-pupal translocated strain of the pest fruit fly Bactrocera cucurbitaeGigaScience7.4631.25
Transcriptomic and proteomic dynamics in the metabolism of a diazotrophic cyanobacterium, Cyanothece sp. PCC 7822 during a diurnal light–dark cycleBMC Genomics3.8671.25

One thing is that the titles at the bottom seem to be longer, and that is born out quantitatively, although the correlation is perhaps not spectacular:

Any other features of the title? We looked at specificity (which was the sum of the times a species, gene name or tissue was mentioned), declarativeness (“RNA transcription requires RNA polymerase” vs. “On the nature of transcription”), and mention of a “weird organism”, which we basically defined as anything not human or mouse. Check it out:

Hard to say much about declarativeness (declariciousness?), not much data there. Specificity is similarly undersampled, but perhaps there is some tendency for medium impact titles to have more specific information than others? Weird organism, however, really showed an effect. Basically, if you want people to think you wrote a low impact paper, put axolotl or something in the title. Notably, for each of the high impact journals, we had 1 each perceived as high and low impact, and this “weird organism” metric explained that difference completely. The exception to this is, of course, CRISPR: indeed, the highest perceived low impact paper was CRISPR in C. elegans. Note that we also included E. coli as “weird”, although probably should not have.

We then wondered: does this perception even matter? Does it have any bearing on citations? So many confounders here, but take a look:

First off, where you publish clearly is clearly strongly associated with citations, regardless of how your title is perceived. Beyond that, it was murky. Of the high impact titles, the ones with high perception index definitely were cited more, but the n is small there, and the effect is not there for medium and low impact titles. So who knows.

Our conclusion seems to be that mid-tier journals publish things that sound like they should be in mid-tier journals, perhaps with titles with more specificity. Flashy and non-flashy papers (as judged by actual impact factor) both seem to be playing the same hype game, and some of them screw up by talking about a weird organism.

Anyway, before reading too much in into any of this, like we said in the methods section, there are lots of problems with this whole thing. First off, we are vastly underpowered: the total of 18 titles is nowhere near enough to get any real picture of anything but the grossest of trends. It would have been better to have a large number of titles and have the questionnaire randomly select 18 of them, but if we didn’t get enough responses, then we would not have had very good sampling for any particular title. Also, it would have been interesting to have more titles per journal, but we instead opted for more journals just to give a bit more breadth in that respect. Oh well. Some folks also mentioned that 8 is a pretty aggressive cutoff for “low impact”, and that’s probably true. Perception of a journal’s importance and quality is not completely tied to its numerical impact factor, but we think the particular journals we chose would be pretty commonly associated with the tiers of high, medium and low. With all these caveats, should we have given our blog post the more accurate and specific title “Results from the Guess the Impact Factor Challenge in the genomicsy/methodsy subcategory of molecular biology from late 2014/early 2015”? Nah, too boring, who would read that? ;)

We think one very important thing to keep in mind is that what we measured is perceived impact factor. This is most certainly not the same thing as perceived importance. Indeed, we’re guessing that many of you played this game with your cynic hat on, rolling your eyes at obviously “high impact” papers that are probably overhyped, while in the back of your mind remembering key papers in low impact journals. That said, we think there’s probably at least some correspondence between a seemingly high profile title and whether people will click on it—let’s face it, we’re all a bit shallow sometimes. Both of these factors are probably at play in most of us, making it hard to decipher exactly how people made the judgements they did.

Question is what, if anything, should we do in light of this? A desire to “do” something implies that there is some form of systematic injustice that we could either try to fix or, conversely, try to profit from. To the former, one could argue that the current journal system (which we are most definitely not a fan of, to be clear), may provide some role here in “mixing things up”. Since papers in medium and high impact journals get more visibility than those in low impact journals, our results show that high impact journals can give exposure to poorly (or should we say specific or informatively?) titled papers, potentially giving them a citation boost and providing some opportunity for exposure that may not otherwise exist, however flawed the system may be. We think it’s possible that the move to preprints may eliminate that “mixing-things-up” factor and thus increase the incentive to pick the flashiest (and potentially least informative) title possible. After all, let’s say we lived in a fully preprint-based publishing world. Then how would you know what to look at? One obviously dominant factor is who the authors are, but let’s set that aside for now. Beyond that, one other possibility is to try and increase whatever we are measuring with perception score. So perhaps everyone will be writing like that one guy in our field with the crazy bombastic titles (you know who I mean) and nobody will be writing about how “Cas9–crRNA ribonucleoprotein complex mediates specific DNA cleavage for adaptive immunity in bacteria” any more. Hmm. Perhaps science Twitter will basically accomplish the same thing once it recovers from this whole Trump thing, who knows.

Perhaps one other lesson from all of this is that science is full of bright and talented people doing pretty amazing work, and not everybody will get the recognition they feel they deserve, though our results suggest that it is possible to manipulate at least the initial perception of our work somewhat. A different question is whether we should care about such manipulations. It is simplistic to say that we should all just do the work we love and not worry about getting recognition and other outward trappings of success. At the same time, it is overly cynical to say that it’s all just a rat race and that nobody cares about the joy of scientific discovery anymore. Maybe happiness is realizing that we are most accurately characterized by living somewhere in the middle… :)

Friday, February 17, 2017

Introducing Slideboards, a tool for scientific communication

Given the information overload we all deal with, I think it’s pretty safe to say that scientific communication is more important than ever these days. The problem is that we’re still mostly using the same format we’ve been using for ages, namely the paper. And the bottom line is that people just don’t read them. The problem, deep down, is that papers serve two not entirely overlapping purposes: one is to tell people what you learned, and the other is to document precisely how you learned it. This is particularly problematic when trying to understand work outside your particular subdomain—all the details make it hard to focus on the bigger picture.

How do we normally solve the problem of giving a big-picture version of what your paper is about? Personally, I feel like the 5-10 minute short talk like you hear at a conference, when done well, accomplishes this nicely. So our first foray into communicating our science more efficiently was to make slidecasts, which are short videos consisting of slides and a voiceover narration—basically, an online version of the short conference talk. I think these are generally pretty effective, and I’ve gotten generally positive feedback on them, usually along the lines of “We should make these, too” (more on that later). One person I sent a slidecast to, though, had an interesting response. He said that he liked it, but that it was “Too slow, I want to get though the slides faster” and that “I want to know the answer to particular details, but I can’t get them.” Hmm. How do you make something simultaneously faster and include more information? So after a fair amount of thinking, we took a cue from the web. If you need to renew your driver’s license, do you download the entire operational manual of the DMV? No, you go to the website and get the overview. And if you have some special case scenario, like your boat-car needs a special game-and-fisheries license or something? Just look at the FAQ. Which got me thinking: maybe this is the solution for the “faster, but more content” crowd is looking for. Have a slidecast that one can flip through quickly, then a FAQ on the side that answer those “supplementary figure” questions that people often have during a short talk.

So we made exactly this! (And by we, I mean my awesome technician Rohit, who coded the whole thing from scratch.) We call them Slideboards, and you can check out our first fully-featured “Slideboard” here. I think it pretty much realizes our initial concept. Feel free to post a question and I will try and answer it!

Of course, it’s nice for us to make slidecasts and now Slideboards, but this always raises the question: how do we get others to make them, too? This brings me back to the feedback we got our slidecasts, which was “We should make these”, after which approximately zero people ever actually make one. Why not? Well, after having made a few of these myself, the answer is that it’s a lot of work—you really have to have a fully written out script, and it usually requires at least a couple takes, which all adds up to the better part of a day. (Of course, the fact that the work itself probably took two to four years never seems to enter into this calculus, but whatever.) Which is why we really wanted to make an authoring tool that would make the task of creating a Slideboard as simple as possible. Problem is, it’s hard. The reason why is simple, which is that making content just plain takes time, as anyone who’s made endless graphical abstracts and bullet points and the such can relate to. So we thought to ourselves, what is the content that pretty much everyone already has on their work? We thought two things: a slide deck for a talk on the work, and the PDF of the preprint or other written version that has various figures and supplementary figures. Our authoring tool leverages these to allow you to make a Slideboard quickly and easily. Basically, upload the slides to make the slideshow part and type in captions for the slides to provide some narration, then make questions and answers through a quick interface that allows you to drag and select images from the PDF to quickly insert into your answers. Here’s a very short video showing how to do it:

And that’s it! If you have some slides and Also, the viewer interface allows you audience to ask you questions, which you can then choose to answer if it seems appropriate (not that there are any dumb questions or anything, but… ;) ). We’ve tried to make the whole process as painless as possible, and hope to see your work soon!

Still, in a world with a steady stream of new ways to reformat and share your scientific work, why use this one? We believe that our approach provides a simple, rapidly digestible format that simultaneously provides a lot of information. Meanwhile, we’ve provided an authoring tool that makes it as easy as possible to develop Slideboards of your own.

And what can you do with Slideboards? Our primary goal so far has been to make a format for sharing scientific papers, and you can easily share links to either the entire Slideboard or a specific slide or question; you just edit the URL like this:

(More convenient URL generating buttons coming soon!) We think there are plenty more possibilities, however, including outreach to young students just getting interested in science, and probably many others we haven’t thought of. Anyway, give it a try, and just let us know if you have any questions, happy to help!

Sunday, February 5, 2017

A bigly new method: the most tremendous FISH ever invented

Post by Ian Mellis.

Here we present a novel method for the visualization and quantification of previously unobservable...what am I saying? This isn't how we write papers anymore! Now that our elected officials can so unceremoniously dispense with objective fact and insist on a personally profitable alternative reality (in a permanent tantrum televised 24 hours a day), I think it's about time that we update our scientific discourse to match the political.

You can access our BIGLy-FISH paper here FREE OF CHARGE. Patriotic!

Saturday, January 7, 2017

I think Apple is killing the keyboard by slow boiling

I’m pretty sure Apple is planning to kill the mechanical keyboard in the near future. What’s interesting is how they’re going about it.

Apple has killed/"moved forward" a lot of tech by unilateral fiat, including the floppy drive, DVD drive, and of course various ports and cables. (Can we just stop for a minute and consider the collective internet brainpower wasted arguing about the merits of these moves? (Yes, I can appreciate the irony.).)

The strategy with the keyboard, however, is something different. For the past several design iterations, the keyboard travel has been getting thinner and thinner, to the point where the travel on the latest keyboards is pretty tiny. It’s pretty easy to see that the direction Apple is headed is towards a future in which the keyboard has no mechanical keys, but is rather some sort of iPad like thing, perhaps with haptic feedback, but with no keys in the traditional sense of the term. (The force touch trackpad and the new touch bar are perhaps harbingers of this move.)

What’s interesting is how Apple is making this transition. With the other transitions, Apple just pulls the plug on a tech (Firewire, I barely knew thee), leading to squealing by a small but not insignificant number of very visible angry users, modestly annoyed shrugs from everyone else, and a Swiss-Army-knife-like conglomeration of old projector adapters wherever I go give presentations. With this keyboard transition, though, the transition has been far more gradual—and the pundit class has consequently been far more muted. Instead of the usual “Apple treats their users with utter contempt!” “Apple is doomed by their arrogance!” and so forth, the response is more like “huh, weird, but you’ll get used to it.” Perhaps this reflects more the fact that there’s no way to “transition” to a new port interface (there is no port equivalent to “reduced key travel”, although perhaps microUSB qualifies), but still.

Why might Apple be doing this? There are three possibilities I can think of. First, one formal possibility is that there could be some convenience/cost benefit to Apple to doing this, like reduced component cost or whatever. This strikes me as unlikely for a number of reasons, not least of which being that it is almost certainly a pain in the butt to keep designing new keyboards. Another possibility is that the there is some tradeoff, most obviously with thickness: clearly, having a shorter travel will let you make a thinner computer. While this is a likely scenario, and perhaps the most likely, there are some reasons to question this explanation. For instance, why do the keyboards on the desktop Macs (remember those?) also have shorter key travel now? One could say that it’s to maintain parity with laptops, but then again, anyone suffering through desktop Macs these knows that parity isn’t exactly the name of Apple’s game these days—frankly, the keyboard is just about the only thing that got updated on the iMacs in the last several years. Which leads to the third possibility, which is that having a non-mechanical keyboard (essentially a big iPad) down there would enable new interfaces and so forth. Hmm. Well, either way, I think we’ll find out soon.

Thursday, January 5, 2017

Why care about the Dow? Why not?

Just listened to this Planet Money podcast all about hating on the Dow Jones Industrial Average. Gist of it: the Dow Jones calculates its index in a weird (and most certainly nonsensical) way, and is an anachronism that must die. They also say that no market "professional" (quote added by me) ever talks about the dow, but measures like the S&P 500 and the Wilshire 5000 are far more sensible.

This strikes me as a criticism that distracts from the real issue, which is whether one should be using any stock market indicator as an indicator of anything. Sure, the Dow is "wrong" and the S&P 500 is more "right" in that they weight by market cap. Whatever. Take a look at this:

Pretty sure that this fact goes back longer as well, but Wolfram Alpha only goes back 5 years and I've already wasted too much time on this. Clearly, also, short term fluctuations are VERY strongly correlated—here's the correlation with the S&P 500 in terms of fluctuations:

So I think the onus is on the critics to show that whatever differences there are between the S&P and the Dow are meaningful as predicting something about the economy. Good luck with that.

Of course, as an academic, far be it from me to decry the importance of doing something the right way, even if it has no practical benefit :). That said, in the podcast, they make fun of how the Dow talks about its long historical dataset as an asset, one that outweighs its somewhat silly mode of computation. This strikes me as a bit unfair. Given the very strong correlation between the Dow and S&P 500, this long track record is a HUGE asset, allowing one to make historical inferences way back in time (again, to the extent that any of this stuff has meaning anyway).

I think there are some lessons here for science. I think that it is of course important to calculate the right metric, e.g. TPM vs. FPKM. But let's not lose sight of the fact that ultimately, we want these metrics to reflect meaning. If the correspondence between a new "right" metric and an older, flawed one is very strong, then there's no a priori reason to disqualify results calculated with older metrics, especially if those differences don't change any *scientific* conclusions. Perhaps that's obvious, but I feel like I see this sort of thing a lot.

Friday, December 30, 2016

Last post ever on postdoc pay

Original post, first follow up, this post

Short intro: wrote a post about how I didn't like how some folks were (seemingly) bragging about how high they pay their postdocs on the internet, got a lot of responses, wrote a post with some ideas about how postdocs and PIs could approach the subject of pay. That was meant to deal with short term practical consequences. Here, I wanted to highlight some of the responses I got about aspects of postdoc pay that have to do with policy, likely with no surprises to anyone who's thought about this for more than a few minutes. Again, no answers here, just mostly reporting what I heard. So sorry, first part of the post is probably kind of boring. At the end, I'll talk about some things I learned about discussing this sort of thing on the internet.

First off, though, again, for the record, I support paying postdocs well and support the increased minimum. I think a minimum starting salary of $48K (however inadvertently that number was reached) seems to be a reasonable minimum to enforce across the US. Based on what, I dunno, honestly. I just think we need a flat national minimum: it would be hard/weird for NIH to do it by cost of living across the US, but at the same time, relying on institutions to set their own wage scales is ripe for abuse. More on that later.

Anyway, it is clear that one of the top concerns about postdoc pay was child care. No surprise there, postdoc time often coincides with baby time, and having kids is expensive, period. One can get into debates about whether one's personal life choices should figure into how much pay someone "deserves", but considering that the future of the human race requires kids, I personally think it's a thing we absolutely must be considering. There are no easy answers here, though. Igor Ulitsky summed it up nicely:

I think Igor is absolutely right, an institutional child care subsidy is really the only way to do it. The problem otherwise is that the costs are so high for childcare that just paying everyone enough for childcare regardless of family status would quickly bankrupt most PIs' grants. But just paying more based on "need" has a lot of flaws. I think it was telling that at least some trainees said that they wouldn't begrudge their coworker with a kid if the PI paid them more. Well, what if your coworker had parents who lived with them? Or parents who could live with them? Or a spouse who earned a lot of money? Or was home from work often because of the kid? And how much extra should they be paid? Enough for "cadillac" child care? Bare minimum child care? I just don't think it's reasonable or wise for PIs to be making these decisions. If, on the other hand, the institution stepped in to make this a priority (as both my postdocs have argued), then this would solve a lot of problems. They could either provide a voucher applicable to local daycares or provide daycare itself at a heavily subsidized rate (I think Penn does provide a subsidy, but it's not much). This is, of course, a huge expense for institutions to take on, and I'm sure they won't do it willingly, but perhaps it's time to have that discussion. Anecdotally, I think there really has been a change—before, many academics would wait until getting a faculty position (maybe even tenure) before having kids, whereas now, many academics come into the faculty position with kids. I think this is good and important especially for women, and I think it's pushing this particular issue for postdocs into the foreground.

The other big issue folks brought up was diversity. Low wages mean that those without means face a pretty steep price for staying in science, potentially forcing them out, as this commenter points out from personal experience. I think this is a real problem, and again, no real answer here. I'm not convinced, however, that the postdoc level is where that gap typically emerges—I'm guessing that it's mostly at the decision to go to graduate school in the first place. (The many confounders likely make such analyses difficult to interpret, though I don't know much about it.) Which is in some ways perhaps a bit surprising, since unlikely medical/law/business school, you actually get paid to do a PhD (although I believe most analyses still suggest that you could earn more overall by just getting a job straight away, maybe depending on the field). Also, higher pay would mean fewer postdoc positions, making the top ones more competitive, thus potentially further hurting the chances for those facing bias, although my guess is that this latter concern would not outweigh the former on diversity.

Along these lines is the notion of opportunity cost, with at least a few people (typically computational) noting that the postdocs they want to hire can earn so much on the open market that if they didn't pay them a lot, it would be hard to get them. At the same time, interestingly, a couple trainees invoked the ideals of the free market, saying that people should be paid whatever they can earn. Hmm. Well, I think this gets into the question of what the cost of doing science is. All stages of scientist (from trainees to PIs) probably on average earn less than we could in private industry, with that differential varying by field and circumstance—that is the price for doing what we love. The obvious question is whether this sets up a system primed for abuse. There are some who are willing to work like a dog for next to nothing for the chance to keep doing science. For this reason, there has to be a reasonable minimum to ensure at least some degree of diversity in the talent pool. Beyond that, I personally have no problem with people paying above the minimum if they so choose (and institutional policies that prevent that strike me as pretty unfair and something to fight against). If this helps keep talented people in science, great!

The notion of a free-market approach to pay is an interesting one, one that led me to the following question about the cost of doing science. Let's say that I had a ton of money. Is there some amount of money I could pay to get a postdoc that I otherwise would lose to some big name PI? Like, let's say I paid my postdoc $1M per year. Well, I'd probably be getting a lot of top quality postdoc applications (although still probably not even close to all). But what about $100K? How much would that factor into someone's decision to do a postdoc with me? I venture to say that the answer is not much. How little would someone be willing to accept for the opportunity to work with a big name who could greatly aid their quest for a faculty job? All I can say is I'm glad there's a minimum. :)

I also learned a bit about online discussions on this topic. As I said in my first post, I was super reluctant to discuss this topic at all online, given the opportunity for misunderstanding and so forth. And sure enough, I got some of what I thought were unfairly accusatory responses. Which, of course, is something that I was guilty of myself (and I apologize to MacArthur for that). Hmm. I still stand by, sort of, my point that the original tweet from MacArthur came across in a way that was perceived by many as boastful, even if that was not his intent, and that that may not be the most productive way to start a discussion. That said, I also have to acknowledge that waiting for the "perfect" way to discuss the issue means waiting forever, and in the meantime, just saying something, anything, publicly can have an effect. Clearly the collective tweets, posts and responses on the topic (most are imperfect, though I particularly like this one from Titus Brown) are having the desired effect of engendering a discussion, which is good. And, as a practical matter, I'm hopeful that airing some of the institutional differences in postdoc pay may help both trainees and mentors (see some examples in my second post). It is clear that there's a lot of mystery shrouding the topic, both for trainees and PIs alike, and a little sunlight is a good thing.

All that said, I still think that in addition to online rants of various kinds, with an issue this complex, it's pretty important for us all to talk with each other face to face as well. After all, we're all on the same team here. Academia is a small world, and while it's important to disagree, personal attacks generally serve nobody… and might as well be transparent about who you're disagreeing with so they can disagree back:

(In my defense, the only reason I "subtweeted" is that I really didn't want to call MacArthur out personally because his was just the latest tweet out of many of this kind I had seen. And I suppose it worked in that many people I know who read the post indeed had no idea who I was referring to. But giving him the chance to respond is probably on balance the right thing to do.)

Anyway, while I have not met MacArthur in person, I'm guessing we'll probably cross paths at some point, at which point my main concern is that we'll discover we agree on many things and so I won't have anything else to write about… :)

Wednesday, December 7, 2016

Some less reluctant(ish) follow up thoughts on postdoc pay

(Original post, this post, second follow up)

Well, looks like that last post incited some discussion! tl;dr from that post: I wrote that I found tweeting about how high you pay your postdocs above what most other labs pay to be off-putting. There are many factors that go into pay, and I personally don't think talking about how much you yourself pay is a productive way to discuss the important issue of postdoc pay in general. Even if the intent is not to boast, it certainly comes across as boastful to a number of people, which turns them off from the conversation. To be clear, I also said that I support paying postdocs well and support the increased minimum. It's the perceived boast, not the intent, that I have issue with.

So I learned a LOT from the feedback! Lots of comments, fair number of tweets (and these things called "subtweets"; yay internet!) and several personal e-mails and messages—more on all that in a later post; suffice it to say there's a "diversity of opinion". Anyway, okay, I said that I didn't like this particular way of bringing about discussion about postdoc pay. But at the same time, I do think it's a good thing to discuss, and discuss openly. Alright, so it's easy for me to criticize others about their tweets or whatever, but what, then, do I think is a good way to discuss things? Something I've been thinking about, and so I want to write a couple posts with some ideas and thoughts.

Overall, I think there are two somewhat separate issues at play. One is the immediate, practical issue of how to increase awareness of the problems people have and bring about some better outcomes in the near-term. The other is long-term policy goals and values that I will bring up in a later post (with relatively few ideas on what specifically to do, sorry).

So, to the first point, one of the things I learned is how surprisingly mysterious the subject of postdoc pay is, both to prospective postdocs and to PIs alike. Morals and high-minded policy discussions aside, seems like many just don't know some basic practical matters that can have a real impact. Anyway, here's a few relatively off the cuff suggestions of things to think about based on what I've heard, and feel free to add to the list.

First, for potential postdocs, the main thing to do is to remember that while science should in my opinion be the primary factor in choosing a postdoc, pay is another important factor and one you should definitely not shy away from, awkward though it may seem. I think advocacy begins here, on a practical level, by advocating for yourself. Keeping in mind that I haven't hired that many postdocs and I'm not sure how some of these ideas might hold up in practice, here is some information and some ideas for trainees on how to approach pay:
  • Ask about pay relatively early on, perhaps once there's real interest on both sides, during or maybe better after a visit (dunno on that). It may be uncomfortable, but at least make sure that it's clear that it's on your radar as a thing to discuss. Doesn't mean that you have to come to a hard number right away, but signal that it's worth talking about.
  • Before having such a discussion, it's worth thinking about what number seems fair to you. There is the NIH minimum, and then there's your life situation and location and so forth. You are an adult with a PhD, so take stock of what you think you need to be happy and productive, and don't be afraid of saying so. What can help with this is to think about what you might otherwise make outside of academia, or what the average cost of living is in your area, our your particular personal situation, or whatever other factors, and come up with a number. Having some rationalization for your number, whatever it may be, is important to help you maintain fortitude when you do discuss pay and not feel like you're being impudent. Remember that the PI probably finds this awkward as well, and so having guidance can actually help both parties! And if you're a decent candidate, you may have a surprising amount of bargaining power. At the same time, remember that the PI may have their own expectations for the discussion (which may include not having the conversation!), and so you may catch them a bit off guard, depending.
  • Some basic orientation about pay: the major national guideline comes from the NIH. The NIH sets a *minimum for fellowship* pay. This used to be ~$42K a year for a starting postdoc, and then there was some labor ruling that caused that to increase to ~$48K a year. Institutions often follow this NIH guidance to set up their pay guidelines. This ruling got overturned recently, and so now some institutions have gone back to $42K starting, while some others have not. These are the national guidelines for a baseline. Clearly, some places in the country are going to be more expensive than others.
  • This is the NIH guidance on the minimum. At some places, yes, you can definitely be paid more than the minimum (apparently, many trainees didn't know that). At some places, there are institutional rules that prevent PIs from paying more than the minimum or some other defined number or range. At some places, there are institutional rules that require PIs to pay above the minimum. If the PI has flexibility, they may have their own internal lab policy on pay, including a "performance raise" if you get a fellowship. And it's also possible that the PI just doesn't have any clue about any of this and just goes along with what HR tells them. At the same time, keep in mind that the PI does manage a team with existing players, and they must manage issues of fairness as well. Anyway, point is ask, do not ever assume.
  • Some points of reference. Many (most?) postdocs work for the NIH minimum (which of course does not mean you should or should not, necessarily). Stanford institutionally starts at $50K. As mentioned last time, some folks pay $60K (Tweet was from Daniel MacArthur, who has asked that I not subtweet, sorry). Right or wrong, clearly some PIs take issue with this. I've heard of some fellowships that went up north of $80K. I think that $80K is probably considered by most to be a pretty eye-poppingly high salary for a postdoc, but dunno, I'm old now. Computational work often pays more than straight biology because a lot of those folks could make so much in industry that it's harder to attract them for less (maybe $10K+ premium?). Math often pays higher than biology because postdocs are considered sort of like junior faculty. Physics often pays better as well, perhaps dependent on whether you have some named fellowship. Anyway, you have an advanced degree, do some homework. I think it makes sense to be sure your number reflects your self-assessed worth but is within reasonable norms, however you choose to define "reasonable".
  • As in any negotiation, there may be back and forth. As this happens, you may have areas in which you are flexible, and maybe the PI is flexible. It is also possible that the PI is unable or unwilling to bend on pay. At that point, it is up to you to make the decision about whether that sacrifice is worth it for you. There are of course further policy discussions that must happen in this regard, but for now, this is what you are faced with, and it's your decision to make.
  • It is possible that PIs may not even know all the options for pay. Sometimes, there is some institutional inertia on "how they do things" that everyone just goes along with. This can be hard to find out until you get there and find out who to ask, though.
  • There are often some hidden costs, and it's worth considering what those may be in your case. These can include things like out of pocket payments for health insurance (including family), gym memberships, and various other benefits. Note that sometimes these costs can vary depending on your official position at the institution, which in turn can change depending on whether you have a fellowship or whatever (sometimes, a fellowship reduces your status, thus costing you more for many things, ironically). There may be some sort of child care benefit or something, or at least access to the university daycare. And there may be some commuting benefits, in case that's relevant. Some places are able to cover moving costs if the PI wishes.
  • There are a host of issues for foreign postdocs, and someone more knowledgeable than I should probably write about them, but some costs I've seen are visa costs (sometimes paid by institution, sometimes not, very confusing), and also travel costs associated with yearly return visits to the home country for visa purposes. These return visits, by the way, may be avoidable with longer contracts, which may or may not be available, which was something I just learned recently myself.
  • For a lot of the above hidden costs, the PI may not even realize that these sorts of things are going on, and they may be willing to help. There is a possibility that they can cover some of these costs, depending on institutional rules, or maybe it can be a rationale to negotiate a higher salary.
Here are some thoughts for PIs, probably mostly for junior people (which I still consider myself, but I'm probably just kidding myself). Most of these I'm just kind of making up on the spot, being a relatively inexperienced postdoc-hirer myself:
  • It took me a while to learn all the intricacies of what constitutes pay. What are the pay scales? What can I pay for? Moving costs? Commuting costs? Benefits? I still don't think I fully understand all of this, but I wish I had a better understanding when I started. When I started, it was like "you can hire a postdoc, here you go."
  • I'm still not fully clear on all the hidden costs to my people and what benefits they get, and I should really brush up on that, potentially making a plain English document for new lab members.
  • At the institutional level, it took me a while to disentangle what is actual policy on things like pay vs. what is just "the way we have always done it". Breaking these unofficial rules gave me some flexibility to do good things for my people.
  • I am thinking of developing a coherent lab policy on pay, explicitly stating what I will and will consider when figuring out overall pay level, relative pay between people, etc. I haven't really worried about it so far, and that's been fine, but having something like that would really help. I guess that's sort of obvious, so maybe I'm just sort of late to this bit of common sense. Am I alone in that?
  • I think in the course of coming up with such a policy on pay, I'll probably think about exactly what my values are, what these kids' opportunity costs are, and how much I think is reasonable to live on in Philly. I mean, I kinda do this already, but haven't really thought about it very seriously, and periodic reexamination seems appropriate.
  • I'm not entirely sure I would share this policy within the lab, though. Thing is, everyone's circumstances are different, and exceptions are frankly pretty much the rule. I think the point is just to have some sort of internal guidance so that at least you won't forget about anything when deliberating.
  • I'm wondering whether and to what extent it's worth discussing lab cost management with the people in your lab so that they see how the sausage gets made. I had one trainee who was surprised to find out (not from me, but rather from Penn HR) exactly how much their pay actually counted against a grant once all the benefits and so forth were added in. There is an argument to be made (that I've mostly subscribed to) that postdocs should just focus on their work and not worry about the lab bills. There's another argument to be made that sharing such information gives people a sense of the true costs of running a lab for training purposes. Then again, it's a fine line between being informative and passive-aggressive. Dunno on this one.
Anyway, who knows if this will help anything, but consider this my contribution to the discussion for now. While it certainly won't solve all the problems out there, given the surprising lack of knowledge out there, perhaps this information will be of some use. More in another post later on policy things that came up, as well as how to talk about these things on the internet.