Considerations on Careerscienceability
When the best look like obstinate anti-science conspiracy theorists
Last November witnessed the 200th Open Thread on Scott Alexander’s blog, formerly Slate Star Codex, now Astral Codex Ten and number one in Substack’s science category. Consider the following passage, from one of the notes he included when posting to host the thread:
Suppose that some parapsychologist has done twenty studies, all of which prove psychic phenomena exist (and there are many such parapsychologists!) Does it help if she works for another year or two, and we get forty such studies? How about another decade, and we get two hundred such studies? Who cares?! We already know that this parapsychologist, using whatever methodology she uses, is able to consistently get positive results.
The actual context was ivermectin as a COVID-19 medication, rather than psychic phenomena. But parapsychology is well-suited to make the point, since we don’t believe in it even though it does pass the usual tests of scientific validity. Instead, we have to recognise that those tests are faulty — which obviously opens a can of worms. Scott presented the problem eight years ago in his classic The Control Group is Out Of Control, including thoughts directly similar to the ones just quoted (such as “you can keep coming up with more studies till the cows come home”).
Now, what I find interesting and potentially destructive here is that, once we look at ivermectin rather than parapsychology, Scott is probably right in his skepticism but, as he indicates himself later in the abovementioned note, at the same time in danger of looking unreasonable. Even doubly so. Not only looking like a person who refuses to submit to the authority of scientific studies, but also like a “bad Bayesian”, who refuses to “update” his beliefs, obstinately clinging to them no matter how much contrary evidence accumulates.
First of all, let’s be clear about: can Scott really be right? Do the standards of science really let spurious results pass on a large scale? To convince those hesitant to believe it, there is not only his Control Group, but also his more recent post on the serotonin transporter gene 5-HTTPLR, a gene around which from 1996 onwards a big body of work formed that, as has become clear more recently, is simply baseless. He describes in the post, to quote from its concluding paragraph, “a rare case where methodological improvements allowed a conclusive test of a popular hypothesis, and it failed badly. How many other cases like this are there, where there’s no geneticist with a 600,000 person sample size to check if it’s true or not? How many of our scientific edifices are built on air?”
We need to acknowledge this. Scientists tend to come up with confirmation for many of their hypotheses even if those are false. How do they do it? That’s the obvious question. In his Control Group, Scott ultimately has to engage with it, so let me quote him once more. He suggests (this is written in the context of psychology; other contexts may feature additional ingredients)
a combination of subconscious emotional cues [think of the horse Clever Hans], subconscious statistical trickery, perfectly conscious fraud which for all we know happens much more often than detected, and things we haven’t discovered yet which are at least as weird as subconcious emotional cues. But rather than speculate, I prefer to take it as a brute fact. Studies are going to be confounded by the allegiance of the researcher.
However it works, the tendency is there. Here I will speak of “researcher’s magic sauce” and follow Scott’s brute-fact approach. That is, I will try to comment on the consequences and limits, but not on the obvious question of what’s behind it.
Consequences like, as noted, the reasonable side in a debate appearing anti-science and obstinate. But it gets even worse. If scientists find non-existing phenomena, even on a large scale, that does not yet really come across as sinister, however bad the consequences. As long as there is no ideological motivation, it’s kind of reminiscent more of a group of children developing theories about fairies together — innocent enough. But what if the game is about hiding rather than inventing something? Now we’re truly hitting rock bottom. Now those aware of the problems are set to appear like obstinate anti-science conspiracy theorists.
Looking for examples of phenomena that scientists would be incentivised to hide, I suppose one could consider politically incorrect aspects of reality. There is an obvious incentive to play down those aspects: combine that with researcher’s magic sauce, and one can imagine what might be going on. However, what one can also imagine, even if a field were politically innocuous, is simply that a dogma has taken hold. That one of several competing schools has managed to monopolise such commanding heights as funding bodies, editorial posts, etc. Today’s scientific enterprise is a professional one; scientists need their careers; and hence the incentives to toe the party line, if one can be established, would be strong.
How many scientific fields this describes I don’t know, but one that surely fits the bill is nutrition science, as documented in The sugar conspiracy, a 2016 article from The Guardian, by Ian Leslie. The root dogma here, still influential today, is the claim that saturated fat causes heart disease. In the broader context of what causes obesity and Western diseases, the main competing theory in the 1960s, subsequently stamped out by the anti-fat brigade, pointed to sugar instead. However bad or good the evidence against sugar, it certainly was neglected by the field — hence “sugar conspiracy”.
The extent of the failure of nutrition science as a scientific endeavour (see also here) will be exceptional, hopefully, but think also of the fields that famously suffer from a “replication crisis” (though at least those scientists are aware that experiments should be replicable — I’m not sure if nutrition science could even have a replication crisis). The problems are pronounced. My post is not about solving them, just about coming to terms with the situation. And even in that respect it has no aspirations of breaking new ground. But one possible way to get a handle on an issue is to name it, so here’s a definition:
A careerscienceable body of scientific evidence for a hypothesis is one that, if the hypothesis is false, is still feasibly produced by the cumulative work of scientists in light of their career incentives.
Not the least emphasis should be on “cumulative”. Before we can competently spread the word, most of us probably have to retrain our own intuitions regarding how insignificant volume is when it comes to scientific evidence. According to Scott’s 5-HTTPLR post, there are hundreds of studies “discovering” all sorts of different aspects and connections of the phenomenon, including meta-analyses “establishing” certain effects of the gene with very high confidence (just 1 in 50,000 against). Can we intuitively grasp that such an impressive edifice, erected by professional scientists, is all built on air? And yet it is. Meanwhile, nutrition science was at it longer still than the 5-HTTPLR researchers were. Expect thousands of studies (at least) in that case, using different approaches and techniques, but all “converging” on the suggestion that saturated fat is bad for you — and yet it means nothing. Or at least the enormous volume of evidence produced means nothing.
I’m not saying that, in a field captured by dogma, mainstream scientists lie and contrarians don’t. That would be questionable on several levels. First, one doesn’t have to accuse researchers of outright lies. I quoted Scott on “subconscious emotional cues” and “subconscious statistical trickery”; also, short of lying there is still freedom in how to present results in publications, for instance what to put prominently in the abstract and what to keep away from plain sight. Plus, I should admit that the notion of careerscienceability finds a competitor in something like “confirmscienceability” if researchers just take prevailing wisdom for granted, independently of career incentives. Secondly, the allegiances of many contrarian researchers would seem to be, if anything, even stronger. And there can also be career incentives to be contrarian. However, contrarians are a small minority by definition, and hence a body of contrarian evidence loses careerscienceability once it reaches a certain volume, whereas a body of dogma-confirming evidence does not. (Although, if the contrarians in question are as highly productive as Scott’s parapsychologist, the threshold will be high.)
A similar loss of careerscienceability can affect a voluminous body of negative evidence. It will depend on the circumstances, but I assume that in a case like the 5-HTTPLR one the career incentive will have been to add planks to the edifice, rather than to express doubts. It is well known, after all, that positive results are more publishable than negative ones. And better-known yet is the mantra “publish or perish”. Thus conceivably a body of scientific evidence that is uniform in type and mostly positive could still warrant a negative conclusion — if what negative evidence the body contains, evidence that the phenomenon is not real, is not careerscienceable once added together.
What else can be said about careerscienceability? More people need to know about it — but getting the point across looks tough. The best chance, I guess, is finding and exposing cases of scientific edifices built on air, such as the 5-HTTPLR one. As I indicated, I think they are less risky than cases of dogma-bound science, because, if you don’t manage to convince everyone, still hopefully the label of conspiracy theorist won’t quite fit.
A label you wouldn’t want since the underlying problem here, which I have not yet mentioned explicitly, is that conspiracy theorists tend to be wrong! Often grotesquely so. Similarly for people who reject science wholesale, and people who cling to their beliefs. This is the tragedy of making the best look like obstinate anti-science conspiracy theorists. The latter sort are usually not worth listening to at all. And believers in scientific expertise and authority like to outright ban them from public platforms when possible. Careerscienceability forces the best, those most needed for society’s intellectual health, to mimic the worst, and even to run the risk of getting banned with them.
Notice also that finding and exposing scientific edifices built on air, an obvious strategy and just advocated here, does not only support the best, it equally supports the worst. What we really need is differentiation between the best and the worst. And, to end on a positive note, when it comes to “obstinate” and “anti-science”, the quest for this might not be hopeless. What the best have to try is specify which kind of scientific evidence would actually change their minds. The “normal” kind of scientific study is not good enough; a high volume of such studies isn’t; and meta-analyses certainly aren’t. What would be, by contrast? Is there anything?
A body of studies that show high effect sizes, not to be confused with high confidence as in the 5-HTTPLR meta-analyses, loses careerscienceability rather quickly. This information I take from Scott’s Control Group. So please come up with such studies . . . then again, if an effect is real but small, that’s a stupid demand. And if it’s real and big, we wouldn’t be in this situation, presumably. Is there anything, once we’re in it?
Well, there might be, since if we look again at the Guardian article, the one thing about nutrition it really confirms is, not that sugar is bad (research on that was dialed down after all), but rather that saturated fat is not. Although the harm caused by saturated fat was exactly what scientists tried to demonstrate! Yet however voluminous the careerscienceable body of dodgy evidence produced, randomised control trials meant to seal the deal backfired. Especially one huge trial. They defeated researcher’s magic sauce. This is not something that randomised control trials would be able to just do routinely: see Scott’s Control Group again. So how big does a trial have to be to gain that crucial power? What other factors play into it?
People who understand better how science is practiced would be better able to characterise the kind of evidence that escapes careerscienceability. I wanted to mention one more candidate, though, particularly for the social sciences: natural experiments. They should take a lot of researcher’s magic sauce away. One could imagine how a single natural experiment of the right kind might be already too strong to be careerscienceable. We have a paradoxical situation, where the seemingly obstinate anti-science person, prone to seemingly ignoring huge numbers of studies, may suddenly concede the point when faced with just a single one, a strong randomised control trial or natural experiment. In the age of careerscienceability, quantity of scientific evidence is truly nothing against quality.
So, anyway, in sum, the following is what science writers could do. Find and expose clear instances of careerscienceability at work, and most importantly investigate and communicate the limits of careerscienceability.