For predicting depression. The authors of this study report
The implication of our study, therefore, is that previous positive main effect or interaction effect findings for these 18 candidate genes with respect to depression were false positives. Our results mirror those of well-powered investigations of candidate gene hypotheses for other complex traits, including those of schizophrenia and white matter microstructure.
Read Scott Alexander’s narrative about their findings.
As I understand it, a bunch of old studies looked at one gene at a time in moderate samples and found significant effects. This study looks at many genes at the same time in very large samples and finds that no one gene has significant effects.
The results are not reported in a way that I can clearly see what is happening, so the following is speculative:
1. It is possible that the prior reports of a significant association of a particular gene with greater incidence of depression are due to specification searches (trying out different “control” variables until you find a set that produces “significant” results).
2. It is possible that publication bias meant that although many attempts by other researchers to find “significant” results failed, those efforts were not reported.
3. These authors use a different, larger data sample, and perhaps in that sample the incidence of depression could be measured with greater error than in the smaller samples used by previous investigators. Having a larger data sample increases your chance of finding “significant” results, but measurement error reduces your chances of finding “significant” results. The authors are aware of the measurement-error issue and they conduct an exercise intended to show that this could not be the main source of their failure to replicate other studies.
4. If I understand it correctly, previous studies each tended to focus on a small number of genes, perhaps just one. This study includes many genes at once. If my understanding is correct, then in this new study the authors are now controlling for many more factors.
Think of it this way. Suppose you do a study of cancer incidence, and you find that growing up in a poor neighborhood is associated with a higher cancer death rate. Then somebody comes along and does a study that includes all of the factors that could affect cancer incidence. This study finds that growing up in a poor neighborhood has no effect. A reason that this could happen is that once you control for, say, propensity to smoke, the neighborhood effect disappears.
In the case of depression, suppose that the true causal process is for 100 genes to influence depression together. A polygenic score explains, say, 20 percent of the variation in the incidence of depression across a population. Now you go back to an old study that just looks at one gene that happens to be relatively highly correlated with the polygenic score.
In finance, we say that a stock whose movements are highly correlated with those of the overall market is a high-beta stock. The fact that XYZ corporation’s share price is highly correlated with the S&P 500 does not mean that XYZ’s shares are what is causing the S&P to move. Similarly, a “high-beta” gene for depression would not signify causality, if instead a broad index of genes is what contributes to the underlying causal process.
Further comments:
(1) and (2) are fairly standard explanations for a failure to replicate. But Alexander points out that in this case it is not just one or two studies that fail to replicate, but hundreds. That would make this a very, very sobering example.
If (3) is the explanation (i.e., more measurement error in the new study), then the older studies may have merit. It is the new study that is misleading.
If (4) is the explanation, then the “true” model of genes and depression is closer to a polygenic model. The single-gene results reflect correlation with other genes that influence the incidence of depression rather than direct causal effects.
If (4) is correct, then the “new” approach to genetic research, using large samples and looking at many genes at once, should be able to yield better predictions of the incidence of depression than the “old” single-gene, small-sample approach. But neither approach will yield useful information for treatment. The old approach gets you correlation without causation. The new approach results in a causal model that is too complex to be useful for treatment, because too many genes are involved and no one gene suggests any target for intervention.
I thank Russ Roberts for a discussion last week over lunch, without implicating him in any errors in my analysis.