Start with what I said in my review of Robert Plomin’s Blueprint.
Plomin is excited by polygenic scores, a recent development in genetic studies. Researchers use large databases of DNA-sequence individuals to identify combinations of hundreds of genes that correlate with traits.
The most predictive polygenic score so far is height, which explains 17 percent of the variance in adult height… height at birth scarcely predicts adult height. The predictive power of polygenic scores is greater than any other predictors, even the height of the individuals’ parents.
One can view this 17 percent figure either as encouraging or not. It represents progress over attempts to find one or two genes that predict height, an effort that is futile. But compared to the 80 percent heritability of height it seems weak.
Plomin is optimistic that with larger sample sizes better polygenic scores will be found, but I am skeptical.
My question, to which I do not have the answer, is this: if height is 80 percent heritable, why is the statistical correlation found between genes and height only 17 percent?
I do not know any biology. But as a statistician, here is how I would go about developing a polygenic score.
1. I would work with one gender at a time. Assume we have a sample of 100,000 adults of one gender, with measurements of height and DNA sequences. I would throw out the middle 80,000 and just work with the top and bottom deciles.
2. For every gene, sum up the total number in the top decile with that gene and the total number in the bottom decile with that gene, and see where the differences are the greatest. If 8500 in the top decile have a particular gene and 1200 in the bottom decile have the gene, that is a huge difference. 7500 and 7200 would be a small difference. Take the 100 largest differences and build a score that is a weighted average of the presence of those genes.
3. To try to improve the score, see whether adding the gene with the 101st largest difference improves predictive power. My guess is that it won’t.
4. Also to try to improve the score, see whether adding two-gene interactions helps the score. That is, does having gene 1 and gene 2 make a difference other than what you would expect from having each of those genes separately? My guess is that some of these two-gene interactions will prove significant, but not many.
It seems to me that one should be able to extract most of the heritability from the data by doing this. But perhaps this approach is not truly applicable.
Another possibility is that heritability comes from factors other than DNA. Perhaps the reliance on twin studies to try to separate environmental factors from genetic factors is flawed, and the heritability of height comes in large part from environmental factors. Or perhaps DNA is not the only biological force affecting heritability, and we need to start looking for that other force.
Another possibility is that scientists are working with much smaller sample sizes. If you have a sample of one thousand, then the top decile just has one hundred cases in it, and that is not enough to pick out the important DNA differences.
As a related possibility, the effective sample sizes might be small, because of a lot of duplication. Suppose that the top decile in your sample had mostly Scandinavians, and the bottom decile had mostly Mexicans. Your score will be good at separating Scandinavians from Mexicans, but it will be of little use in predicting heights within a group of Russians or Greeks or Kenyans or Scots.
I am just throwing out wild guesses about why polygenic scores do not work very well. I probably misunderstand the problem. I wish that someone could explain it to me.