Here’s a dirty little science secret: If you measure a large number of things about a small number of people, you are almost guaranteed to get a “statistically significant” result. Our study included 18 different measurements—weight, cholesterol, sodium, blood protein levels, sleep quality, well-being, etc.—from 15 people. (One subject was dropped.) That study design is a recipe for false positives.
Usually, I think of health studies as bad because they are non-experimental. But this is a way to scam experimental studies.
See also Slate Star Codex commentary on same:
http://slatestarcodex.com/2015/05/30/that-chocolate-study
Even with a large sample you still have the multiple comparisons problem. No matter how many subjects you have, you’ll still get a false positive about 5% of the time.
What small sample does is exacerbate the statistical significance filter…the false positives you do get will look like really big effects.
http://andrewgelman.com/2011/09/10/the-statistical-significance-filter/