I wrote this review of Uncontrolled, by Jim Manzi, for National Review, but I cannot find it on line. Here is the version that was in my files.
Why are social sciences less scientific than natural sciences? And what does this imply about public policy?
To the first question, many people probably would answer, “Because social sciences involve human beings, and human beings sometimes do things that are not predictable.”
That answer turns out to be at best shallow and at worst off base. Moreover, the fact that human beings are not perfectly predictable has never stopped economists, sociologists, or political scientists from believing that they can contribute useful knowledge.
Jim Manzi’s book attempts to provide an answer that is both more rigorous and more helpful. Manzi, an entrepreneur and a contributing editor to NR, ends up making a case that social scientists would be better served by (cautiously) undertaking more experiments. Undertaking rigorous experiments is also Manzi’s message to policy makers.
The ideas in this book are important, and I think it belongs on the syllabus of graduate programs and high-level undergraduate programs in social science and public policy. It is unfortunate that Manzi probably does not have enough academic street cred to gain that sort of audience. For instance, even though he skewers famous studies by renowned Princeton political science professor Larry Bartels and renowned University of Chicago economist Steven Levitt, their position in the professional hierarchy probably makes them impregnable, particularly when attacked by someone from outside the academy.
The Problem of Causal Density
Manzi introduces a new and useful term to describe the problem of the social sciences: causal density. Causal density means that there are many factors that can affect the phenomena in which social scientists are interested. Think of all of the plausible causes of the first World War, the Great Depression, or the recent financial crisis.
Even when we are not dealing with single events, causal density is a problem. How can we sort out the causes of income inequality or differences in educational outcomes?
The problem of causal density also crops in physical sciences, notably biology. Even though there is strong evidence of heritability of diseases and other characteristics, the hopes of pinning these traits down to specific genes or sets of genes have faded. There is too much causal density.
For me, the paradigmatic case of causal density is macroeconomics, as typified by the question of how effective is fiscal stimulus in ameliorating a recession. We want to know whether, all other things equal, more government spending raises output and employment. However, history does not hold other things equal.
When experiments are not practical, we rely on observational data. Manzi points out that this worked in the case of establishing a link between smoking and lung cancer. He notes that in that context, the circumstances under which observational studies can demonstrate causality were spelled out by Austin Bradford Hill. Among the Hill criteria are strength of relationship, consistency of relationship, dosage-response relationship, plausibility, and coherence with other scientific findings.
The challenge with judging the effect of government deficits on economic performance is that the data that are available do not satisfy the Hill criteria. For example, one does not observe a consistently positive relationship between deficit spending and economic outcomes—quite the contrary, perhaps because of reverse causality, large deficits are associated with weaker economic performance. Turning to other criteria, a positive relationship between deficit spending and economic outcomes is plausible and coherent for Keynesians, but not for economists who subscribe to classical theory. This debate has persisted ad nauseum.
Manzi argues that where controlled experiments are feasible (not in macroeconomics), they can provide a better, albeit imperfect, solution to the problem of causal density. For example, if one is testing a new pedagogical technique, one can randomly assign some students to be taught the old way and others to be taught using the new method. Many of the most trustworthy findings in social science have come from such experiments. There is a famous Rand study, now nearly three decades old, of health insurance policies with different deductibles. Also famous are the various experiments testing Milton Friedman’s idea of a negative income tax as a tool to alleviate poverty.
Experiments and Tampering
I was once seated at a dinner table next to an official of the Department of Education involved in education research. I made an impassioned plea for more controlled experiments in education. The official responded by asking, “Would you want your child to be the subject of an experiment?”
At this, my jaw dropped, and I sputtered, “They do it to my children all the time! They constantly introduce curriculum changes, scheduling changes, and changes in teacher methods. They just don’t bother to evaluate whether or not it works.”
Statistical quality control guru W. Edwards Deming used the term “tampering” to describe this process of introducing changes without rigorously evaluating results. Tampering and experiments are two ways of disturbing the status quo. But only experiments are designed with the intent of producing reliable measurements of success or failure.
Like my dinner companion, most policy makers view experiments as at best costly and at worst immoral. Even though tampering is just as bad, if not worse, it somehow escapes such criticisms.
Manzi points out that most social experiments are too small and too limited in terms of initial conditions. Much is made of the Perry Pre-school Experiment, conducted in one location with fewer than 150 students. Manzi argues that best practice is to conduct multiple experiments in a variety of initial conditions.
Manzi offers a set of conclusions about experiments, which I will paraphrase as follows:
–in fields with high causal density, experimental methods are a significant tool for producing reliable results
–a single experiment is much less reliable than multiple, replicated experiments
–most new programs and policies fail to achieve their desired results. It would be better to discover this beforehand, using experiments. (Of course, to the extent that policy makers do not want to recognize failures, they will not want to conduct experiments.)
Looking at experimental results, Manzi (p. 202) notes a general finding that “programs that attempt to improve human behavior by raising skills or consciousness are even more likely to fail than those that change incentives and environment.” It is really hard to fix the flaws in human character.
A Case for Federalism?
Manzi argues that the value of experiments bolsters the case for federalism. States can be laboratories for what works in social policy.
I am not sure that this case is sound. In theory, if Washington were to approach social policy by conducting rigorous, controlled experiments in order to determine what works, that might be better, on Manzi’s own terms, than leaving the fifty states alone to engage in unsystematic tampering.
I think that the case for federalism is more subtle. Attempting to change the skills or consciousness of officials in order to influence them to conduct rigorous experiments as part of the policy process is unlikely to work. However, creating an environment in which incentives lead them to adopt experimental methods has a better chance of success. A more competitive political system, which a decentralized structure might provide, could create this sort of environment.
To sum up, this is a provocative book for people who are interested in how social science relates to public policy. I am confident that most of the people who read it will benefit from it. I am much less confident that most of the people who would benefit from it will read it. That reflects my pessimistic views of today’s intellectual culture, particularly in the academy.