The mismatch between rising greenhouse-gas emissions and not-rising temperatures is among the biggest puzzles in climate science just now. It does not mean global warming is a delusion. Flat though they are, temperatures in the first decade of the 21st century remain almost 1°C above their level in the first decade of the 20th. But the puzzle does need explaining.
On a separate but related topic, Noah Smith writes,
DSGE models are highly sensitive to their assumptions. Look at the difference in the results between the Braun et al. paper and the Fernandez-Villaverde et al. paper. Those are pretty similar models! And yet the small differences generate vastly different conclusions about the usefulness of fiscal policy. Now realize that every year, macroeconomists produce a vast number of different DSGE models. Which of this vast array are we to use? How are we to choose from the near-infinite menu of very similar models, when small changes in the (obviously unrealistic) assumptions of the models will probably lead to vastly different conclusions? Not to mention the fact that an honest use of the full nonlinear versions of these models (which seems only appropriate in a major economic upheaval) wouldn’t even give you definite conclusions, but instead would present you with a menu of multiple possible equilibria?
James Manzi’s Uncontrolled pinpoints the problem, in what he calls causal density. When there are many factors that have an impact on a system, statistical analysis yields unreliable results. Computer simulations give you exquisitely precise unreliable results. Those who run such simulations and call what they do “science” are deceiving themselves.
Cherry-picking from among many possible explanatory models is a problem, but I am more concerned with research that is influenced by author incentives, whether it is global warming and climate change funds (for grants, positions, prestige, students…), monetary economics and the Fed, or fast food “addiction” and nutrition research funding.
One of the strange things about the _Economist_ article is that the “Falling off the scale” graph doesn’t indicate the publication date of the calculations underlying Hawkins’ “range of projections derived from 20 climate models.” From what I know of the history I’m pretty sure the date is not before 1998, and from the quote from Hawkins presumably it is before 2005. It would certainly be suggestive to a large fraction of trained economists reading _The Economist_ if the divergence between model and observation coincides with the date(s) of publication. It might be suggestive to many of their less-specialized readers as well.
You write “when there are many factors that have an impact on a system, statistical analysis yields unreliable results.” I would say instead that statisticians have to be careful in order to get reliable results, or perhaps that statistical analysis can easily be made to yield unreliable results. Note that computer pattern recognition problems — e.g. recognizing handwritten letters, or recognizing individual faces, or recognizing what language is used in a document — can involve very large numbers of factors, and can sometimes reach useful levels of reliability anyway. The statistical methods used to do this might look strange to people who were trained in other classical statistics methods, but they are recognizably statistical methods, and for some problems they work fairly reliably.
You write “Those who run such simulations and call what they do ‘science’ are deceiving themselves.” There are interesting newfangled statistical methods which — for sufficiently elegant and clean models and sufficiently large featureful datasets — could allow careful investigators to systematically avoid the problem of overfitting. (I’m thinking of methods involving concepts like Minimum Description Length or Vapnik–Chervonenkis dimension.) We can cook up tests which make selfdeceit much harder. Are the IPCC-associated models sufficiently elegant and clean to pass such tests with current data? Not even close. Are their investigators appropriately careful? It does not seem so to me. But that doesn’t mean it absolutely can’t ever be done. It certainly can’t be done for all problems we care about, and it seems utterly impractical for useful-accuracy climate forecasts with anything like current tech. But that doesn’t mean it is never practical for any problem of interest. So I might grant you “typically those who run such simulations and call what they do ‘science’ have fooled themselves and/or are trying to fool us.” But I think this is essentially a problem of sociology and economics of scientists and institutions, not some inevitable consequence of fundamental limitations of what is technically possible with models and statistics.
To be less airily mathematical about my claim that politics is the real problem, not modeling: consider tide tables. Are they not derived from models? Do their predictions not work? (In my admittedly limited experience on the Oregon coast, they work well. They seem to have a good reputation for working well in many other places too.)
It seems to me the appropriate conclusion is not that it’s important never to rely on models, but that it’s important to impartially recognize models for tractable problems where the model successfully gets the important factors under control. When charlatans and shills in climate, macroeconomics, public health, ecology, and other politically-charged fields claim that they also have all the important factors under control, or indeed when anyone claims to have important factors under control, we should respond by “see these hoops for sharp models with good predictive power? Jump your model through one of them and then we’ll talk.”
Or… on this first of April, it occurs to me that from my interested amateur’s perspective, such claims to have the important factors under control can often seem fundamentally unserious. (Ceteris paribus for effects caused by money supply when tariffs and income tax rates rise above 50%, gold is confiscated at a discount despite Constitutional guarantees of compensation and banned, and general lawlessness, insecurity, and arbitrary economic dirigisme moves at least 10% of the way away from 1920 USA and toward 1920 Latin America? Solemnly precisely hindcasting 20th century climate despite big historical uncertainties in key model inputs, e.g. particulate emissions? Or, for that matter, analyzing high medical costs in the US without mentioning legal barriers to entry?) While ordinarily I’m a “savage indigation rends his heart” kind of guy, today maybe I should take those things in a different spirit, merrily celebrating highlights accumulated from humanity’s 24/7 deadpan performance of absurdist humor.
Several of the comments here provide interesting clarification, and illustration, of the original post.
The key point, here is: “When there are many factors that have an impact on a system, statistical analysis yields unreliable results. Computer simulations give you exquisitely precise unreliable results.”
This is not, as Mr. Newman makes clear, a general critique of either model-building or statistics. Both are both unavoidable and potentially illuminating. The problem is rather meta-statistical, and epistemological, since many of the things we would like to understand are (1) hard to define and even harder to measure — not just technically, but because the definitions and values are in fact contentious and subject to incommensurability problems, and (2) social science has strong internal habits and reward-structures that favor constructing models and carrying out statistical analysis whether or not the relationship between the model and the real-world comprendendum is well-understood and rigorous, and whether or not the data supporting the statistical analysis is adequate to the claims that the analysis wants to make. (This is above and beyond the confirmation bias and expectations bias and outright self-interest bias that Jack P. evokes, I think).
Every discipline has its void at the core that everyone sort of understands but most practitioners don’t really like to talk about it. This is as true of narrative explanation for historians and tropic models for lit-crit as it is of model-building and statistical analysis for social scientists. All of us, whatever our disciplinary predilections, should firmly face the problems that our particular discipline’s gaps raise (no matter how uncomfortable that is), but then get back to work in good faith, making the necessary exceptions and provisos and statements of limitation. The world is demonstrably better off if we leave the imponderables to the philosophers, after all.
However, once again reward structures and habitus raise their dirty head: on the whole, with commendable exceptions, rewards in the intellectual world go to those who fall in love with their epistemology and treat their models as optimal and their data as unimpeachable, who damn the torpedoes and make bold statements. We like bold, clear, and unshadowed by doubt, even though all of our hard won understanding, in all the disciplines, is permanently shadowed by doubt and often far less clear than we might wish.
So: kudos to those who acknowledge the problem, point out the tree in their own eye before snarking about the motes that those in other disciplines so obviously suffer from, and then go ahead and do their best, in good faith — which means both trying hard to create the best models, statistics, narratives, hermeneutics, or whatever, that they can — AND acknowledging, readily and graciously, when their model turns out to lack correspondence to reality, their statistics turn out to be mostly noise about mis-measurements, whose narratives turn to have been mistaken, and whose hermeneutics don’t lead anywhere.