Models, Assumptions, and Predictions

Think of a model as a set of assumptions (do I owe Quine and Duhem a citation here?).

These assumptions produce a set of predictions. The “predictions” need not be about the future. They may be explanations of events or patterns of events observed in the past.

When a model’s predictions turn out to be true, the model should be rated highly if and only if it is very difficult to obtain those predictions from other models. In order to have the best chance of being rated highly, a model should make many predictions, at least some of which are contrary to predictions made by other models.

When a model’s predictions turn out to be false, the model should be rated highly if and only if the failure is interesting. What does that mean?

Suppose that we classify the assumptions in a model as either verified or doubtful. A verified assumption is an assumption that can be demonstrated to hold true in the situation in which a prediction is being made. A doubtful assumption is one that either cannot be verified or is even known to be false.

If a model makes false predictions, then one or more of the doubtful assumptions does not hold. If there are many doubtful assumptions, then we do not know which one, and we learn almost nothing from the prediction failure. At the other extreme, if there is only one doubtful assumption, then a prediction failure is interesting in the sense that it tells us that this doubtful assumption does not hold and this has important consequences.

So you want to try to eliminate doubtful assumptions. One way to do this is to generalize a model. That is, if your model assumes that there are two factors of production, can you get the same result if there are many factors of production? If so, then you have eliminated the assumption of two factors of production as a doubtful assumption. Another way to eliminate doubtful assumptions is to demonstrate using data that the assumptions hold.

In short, we should rate highly a model when

(a) if almost all of its main predictions are true, at least some of them differ from the predictions made by other models

(b) if some important main predictions are false, there is a very limited set of doubtful assumptions that might explain the false predictions

My guess is that very few economic models could be rated highly by these criteria.

Good Models Fail in Interesting Ways

Consider three monetarist models.

(1) If the Fed reduces the rate of growth of high-powered money relative to recent trend, this will slow the growth rate of nominal GDP.

(2) If the Fed raises the Fed Funds rate above the interest rate on medium-term bonds, this will slow the growth rate of nominal GDP.

(3) If the Fed lowers the expected future growth rate of nominal GDP, then this will slow the growth rate of nominal GDP.

Model (3) is less likely to fail, in the sense that lower expected future growth of nominal GDP is very likely to be correlated with lower growth in nominal GDP. However, when model (1) or model (2) fails, the result is interesting. Because model (3) is not likely to fail in an interesting way, it is, in my view, an inferior model.

Over the years, economists have developed many criteria for evaluating a model. I think that if you ponder the subject, and you review situations where we learn from models and where we do not, you might agree with me that a good model is a model that is capable of failing in interesting ways. Bad models are models where either (a) failure would be met with indifference, perhaps because we already know that several key assumptions are implausible, or (b) where “success” is so heavily built into the model–as in (3) above–that there is nothing to be learned from confronting the model with evidence.

Thoughts on the use of Models in Economics

The term “model” can mean many things.

1. To an engineer, a model might be something like a flight simulator. It attempts to replicate actual conditions so well that it can be used to train a pilot who then moves on to operate in the real world. Economists sometimes treat their models this way. Think of models used to forecast the impact of tax changes, or think of Jonathan Gruber’s model used to predict the impact of Obamacare on the health insurance market. In my view, most of the time the accuracy of these sorts of exercises is often far over-stated.

2. In economics, models are often used for a different purpose. The economist writes down a model in order to demonstrate or clarify the connection between assumptions and conclusions. The typical result is a conclusion that states

All other things equal, if the assumptions of this model hold, then we will observe that when X happens, Y happens.

For example, X could stand for “a firm raises its price” and Y could stand for “the demand for its product goes down.”

3. Model failure is usually more interesting than model success.

Suppose that we observe a situation where X happens and Y happens. Does that confirm the model? Because there typically are other models that can explain such a pattern, we usually do not draw strong conclusions based on such evidence.

However, suppose that we observe a situation where X happens and Y does not happen. Does that refute the model? I would say that what it refutes is the prefatory clause “other things equal, if the assumptions of this model hold.” That is, we may conclude that other things were not equal or that the assumptions of the model do not hold.

In my book, I use the example of a college that raised its tuition and experienced a subsequent increase in applications for admission. I do not say that the law of demand fails to hold. Instead, I say that other things were not equal. In particular, the college also raised its level of financial aid. Thus, although its “list price” went up, the discounted price faced by many applicants went down.

The prefatory clause in economic models makes it difficult to draw scientific conclusions from real-world observations. When X and Y occur as predicted, we cannot confirm any one model, because other models are consistent with result. When they do not occur as predicted, we only know that the prefatory clause was violated–the assumptions of the model were not met or other things were not equal.

Often, we cannot say anything very interesting about which assumptions were not met or which things were not equal. For example, the model of an aggregate production function is used to predict that differences in output per worker will be proportionate to differences in capital per worker. When this fails, there are many possible reasons: workers may differ in their human capital; physical capital may not be measured or aggregated correctly; output may not be measured or aggregated correctly; institutional differences may matter. etc.

In fact, the primary use of the aggregate production function model is to examine its failure, which is called “the residual.” Economists place an interpretation on this residual, calling it “total factor productivity.” They interpret the rate of change in this residual over time as “productivity growth.” They interpret the change in the rate of change in this residual as “change in the trend rate of productivity growth.”

Eli Lehrer on Trends in Job Mobility

He writes,

Overall, employment patterns have shifted — in the direction of increased employment by big firms and a declining role for small businesses and the self-employed. Since 1993, the earliest year for which there is comparable data, the percentage of workers employed by small firms (one to four employees) has fallen slowly, but fairly consistently, from 5.6% to a bit under 5%. Meanwhile, the percentage of the workforce employed by firms with 1,000 or more workers has risen from 35.6% to 39.2%. Average tenure with the same employer has also risen in recent years, going from 4.9 years in 2004 to 5.5 years in 2014. The percentage of workers over 25 who have been with their current employer for more than a decade has also risen consistently, from 30.6% in 2004 to 33.3% in 2014. The percentage of people who are self-employed has steadily and consistently declined over the past several decades, falling from a high of about 7.3% in 1991 to 5.3% in 2015.

I wonder how this breaks down by industry. I would bet that the market share of small businesses has been declining in medical care, restaurants, and general retail. I assume that small farms have continued a downward trend.

Robert Murphy on Mises and Economic Calculation

Murphy writes,

If a particular operation is unprofitable, that means that it absorbs resources that have a higher monetary value than the outputs it produces. In other words, everyone else in society outside of that operation thinks that its input resources are more valuable than its output goods (or services). This is feedback from everyone else telling the people running this operation: “You are reducing the value of economic resources available to the rest of us, so consider carefully what you have been doing. Is there a tweak you can make to your enterprise, so that you absorb fewer inputs and/or produce outputs that the rest of us value more highly?”

I often point out that government-promoted recycling operations tend to be unprofitable. For me, this creates a presumption that the value of the output in recycling is less than the cost of the inputs. In short, recycling wastes resources.

Read the whole essay. This is the sort of analysis that ought to be stressed to first-year (and later) economics students.

Russ Roberts and Yuval Levin

The latest episode of econtalk. Recommended. A snippet:

We don’t think enough about how unusually cohesive and consolidated America was coming out of the Second World War, after the experience of the Depression; but even more than that, half a century of industrialization, of mass media, of progressive politics left American life intensely cohesive and consolidated and focused on national unity, on solidarity above individual identity and individualism generally. And what’s happened since that time is the breakdown of that consolidated culture–the liberalization, we would say in a positive sense, or the breakdown in a negative sense–the culture has become much more fragmented

This reminds me a bit of Brink Lindsey’s The Age of Abundance.

Recall my review of Levin’s book.

Intent to Commit Gross Negligence

My superfluous and uncharitable reaction to the FBI recommendation against prosecuting Hillary Clinton:

How can “intent” ever be a standard determining whether to prosecute someone for gross negligence?

I am not a lawyer, but I think it is almost impossible for gross negligence to be considered intentional. If you include a standard of “intent,” then gross negligence is no longer a crime subject to prosecution. I’m not saying that’s a bad thing. Just weird.

Commenters: Please avoid sarcasm or generic bashing of Hillary or the FBI director or the system. Try to stick to the narrow point. How best can one reconcile a law against gross negligence with a prosecution standard that requires intent to break such a law?

My Essay on Martin Gurri

Probably the longest book review I have written. But Gurri’s book is probably the most important one I have read in the past couple of years. One snippet of the review:

The insiders’ advantages include institutional continuity, the ability to mobilize large resources, and experience at choosing strategy and tactics. The outsiders’ advantages include rapid evolution by trying many tactics and quickly discarding those that fail, the ability to use information that gets filtered out by insider institutional processes, and unshakable conviction in their cause.

A central point of Gurri’s analysis is that the Internet and social media have altered the balance between insiders and outsiders with respect to information. Fifty years ago, outsiders lacked access to a lot of the information that was available to insiders.