Think of a model as a set of assumptions (do I owe Quine and Duhem a citation here?).
These assumptions produce a set of predictions. The “predictions” need not be about the future. They may be explanations of events or patterns of events observed in the past.
When a model’s predictions turn out to be true, the model should be rated highly if and only if it is very difficult to obtain those predictions from other models. In order to have the best chance of being rated highly, a model should make many predictions, at least some of which are contrary to predictions made by other models.
When a model’s predictions turn out to be false, the model should be rated highly if and only if the failure is interesting. What does that mean?
Suppose that we classify the assumptions in a model as either verified or doubtful. A verified assumption is an assumption that can be demonstrated to hold true in the situation in which a prediction is being made. A doubtful assumption is one that either cannot be verified or is even known to be false.
If a model makes false predictions, then one or more of the doubtful assumptions does not hold. If there are many doubtful assumptions, then we do not know which one, and we learn almost nothing from the prediction failure. At the other extreme, if there is only one doubtful assumption, then a prediction failure is interesting in the sense that it tells us that this doubtful assumption does not hold and this has important consequences.
So you want to try to eliminate doubtful assumptions. One way to do this is to generalize a model. That is, if your model assumes that there are two factors of production, can you get the same result if there are many factors of production? If so, then you have eliminated the assumption of two factors of production as a doubtful assumption. Another way to eliminate doubtful assumptions is to demonstrate using data that the assumptions hold.
In short, we should rate highly a model when
(a) if almost all of its main predictions are true, at least some of them differ from the predictions made by other models
(b) if some important main predictions are false, there is a very limited set of doubtful assumptions that might explain the false predictions
My guess is that very few economic models could be rated highly by these criteria.
Isn’t a model more interesting if it casts doubt on a _verified_ assumption? Examples for such theories are the Heliocentric theory or the photoelectric effect explanation.
I like this model of modeling 🙂 My only issue is that sometimes failures are due to unexpected factors, not just bad assumptions. I agree that very few economic models could be highly rated by these criteria. Where I think we differ in opinion is that I believe that the continuous application of your criteria to existing models produces either improved models or valuable insight while you believe that economic models are inherently chaotic and their application can only lead to misery.
You might be right. Even if your not, heavy skepticism of the early models is exactly what is required for long term success.