Think of a model as a set of assumptions (do I owe Quine and Duhem a citation here?).
These assumptions produce a set of predictions. The “predictions” need not be about the future. They may be explanations of events or patterns of events observed in the past.
When a model’s predictions turn out to be true, the model should be rated highly if and only if it is very difficult to obtain those predictions from other models. In order to have the best chance of being rated highly, a model should make many predictions, at least some of which are contrary to predictions made by other models.
When a model’s predictions turn out to be false, the model should be rated highly if and only if the failure is interesting. What does that mean?
Suppose that we classify the assumptions in a model as either verified or doubtful. A verified assumption is an assumption that can be demonstrated to hold true in the situation in which a prediction is being made. A doubtful assumption is one that either cannot be verified or is even known to be false.
If a model makes false predictions, then one or more of the doubtful assumptions does not hold. If there are many doubtful assumptions, then we do not know which one, and we learn almost nothing from the prediction failure. At the other extreme, if there is only one doubtful assumption, then a prediction failure is interesting in the sense that it tells us that this doubtful assumption does not hold and this has important consequences.
So you want to try to eliminate doubtful assumptions. One way to do this is to generalize a model. That is, if your model assumes that there are two factors of production, can you get the same result if there are many factors of production? If so, then you have eliminated the assumption of two factors of production as a doubtful assumption. Another way to eliminate doubtful assumptions is to demonstrate using data that the assumptions hold.
In short, we should rate highly a model when
(a) if almost all of its main predictions are true, at least some of them differ from the predictions made by other models
(b) if some important main predictions are false, there is a very limited set of doubtful assumptions that might explain the false predictions
My guess is that very few economic models could be rated highly by these criteria.