In a Q&A, he writes,
My critique is of what I call “metric fixation.” The key components of metric fixation are the belief that it is possible and desirable to replace judgment, acquired by personal experience and talent, with numerical indicators of comparative performance based upon standardized data (metrics); that the best way to motivate people within organizations is by attaching rewards and penalties to their measured performance, rewards that are either monetary or reputational (college rankings, etc.); and that making the metrics public makes for greater professional “accountability” — as if only that which can be counted in some standardized way makes for professional probity. My book is about why this so often fails to have the desired effects and leads to unintended negative outcomes, which, after decades of experience, ought to be anticipated.
Read the whole thing. His book is The Tyranny of Metrics.
Sounding rather opposed is Bryan Caplan, who writes,
If you’re teaching something existing tests can’t detect, write a better test! But if you’re teaching something no conceivable test can detect, you probably aren’t teaching anything at all.
Bryan seems to be saying that everything one can learn is measurable in some way. Can you test for curiosity? For intellectual humility? For willingness to question one’s own beliefs?
Life is an IQ test of sorts.
Periodically we get stories about people of preternatural cunning or wiley-ness who were largely uneducated or at
1. Walter Laqueur describes somebody like that in his memoir–I don’t recall the details.
2. The Sicilian pentito Salvatore Contorno was described as someone of that sort–very streetwise, having survived assassination attempts that would have killed anyone else. It is claimed he could not speak standard Italian to save his life.
https://en.wikipedia.org/wiki/Salvatore_Contorno
3. Napoleon used to ask of officers ready for promotion toward generalship: “Is he lucky?”
= – = – =
A problem is that perhaps this skill cannot be taught, so you couldn’t instruct and then measure learning. It likely exists.
I think it wouldn’t be too much of a creative challenge to figure out ways to test for those qualities. Monitoring on one’s web browsing and reading habits could produce a rough measure of ‘curiosity’, for example.
But the real question is upon whom should the burden of proof be placed. Educators who are charging tons of money and claim as justfication that students leave in an improved condition because they learn certain things should be able to show that students actually learn those things. They shouldn’t get the presumption of effectiveness until proven impotent, and be able to say, “Oh, trust me, I can’t prove it to you, but I still know that my students leave my classes more curious about the world than they would have become without my class.”
Certainly they should not be able to “retreat into a briar patch of unmeasurables” and say that kind of thing after everything we can measure weighs to the contrary conclusion. That’s getting into unfalsifiable territory and puts any critic in an impossible position.
I am in the Bryan Caplan camp, but I think perhaps the two views are not necessarily opposed to one another. The problem is you are measuring human beings who, being quite clever, almost immediately find ways to game any system- and this applies to both those being measured and those doing the measuring. It is a never-ending process.
A propos Brian Caplan, I’m reminded of the epigraph of my book:
“Those who believe that what you cannot quantify does not exist also believe that what you can quantify, does.” (Aaron Haspel)
That misses Caplan’s point. Even when they resist the possibility of precise measurement, proponents of the benefit of education are still making claims with a quantitative character. They are claiming confidently that there is a something like a spectrum of level of skill, or of learned knowledge, or of commitment to intellectual values, and that what education does is raise the student’s fuzzy position in that spectrum to a higher value. Caplan says it doesn’t make intellectual sense to take such quantitative claims at face value without explaining how we can be sure we know that’s true on some basis other than inscrutible intuition and assertions regarding subjective perceptions.
Consider statement about heat or temperature. The ability to quantify and measure temperature with any precision whatsoever is a fairly recent development. But besides reliance on human thermoception, there were still indirect ways to test whether claims about temperature change were true. If someone said that polishing metal made it colder because it felt cooler to the touch, one could take pieces of polished and unpolished metal from the same warm place and put then in cups of snow and see if the amount of meltwater was different. Even though they could quantify or measure temperature hardly at all, even ancients could observe the claim to be false. Does temperature even “exist” except as a metric, as a statistical property of countless local particles? It doesn’t, but that doesn’t matter with regards to evaluating the truth or falsity of these kinds of claims.
It’s fine to say, “I believe this to be true, even though I can’t prove it to you,” but that’s not the character of a scientific claim of empirical reality, or the kind of claim a skeptic should be pressured into accepting. Intellectual humility means that if you can’t show your argument and evidence to anyone else to persuade them of the validity of your claims, then you shouldn’t feel entitled to believe your confidence in the claim is based on anything but faith and intuition.
That hardly diminishes the force of Caplan’s argument. “People who argue against me concede that they have no good scientific reason to be confident in their claims, and that their opinions are only based on faith and intuition. They can’t prove it to me, because they can’t prove it to anybody, not even themselves. And that’s pretty much admitting defeat by the usual rules of these games, whether they are willing to accept that sad fact or decide to be bad losers. I’ll accept a rematch when they are able to produce some actual intellectual ammunition, but that burden’s rightly put on them.”
Jerry Muller’s interview excerpt and Bryan Caplan seem too be talking about very different things – use of metrics in society vs tests created by a teacher of class. Metrics in society have multi-player game theory aspect – and real word criteria constraints like cost. Look at magazine rankings of colleget – effectively funded by adversiting – what is there incentive to spend a lot of money get a “right” answer. Coming as a manager from a corporate environment – how do you test for performance against objectives against across a variety of roles (and time) – it really does come down to judgment, but we have “metricize it” to make HR happy.
Employers need to give the test and they didn’t do the teaching. So the employers have the huge problem, each has a test none of which match the many teacher tests.
Games and tests are related.
Games have a metric (score) and are fundamentally non-lethal, so that unlike the ‘real thing’ there should be basically no cost to losing. As a result, you can practice. When the rewards of games exceed those of the ‘real world’ there are serious distortions, and they really stop being games. At worst, they become a Matrix-like artificial world with stakes as high as anything else.
Tests have a metric and an observer who rewards and punishes. Ideally, they are checks on processes like learning. The idea is that the property being tested applies to a high stakes situation – like flying a plane – and the outcome of the test allows for screening or correction. However, there is a tension between the test taker and test grader.
Teaching something that ‘can’t be tested’ is different from teaching something that can’t be observed. The gap is the duration/cost of the required test. Probably the best test of humility is giving two very different and similarly talented people a project that suits neither of them and watching them interact. However, that takes quite a bit of time and is hard to repeat. Also, long tests allow for the skill/trait to be learned during the test. Which might not be a bad thing. ‘Developmental assessments.’
“Bryan seems to be saying that everything one can learn is measurable in some way”
No, Bryan is saying that if you can’t measure something you can’t be confident that you have taught anything.
I thought that his thesis against education was that teachers testing students is not good enough.
This was my reading as well – if you claim to have taught something, then there should be a measurable test to display that learning. There may well be other things that can be learned, but they are not necessarily taught by anyone.