Null Hypothesis Watch

Two papers that claim to reject it.

1. Michael Lovenheim and Alexander Willen write,

We see consistent evidence that 12 years of exposure to a collective bargaining law negatively impacts both cognitive and noncognitive scores among men. AFQT percentile declines by 10.2, a 20.9% effect relative to the mean.

See also the abstract, quoted by Tyler Cowen.

2. Michael Gilraine, Hugh Macartney, and Robert McMillan write,

California’s statewide class size reduction program of the late-1990s. . .caused marked reductions in local private school shares, consequent changes in public school demographics, and significant increases in local house prices — the latter indicative of the reform’s full impact. Second, using a generalization of the differencing approach, we provide credible estimates of the direct and indirect impacts of the reform on a common scale. These reveal a large pure class size effect of 0.11 SD (in terms of mathematics scores), and an even larger indirect effect of 0.16 SD via induced changes in school demographics. Further, we show that both effects persist positively, giving rise to an overall policy impact estimated to be 0.4 SD higher after four years of treatment (relative to none).

I am skeptical of both papers. I am not convinced that the methods used truly eliminate possible confounding factors. But I have not read either paper closely.

The Tyranny of Metrics

The book by Jerry Muller will be out shortly. It makes a strong case against the over-use of quantitative measures to fix compensation. Education is one example.
If you believe the null hypothesis, then compensating teachers based on outcomes only introduces randomness into their pay.

Meanwhile, without referring to the book, in talking about health care and education, Megan McArdle writes,

So when we measure outputs, we are getting at best a very distorted picture of the value of the services provided. Modern industrial management is simply not designed for this sort of situation. If you feed human inputs into a machine system, you are quite likely to grind up the humans in the process.

Read the whole thing.

What I’m Reading

Matthew Walker, Why We Sleep. Too far off the topic of political economy to compete for a place on my list of best books of the year, but very strongly recommended.

Walker is highly opinionated. One of his opinions is that teenagers naturally stay up later and sleep later than children or adults. So high schools that start before 9:00 AM are causing harm. It would be interesting to see if changing the start time would be an intervention that overcomes the Null Hypothesis, including finding an effect that does not fade out after several years. Meanwhile, it’s another case for home schooling–you don’t have to make your teenager wake up prematurely to go to school.

Null Hypothesis watch

The abstract of a paper by Cynthia (CC) DuBois andDiane Whitmore Schanzenbach says,

This paper examines the effect of a court-ordered hiring guidelines intended to increase the share of black teachers employed in a school district in Louisiana. We find that the court-ordered hiring policy significantly increased the share of teachers who are black in the district relative to the rest of the state, and to a matched synthetic control sample. The policy also increased the share of new teachers hired who are black, and decreased the student-teacher representation gap, defined as the difference in enrollment share black among students and teachers in a district. There were increases in the share of black teachers observed in both predominately white and predominately black schools in the district. The policy had no measurable impacts—either positive or negative—on district-level measures of student achievement.

Does tutoring work?

In 1994, Benjamin S. Bloom wrote,

Using the standard deviation (sigma) of the control (conventional) class, it was typically found that the average student under tutoring was about two standard deviations above the average of the control class (the average tutored student was above 98% of the students in the control class). The average student under mastery learning was about one standard deviation above the average of the control class (the average mastery learning student was above 84% of the students in the control class)

The reader who sent me the link to this paper asked whether it invalidates the Null Hypothesis. The researchers experiments rather than observational studies, so I will give them that. But

1. Usually, educational interventions have small effects. The effect of “mastery learning” of one standard deviation seems rather implausibly high, considering its definition.

Formative tests (the same tests used with the conventional group) are given for feedback followed by corrective procedures and parallel formative tests to determine the extent to which the students have mastered the subject matter.

That does not sound like something that would cause a one-standard deviation difference. My guess is that these findings would not replicate if they were undertaken by different researchers.

2. Even when interventions show large effects for a single subject in a single year, the effects tend to fade out. That is, if you examine the experimental group and the control group three years later, any difference has vanished. Even if these results replicate in the short term, they do not invalidate the Null Hypothesis if they suffer from fade-out.

3. We do not know how what would be necessary to enable tutoring to scale. Bloom seems to believe that tutoring works by adapting to the needs of the student. If so, then my guess is that the process of matching tutoring style to student characteristics would be quite a challenge.

Public schools, private goods

Salim Furth writes,

we do not, in the suburbs, have a system of public schools. We have private, government-run schools. A public good is something available to all—non-excludable and non-rival in consumption, like clean air or a radio broadcast. But access to local school is eminently excludable: those who do not buy or rent a home in the right area cannot access it. And it is at least somewhat rivalrous in consumption, since crowding and peer effects play such a large role, at least in the perception of educational quality.

He says that universal school vouchers are a political non-starter.

The two most stable organizing principles of the political economy of the American family in the twenty-first century are that educational access is purchased with one’s home, and that established suburbs do not change their character.

Jordan Peterson and Jonathan Haidt

A long, wide-ranging conversation. At the end, Haidt predicts that there will be a split in the academic world. There will be a “University of Chicago model,” which underlines a commitment to truth and spurns indoctrination, and a “Brown University model” that does the opposite. He predicts that the market will reward Chicago and punish Brown.

I am not nearly so optimistic that the Chicago model will win out decisively.

1. I think that many high school students will prefer the Brown model.

2. I think that parents, who are the real consumers here, do not feel strongly about which model is used. What they care about is the school’s prestige and their ability to tell their friends that their child got into a top school. I do not think that Brown’s brand will decline much, if at all, in that regard.

I guess what I am saying is that I do not think that high school students or parents care all that much about the issue of truth-seeking vs. social-justice-seeking institutions of higher education. But suppose that they do care. Then some possible outcomes:

a. Brown attracts students oriented its way, and Chicago attracts students oriented its way. Over time, Chicago becomes predominantly conservative, and Brown becomes even more leftist.

b. Earlier in the dialogue, Peterson tosses out the data point that illiberal leftist students score relatively low on verbal intelligence. So perhaps the quality of the student body rises at Chicago and falls at Brown.

Two different problems in education

I liked the distinction drawn by a commenter.

distinguish two situations. One, the student wants to learn about something, because she finds it interesting or wants to use it for something. She wants to learn and wants the knowledge to stay with her.

In the other situation, the student is not inherently interested and/or does not expect to use the information. The student may well want to learn enough for long enough to pass but doesn’t mind if all that knowledge just decays away once the course (or the test!) is over.

Right. The first situation calls for providing content and feedback. The second situation calls for providing reward and punishment (assuming that we know what is best for the student). Note that the reward can be psychic reward, such as the student feeling closer affinity with a teacher that the student respects.

Peter Diamindis on reinventing education

He wrote,

I just returned from a week in China meeting with parents whose focus on kids’ education is extraordinary. One of the areas I found fascinating is how some of the most advanced parents are teaching their kids new languages: through games. On the tablet, the kids are allowed to play games, but only in French. A child’s desire to win fully engages them and drives their learning rapidly.

He also puts in a plug for the “illustrated primer” of Neal Stephenson’s The Diamond Age.

Read the whole thing. Plenty of interesting ideas, but keep in mind the null hypothesis.

Could Elite Colleges Expand?

In the course of the podcast with Russ Roberts, Tyler Cowen says

I think that a Harvard/California could work. I believe normatively Harvard should do it. I see zero signs they are about to. It would mean a dilution of control, a lot of headaches, a lot of new legal issues. You know, some reputational risk. But you could increase the number of people getting into some version of Harvard by really quite a bit. And that would be a wonderful thing for the country. And the world.

This is during a long digression on whether elite colleges could expand by orders of magnitude.

Suppose Harvard set up branches around the country, thinking that it could use its brand name to expand to, say, 250,000 students. Think about how this would play out. Many more students could get into Harvard. Assuming that other elite schools did not expand, Harvard would become by far easier to get into than Princeton or maybe even Maryland. So I think you ruin the Harvard brand.

It seems to me that this is an example in which value depends on scarcity.