Null hypothesis watch

Several readers spotted a story on the results of a Bill Gates initiative to improve teaching. The null hypothesis won.

https://www.washingtonpost.com/news/answer-sheet/wp/2018/06/29/bill-gates-spent-hundreds-of-millions-of-dollars-to-improve-teaching-new-report-says-it-was-a-bust/?noredirect=on&utm_term=.e24c304f9896

Applying the null hypothesis

Kevin Drum writes,

But it’s Los Angeles that’s shown the biggest progress across the board. White kids have improved by 21 points in math and 10 in reading. Black kids have improved by 20 points in math and 16 in reading. That’s the best progress among black kids among all five cities, and close to the best progress for white kids too.

I’m showing you progress from 2003 to 2017.

Thanks to a commenter for the pointer. Drum takes it as given that the improvement in test scores reflects some improvement in quality of teaching. But this runs counter to the Null Hypothesis, which is that educational interventions do not matter.

Applying the null hypothesis, I would bet that the difference in test scores between 2003 and 2017 is due to a quirk of some sort. Were “white” and “black” always defined the same way? Did the rules of who was “eligible” to take the tests change? Etc.

Sorry to be so cynical, but I would want to see more data before I would treat the improvement as real.

De-politicize college?

Six essays can be found here. For example, Tom Lindsay writes,

On both constitutional and prudential grounds, what is required to depoliticize our schools are measures that reduce the federal role in higher education.

Think of higher education as a church, and Federal government involvement in higher education as joining church and state.

Debra Mashek, of the Heterodox Academy (Jonathan Haidt’s project), writes,

In a world as complex as ours, it is unlikely that any one person holds a full and accurate understanding of problems, much less solutions. Intellectual humility compels us to at least question the completeness of our understanding while curiosity compels us to seek out and to try to understand the views of others. Resilience, in turn, helps individuals depersonalize difference. Resilient individuals are well-practiced at questioning and reframing their initial reactions to critique and challenge, and finding ways to read people and their actions with generosity and compassion.

It sounds to me like we should raise the status of the Intellectual Dark Web and lower the status of politically active professors.

Metrics meet the Null Hypothesis

From a podcast with Russ Roberts and Jerry Muller:

what’s so striking when you read through a lot of this literature on pay-for-performance and standardized measurement combined with pay-for-performance is: How often the scholarly literature shows, in a variety of fields, that it doesn’t work. And yet, politicians, policy-makers, they don’t seem to get the message.

People who are determined to try central planning aren’t interested in theories or evidence that indicate that central planning does not solve the problem.

It occurs to me that among the many problems with metrics in health care or education is that often the best way to look good is to be very selective about your customer base. Schools with children of affluent, two-parent households will tend to look “good.” Doctors who see mostly-healthy, conscientious patients will look “good.” etc.

The whole interview is interesting. Also, people seemed to like my essay on Jerry’s book.

Mis-measurement and Mis-leadership

My latest Medium essay, called Mis-Leadership and Metrics. An excerpt:

Instead of holding teachers accountable to a centralized statistical office, I believe that teacher evaluation should be undertaken by peers, principals, and parents. Test scores can be unreliable indicators for many reasons. Parents can readily assess whether a teacher is working conscientiously and effectively.

I have only started reading Nassim Taleb’s latest book, Skin in the Game. But I gather that he would make the point that when it comes to children’s education, parents have real skin in the game. If I allow my child to try to learn from a bad teacher, I suffer from that. If a bureaucratic system fails to remove a bad teacher, the designer of the system does not suffer consequences.

And of course I mention Jerry Muller’s book.

A Null Hypothesis Exception?

Alex Tabarrok writes,

What if I told you that there is a method of education which significantly raises achievement, has been shown to work for students of a wide range of abilities, races, and socio-economic levels and has been shown to be superior to other methods of instruction in hundreds of tests? Well, the method is Direct Instruction

Many years ago, I ordered a book on Direct Instruction. Trust me, you would hate it if you were a teacher. An you might hate it as a student. So it is quite counterintuitive that it works. It is very focused on repetitive drills.

On the other hand, I remember a 6th-grade math teacher who liked to hand out arithmetic speed drills. I didn’t hate those. And maybe having really solid fundamentals is what is important.

Me vs. Steven Pinker

In an interview, Pinker says,

I’m skeptical about that we’re going to see enhancements of human nature by genetic engineering, nanotechnology, or neural implants (though these technologies may be used to mitigate disabilities, a different matter). We now know that there is no “gene for musical talent” that ambitious parents will implant into their unborn children—psychological traits are distributed across thousands of genes, each with a teensy effect, and many with deleterious side effects (such as a gene that makes you a bit smarter while increasing your chance of getting cancer). Also, people are risk-averse (sometimes pathologically so) when it comes to their children and when it comes to genetic engineering—they don’t accept genetically modified tomatoes, let alone babies.

Just before I read this, I posted the following on a private discussion forum:

For those of you have read The Diamond Age, what feature of the future Stephenson depicts there do you find least plausible? I’ll nominate the Illustrated Primer. I bet that no educational technology that relies on communication with the student will ever prove as successful as the primer is portrayed. When it comes to achieving dramatic gains in cognitive skills, some form of biological intervention will prove workable sooner.

When I was in high school, SAT tutors were unheard of. The whole concept would have seemed distasteful. What parent would be so neurotic and competitive as to get their kid a tutor for the SATs? But once a few parents started doing it, other parents thought that they had to do it in order to keep up. Nowadays, I get the sense that any affluent parent who does not get their kid a tutor feels like they are handicapping their child. I’ve been predicting that in another generation, biological enhancement will go through a similar phase change–going from unthinkable to commonplace very quickly.

In your comments, please address substantive issues, leaving out your personal opinions of Pinker or me.

Caplan, Hanushek, and my own views on education

A reader asked me to comment on the debate between Bryan Caplan and Eric Hanushek on the extent to which education confers real skills or is merely a signal. I thought that the only point that Hanushek scored was when he produced data showing that the sheepskin effect is smaller than in other studies.

As a proponent of the Null Hypothesis, I am not the one to defend the human capital view. Where I differ from Bryan is that I am inclined to put even more weight on an ability-bias story, leaving less room for signaling. For example, my understanding is that the differences in earnings between people who are accepted to Ivy League schools and similar people who are not accepted ends up being pretty small. If it were mostly signaling, then losing out on the brand-name seal of approval should be more costly.

Russ Roberts and Bryan Caplan

One of my favorite podcast episodes, because Russ pushes back so hard and of course Bryan debates effectively. For example, Bryan says,

I would say if there is no designable test that can show that people learn something, then they haven’t learned it. You might say the test is bad, in which case I would say, ‘Fine. Design a better test, and then show it to me.’ But, if you want to say that people have been transformed but it’s a way that no one can actually show, no matter how hard they try, then I’m going to say, ‘No. That just sounds like wishful thinking.’

Later, Bryan says:

I’m weird in this way, in that when I read something that seems true to me, like I just feel this incredible, this weight on the world: ‘I must repent. I can’t keep living the way I used to live any more. I’ve got to go and incorporate this knowledge into my decisions, day after day. And, I’m a sinner if I don’t.’ But even that is such a weird response to a book. Most people read Tetlock’s Superforecasting and say, ‘Oh, yeah. So interesting. Some people are really great at this stuff. Yeah. Right.’ And then they go back and live their normal lives.

This is interesting. Maybe there is an “ability to learn” that reflects hyper-sensitivity to new information. And can formal education affect the degree of sensitivity to new information?

The case against education

Made by Ben Wilterdink.

unsurprisingly, one of the best ways to develop the soft skills necessary for labor market success comes in the form of entry level employment. A 2015 report from USAID concludes, “Theoretical literature suggests that adolescence and young adulthood are optimal times to develop and reinforce these skills.” Additionally, a growing body of evidence suggests that actually working, or at least being in a workplace environment, is a key indicator of successful soft skill development.

Read the whole thing. One implication is that young people probably would learn more if they spent less time in school and more time working at jobs.