Null Hypothesis Watch

Scott Alexander takes a look at it.

In summary: teacher quality probably explains 10% of the variation in same-year test scores. A +1 SD better teacher might cause a +0.1 SD year-on-year improvement in test scores. This decays quickly with time and is probably disappears entirely after four or five years, though there may also be small lingering effects. It’s hard to rule out the possibility that other factors, like endogenous sorting of students, or students’ genetic potential, contributes to this as an artifact, and most people agree that these sorts of scores combine some signal with a lot of noise. For some reason, even though teachers’ effects on test scores decay very quickly, studies have shown that they have significant impact on earning as much as 20 or 25 years later, so much so that kindergarten teacher quality can predict thousands of dollars of difference in adult income. This seemingly unbelievable finding has been replicated in quasi-experiments and even in real experiments and is difficult to banish. Since it does not happen through standardized test scores, the most likely explanation is that it involves non-cognitive factors like behavior. I really don’t know whether to believe this and right now I say 50-50 odds that this is a real effect or not – mostly based on low priors rather than on any weakness of the studies themselves. I don’t understand this field very well and place low confidence in anything I have to say about it.

The modesty he expresses at the end goes to far, in my opinion. I think that the right way to put it is that no one should pretend to know very much or have great confidence about claims about how teaching affects outcomes, particularly in the long term.

I actually think that Alexander’s post is the best discussion of this issue that you will find. I have more confidence in what he has to say than I have in any what Raj Chetty’s groupies have to say.

For newcomers to this blog, the null hypothesis is this:

Take any educational intervention. Measure its effects.

1. The effects are likely to be zero in the year that the intervention is introduced.

2. If they are not zero in the year they are introduced, the effects are likely to fade out quickly.

3. If the effects do not fade out quickly, the results are not likely to be replicable using rigorous experimental methods.

4. If the effects are replicable, they are not likely to be replicable at scale.

In short, the null hypothesis is that educational interventions have no effect, if you study them carefully. This includes interventions involving trying to measure and reward teacher quality. On that point, I agree with the teachers’ unions that measures of teacher quality are mostly noise and not signal.

In other service businesses, we let the customer make a subjective judgment of quality. You do not pick your hair stylist or your doctor or your auto mechanic based on some distant economist’s regression analysis. The reason we don’t use customer judgment in education is that we let government, rather than the customer, pick the service provider.

9 thoughts on “Null Hypothesis Watch

  1. Just thinking off the top of my head:

    We act like parents don’t know what is good for their kids in education, the way that kids don’t know what is good for them when planning meals. Many (most?) kids are not good at buying a meal with a fist full of dollars at the convenience store. They need to have their choice set restricted.

    Kids should have a well balanced meal–but many will buy ice cream and potato chips at the convenience store.

    The imperfect analogy: Students should have a “good” education–so, as with meal planning, someone else has to be in charge of the choice set for teachers, curriculum, etc. The school is like a dietician.

    The analogy is highly defective–the school is a local monopoly and tends to be highly influenced by producer interests.

    Also, we need to discuss “learning over time.” Some of us don’t eat ice cream for dinner because we feel better later if we eat something more balanced. All sugar makes me cranky the next day–it doesn’t happen with a salmon sandwich.

    What’s the equivalent in education? The data isn’t quite there. Everyone knows that suburban schools have better outcomes than “ghetto schools,” but there are too many confounding variables.

    One thing that even inattentive parents know about education is that people who went through Marine Boot Camp, or even Army Basic, and served for 4 years, seem to come out different. (Is this shown by statistics? Or is it a false belief?).

    The grand interventions that might matter involve restricted choice set, training, and discipline seem to matter. The KIPP schools are starting to replicate it. The next step would be KIPP Schools with dorms. Or KIPP schools where you can’t leave for months at a time.

    (The Spartans didn’t certify teachers, as I recall)

    Even then there might be fade out. People do serve in the uniformed military and then later end up homeless from poor choices (not just PTSS). The drill instructor and CO don’t go with them into civilian life.

    Sorry to introduce about 9 variables at once. Still waking up.

  2. [Alexander’s modesty “goes to far” or “goes too far”?]

    Education — especially earlier education — inherently faces the third-party payer problem. Even in a voucher system, parents can usually only choose schools, rather than individual teachers. But even if the economies of scale from schools as institutions were ignored or removed, and parents could choose a teacher each year independently, it is hard for a parent to evaluate the quality of a teacher because almost all information would be either through one-off observations (which could then be “gamed”) or mediated through a self-interested child.

    Between the parental knowledge problem and the typically short time that a child is under the tutelage of any given teacher, it is no surprise that schools can be a useful agent. I tend to think school districts provide much less incremental benefit in that respect.

    • “it is hard for a parent to evaluate the quality of a teacher because almost all information would be either through one-off observations (which could then be “gamed”) or mediated through a self-interested child.”

      No, it really isn’t that hard, and that’s not how they go about it. Most of the parent information about teacher quality comes from other parents whose kids had the teacher before (or from their own experiences via an older sibling). In the schools my kids attended, officially parents had no role in selecting teachers but everyone knew that, unofficially, lobbying could be done And most such efforts went not into trying to get that one that teacher that everybody thought was great, but rather in trying to avoid the one teacher that everybody knew was terrible (but, of course, could not be fired). Such teachers weren’t numerous. But some were *really* bad, and it was well worth the effort to avoid them if at all possible.

      • “rather in trying to avoid the one teacher that everybody knew was terrible”

        Bingo

  3. “the most likely explanation is that it involves non-cognitive factors like behavior”

    Could have told you that from the fact that my Mom’s school in NYC was considered one of the best non-white public schools, because it was run by a wildly strict ex-nun principle that broke all sorts of the cities regulations to enforce harsh discipline.

    You can’t change peoples IQ. You can’t make the dumb into doctors. You can make the dumb into slightly more conscientious people then otherwise though, if you are serious about reinforcing good habits. Instead of trying to prepare 90 IQ people for college, its best to teach them to merely follow orders.

    Most of the successful charter schools working with disadvantaged children follow this model. To liberals it all seems arbitrary and reactionary. School Uniforms? Harsh punishments for any deviation? All seems really racist to them.

    In reality though, Jamal is never going to college. You might be able to beat enough discipline into him to show up to his work shift on time, keep mind of his appearance and manners, maintain his household in a proper manner, etc. People who get used to following the rules when young already have the right habits for successful adulthood.

    I doubt it will make upper middle class preppies out of any of these kids, but it will move a few of them from the lumpenproletariat to the proletariate category.

    Or you could just be Alex and publish nonsense like this:
    http://marginalrevolution.com/marginalrevolution/2016/05/developmental-roots-conformity-bias.html

  4. I agree that the null hypothesis is what my instincts are telling me here. But I also think it’s important to notice when our best evidence is contradicting our instincts, and so far that seems to be true here.

    PS: I changed the first sentence of that paragraph around to make it clearer that the 10% and the 0.1 SD were referring to different things – maybe edit it on your version as well?

  5. Eric Greitens, Navy Seal turned author, said something along the line of

    “Education changes what you know.
    Training changes who you are.”

    Worth pondering, even if not quite right.

  6. Good teachers can make kid just like school more than they otherwise would.

  7. I remain a true believer in vouchers and more parental choice as an ed intervention (at scale) that is most likely to lead to better outcomes. Rep failure to have more voucher programs remains a big disappointment.

    What to do about below avg IQ folk? What is the optimal education for such folk? What is the optimal culture?

    Some folk I know score a low ‘3’ in military aptitude — only 1s & 2s got sent oversease (30 years ago, don’t know about now).
    With work, they can ‘get thru’ college. But most would be better off getting training to fix something mechanical (for guys) or to take care of others (for women).

    The reality of nursing, where men are welcomed, is that they don’t stay. It’s not that they can’t do the work, they don’t like the work. Similarly for most women who don’t like fixing things as much as men seem to, nor do most women like programming (writing code) as much as guys who try it.

    Culture and society should be organized at promoting unwritten rules and norms of behavior for the third quartile, or the third and fourth quintiles, which are optimal. Successful religions have been successful partly because of their ability to help the below avg IQ folk follow rules of behavior that have better results.

    But there isn’t even such good agreement on what are better results — there should be clearer metrics, as well as alternative approaches that have been tried.

Comments are closed.