This is Too Much

Jared Bernstein writes,

had R&R gone through the peer-review process, I’m fairly confident that a) the spreadsheet error would NOT have been found, but b) the paper would have been sent back to them for failing to provide even a cursory analysis of the possibility of reverse causality (slower growth leading to higher debt/GDP ratios vs. the R&R claim of the opposite). Re “a,” peer reviewers do not routinely replicate findings, though they should when possible (more work these days is with proprietary data sets which cannot legally be shared).

Pointer from Mark Thoma.

I have not been commenting on the Reinhart and Rogoff fracas. My view of empirical macroeconomics is that there are hardly any reliable findings, so I always brushed aside the notion that there is some adverse growth impact of having a debt to GDP ratio of 90 percent. But some people took it seriously. And now the left is howling that all of the austerity policies in the world are due to Reinhart and Rogoff, and they should be burned at the stake, or something.

But speaking of unreliable findings in empirical macroeconomics, this is the same Jared Bernstein who co-authored a memo for President Obama saying that the multiplier is 1.54, as if we know what it is with that precision (I do not think we know with any precision that it even has a positive sign.) He has about as much right to complain about Reinhart and Rogoff as a crack-head has to complain about somebody who got drunk once.

And do read F. F. Wiley. (pointer from Tyler Cowen)

More on Schooling, Deschooling, and the Null Hypothesis

Four links.
1. A NYT article on computerized grading of essays. I highlight the response of the Luddites:

“My first and greatest objection to the research is that they did not have any valid statistical test comparing the software directly to human graders,” said Mr. Perelman, a retired director of writing and a current researcher at M.I.T.

He is among a group of educators who last month began circulating a petition opposing automated assessment software. The group, which calls itself Professionals Against Machine Scoring of Student Essays in High-Stakes Assessment, has collected nearly 2,000 signatures, including some from luminaries like Noam Chomsky.

“Let’s face the realities of automatic essay scoring,” the group’s statement reads in part. “Computers cannot ‘read.’ They cannot measure the essentials of effective written communication: accuracy, reasoning, adequacy of evidence, good sense, ethical stance, convincing argument, meaningful organization, clarity, and veracity, among others.”

Suppose, for the sake of argument, that the software does poorly now and can be fooled easily. My bet is that within five years there will be software that can pass a Turing test of the following sort.

a. Assign 100 essays to be graded by four humans and the computer.

b. Show the graded essays to professors, without telling them which set was computer-graded, and have them rank the five sets of essays in terms of how well they were graded.

c. See if the computer’s grading comes in higher than 5th.

While we are waiting for this test, the NYT article points to a nice paper by Mark D. Shermis summarizing results of a comparison of various software essay-grading systems.

2. Isegoria points to Bloom’s 2-Sigma Problem,

The two-sigma part refers to average performance of ordinary students going up by two standard deviations when they received one-to-one tutoring and worked on material until they mastered it, and the problem part refers to the fact that such tutoring doesn’t come cheap.

I am skeptical. It is possible that this educational intervention is so radically different from anything else that has ever been tried that it works much better than other interventions. But I would bet that if another set of researchers were to attempt to replicate this study, they would fail to find similar results. In social science in general, we do too little replication. This is particularly important when someone claims to have made a striking finding.

3. In the comments on this post, I found this one particularly interesting and articulate:

I think K-12 public schools are about warehousing children, giving parents childcare, whether they are at work or simply want a break from being around their kids (the quality of parenting going on is incredibly wide-ranging).

…why the current system is still in place-Cost, Convenience, Comfortability and Childcare. Unfortunately, the one-size-fits-all approach is ineffective, makes young people passionately hate school (which breeds some serious anti-intellectual pathologies) and is becoming even more centralized in curriculum and control. (See Common Core curriculum adopted by 48 states.)

I think that the Childcare aspect deserves more notice. When President Obama supports universal pre-school, the “scientific” case is based almost entirely on taking kids out of homes of low-functioning parents. But what affluent parents hear is “Obama is going to pay for my child care,” and that is what makes the policy popular.

More generally, assume that as a parent you believe that your comparative advantage is to work, rather than spend the entire day with your child. Then ask yourself why as a parent you would prefer to have your child in school rather than home without supervision. Even if the child learns less at school than they would at home, you still might prefer the school, as long as you are convinced that it reduces the risk of your child getting into really bad trouble.

4. From Michael Strong, in a long comment pushing back on my post last week.

No one doubts that if one compares one group that receives significant practice in an activity against another group with no exposure to the activity at all, that a treatment effect exists.

Why then are so many people skeptical that interventions in education make a difference? Largely because the comparisons exist between idiotic variations within a government-dominated industry.

As a rejoinder, I might start by changing “receives significant practice” to “engages in significant practice.” “Learning a skill” and “engaging in significant practice” are so closely related that I would say that, to a first approximation, they are the same thing.

This leads me to the following restatement of the null hypothesis.

The null hypothesis is that when you attempt an educational intervention, such as a new teaching method, the overall economic value of the skills that an individual acquires from age 5 to 20 is not affected by that intervention. I will grant that if you take two equivalent groups of young people and give one group daily violin lessons and the other group daily clarinet lessons then the first group is more likely to end up better violinists on average.

But when economists measure educational outcomes, they usually look at earnings, which result from the market value of skills acquired. To affect that, you have to affect the ability and willingness of a person to engage in practice in a combination of generally applicable fields and fields that are that person’s comparative advantage.

Aptitude and determination matter. Consider Malcolm Gladwell’s “10,000 hour rule” for becoming an expert at something. There is a huge selection bias going on in that rule. How many people who have little aptitude for shooting a basketball are going to keep practicing basketball for 10,000 hours?

When you consider how hard it is to move the needle half a standard deviation on a fourth-grade reading comprehension exam, the chances are slim that you are going to come up with something that affects long-term overall outcomes. Until we get the Young Lady’s Illustrated Primer.

The Minimum Wage Debate

Clearly, the debate left things unsettled, because the protagonists are still arguing. [update: Betsey Stevenson contributes noise to the debate.]

The point that I would like to hear shouted from the rooftops is that the minimum wage in the United States is just barely effective. We teach in freshman economics that a price floor is only binding if it is above the equilibrium price. But looking at the number of workers covered by the minimum wage is small, and knowing that the vast majority of unemployed workers are not clamoring for minimum-wage jobs, I would say that you should draw your labor supply and demand diagram with the minimum wage just epsilon above the market equilibrium. It should be hard to see the effect of the minimum wage on employment in your diagram, and it should be hard to see the effect in the real world.

In that sense, Elizabeth Warren is right–if you really want to have an effective minimum wage, it needs to be a lot higher. That is what the debate should focus on, in my opinion. What would happen if we raised the minimum wage by $5 an hour or more? In that case, if somebody wants to try to argue that the effect on employment would be negligible, good luck to them.

Causal Density is a Bear

The Economist reports,

The mismatch between rising greenhouse-gas emissions and not-rising temperatures is among the biggest puzzles in climate science just now. It does not mean global warming is a delusion. Flat though they are, temperatures in the first decade of the 21st century remain almost 1°C above their level in the first decade of the 20th. But the puzzle does need explaining.

On a separate but related topic, Noah Smith writes,

DSGE models are highly sensitive to their assumptions. Look at the difference in the results between the Braun et al. paper and the Fernandez-Villaverde et al. paper. Those are pretty similar models! And yet the small differences generate vastly different conclusions about the usefulness of fiscal policy. Now realize that every year, macroeconomists produce a vast number of different DSGE models. Which of this vast array are we to use? How are we to choose from the near-infinite menu of very similar models, when small changes in the (obviously unrealistic) assumptions of the models will probably lead to vastly different conclusions? Not to mention the fact that an honest use of the full nonlinear versions of these models (which seems only appropriate in a major economic upheaval) wouldn’t even give you definite conclusions, but instead would present you with a menu of multiple possible equilibria?

James Manzi’s Uncontrolled pinpoints the problem, in what he calls causal density. When there are many factors that have an impact on a system, statistical analysis yields unreliable results. Computer simulations give you exquisitely precise unreliable results. Those who run such simulations and call what they do “science” are deceiving themselves.

The Albert Hirschman Bio

It is The Worldly Philosopher, by Jeremy Adelman. Tyler Cowen offers praise although he is only partly into it. I finished it, which is something I am guessing few others will do, although I imagine a lot of people will make some sort of attempt. My reactions:

1. Reading about Hirschman reminded me of my father. 1920’s Berlin and 1920’s St. Louis were both cities where assimilated German Jews tried to distance themselves from the more backward/traditional Russian Jews, and neither my father nor Hirschman identified with traditional Judaism. Hirschman grew up in an assimilated family (although the family covered up his father’s Ostjuden background), whereas my father’s parents were blatantly Polish-Russian and therefore embarrassing to him. Both Hirschman and my father had sisters who were staunch Communists, and both had their career opportunities limited in the McCarthy era. Both Hirschman and my father specialized in Latin America. Both lacked mathematical tools and instead relied on a broad-based humanistic approach, including psychology and literature. My guess is that their paths would have crossed had my father not deserted research for administration just as Hirschman’s career took off. Both were skeptical of universal laws in social science. One of my father’s favorite sayings was what he called the First Iron Law of Social Science: sometimes it’s this way, and sometimes it’s that way.

2. Success is contingent. It is easy to imagine Hirschman with a Nobel Prize. It is also easy to imagine him never emerging from obscurity. On p. 447, Adelman quotes Gordon Tullock’s scathing (we would now say snarky) review of Exit, Voice, and Loyalty.

clearly there is room in the literature for a 155-page book on the responses of customers to declining efficiency on the part of their suppliers, and on the differences between changes in quality and changes in price. Unfortunately, this is not the book.

As Adelman points out, Hirschman (unlike Tullock, I might add) left behind few disciples, much less a complete “school.”

3. Adelman develops the theme that much of Hirschman’s appeal was literary. Like a highbrow novelist, he gave the reader a sense of pride in being able to appreciate his word plays and allusions.

4. Perhaps Hirschman’s most admirable achievement was the one he liked least to discuss: his involvement in the black market in Marseilles in 1940 to extricate prominent Jews from Hitler’s Europe.

Russ Roberts and Edward Leamer

I love this video, but that is because I agree so much with Leamer.

One thing I would point out about his charts is that he uses trend lines and implies that mean reversion is the norm. That is, for most of the postwar period, if you had a recession that took GDP below trend, you would then have above-trend growth. An alternative hypothesis is that real GDP follows a random walk with drift. That would mean that it always tends to grow at 3 percent, regardless of its recent behavior. The last three recession seem to follow such a model.

In the late 1980s, some folks, notably Charles Nelson and Charles Plosser, argued strenuously against mean reversion and in favor of the random walk with drift. Note that this is back when Leamer describes output and employment as mean-reverting. I wonder if what happens as data get revised over long periods of time is that random walks get turned into mean-reverting trends.

Note Tyler Cowen’s comment on the latest employment report:

we are recovering OK from the AD crisis, but the structural problems in the labor market are getting worse. It’s becoming increasingly clear those structural problems were there all along and also that they are a big part of the real story. On the AD side, mean-reversion really is taking hold, as it should and as is predicted by most of the best neo-Keynesian models.

Quintile Mobility: Built-in Properties

Timothy Taylor writes,

For example, for all those born into the bottom quintile, 44% are still in that quintile as adults. About half as many, 22%, rise to the second quintile by adulthood. The percentages go down from there. … Similarly, those born into the top income quintile are relatively likely to remain in the top. Among children born into the top quintile, 47% are still there as adults. Only 7% fall to the bottom quintile. The experiences of those born into the middle three quintiles are quite different. The distribution among income quintiles as adults is much more even for those born in these three middle groups, suggesting significant mobility for these individuals. … This pattern has led researchers to conclude that the U.S. income distribution has a fairly mobile middle, but considerable “stickiness at the ends” …”

This result is nearly an arithmetical certainty. Suppose that everyone faces three equally-probable outcomes:

–their income as adults puts them in the same quintile as their parents
–their income as adults rises enough to move up a quintile
–their income as adults falls enough (in relative terms) to move down a quintile

If this were the case, then people in the top would have a 2/3 chance of remaining at the top, because those who get lucky have nowhere to go but up within the top quintile. Similarly, people would have a 2/3 chance of remaining at the bottom, because those who get unlucky have nowhere to go but down within the same quintile. People in the middle quintiles would have only a 1/3 chance of remaining in their original quintile, because they can move in either direction. This pattern would lead researchers to conclude that the U.S. income distribution has a fairly mobile middle but considerable stickiness at the ends, even though by construction everyone in all quintiles has the same probability of moving up or down the income scale.

Ronald Coase on the State of Economics

He writes,

In the 20th century, economics consolidated as a profession; economists could afford to write exclusively for one another. At the same time, the field experienced a paradigm shift, gradually identifying itself as a theoretical approach of economization and giving up the real-world economy as its subject matter. Today, production is marginalized in economics, and the paradigmatic question is a rather static one of resource allocation. The tools used by economists to analyze business firms are too abstract and speculative to offer any guidance to entrepreneurs and managers in their constant struggle to bring novel products to consumers at low cost.

Pointer from Jason Collins.

As a counter-example, I offer Information Rules, by Varian and Shapiro. And note that Varian was hired by Google and has played a significant role there. Coase later writes,

Today, a modern market economy with its ever-finer division of labor depends on a constantly expanding network of trade. It requires an intricate web of social institutions to coordinate the working of markets and firms across various boundaries. At a time when the modern economy is becoming increasingly institutions-intensive, the reduction of economics to price theory is troubling enough. It is suicidal for the field to slide into a hard science of choice, ignoring the influences of society, history, culture, and politics on the working of the economy.

I really like the phrase “increasingly institutions-intensive.”

John Papola on the State of Economics

From a post on Facebook:

More than ever I am convinced that the professional study of economics peaked in the classical liberal era with John Stuart Mill and rapidly became a mostly destructive force in society thereafter. What was once an area of inquiry dedicated to counteracting intuitive-yet-bad ideas has overwhelmingly devolved into a pseudo-scientific industry of naval-gazing in support for some of the worst fallacies these classical thinkers devoted their lives to refuting, such as utterly absurd claims like “consumption increases output”.

My current theory for why this has occurred is that the classical thinkers were a diverse group with many of them devoting their lives to actual value creation for other people through commerce and most of them multi-disciplinary polymaths. They had an integrated view of the world that was rich and realistic. All of that has been sandblasted into oblivion. This seems to be the result of the mathematization and hyper faux-specialization of the “economics profession”, combined with the fact that most employment opportunities for economists are in academia (where the real world is irrelevant) or government directly (same problem, worse incentives).

If you want to glimpse economics from the inside, read Miles Kimball on three goals for Ph.D courses.

My instinct is that the profession is not as bad as Papola portrays it, although I have sympathy with where he is coming from.

1. I see strong gains in economic history, understanding the causes and consequences of economic growth, finance theory, mechanism design, game theory, and other areas. Yes, there is over-hyping, but I believe that the progress is genuine.

2. In academic economics, the emphasis is on “tools,” meaning mathematical techniques. This crowds out thinking either about deep philosophical issues or the real world. I wish that philosophical rigor were emphasized as much as mathematical rigor. In terms of the real world, what troubles me most about economists’ mindset is the failure to appreciate the importance of conflict within organizations and radical ignorance/uncertainty.

3. Macro is a disaster area. I have said before that it ought to be relegated to a “history of thought” course, rather than given equal billing with micro. But hardly anything else in economics is as bad as macro.

4. For a long time, applied econometrics was a disaster area. What Angrist calls the “credibility revolution” has helped, I think, although again there is over-hyping.

5. I have said before that I think that the economics profession is far too heavily influenced by a few “top” departments. I do not know how much worse things are in economics than in other disciplines. But in economics, the control that MIT, Chicago, and Harvard exert is so strong that they can pull several generations of economists in the wrong direction. Tyler Cowen cites relevant data and remarks,

It has been evident for a while that the former “top six” is in some ways collapsing into a “top two,” namely Harvard and MIT.

It is a self-perpetuating, in-bred, smug, narrow guild. I do not know what to do about it.