Andrew Gelman on the Replication Crisis

He writes,

2011: Joseph Simmons, Leif Nelson, and Uri Simonsohn publish a paper, “False-positive psychology,” in Psychological Science introducing the useful term “researcher degrees of freedom.” Later they come up with the term p-hacking, and Eric Loken and I speak of the garden of forking paths to describe the processes by which researcher degrees of freedom are employed to attain statistical significance. The paper by Simmons et al. is also notable in its punning title, not just questioning the claims of the subfield of positive psychology but also mocking it.

Pointer from Alex Tabarrok.

I am pretty sure that at some point prior to 2011 when I criticizes macro-econometrics I said that the degrees of freedom belong to the researcher rather than to the data. That is a minor note.

More important, I think that John Ioannidis deserves a mention. Yes, Gelman is focused on research in the field of psychology and Ioannidis focused primarily on epidemiology, but his paper Why Most Published Research Findings are False strikes me as a milestone worth including in the timeline.

Gelman’s post is mostly about the tension between insiders and outsiders in the academic world. The insiders’ chief weapon is the peer-reviewed journal article. The outsiders’ chief weapon is the blog post. If, like me, your heart is with the outsiders, you will find Gelman’s post bracing.

I should note that in my high school statistics class last year, I had an autodidact student who, among other things, was very familiar with the term p-hacking and the related literature. This gives me hope that as the generations turn over in academia, things might improve. As Max Planck is said to have remarked, science advances one funeral at a time.

From the Right-Wing Conspiracy Wing Nuts

For example, the FDA assures the public that it is committed to transparency, but the documents show that, privately, the agency denies many reporters access—including ones from major outlets such as Fox News—and even deceives them with half-truths to handicap them in their pursuit of a story. At the same time, the FDA cultivates a coterie of journalists whom it keeps in line with threats. And the agency has made it a practice to demand total control over whom reporters can and can’t talk to until after the news has broken, deaf to protests by journalistic associations and media ethicists and in violation of its own written policies.

This comes from that notorious conservative outlet, Scientific American.

Boston Discussion: Try Again?

I tried a couple weeks ago, but I let the weather forecasters frighten me out of it. I will be back in Boston on October 5th and 6th. Either lunch time or dinner time would work. We will discuss my latest book, Specialization and Trade. If interested, email me at arnold at-sign arnoldkling.com and let me know which time works best for you. I will try to work something out.

What’s Wrong with Keynesian Economic Theory?

That is the title of a new book, edited by Steven Kates. It is published by Edward Elgar ($$$). I am one of the contributing authors.

My essay argues that Keynesians use two very different approaches in marketing their ideas. First, they use a simplistic approach (“spending creates jobs and jobs create spending”) to talk to politicians and the general public. Second, because among trained economists it is indefensible to ignore prices and instead talk about quantities depending on quantities, Keynesians talk in academic circles in an entirely different fashion.

I say that that fatal flaw in both approaches is aggregation–treating the economy as a GDP factory. This makes it impossible for Keynesians (or for macroeconomists in general) to think about the issue of patterns of sustainable specialization and trade. The PSST story is that some patterns of trade become unsustainable as tastes and technology change, and addressing this requires a trial-and-error process to evolve new sustainable patterns.

In terms of policy, the Keynesian assumption that all work is done in the same GDP factory suggests that government can fix a recession without knowing any specifics about the characteristics of unemployed workers. In reality, worker skills are heterogeneous, and there is no guarantee that a fiscal stimulus will be relevant to the workers who are having difficulty adjusting to new circumstances.

I reprise some of these points in Specialization and Trade, which is priced so that an entity other than a library might wish to purchase it.

To the Aspiring Econ Grad Student

Paul Romer writes,

If I am right that in recent decades the equilibrium in post-real macro has discouraged good science (and remember, many economists do not agree with me, at least not yet) there is some risk that a rear-guard of post-real macroeconomists will continue to defend their notion of methodological purity. At this point it is hard to know whether this group will fracture or dig in for a fight to death. If they dig in, I suspect that it will be in a few departments and that the variation between departments will be larger. Watch to see how this plays out and choose where you go with this in mind.

Pointer from Mark Thoma.

My own advice is to look for opportunities other than graduate school in economics.

One way to think of my latest book, Specialization and Trade, is as a denunciation of the path that academic economics took since 1940. It is a Quixotic attempt to pull off what Paul Samuelson did in the 1940s, which is completely re-orient economics from undergraduate education on up. That is not going to happen. I think that academic economics (especially macro, but not just macro) is simply too far gone.

If you have a strong interest in studying economics and in joining in the intellectual conversation, you can do that on your own, without going to graduate school. This was less true forty years ago, when I was starting grad school, because we did not the Internet, with its blogs, online working papers, podcasts, and so on.

On your own, you can be selective about what you study and how intensively you delve into various areas. Studying on your own will be a lot less expensive than going to graduate school, particularly in terms of opportunity cost. Graduate programs will make you waste a lot of time studying things that are either uninteresting to you or uninformative, or both.

It could be that you really want the academic lifestyle, and suffering through an economics Ph.D program is the best way to get it. But be careful about assuming that the academic lifestyle is the only one for you. I think that bright college students tend to over-estimate the intellectual stimulation that they can get out of academia and they under-estimate the intellectual stimulation that they could get out of working in business.

The Trust Variable

Noah Smith worries about the way economists invoke trust.

So although trust, in some form, is probably important in our economic lives, we don’t yet have the tools to measure it, we don’t know exactly how it’s important, and we definitely don’t know how to control or alter a society’s level of trust. Until we understand trust a lot better, it would be a mistake to rely on it too much when trying to explain the world around us.

Read the entire essay. I agree with his qualifications, but I would rephrase his conclusion. It sounds like he could be saying that if something is hard to measure and control, then look for other variables to explain and control the world. Instead, I would say that one should be humble about one’s ability to explain and control the world.

The first step in getting a better handle on trust is to define it well. As Smith indicates, the standard practice is to measure people’s answers to very broad survey questions (“How strongly do you agree with the statement that most people can be trusted?”) That is very unsatisfying.

When I worked at Freddie Mac, we were subjected to given some management training of the “teambuilding” sort, one of the goals of which was to improve trust within the organization. This lead us to think about trust, and one insight that some of us arrived at was that trust involves more than just a belief that someone else is well motivated. Often, trust breaks down because we lose confidence in other people’s competence. Even if you have very general views about other people’s motives, you are likely to assess other people’s competence relative to their specific occupations.

This factor of competence assessment is embedded in my views of the role of finance in economic fluctuations. In Specialization and Trade, I argue that financial intermediation can expand when people trust financial intermediaries. In particular, as we experience financial intermediaries meeting their obligations, we gain confidence in their competence (as well as in their motivation). This leads to more trust, more expansion of financial intermediation, and so on, until, in Minsky fashion, the intermediaries are engaged in dangerous activities, and we get a collapse, including a collapse of trust.

So trust is not “social capital” that you want to see increased indefinitely. At least in the case of financial intermediation, it is best for trust to be at some intermediate level. Not so low that relatively low-risk, high-return investment opportunities are missed. But not so high that you get an excess of relatively high-risk, low-return projects (e.g., sub-prime mortgage loans) that are funded.

Tonight’s Debate

I’m guessing that the people most motivated to watch will be those who already have made up their minds which of the two they are voting for. I have already made up my mind, not to vote for either one of them. And I will not watch. (Note: Peggy Noonan has encountered a lot of people who are undecided. That goes against my experience, but I don’t deny living in a bubble. I remember in previous elections Jonah Goldberg wondering who the heck these undecided voters were. I sympathize with his befuddlement.)

Also, I think that Gary Johnson deserves to be in the debate. The threshold of 15 percent in the polls may have been appropriate when the two major parties were nominating acceptable candidates. However, that is not the case this year. Simply being on the ballot in every state should qualify Johnson to be in the debates in a year when the majority of people have a negative view of both Mr. Trump and Ms. Clinton. I think that the threshold for keeping Johnson out of the debates should be that the polls show that the unfavorability ratings for the other candidates should be less than, say, 40 percent.

While I am on the topic of the election, Tyler Cowen recommends David Brooks. Brooks writes,

We have an emerging global system, with relatively open trade, immigration, multilateral institutions and ethnic diversity. The critics of that system are screaming at full roar. The champions of that system — and Hillary Clinton is naturally one — are off in another world.

There is a strong case to be made for an open world order, and a huge majority coalition to be built in support of it.

In the nearly twenty years since Brooks wrote Bobos in Paradise, coining the expression “bourgeois bohemians,” have the Bobos achieved the status of a “huge majority coalition”? My guess is that Peggy Noonan, based on her conversations with potential voters, would have doubts.

The guardians of the open world order helped encourage a revolution in Syria that became a civil war. The guardians of the open world order were unable to stop this civil war. The guardians of the open world order have yet to convincingly demonstrate that they can cope with the refugee problem created by this civil war.

I am not joining the anti-Bobos here. But I do think that one should not over-estimate the Bobo vote, and where Mrs. Clinton needs help is with people who are not Bobos. If you talk to them about an “open world order,” they are likely to want to know where the “order” part is going to come from.

As a final point, I endorse the view that democracy works best when elections do not matter much. Let us all hope that this election does not matter much, and that the system is robust enough that we can get through the next four years regardless.

A Congressional Regulation Office?

Philip Wallach and Kevin R. Kosar write,

The office would have two core functions. First, it would perform cost-benefit analyses of agencies’ significant rules, which number around a hundred per year, in order to provide a disinterested check on agencies’ self-interested math. These CRO analyses would coincide with the prospective estimates that agencies themselves perform. This would create a legislative counterweight to the rule-review function of the Office of Information and Regulatory Affairs — which is nested within the OMB and thus the Executive Office of the President, and is therefore unable to provide a credibly neutral review process that goes beyond concerns internal to the executive branch.

The CRO’s assessment of a proposed regulation, like CBO’s bill scores, should be posted online and delivered to the committee of jurisdiction. Doing these things would increase the political salience of agency rulemaking, thereby fostering congressional oversight and encouraging policy entrepreneurs in the legislature to take up the subject. A CRO cost-benefit analysis should also be automatically submitted as public comment to the rule, which would oblige an agency response and possibly a recalibration of the rule.

Second, but perhaps just as promising, would be to have CRO perform periodic retrospective analyses informed by real data rather than forward-looking estimates. Agencies sometimes perform “look-back” assessments, but they are modest in number (certainly compared to the massive corpus of standing regulation) and produce only nominal changes. This is unsurprising, since each agency is passing judgment on its own work. CRO reports would regularly goad Congress to examine how the rules produced by existing laws are performing, such that they could work to revise those statutes that have yielded problematic results.

My thoughts.

1. Take the analogy with the Congressional Budget Office. People love the CBO, but its practical impact has been questionable. Since 1975, the Congressional budget process has become worse, not better. Congress has become more evasive of accountability, not less so.

2. The proposed CRO is a solution if the problem is that Congress lacks information about bad regulatory policy. But is that really the problem? The problem is that, as with the budget, there is not much collective will in Congress to set policy.

I think that we have the state of affairs that we do because politicians like it. That means either that the regulators are getting away with something without the public realizing it or the public is basically complacent about regulators running amok. I am afraid that it is the latter.

Would a CRO make the public less complacent? Well, CBO has not made the public any less complacent about the unfunded liabilities of Social Security and Medicare. Instead, the CBO’s main impact has been to increase policy makers’ hubris about the ability of deficit spending to create jobs.

In a better world, what are called “regulations” would be called “laws,” and every single last one of them would require Congressional votes. In an even better world, politicians in Washington would look at all the things that agencies are attempting to regulate and say, “Gee, we have no business doing that. We are not properly informed. We should only regulate in areas where we have a good set of information on which to base regulation.”

I don’t know how to get to a better world. Look, I love checks and balances in theory. And I don’t mean to discourage creative ideas for addressing what I agree is a serious problem. But I am afraid that I must assign a low probability to a CRO moving us in the right direction.

Economic Data in 1946

Scott Sumner writes,

One commenter pointed out that RGDP fell by over 12% between 1945 and 1946, and that lots of women left the labor force after WWII. So does a shrinking labor force explain the disconnect between unemployment and GDP? As far as I can tell it does not, which surprised even me. But the data is patchy, so please offer suggestions as to how I could do better.

You could do better by taking the RGDP figure with a tablespoon of salt. The way that the Commerce Department adjusts nominal GDP for price changes is pretty unreliable for that period. Part of the reason is that there was so much shifting between public sector output (who knows how much of that is “real” vs. nominal?) and private sector output, and part of the reason is that as you move away from the base year (either many years ahead or many years behind) the adjustment process gets screwy. 1946 is now many, many years away from the base year that is used to calculate real GDP. I think that if you can find old publications from the Commerce Department, you will see very different patterns of real GDP for 1946, resulting from shifts in the base year from 1958 to 1975 to ….

I think that for 1946 you are safer sticking to nominal GDP numbers.

By the way, here is a piece I wrote on that period.

The Status of Status Games

A commenter asks,

If beach volleyball is made an Olympic sport, does that lower the status of Usain Bolt? No probably not, but it does raise the status of beach volleyballers. What evidence is there that status is zero sum?

Within each status game, it is zero-sum. The 100-meter race can have only one winner.

But what about multiple status games? Does adding a status game lower the status of existing games?

I hope instead that with multiple status games, more people can be winners. I recall Tyler Cowen arguing that having multiple status games would be more conducive to social peace. Instead, if there is only one ultimate game, so that “status” can be reduced to a single dimension along which everyone has a rank, then conflict seems inevitable.