The WSJ, had an article in the print edition on November 27 that I cannot find on line (their search function is not helpful). The print article was called ‘Adversity’ Has Big Effect on SAT Scores. What I can find online instead is this:What Happens if SAT Scores Consider Aversity? Find Your School.
Anyway, the WSJ uses a Georgetown education researcher’s regression equation relating SAT scores to “adversity scores” to make inferences such as
Top public magnet schools performed exceptionally well in adjusted SAT scores, meaning their scores jump when adversity is accounted for.
To see why this is not a valid inference, suppose that there were two students of identical backgrounds but different ability levels. Presumably, the magnet school would select the student with higher ability, leaving the other student to attend a regular school. The more able student would get a higher SAT score, but that would say nothing about the magnet school’s “performance.”
I sent a letter to the editor of the WSJ about this, but they did not print it. But I hope that someone there gets the message that this was statistical malpractice.
This was an interesting post.
Somewhat tangentially, this made me think about simulating some students and thinking about the algorithm to choose the students for magnet schools and what that means for the type of students in the school.
First, I simulated a two-dimensional variable that is multivariate normal with each individual variable mean of zero with standard deviation of one and a 50% correlation. The first variable would represent socio-economic status and the second would represent intelligence. So basically I am assuming some modest correlation between status and intelligence. We can also calculate an intelligence score adjusted for socioeconomic status by regressing intelligence on status and getting the residual. I presume this is similar to the concept of the adversity score.
We can think of the magnet school as selecting those students that are high in intelligence, e.g. one standard deviation above zero. On average, they will have an intelligence score of about 1.5 and a status score above 0 (so they are a little better off, due to the correlation assumptions above), and a status-adjusted intelligence score of modestly above 1. This reflects that the magnet school.
Of course, we can consider other types of magnet school in this simple example. If we select students with above one on the status-adjusted intelligence score, then the average status is about 0 (so average), the average intelligence is a little less than before, but the status-adjusted intelligence is higher.
We can also select students on the basis of just socio-economic status (again one standard deviation above zero), like Ivy League schools in the past. In this case, the socio-economic score is on average higher (around 1.5), while the raw intelligence score is around 0.75 (a little weaker than the other schools) and the status-adjusted intelligence score is around 0 (so all of their raw intelligence is driven by what is correlated with their status).
In other words, the characteristics of the magnet school vary significantly depending on how you choose the students. What algorithm you choose for selecting students for the magnet school may be influenced by your values about what is appropriate for the school, but it will also have an impact on the characteristics of the school’s population.
Of course, this is all rather obvious.
Kling, this is silly. When people say that a school is a “good school” and it “performs well”, the normal expectation is that that is mostly due to which students attend it. For example, people normally say Harvard University is a good school, of course that is about its selective admission process.
Measuring a school’s added value, relative to the students that attend, is also very important, but that is more involved process, and subjective in the sense that two different analysts/researchers can choose different methods yielding entirely different results.
One reasonable definition of “Adversity” is simply the inverse of aptitude. SAT is an aptitude test: If you measure aptitude while correcting for aptitude, or equivalently while correcting for adversity to aptitude, you should get a single constant score for everyone, which would be rather silly.
When people say that a school is a “good school” and it “performs well”, the normal expectation is that that is mostly due to which students attend it.
Maybe where you live–but in America, just about everyone says the opposite. At least in public. Every politician complains about “failing” schools and how they’re going to “fix” them. This assumes that “good schools make good students”, not that “good students make good schools”.
Unfortunately, it is 90% true that “good students make good schools”, that when a “good school” “performs well”, “that is mostly due to which students attend it.”
So the base of most everybody’s educational policy is an untruth. Which is one reason we spend so much money and keep “failing”.
Kind of.
That claim has two pieces.
First is the Null Hypothesis, which is well-evidenced, and see Bad Students, Not Bad Schools and The Long Crusade: Profiles in Education Reform, 1967-2014.
But second is the ‘failing’, which is probably not accurate, because if you look at American PISA rankings broken down by national origin or “major continental population grouping”, U.S. students do very well compared to their closest co-ethnics abroad.
Indeed, there seems to be a big Red State vs. Blue State gap in just how well the local version of the American School System succeeds or fails in teaching victim-group-students, and Red States are not just better than Blue States, but among best-in-class for scores among the global diaspora of similar-origin students.
The bottom line orbits around one’s perspective on The Gap.
If The Gap is natural and due to impersonal forces, then the rational message is “Way to go America, keep it up! It’s expensive, so, if you can, try to do it cheaper.”
If The Gap is due to evil oppression, discrimination, privilege, and so forth, then the rational message is “Terrible job, America. We need to give more help to the under-privilieged and fix the bad schools to close The Gap.”
The magnet schools were at at least capable of picking the more able student. One of the major skills required for good schools.
Intellectual malpractice to support The Narrative is the order of the day.
Arnold,
You say,
“To see why this is not a valid inference, suppose that there were two students of identical backgrounds but different ability levels. Presumably, the magnet school would select the student with higher ability, leaving the other student to attend a regular school.”
Both of my daughters atteneded/are attending a magnet school here in Chicago (Lane Tech, if you want to look it up.) These magnet schools actually use adversity in their selection process to help pick who gets in — CPS divides the City into Tiers (I think by Census tracks) and you are awarded bonus points for being in the worst Tiers (i.e. a Census track with low socio-economic measures, which usually correlates well with Black or Hispanic demographics.) So in effect these schools are already helping kids from “adverse” backgrounds.
Presumably, the magnet school would select the student with higher ability, leaving the other student to attend a regular school. The more able student would get a higher SAT score, but that would say nothing about the magnet school’s “performance.”
As Dale and Krueger showed, exactly the same argument applies to elite universities (substitute ‘income’ for ‘SAT score’ and ‘elite university’ for ‘magnet school’ in the paragraph above). In both cases the apparent ‘performance’ of the institution is a function of the selectivity of the admissions office.
My current policy proposal is to limit tax-exempt schools from having more than 10% of their entering class be in top 10% of income. Similarly also restrict top 1%; 20%; 30%; 40%; 50%. Possibly the 60% and 70% levels. In an entering class of 1000, no more than 100 would be from the top 10% (no more than 10 from top 1%).
Harvard needs to be encouraged more strongly to admit more poor kids. In any income range, there are also other adversities like divorce, parents as criminals, physical handicaps, bad zip-code / school district for High School; perhaps being a boy. The Chicago Tiers seems a reasonable way to collapse multi-variant adversity into a simple tier point advantage. So Arnolds example would be two kids from the same tier, only the best academically would get the 1 spot available.
I consider this more one of the complicated, not complex, problems — but nevertheless politically difficult because of different value systems.
One of my more recent anti-Affirmative Action thoughts is that the US black communities have been intellectually castrated by their brightest being sucked out of the local black community, and imported (as tokens or not) into the upper managerial class. Without AA, many of them would have been struggling local entrepreneurs, some of whom would fail, others succeed, but most would stay near their mother & family. The poor, adverse community would be enriched, at a price of less success for the community member who otherwise would have left / escaped.
This is similar to the desire of more development economists to avoid educating a local elite that is then empowered … and who then leave for the US or some other richer place, rather than helping do the (harder?) work of making the local place better.
We are in a school system with many public magnets that are open to all students. Seats are assigned out based on a lottery that takes socioeconomics into account – mostly neighborhood and parental income. The lottery itself requires parents to do some research and it can be gamed so there is a g-loaded hurdle if you want to increase the odds of getting your kid into more popular programs.
The kids in these magnet schools tend to outperform others of the same race/ethnicity/income groups. Shocking.
Goodhart’s Law. Statistics can’t be trusted whenever there is a strong incentive and ability to game the reported measures behind the scenes. The Cheating Crisis (to include sub rosa corrupt selection) is just another instance of the same phenomenon behind the Replication Crisis.