It seems to outperform human hiring, at least according to Mitch Hoffman and others, who write,
We evaluate the staggered introduction of a job test across 130 locations of 15 firms employing service sector workers. We show that testing improves the match-quality of hired workers, as measured by their completed tenure, by about 14%.
Here is an abstract of a more recent version. Some questions:
1. How does this affect diversity?
2. Can testing work for high-skilled jobs, also? Software engineers? Middle managers?
3. What would Tyler Cowen say?
This might be huge. What if the robots data mine that college is orthogonal?
“That is, when faced with similar applicant pools,
managers who make more exceptions systematically end up with workers with lower tenure”
Obvious methodological problems…but still.
So, the human evaluation function is moved over into designing the tests.
How would this work for admission to post secondary schooling?
How would this work for positioning: Tutors, teachers, assistants?
“How would this work for admission to post secondary schooling?”
It will be fun when people have to explain the explicit parameterization of legacy, etc.
“So, the human evaluation function is moved over into designing the tests.”
Is it? How does it account for en employee being a “good fit” for the firm, i.e. their off-hour interests are aligned with other people at the firm?
Then again this study looked at service sector workers, wherein the above might be mostly irrelevant.
Working in software, tests have a short life before they are shared on the internet. Take-home tests are especially short-lived. It seems easy to game a large system like this.
I’ve seen tests used more for college hires and as an early round filter.
Tests probably fail to measure whether a candidate is hard working.
The diversity thing is going to be a big deal.
The trouble is that it is very hard to explain to people the value of tolerating a little social ambiguity and pragmatic hypocrisy, the kind that is often required to get to the best (or mandatory) result.
Once you have a hiring program, and it gets to unacceptable results, then companies are going to need to deviate from what the robot said. So they are going to need protocols that tell them when and how often to do that, and they are going to need some narrative full of coded sophistry to ‘explain’ why they’re deviating without getting themselves into too much legal trouble.
I predict that unless the programmers can figure out a way to accommodate and mitigate this issue (which wouldn’t be that hard), then robotic HR will be about as popular as internal prediction markets.