Concerning deep learning algorithms, Neil C. Thompson and others write,
Theory tells us that computing needs to scale with at least the fourth power of the improvement in performance. In practice, the actual requirements have scaled with at least the ninth power.
The implication is that progress in machine learning will hit a wall.
The authors sort of skip the part where they explain how they get from O(n^4) to O(n^9). Then again I would think O(n^4) is bad enough.
Modern AI has, to some extent, shown that by throwing money and hardware at the problem you can get farther than more pedestrian lab research. We will see how far this ultimately goes. For example, autonomous driving (“level 5”) seems a hill too far at this time.
Google is already on the green wave, by the way. I seem to recall that many of their datacenters are already run on locally sourced hydro power. It means less hydro for NYC but what can you do? Don’t blame Larry and Sergey, they’re just being responsible.
I don’t worry about this too much. The Algos that run machine learning are amazing in the sense that we really understand something about how the brain works and can basically do some learning in a similar way. That being said the algos are terrible. The brain does it with less data, and has more transferable learning. My guess is we’ll evolve our algos so that the future of machine learning is small efficiency gains in the way we learn, and larger transferability in data sets. Compute power is one input, we may have maxed it out for a while, but we have a ton of runway in the other inputs. Also we’ve barely made use of the progress since imageNet in our actual lives, machine learning can do more for you with existing inputs, but we are still going from proof of concept to products.
In 10 years algos w’ll be 2x as efficient, and computers32x as fast, even if Moore’s law slows down GPT-X is going to be a step change better.
What about a hybrid approach, using deep learning in combination with expert systems? Some rules + deep learning improve the results
Don’t worry about it, machine learning is a scam.
The tech industry has hit a wall, where we have so much computing power that they literally don’t know what to do with it, so they waste a lot on these useless ML coprocessors or giant GPUs for games.
We are due for a giant reorganization of the tech sector, because the existing players, like google, facebook, Alibaba, bitcoin, etc., are completely played out and incapable of taking the needed next steps. This happened before in the ’90s when IBM and AOL were incapable of really pushing the internet and PC revolutions, so both were swept aside for new companies like Microsoft, Intel, and Dell or startups like Netscape and google. We are on the cusp of another such changing of the guard, who will not waste time on bad tech like ML.
This is a regular cycle in AI research. A promising new approach is discovered, and there’s a bunch of rapid progress as researchers fully explore it, and then when it’s tapped out the field goes through a period of stagnation called AI winter until the cycle starts again with a new big idea.
This has been by far the most fruitful cycle, but I think it’s pretty clear that it’s ultimately going to be a dead end in terms of developing human-level capabilities.
Robin Hanson tweeted this paper last week. Gwern says it’s bullshit because Thompson is constructing his scalings by collating scalings that aren’t comparable, and also doesn’t cite important scaling papers such as Kaplan et al. (2020) on GPT. Knowing his thoroughness, I am inclined to believe him.