Are we on the verge of an intelligence explosion? Maybe, but scaling alone won't get us there. 

Why? The human data bottleneck. Today’s models are dependent on human data and human feedback. 

Human-level intelligence (AGI) might be possible by teaching AI everything we know, but superintelligence (ASI) requires learning things we 𝗱𝗼𝗻’𝘁 know.

For AI to learn something fundamentally new - something it cannot be taught by humans - it requires exploration and ground-truth feedback.

  • Exploration: The ability to try new strategies, experiment with new ways of thinking, discover new patterns beyond those present in human-generated training data.
  • Ground-Truth Feedback: The ability to learn from the outcome of explorations. A way to tell if these new strategies - perhaps beyond what a human could recognize as correct - are effective in the real world.


This is how we've 𝘢𝘭𝘳𝘦𝘢𝘥𝘺 achieved superintelligence in limited realms, like games (AlphaGo, AlphaZero) and protein folding (AlphaFold).

Without these ingredients, AI remains a reflection of human knowledge, never transcending our limited models of reality.

Full post (no paywall): https://bturtel.substack.com/p/human-all-too-human

New Comment
4 comments, sorted by Click to highlight new comments since:

This is all true, but I'm not sure the claimed implications are so certain. The problem is, different minds can gain different levels of insight out of the same data and tools. 

First, we should assume humanity has enough data to enable the best human minds to reach the highest levels of every capability available to humans very very little real-world feedback. It's not ASI in the full sense, but there has never been a human mind that contained all such abilities at once, let alone with an AI's other default advantages. 

Second, it seems extremely unlikely to me that the available data does not include patterns no human has ever found and understood. All collected data ha[s] yet to be completely correlated and put together in all possible relationships. I don't have a strong sense of the limits of what should be possible with current data. At minimum I expect an ASI to have better pure and applied math tools to apply to any task, and require less data than we do for any given purpose.

Third, with proper tool support, I'm not sure how much physical experimentation and feedback can be substituted with high-quality simulation using software based on known physics, chemistry, and biology. At minimum, this should enable answering a lot of questions that current humanity knows how to answer by formulaic investigation but has never specifically asked or bothered writing down an answer to.

To me this indicates that at the limit of enough compute with better training methods, AI should be able to push at least somewhat beyond the limits of what humans have ever concluded from available data, in every field, before needing to obtain any additional, new data.

Hey, thanks for reading and for the thoughtful comment!  

100% agree with this: "AI should be able to push at least somewhat beyond the limits of what humans have ever concluded from available data, in every field, before needing to obtain any additional, new data."

Current methods can get us to AGI, and full AGI would result in a mind that is practically superhuman because no human mind contains all of these abilities to such a degree.  I say as much in the full post: "Models may even recombine known reasoning methods to uncover new breakthroughs, but they remain bound to known human reasoning patterns."  

Also agree that simulation is a viable path to exploration / feedback beyond what humans can explicitly provide: "There are many ways we might achieve this, whether in physically embodied intelligence, complex simulations grounded in scientific constraints, or predicting real world outcomes."

I'm mostly pointing out that at some point we will hit a bottleneck between AGI and ASI, which will require breaking free from human labels, and learning new things via exploration / real world feedback.

Got it. Then I agree with that. I'm curious if you've thought about where you'd put lower and upper bound estimates on capabilities before hitting that bottleneck?

That's a good question.  I don't think I have a great idea of the lower / upper bound on capabilities from each, but I also don't think it matters much - I suspect we'll be doing both well before we hit AGI's upper bound.  

There's likely plenty of "low hanging fruit" for AGI to uncover just working with human data and human labels, but I also suspect there are pretty easy ways to let AI start generating / testing hypothesis about the real world - and there are additional advantages of scale and automation to taking humans out of the loop.