In Minsky's "Steps Towards Artificial Intelligence", Planning is the second-last stage. The final stage is Induction, by which he means, making its own models of the world.

As far as the current era of AI goes, you could say we saw the first signs of Planning in the primitive LLM-based agent ChaosGPT. It wasn't very good at planning, but it did talk to itself about which courses of action to take.

Apart from the method of adding planning "scaffolding" to a transformer LLM, there is the rumor that Google's Gemini combines the Monte Carlo Tree Search method of policy optimization, used in AlphaGo, with a transformer-like architecture.

I think next year's AIs will probably be good at planning, and I'll stick with my timeline, 0-5 years to superintelligence.

Reply

[-]Capybasilisk2y20

I think Minsky got those two stages the wrong way around.

Complex plans over long time horizons would need to be done over some nontrivial world model.

Reply

[-]jacob_cannell2y51

At this point I think the general shape of brain-inspired algorithms for efficient model-based planning are fairly obvious but they translate into a use of large (ie TBs) of 'fast weight' memory at different timescales (mostly in prefrontal cortex, BG, hippocampus-adjacent and associated) combined with true recurrence, which currently seems prohibitively expensive to translate directly into transformers on GPUs (fast weights are equivalent to KV cache unique per experience sequence and thus expensive for inference). Further speculation on how to improve that probably shouldn't be discussed in this public forum.

Reply

[-]simon2y32

I think:

long term planning is hard but maybe not super hard
if your training uses short term feedback - which typically is all anyone who isn't an evolution has time for, and even evolved systems need to use a lot of - then there's usually some simpler solution than long term planning to satisfy that short term feedback, which means the system doesn't typically reach the long term planning solution using gradient descent based on the short term feedback
under recursive self-improvement, a long-term-planner will tend to preserve its long-term-planning nature, while a non-long-term-planner will not care, making long-term planning an attractor state under recursive self-improvement
sufficiently advanced metacognition might be equivalent to recursive self-improvement

Reply

[-]PeterMcCluskey2y20

I realize now that some of this post was influenced by a post that I'd forgotten reading: Causal confusion as an argument against the scaling hypothesis, which does a better job of explaining what I meant by causal modeling being hard.

Reply

Moderation Log

LESSWRONG
is fundraising!
LW

LESSWRONG
is fundraising!
LW

18

When Will AIs Develop Long-Term Planning?

18

18

Evidence from Evolution

Obstacles to Planning

Will AI be Different?

Best AI Planning So Far?

LeCun's JEPA Model

Conclusion