Here's a link towards DreamerV3, a new model from DeepMind that can be trained on a bunch of different tasks (including a simplified version of Minecraft) and outperform more narrow models. Link: https://arxiv.org/pdf/2301.04104v1.pdf
The most surprising bits are that:
- The type of tasks they train it on is fairly diverse
- Data efficiency scales with the number of parameters
- They so far haven't scaled it that far and got pretty good results
I'm not familiar with LeCun's ideas, but I don't think the idea of having an actor, critic, and world model is new in this paper. For a while, most RL algorithms have used an actor-critic architecture, including OpenAI's old favorite PPO. Model-based RL has been around for years as well, so probably plenty of projects have used an actor, critic, and world model.
Even though the core idea isn't novel, this paper getting good results might indicate that model-based RL is making more progress than expected, so if LeCun predicted that the future would look more like model-based RL, maybe he gets points for that.