I have a lot of opinions about technology - what's feasible, what isn't, what the best ways to do things are, and so on. Sometimes I write them down on my blog, but usually I don't. I also know some people who are good at designing technology.
If you want my personal opinion on some technology or technological problem, I guess you can ask me here.
What is your opinion on general robotics (driven by an early form of AGI) and then robotics self manufacturing?
I have personally thought that using a technique that is a straightforward expansion of LLM training, but you train on human manipulation from all available video, would give you a foundation model. Then you would add rl fine tuning and millions of simulated years of RL training to the foundation model to develop excellent and general robot performance.
Do you think this isn't that straightforward or near term?
I ask because if robots can go from their narrow and limited roles now to performing most tasks if given information like (an example or a prompt or a goal schematic) it would change everything.
It would trivialize energy transitions for one. And deep mining and ocean mining and so on. (Because you task some robots with building and assembling others, some with building and deploying energy collection, some with mining, and are now rate limited not really by money directly but by materials, regulations, energy, or mining rights)
Your opinions must rest on an assumption that this problem cannot be solved anytime soon. How confident are you in this belief? What are the obstacles you believe are limiting in developing such a robotics technique?
Past efforts failed but didn't have sufficient compute, and a multimodal technique as described would need all the compute for gpt-4 plus all the compute for image processing, as well as much more training and quality checks during inference. So essentially the technique has never once been attempted at scale.