They’ll be able to automate ML research in the sense of coming up with experiments to try, and implementing those experiments, but never any new conceptual work.
I agree this seems unlikely. As a baseline, labs have thousands of capabilities researchers coming up with insights, and they could train the models to imitate them. There is also a path of RLVRing against the results of small scale experiments. It’s more expensive to collect data for research taste, but it doesn’t seem like a difference in kind to software engineering.
How has your p(doom) changed over this period?
What do you think have been the most important applications of UDT or other decision theories to alignment?
2x uplift is already happening at the most advanced AI lab
This seems plausible to me, but would be good to have a new METR uplift study to have more confidence in this.
Could you give an example of an article where this was effective?
The existential risks that everyone will die or that the future will belong to the AIs are obvious.
I not sure that this is obvious to most, particularly outside of LW.
The one bench we definitely don't want to be bench-maxxed.
arguing that if you don't know the hazard rate, but instead have uncertainty about it that you update over time, hyperbolic-looking discounting can fall out from there
This seems relevant to X-risk discussions.
Claude 4.6 was released about an hour ago. Just 10 mins after it was released, OpenAI released GPT-5.3.