The AGI race has begun. We are building machines that can think and reason. By 2025/26, these machines will outpace many college graduates. By the end of the decade, they will be smarter than you or I; we will have superintelligence, in the true sense of the word.
In the link provided, Leopold Aschenbrenner explains why he believes AGI is likely to arrive within the decade, with superintelligence following soon after. He does so in some detail; the website is well-organized, but the raw pdf is over 150 pages.
Leopold is a former member of OpenAI's Superalignment team; he was fired in April for allegedly leaking company secrets. However, he contests that portrayal of events in a recent interview with Dwarkesh Patel, saying he leaked nothing of significance and was fired for other reasons.[1]
However, I am somewhat confused by the new business venture Leopold is now promoting, an "AGI Hedge Fund" aimed at generating strong returns based on his predictions of imminent AGI. In the Dwarkesh Patel interview, it sounds like his intention is to make sure financial resources are available to back AI alignment and any other moves necessary to help Humanity navigate a turbulent future. However, the discussion in the podcast mostly focuses on whether such a fund would truly generate useful financial returns.
If you read this post, Leopold[2], could you please clarify your intentions in founding this fund?
- ^
Specifically he brings up a memo he sent to the old OpenAI board claiming OpenAI wasn't taking security seriously enough. He was also one of very few OpenAI employees not to sign the letter asking for Sam Altman's reinstatement last November, and of course, the entire OpenAI superaligment team has collapsed for various reasons as well.
- ^
Leopold does have a LessWrong account, but hasn't linked his new website here after some time. I hope he doesn't mind me posting in his stead.
I'm curious for opinions on what I think is a crux of Leopold's "Situational Awareness":
This disagrees with my own intuition - the gap between chatbot and agent seems stubbornly large. He suggests three main angles of improvement:[2]
We already have pretty large context windows (which has been surprising to me, admittedly), but they've helped less than I expected - I mostly just don't need to move relevant code right next to my cursor as much when using Copilot. I haven't seen really powerful use cases; the closest is probably Devin, but that doesn't work very well. Using large context windows on documents does reasonably well, but LLMs are too unreliable, biased towards the generic, and memoryless to get solid benefit out of that, in my personal experience.
Put another way, I think large context windows are of pretty limited benefit when LLMs have poor working memory and can't keep properly track of what they're doing over the course of their output.
That leads into the inference-time compute argument, both the weakest and the most essential. By my understanding, the goal is to give LLMs a working memory, but how we get there seems really fuzzy. The idea presented is to produce OOMs more tokens, and keep them on-track, but "keep them on-track" part in his writing feels like merely a restatement of the problem to me. The only substantial suggestion I can see is this single line:
And in a footnote on the same page he acknowledges:
Not trivial or baked-in to current AI progress, I think? Maybe I'm misunderstanding something.
As far as for enabling full computer access - yeah multi-modal models should allow this within a few years, but it remains of limited benefit if the working memory problem isn't solved.
EDIT 12/22/24: Well, it seems Leopold knew more than I did and just couldn't talk about it. We still don't have all the details on o3, but it really does seem like "more inference time compute" can be leveraged into reasoning capability.
Page 9 of the PDF.
Pages 34-37 of the PDF.
Page 36 of the PDF.
I think this will be done via multi-agent architectures ("society of mind" over an LLM).
This does require plenty of calls to an LLM, so plenty of inference time compute.
For example, the current leader of https://huggingface.co/spaces/gaia-benchmark/leaderboard is this relatively simple multi-agent concoction by a Microsoft group: https://github.com/microsoft/autogen/tree/gaia_multiagent_v01_march_1st/samples/tools/autogenbench/scenarios/GAIA/Templates/Orchestrator
I think that cu... (read more)