I am a PhD student in computer science at the University of Waterloo, supervised by Professor Ming Li and advised by Professor Marcus Hutter.
My current research is related to applications of algorithmic probability to sequential decision theory (universal artificial intelligence). Recently I have been trying to start a dialogue between the computational cognitive science and UAI communities. Sometimes I build robots, professionally or otherwise. Another hobby (and a personal favorite of my posts here) is the Sherlockian abduction master list, which is a crowdsourced project seeking to make "Sherlock Holmes" style inference feasible by compiling observational cues. Give it a read and see if you can contribute!
See my personal website colewyeth.com for an overview of my interests and work.
I do ~two types of writing, academic publications and (lesswrong) posts. With the former I try to be careful enough that I can stand by ~all (strong/central) claims in 10 years, usually by presenting a combination of theorems with rigorous proofs and only more conservative intuitive speculation. With the later, I try to learn enough by writing that I have changed my mind by the time I'm finished - and though I usually include an "epistemic status" to suggest my (final) degree of confidence before posting, the ensuing discussion often changes my mind again. As of mid-2025, I think that the chances of AGI in the next few years are high enough (though still <50%) that it’s best to focus on disseminating safety relevant research as rapidly as possible, so I’m focusing less on long-term goals like academic success and the associated incentives. That means most of my work will appear online in an unpolished form long before it is published.
It looks like AI 2027 was posted on April 3rd, 2025?
In that case, August was about 4 months away, which means late September is 20-25% slower than projected and we are still a few percentage points short - seems reasonable to expect the scores they predicted sometime in October or November, but that is still say 40-50% over their prediction.
The authors have emphasized repeatedly that AI 2027 was and is faster than their mode scenario, which makes doing this kind of evaluation annoying, but I would have to say that things look significantly behind the specific story in that piece. The reason I am saying this is that it is a bit of an overstatement to praise their predictive accuracy on mid-2025 predictions which they made in early-mid 2025, when their predictive accuracy is off on the scale of a month or two, and their predictions for 2025 were not viewed as particularly radical or unexpected at the time as far as I remember. It seems to me that even a hardcore skeptic of AI 2027 would have been unlikely to predict a much larger error.
(I believe I did myself leave a comment like "I expect this to start not happening right away" but in follow-up conversation specified that I was not talking about 2025).
Still, I appreciate that you are checking in on the accuracy of their story with real numbers.
Hmm I just imagine Saturday and Sunday on top with the other days in a lower row.
So that was your idea! Aram Ebtekar and I have been going around suggesting this but couldn’t remember who initially proposed it at Iliad 2.
Any more details on Pokémon performance?
I kind of expected an improvement after hearing (Anthropic’s unverified claims) it could work on SWE for 30 hours etc. Is zero-shotting games just not as closely connected to performing other long-term tasks as I thought? If it can’t beat Pokémon, it’s hard for me to believe it can have a very long (METR) task length score. It seems that multiple hour projects start to require some serious planning and online learning (and even maybe perception eventually, but maybe perception is the big difference?).
That is… an interesting question.
Are you implying they have over 4 hour task length? I’m confused about what you’re updating on.
I think that’s reasonable, but Anthropic’s claim would still be a little misleading if it can only do tasks that take humans 2 hours. Like, if you claim that your model can work for 30 hours, people don’t have much way to conceptualize what that means except in terms of human task lengths. After all, I can make GPT-4 remain coherent for 30 hours by running it very slowly. So what does this number actually mean?
Well, dividing by <7 would give >4 hours which is on trend ;)
Actually that would be somewhat ahead still.
I mean, if true, or even close to true, it would totally change my model of AI progress.
If the task length is like 4 hours then I’ll just update towards not believing Anthropic about this stuff.
I expect this to start not happening right away.
So at least we’ll see who’s right soon.