LESSWRONG
LW

1229
Cole Wyeth
4011Ω285437582
Message
Dialogue
Subscribe

I am a PhD student in computer science at the University of Waterloo, supervised by Professor Ming Li and advised by Professor Marcus Hutter.

My current research is related to applications of algorithmic probability to sequential decision theory (universal artificial intelligence). Recently I have been trying to start a dialogue between the computational cognitive science and UAI communities. Sometimes I build robots, professionally or otherwise. Another hobby (and a personal favorite of my posts here) is the Sherlockian abduction master list, which is a crowdsourced project seeking to make "Sherlock Holmes" style inference feasible by compiling observational cues. Give it a read and see if you can contribute!

See my personal website colewyeth.com for an overview of my interests and work.

I do ~two types of writing, academic publications and (lesswrong) posts. With the former I try to be careful enough that I can stand by ~all (strong/central) claims in 10 years, usually by presenting a combination of theorems with rigorous proofs and only more conservative intuitive speculation. With the later, I try to learn enough by writing that I have changed my mind by the time I'm finished - and though I usually include an "epistemic status" to suggest my (final) degree of confidence before posting, the ensuing discussion often changes my mind again. As of mid-2025, I think that the chances of AGI in the next few years are high enough (though still <50%) that it’s best to focus on disseminating safety relevant research as rapidly as possible, so I’m focusing less on long-term goals like academic success and the associated incentives. That means most of my work will appear online in an unpolished form long before it is published.

Sequences

Posts

Sorted by New

Wikitag Contributions

Comments

Sorted by
Newest
I recklessly speculate about timelines
Meta-theory of rationality
AIXI Agent foundations
Deliberative Algorithms as Scaffolding
5Cole Wyeth's Shortform
Ω
1y
Ω
250
Cole Wyeth's Shortform
Cole Wyeth4d22

Semantics; it’s obviously not equivalent to physical violence. 

Reply
AI 2027: What Superintelligence Looks Like
Cole Wyeth6mo68

I expect this to start not happening right away.

So at least we’ll see who’s right soon.

Reply
tdko's Shortform
Cole Wyeth7h50

The graph is slightly more informative: https://metr.org/blog/2025-03-19-measuring-ai-ability-to-complete-long-tasks/

I think it's fair to say I predicted this - I expected exponential growth in task length to become a sigmoid in the short term: 

In particular, I expected that Claude's decreased performance on Pokemon with Sonnet 4.5 indicated that it's task length would not be very high. Certainly not 30 hours, though I understand that Anthropic did not claim 30 hours human equivalent, I still believe that their claim of 30 hours continuous software engineering seems dubious - what exactly does that number actually mean, if it does not indicate even 2 hours of human-equivalent autonomy? I can write a program that "remains coherent" while "working continuously" for 30 hours by simply throttling GPT-N to work very slowly by throttling tokens/hour for any N >= 3. This result decreases my trust in Anthropic's PR machine (which was already pretty low).

To be clear, this is only one data point and I may well be proven wrong very soon.

However, I think we can say that the “faster exponential” for inference scaling which some people expected did not hold up.

Reply
Daniel Kokotajlo's Shortform
Cole Wyeth1d20

That’s interesting - I used to play with my brother’s cube but never thought of building a cube as part of the game. 

Reply
Daniel Kokotajlo's Shortform
Cole Wyeth1d30

The only magic format that I found consistently fun was draft. I think the game is just not very well optimized for expected fun. Like, you have to carefully balance LANDS with your real cards, and most of the lands are nearly forced to be boring, and even if you get the balance right sometimes you don’t get to place because the lands don’t come up. I guess this is partially a skill issue but I played for years and this happened to me and all of my usual opponents regularly as far as I remember. Why would you design a game so that you can get unlucky and not get to play, like, at all? And then on top of that often someone just gets a wild combo and stomps and it’s not fun to get stomped and it’s not that fun to stump either. 

Reply
Cole Wyeth's Shortform
Cole Wyeth2d42

GPT-5’s insight on Scott Aaronson’s research problem (which I posted about) seems to be a weaker signal than I believed, see the update:

https://www.lesswrong.com/posts/RnKmRusmFpw7MhPYw/cole-wyeth-s-shortform?commentId=E5QCkGcs4eYoJNwhn
 

Reply
Cole Wyeth's Shortform
Cole Wyeth2d60

Has any LLM with fixed scaffolding beaten Pokémon end to end with no hints?

Reply
abramdemski's Shortform
Cole Wyeth3d20

That’s an interesting point

Reply
Cole Wyeth's Shortform
Cole Wyeth3d50

I don’t think it’s hyperbolic at all; I think this is in fact a central instance of the category I’m gesturing at as “epistemic violence.” For instance, p-hacking, lying, manipulation, misleading data, etc. If you don’t think that category is meaningful or you dislike my name for it, can you be more specific about why? Or why this is not an instance? Another commenter @Guive objected to my usage of the word violence here because “words can’t be violence” which is I think a small skirmish of a wider culture war which I am really not trying to talk about. 

To be explicit (again) I do not in any way want to imply that somehow a person using an LLM without disclosing it justifies physical violence against them. I also don’t think it’s intentionally an aggression. But depending on the case, it CAN BE seriously negligent towards the truth and community truth seeking norms, and in that careless negligence it can damage the epistemics of others, when a simple disclaimer / “epistemic status” / source would have been VERY low effort to add. I have to admit I hesitate to say this so explicitly a bit because many people I respect use LLMs extensively, and I am not categorically against this, and I feel slightly bad about potentially burdening or just insulting them - generally speaking I feel some degree of social pressure against saying this. And as a result I hesitate to back down from my framing, without a better reason than that it feels uncomfortable and some people don’t like it. 

Reply
Do Things for as Many Reasons as Possible
Cole Wyeth4d60

I also use this heuristic. I originally read it here: 

https://www.goodreads.com/quotes/8668072-never-do-something-for-just-one-reason
 
Obviously this form is too strong.

Reply
Load More
63Alignment as uploading with more steps
Ω
26d
Ω
33
16Sleeping Experts in the (reflective) Solomonoff Prior
Ω
1mo
Ω
0
53New Paper on Reflective Oracles & Grain of Truth Problem
Ω
1mo
Ω
0
46Launching new AIXI research community website + reading group(s)
2mo
2
26Pitfalls of Building UDT Agents
Ω
2mo
Ω
5
16Explaining your life with self-reflective AIXI (an interlude)
Ω
3mo
Ω
0
29Unbounded Embedded Agency: AEDT w.r.t. rOSI
Ω
3mo
Ω
0
19A simple explanation of incomplete models
Ω
3mo
Ω
1
65Paradigms for computation
Ω
3mo
Ω
10
31LLM in-context learning as (approximating) Solomonoff induction
Ω
4mo
Ω
3
Load More
AIXI
9 months ago
(+11/-174)
Anvil Problem
a year ago
(+119)