LESSWRONG
LW

1991
Cole Wyeth
3593Ω262436972
Message
Dialogue
Subscribe

I am a PhD student in computer science at the University of Waterloo, supervised by Professor Ming Li and advised by Professor Marcus Hutter.

My current research is related to applications of algorithmic probability to sequential decision theory (universal artificial intelligence). Recently I have been trying to start a dialogue between the computational cognitive science and UAI communities. Sometimes I build robots, professionally or otherwise. Another hobby (and a personal favorite of my posts here) is the Sherlockian abduction master list, which is a crowdsourced project seeking to make "Sherlock Holmes" style inference feasible by compiling observational cues. Give it a read and see if you can contribute!

See my personal website colewyeth.com for an overview of my interests and work.

I do ~two types of writing, academic publications and (lesswrong) posts. With the former I try to be careful enough that I can stand by ~all (strong/central) claims in 10 years, usually by presenting a combination of theorems with rigorous proofs and only more conservative intuitive speculation. With the later, I try to learn enough by writing that I have changed my mind by the time I'm finished - and though I usually include an "epistemic status" to suggest my (final) degree of confidence before posting, the ensuing discussion often changes my mind again. As of mid-2025, I think that the chances of AGI in the next few years are high enough (though still <50%) that it’s best to focus on disseminating safety relevant research as rapidly as possible, so I’m focusing less on long-term goals like academic success and the associated incentives. That means most of my work will appear online in an unpolished form long before it is published.

Sequences

Posts

Sorted by New

Wikitag Contributions

Comments

Sorted by
Newest
I recklessly speculate about timelines
Meta-theory of rationality
AIXI Agent foundations
Deliberative Algorithms as Scaffolding
5Cole Wyeth's Shortform
1y
192
AI 2027: What Superintelligence Looks Like
Cole Wyeth5mo68

I expect this to start not happening right away.

So at least we’ll see who’s right soon.

Reply
What, if not agency?
Cole Wyeth19m20

I think I agree with your take on this Abram.

The most extreme version of an AI not being self-defensive seems like the Greg Egan “permutation city” story where shutting down a simulation doesn’t even harm anyone inside - that computation just picks some other subtract “out of the dust.”

By the way, this post dovetails interestingly with my latest on alignment as uploading with more steps.

Reply
A brief argument against utilitarianism
Cole Wyeth2h20

We don't confused metrics for utility functions.

Reply
Kabir Kumar's Shortform
Cole Wyeth17h40

What kind of knowledge specifically are these lawyers looking for?

Reply
Alignment as uploading with more steps
Cole Wyeth20h20

I don’t necessarily disagree that these guesses are plausible, but I don’t think it’s possible to predict exactly what emulation world ends up looking like, and even your high level description of the dynamics looks very likely to be wrong.

The goal is to become one of the early emulations and shape the culture, regulations, technology etc. into a positive and stable form - or at least, into carefully chosen initial conditions.

Reply
Alignment as uploading with more steps
Cole Wyeth20h20

There’s a difference between building a model of a person and using that model as a core element of your decision making algorithm. So what you’re describing seems even weaker than weak necessity.

However, I agree that some of the ideas I’ve sketched are pretty loose. I’m trying to provide a conceptual frame and work out some of the implications only. 

Reply
Alignment as uploading with more steps
Cole Wyeth1d74

Skill issue, past me endorses current me. 

Reply
Alignment as uploading with more steps
Cole Wyeth1d30

Yes, I think what you’re describing is basically CIRL? This can potentially achieve incremental uploading. I just see it as technically more challenging than pure imitation learning. However, it seems conceivable that something like CIRL is needed during some kind of “takeoff” phase, when the (imitation learned) agent tries to actively learn how it should generalize by interacting with the original over longer time scales and while operating in the world. That seems pretty hard to get right. 

Reply
Alignment as uploading with more steps
Cole Wyeth1d42

This is interesting, but I again caution that fine tuning a foundation model is unlikely to result in an emulation which generalizes properly. Same (but worse) for prompting. 

Reply
Alignment as uploading with more steps
Cole Wyeth1d20

Maybe, but we usually endorse the way that our values change over time, so this isn’t necessarily a bad thing.

Also, I find it hard to imagine hating my past self so much that I would want to kill him or allow him to be killed. I feel a certain protectiveness and affection for my self 10 or 15 years ago. So I feel like at least weak upload sufficiency should hold, do you disagree?

Reply
Load More
AIXI
8 months ago
(+11/-174)
Anvil Problem
a year ago
(+119)
57Alignment as uploading with more steps
1d
17
16Sleeping Experts in the (reflective) Solomonoff Prior
Ω
15d
Ω
0
53New Paper on Reflective Oracles & Grain of Truth Problem
Ω
21d
Ω
0
46Launching new AIXI research community website + reading group(s)
1mo
2
26Pitfalls of Building UDT Agents
Ω
2mo
Ω
5
16Explaining your life with self-reflective AIXI (an interlude)
Ω
2mo
Ω
0
29Unbounded Embedded Agency: AEDT w.r.t. rOSI
Ω
2mo
Ω
0
19A simple explanation of incomplete models
Ω
2mo
Ω
1
64Paradigms for computation
Ω
3mo
Ω
10
31LLM in-context learning as (approximating) Solomonoff induction
Ω
3mo
Ω
3
Load More