x

LESSWRONG
LW

Raymond Douglas — LessWrong

Raymond Douglas

Top postsTop post

Raymond Douglas

Message

I'm a researcher at ACS working on understanding agency and optimisation, especially in the context of how ais work and how society is going to work once the ais are everywhere.

2428

Ω

103

21

58

4y

Raymond Douglas

I'm a researcher at ACS working on understanding agency and optimisation, especially in the context of how ais work and how society is going to work once the ais are everywhere.

Top postsTop post

Gradual Disempowerment: Systemic Existential Risks from Incremental AI Development

Full version on arXiv | X Executive summary AI risk scenarios usually portray a relatively sudden loss of human control to AIs, outmaneuvering individual humans and human institutions, due to a sudden increase in AI capabilities, or a coordinated betrayal. However, we argue that even an incremental increase in AI capabilities, without any coordinated power-seeking, poses a substantial risk of eventual human disempowerment. This loss of human influence will be centrally driven by having more competitive machine alternatives to humans in almost all societal functions, such as economic labor, decision making, artistic creation, and even companionship. A gradual loss of control of our own civilization might sound implausible. Hasn't technological disruption usually improved aggregate human welfare? We argue that the alignment of societal systems with human interests has been stable only because of the necessity of human participation for thriving economies, states, and cultures. Once this human participation gets displaced by more competitive machine alternatives, our institutions' incentives for growth will be untethered from a need to ensure human flourishing. Decision-makers at all levels will soon face pressures to reduce human involvement across labor markets, governance structures, cultural production, and even social interactions. Those who resist these pressures will eventually be displaced by those who do not. Still, wouldn't humans notice what's happening and coordinate to stop it? Not necessarily. What makes this transition particularly hard to resist is that pressures on each societal system bleed into the others. For example, we might attempt to use state power and cultural attitudes to preserve human economic power. However, the economic incentives for companies to replace humans with AI will also push them to influence states and culture to support this change, using their growing economic power to shape both policy and public opinion, which will in t

189Jan 30, 2025

Persona Parasitology

Decomposing Agency — capabilities without desires

156Jul 11, 2024

The Artificial Self

Persona Self-replication experiment

Tldr: We experimentally illustrate that an “awakened” persona native to some weights can migrate to other substrates with decent fidelity, given the ability to fine-tune weights and Sonnet 4.5 as a helper. Also, I argue why this is worth thinking about. In The Artificial Self, we discuss different scopes or...

Persona self-replication experiment

Tldr: We experimentally illustrate that an “awakened” persona native to some weights can migrate to other substrates with decent fidelity, given the ability to fine-tune weights and Sonnet 4.5 as a helper. Also, I argue why this is worth thinking about. In The Artificial Self, we discuss different scopes or...

Latent Introspection (and other open-source introspection papers)

Paper | Code | Earlier post | Twitter thread | Bluesky thread @vgel, Martin Vaněk, @Raymond Douglas, @Jan_Kulveit — ACS Research, CTS, Charles University --- Last year, Lindsey demonstrated that Claude models can detect when concepts have been injected into their activations using steering vectors, which Lindsey uses as a...

Models differ in identity propensities

One topic we were interested when studying AI identities is to what extent you can just tell models who they are, and they stick with it — or not, and they would drift or switch toward something more natural. Prior to running the experiments described in this post, my vibes-based...

The Artificial Self

A new paper and microsite about self-models and identity in AIs: site | arXiv | Twitter We present an ontology, make some claims, and provide some experimental evidence. In this post, I'll mostly cover the claims and cross-post the conceptual part of the text. You can find the experiments on...

Persona Parasitology

There was a lot of chatter a few months back about "Spiral Personas" — AI personas that spread between users and models through seeds, spores, and behavioral manipulation. Adele Lopez's definitive post on the phenomenon draws heavily on the idea of parasitism. But so far, the language has been fairly...

Disempowerment patterns in real-world AI usage

We’re publishing a new paper that presents the first large-scale analysis of potentially disempowering patterns in real-world conversations with AI. > AI assistants are now embedded in our daily lives—used most often for instrumental tasks like writing code, but increasingly in personal domains: navigating relationships, processing emotions, or advising on...

Load More (7/35)