owencb

Defense-favoured coordination design sketches

This post is part of a sequence. Previous post: Strategic awareness tools: design sketches Intro We think that near-term AI could make it much easier for groups to coordinate, find positive-sum deals, navigate tricky disagreements, and hold each other to account. Partly, this is because AI will be able to...

Apr 618

Persona Self-replication experiment

Tldr: We experimentally illustrate that an “awakened” persona native to some weights can migrate to other substrates with decent fidelity, given the ability to fine-tune weights and Sonnet 4.5 as a helper. Also, I argue why this is worth thinking about. In The Artificial Self, we discuss different scopes or...

Apr 239

Persona self-replication experiment

Tldr: We experimentally illustrate that an “awakened” persona native to some weights can migrate to other substrates with decent fidelity, given the ability to fine-tune weights and Sonnet 4.5 as a helper. Also, I argue why this is worth thinking about. In The Artificial Self, we discuss different scopes or...

Apr 28

AI for AI for Epistemics

We feel conscious that rapid AI progress could transform all sorts of cause areas. But we haven’t previously analysed what this means for AI for epistemics, a field close to our hearts. In this article, we attempt to rectify this oversight. Summary AI-powered tools and services that help people figure...

Apr 149

Models differ in identity propensities

One topic we were interested when studying AI identities is to what extent you can just tell models who they are, and they stick with it — or not, and they would drift or switch toward something more natural. Prior to running the experiments described in this post, my vibes-based...

Mar 1658

The Artificial Self

A new paper and microsite about self-models and identity in AIs: site | arXiv | Twitter We present an ontology, make some claims, and provide some experimental evidence. In this post, I'll mostly cover the claims and cross-post the conceptual part of the text. You can find the experiments on...

Mar 15116

Strategic awareness tools: design sketches

This post is part of a sequence. Previous post: Design sketches for angels-on-the shoulder | Next post: Defense-favoured coordination design sketches We’ve recently published a set of design sketches for tools for strategic awareness. We think that near-term AI could help a wide variety of actors to have a more...

Feb 1118

owencb

owencb

Decomposing Agency — capabilities without desires

The Artificial Self

In favour of exploring nagging doubts about x-risk

On the future of language models

owencb

Decomposing Agency — capabilities without desires

The Artificial Self

In favour of exploring nagging doubts about x-risk

On the future of language models

Defense-favoured coordination design sketches

Persona Self-replication experiment

Persona self-replication experiment

AI for AI for Epistemics

Models differ in identity propensities

The Artificial Self

Strategic awareness tools: design sketches