Strategic awareness tools: design sketches

rosehadshar, owencb, Lizka, Oliver Sourbut

This post is part of a sequence. Previous post: Design sketches for angels-on-the shoulder

We’ve recently published a set of design sketches for tools for strategic awareness.

We think that near-term AI could help a wide variety of actors to have a more grounded and accurate perspective on their situation, and that this could be quite important:

Tools for strategic awareness could make individuals more epistemically empowered and better able to make decisions in their own best interests.
Better strategic awareness could help humanity to handle some of the big challenges that are heading towards us as we transition to more advanced AI systems.

We’re excited for people to build tools that help this happen, and hope that... (read 166 more words →)

Design sketches for a more sensible world

owencb

owencb, Lizka, Oliver Sourbut, rosehadshar

We don’t think that humanity knows what it’s doing when it comes to AI progress. More and more people are working on developing better systems and trying to understand what their impacts will be — but our foresight is just very limited, and things are getting faster and faster.

Imagine a world where this continues to be the state of play. We fumble our way to creating the most important technology humanity will ever create, an epoch defining technology, and we’re basically making it up as we go along. Of course, this might pan out OK (after all, in the past we’ve often muddled through), but facing the challenges advanced AI will bring... (read 1135 more words →)

Design sketches for angels-on-the-shoulder

owencb

owencb, Lizka, Oliver Sourbut, rosehadshar

This post is part of a sequence. Previous post: Design sketches: collective epistemics | Next post: Strategic awareness tools: design sketches

We’ve recently published a set of design sketches for technological analogues to ‘angels-on-the-shoulder’: customized tools that leverage near-term AI systems to help people better navigate their environments and handle tricky situations in ways they’ll feel good about later.

We think that these tools could be quite important:

In general, we expect angels-on-the-shoulder to mean more endorsed decisions, and fewer unforced errors.
In the context of the transition to more advanced AI systems that we’re faced with, this could be a huge deal. We think that people who are better informed, more situationally aware, more in

... (read 359 more words →)

Replying toHow (and why) to read Drexler on AI

owencb21d*

How (and why) to read Drexler on AI

I take this to be pretty strong evidence that this is not a good article for people reading Drexler to start with! (FWIW I valued reading it, but I'm now realising that the value I got was largely in understanding a bit better how Eric's sweep of ideas connect, and perhaps that wouldn't have been available to me if I hadn't had the background context.)

Edit: I edited the original post to change the recommendation there slightly.

Replying toHow (and why) to read Drexler on AI

owencb22d

How (and why) to read Drexler on AI

I feel like trying properly to explain it would veer more into speculating-about-his-psychology than I really want to. But it doesn't seem totally inexplicable to me, and I'd imagine that an explanation might look something like:

Eric doesn't think it's his comparative advantage to answer these questions; he also sometimes experiences people raising them as distracting from the core messages he is trying to convey.

(To be clear, I'm not claiming that this is what is happening; I'm just trying to explain why it doesn't feel in-principle inexplicable to me.)

-2

How (and why) to read Drexler on AI

owencb

22d

I have been reading Eric Drexler’s writing on the future of AI for more than a decade at this point. I love it, but I also think it can be tricky or frustrating.

More than anyone else I know, Eric seems to tap into a deep vision for how the future of technology may work — and having once tuned into this, I find many other perspectives can feel hollow. (This reminds me of how, once I had enough of a feel for how economies work, I found a lot of science fiction felt hollow, if the world presented made too little sense in terms of what was implied for off-screen variables.)

One cornerstone... (read 1570 more words →)

Replying toOn green

owencb1mo

On green

I think it's easy to locally adopt bits of Greenish perspective when one can see how they would be instrumentally useful.

The claim I'm making is that it's often a good idea to adopt bits of Greenish perspective even when you can't see how they would be instrumentally useful -- because a reasonable chunk of the time they will be instrumentally useful and you just can't see it yet.

I don't think that requires adopting Green's justifications as terminal, but it does require you to adopt some generator-of-Greenish-perspective that isn't just "Blue led me to a Greenish conclusion in this particular case".

Replying toOn green

owencb1mo

On green

I feel like you're rejecting the instrumental value of Green based on a particular story you've invented about why green might be instrumentally valuable.

But ... IDK, it seems to me like a lot of the value of Green relates to being a boundedly rational actor, in a world with other minds. When I envision a world with a bunch of actors who appreciate Green and try to stay connected to that in their actions, I think they're less likely to fuck it up than a world with the same actors who otherwise disregard green. I think they're more likely to respect Chesterton's fences (and not cause unilateralist's curse type catastrophes), and they're... (read more)

Replying toActing Wholesomely

owencb2mo

Acting Wholesomely

This makes a lot of sense! I do find that the way my brain wants to fit your experience in with my conception of wholesomeness is that you were perhaps not attending enough to the part of the whole that was your own internal experience and needs?

Replying toActing Wholesomely

owencb2mo

Acting Wholesomely

Oh nice, I kind of vibe with "meditation on a theme" as a description of what this post is doing and failing to do.

Replying toThe Choice Transition

owencb2moReview for 2024 Review

The Choice Transition

Overall I'm really happy with this post.

It crystallized a bunch of thoughts I'd had for a while before this, and has been useful as a conceptual building block that's fed into my general thinking about the situation with AI, and the value of accelerating tools to improve epistemics and coordination. I often find myself wanting to link people to it.

Possible weaknesses:

While I think the basic analysis looks directionally correct, I wonder if in places it's a bit oversimplified (like maybe you could usefully unpack the concept of "deliberate steering")
- Seems fine for early work on this?
Empirically, the concept doesn't seem to have been catchy
- I think that we've done more to spread the general

owencb2moReview for 2024 Review

AI, centralization, and the One Ring

This was written a few months after Situational Awareness. I felt like there was kind of a missing mood in x-risk discourse around that piece, and this was an attempt to convey both the mood and something of the generators of the mood.

Since then, the mood has shifted, to something that feels healthier to me. 80,000 Hours has a problem profile on extreme power concentration. At this point I mostly wouldn't link back to this post (preferring to link e.g. to more substantive research), although I might if I just really wanted to convey the mood to someone. I'm not really sure whether my article had any counterfactual responsibility for the research people have done in the interim.

Replying toCaring about excellence

owencb2moReview for 2024 Review

Caring about excellence

I'm happy with this post. I think it captures something meta-level which is important in orienting to doing a good job of all sorts of work, and I occasionally want to point people to this.

Most of the thoughts probably aren't super original, but for something this important I am surprised that there isn't much more explicit discussion -- it seems like it's often just talked about at the level of a few sentences, and regarded as a matter of taste, or something. For people who aspire to do valuable work, I guess it's generally worth spending a few hours a year explicitly thinking about the tradeoffs here and how to navigate them in particular situations -- and then probably worth at least a bit of scaffolding or general thinking about the topic.

Replying toDecomposing Agency — capabilities without desires

owencb2moReview for 2024 Review

Decomposing Agency — capabilities without desires

I like this post and am glad that we wrote it.

Despite that, I feel keenly aware that it's asking a lot more questions than it's answering. I don't think I've got massively further in the intervening year in having good answers to those questions. The way this thinking seems to me to be most helpful is as a background model to help avoid confused assumptions when thinking about the future of AI. I do think this has impacted the way I think about AI risk, but I haven't managed to articulate that well yet (maybe in 2026 ...).

Human Dignity: a review

owencb

2mo

I have in my possession a short document purporting to be a manifesto from the future.

That’s obviously absurd, but never mind that. It covers some interesting ground, and the second half is pretty punchy. Let’s discuss it.

Principles for Human Dignity in the Age of AI
Humanity is approaching a threshold. The development of artificial intelligence promises extraordinary abundance — the end of material poverty, liberation from disease, tools that amplify human potential beyond current imagination. But it also challenges the fundamental assumptions of human existence and meaning. When machines surpass us in all domains, where will we find our purpose? When our choices can be predicted and shaped by systems we do not

... (read 1975 more words →)

Embedded Altruism [slides]

owencb

7mo

How should we think about doing good, when we're a part of a world which is too complex for us to fully understand?

Pragmatically, I think the answer should look less like figure out the best thing to do and then do that, and more like a combination of working out how to act well in the roles we have assumed, and strategically thinking about which areas to move into.

I'm afraid this is sort of half-baked stuff! But it's been half-baked for several years, and in the last few months I've found myself a few times linking people to my old slides, so it seemed like it could be useful to just make public, even though it's still imperfect.

I'd love to hear pushback / refinements / requests for more concrete examples, etc.

The crucible — how I think about the situation with AI

owencb

9mo

The basic situation

The world is wild and terrible and wonderful and rushing forwards so so fast.

Modern economies are tremendous things, allowing crazy amounts of coordination. People have got really very good at producing stuff. Long-term trends are towards more affluence, and less violence.

The enlightenment was pretty fantastic not just for bringing us better tech, but also more truthseeking, better values, etc.

People, on the whole, are basically good — they want good things for others, and they want to be liked, and they want the truth to come out. This is some mix of innate and socially conditioned. (It isn’t universal.) But they also often are put in a tight spot and end... (read 2235 more words →)

Disempowerment spirals as a likely mechanism for existential catastrophe

Raymond Douglas

Raymond Douglas, owencb

10mo

When complex systems fail, it is often because they have succumbed to what we call "disempowerment spirals" — self-reinforcing feedback loops where an initial threat progressively undermines the system's capacity to respond, leading to accelerating vulnerability and potential collapse.

Consider a city gradually falling under the control of organized crime. The criminal organization doesn't simply overpower existing institutions through sheer force. Rather, it systematically weakens the city's response mechanisms: intimidating witnesses, corrupting law enforcement, and cultivating a reputation that silences opposition. With each incremental weakening of response capacity, the criminal faction acquires more power to further dismantle resistance, creating a downward spiral that can eventually reach a point of no return.

This basic pattern... (read 1468 more words →)

Knowledge, Reasoning, and Superintelligence

owencb

11mo

What makes people good at solving novel problems? In part, it’s having relevant knowledge. And in part, it’s being able to think flexibly to apply that knowledge. So intelligence isn't a scalar: there are at least two dimensions — often called crystallized intelligence and fluid intelligence.

This is also broadly true of AI. We can see the distinction between language models regurgitating facts and reasoning models tracing out implications. We can see the difference between the performance of AlphaGo’s policynet, and what it achieves with Monte Carlo tree search. And we can imagine this distinction even for superintelligent systems.

This is a conceptual discussion. But the motivation is pragmatic. If we have a better... (read 2068 more words →)

AI Tools for Existential Security

Lizka

Lizka, owencb

Rapid AI progress is the greatest driver of existential risk in the world today. But — if handled correctly — it could also empower humanity to face these challenges.

Executive summary

1. Some AI applications will be powerful tools for navigating existential risks

Three clusters of applications are especially promising:

Epistemic applications to help us anticipate and plan for emerging challenges
- e.g. high-quality AI assistants could prevent catastrophic decisions by helping us make sense of rapidly evolving situations
Coordination-enabling applications to help diverse groups work together towards shared goals
- e.g. automated negotiation could help labs and nations to find and commit to mutually desirable alternatives to racing
Risk-targeted applications to address specific challenges
- e.g. automating alignment research could make the difference between “It’s functionally impossible to bring

... (read 3144 more words →)

Just a prompt to say that if you've been kicking around an idea of possible relevance to the essay competition on the automation of wisdom and philosophy, now might be the moment to consider writing it up -- entries are due in three weeks.

LESSWRONG
LW

LESSWRONG
LW

owencb

Decomposing Agency — capabilities without desires

In favour of exploring nagging doubts about x-risk

On the future of language models

Essay competition on the Automation of Wisdom and Philosophy — $25k in prizes

owencb

owencb

Strategic awareness tools: design sketches

Design sketches for a more sensible world

Design sketches for angels-on-the-shoulder

How (and why) to read Drexler on AI

Human Dignity: a review

Embedded Altruism [slides]

The crucible — how I think about the situation with AI

On Wholesomeness

owencb

Decomposing Agency — capabilities without desires

In favour of exploring nagging doubts about x-risk

On the future of language models

Essay competition on the Automation of Wisdom and Philosophy — $25k in prizes

owencb

owencb

Strategic awareness tools: design sketches

Design sketches for a more sensible world

Design sketches for angels-on-the-shoulder

How (and why) to read Drexler on AI

Human Dignity: a review

Embedded Altruism [slides]

The crucible — how I think about the situation with AI

On Wholesomeness

Principles for Human Dignity in the Age of AI

The basic situation

Executive summary