Lizka

LESSWRONG
LW

Lizka — LessWrong

Strategic awareness tools: design sketches

rosehadshar, owencb, Lizka, Oliver Sourbut

This post is part of a sequence. Previous post: Design sketches for angels-on-the shoulder

We’ve recently published a set of design sketches for tools for strategic awareness.

We think that near-term AI could help a wide variety of actors to have a more grounded and accurate perspective on their situation, and that this could be quite important:

Tools for strategic awareness could make individuals more epistemically empowered and better able to make decisions in their own best interests.
Better strategic awareness could help humanity to handle some of the big challenges that are heading towards us as we transition to more advanced AI systems.

We’re excited for people to build tools that help this happen, and hope that... (read 166 more words →)

Design sketches for a more sensible world

owencb

owencb, Lizka, Oliver Sourbut, rosehadshar

We don’t think that humanity knows what it’s doing when it comes to AI progress. More and more people are working on developing better systems and trying to understand what their impacts will be — but our foresight is just very limited, and things are getting faster and faster.

Imagine a world where this continues to be the state of play. We fumble our way to creating the most important technology humanity will ever create, an epoch defining technology, and we’re basically making it up as we go along. Of course, this might pan out OK (after all, in the past we’ve often muddled through), but facing the challenges advanced AI will bring... (read 1135 more words →)

Design sketches for angels-on-the-shoulder

owencb

owencb, Lizka, Oliver Sourbut, rosehadshar

This post is part of a sequence. Previous post: Design sketches: collective epistemics | Next post: Strategic awareness tools: design sketches

We’ve recently published a set of design sketches for technological analogues to ‘angels-on-the-shoulder’: customized tools that leverage near-term AI systems to help people better navigate their environments and handle tricky situations in ways they’ll feel good about later.

We think that these tools could be quite important:

In general, we expect angels-on-the-shoulder to mean more endorsed decisions, and fewer unforced errors.
In the context of the transition to more advanced AI systems that we’re faced with, this could be a huge deal. We think that people who are better informed, more situationally aware, more in

... (read 359 more words →)

AI benchmarking has a Y-axis problem

Lizka

11d

TLDR: People plot benchmark scores over time and then do math on them, looking for speed-ups & inflection points, interpreting slopes, or extending apparent trends. But that math doesn’t actually tell you anything real unless the scores have natural units. Most don’t.

Think of benchmark scores as funhouse-mirror projections of “true” capability-space, which stretch some regions and compress others by assigning warped scores for how much accomplishing that task counts in units of “AI progress”. A plot on axes without canonical units will look very different depending on how much weight we assign to different bits of progress.^[1]

Epistemic status: I haven’t vetted this post carefully, and have no real background in benchmarking or... (read 2055 more words →)

The first type of transformative AI?

Lizka

1mo

AI risk discussion often seems to assume that the AI we most want to prepare for will emerge in a “normal” world — one that hasn’t really been transformed by earlier AI systems.

I think betting on this assumption could be a big mistake. If it turns out to be wrong, most of our preparation for advanced AI could end up ~worthless, or at least far less effective than it could have been. We might find ourselves wishing that we’d laid the groundwork for leveraging enormous political will, prepared for government inadequacy, figured out how to run large-scale automated research projects, and so on. Moreover, if earlier systems do change the background situation, influencing how... (read 230 more words →)

Lizka's Shortform

Lizka

2mo

This is a special post for quick takes (aka "shortform"). Only the owner can create top-level comments.

Lizka2moQuick Take

When thinking about the impacts of AI, I’ve found it useful to distinguish between different reasons for why automation in some area might be slow. In brief:

raw performance issues
trust bottlenecks
intrinsic premiums for “the human factor”
adoption lag
motivated/active protectionism towards humans

I’m posting this mainly because I’ve wanted to link to this a few times now when discussing questions like "how should we update on the shape of AI diffusion based on...?". Not sure how helpful it will be on its own! (Crossposted from the EA Forum.)

In a bit more detail:

(1) Raw performance issues

There’s a task that I want an AI system to do. An AI system might be able to do it in the... (read 826 more words →)

Lizka3mo

For what it's worth:

If I ask ChatGPT to illustrate how *I* might be feeling

If I ask it to illustrate a random person's feelings

I was also curious if the "don't worry about what I do or don't want to see" bit was doing work here & tried again without it; don't think it made much of a difference:

(I also asked a follow-up here, and found it interesting.)

If I ask it to illustrate Claude's feelings

If I prompt it to consider less human/English ways of thinking about the question / expressing itself

If I tell it to illustrate what inner experiences it expects to have in the future

If I ask it to illustrate how I expect it to feel

A personal take on why you should work at Forethought (maybe)

Lizka

4mo

Basic facts:

Forethought is hiring; apply for a research role by 2 November.
The two open positions are for (i) “Senior Research Fellows” — people who can lead their own research directions, and (ii) “Research Fellows” — people who aren’t ready to lead an agenda yet, but who could work with others and develop their worldviews and research taste. Forethought is also open to hiring “visiting fellows” who would join for a 3-12-month stint.
And you can refer people to get a bounty of up to £10,000.

In the rest of this post, I sketch out more of my personal take on this area, how Forethought fits in, and why you might or might not want to do this... (read 2546 more words →)

ITN 201: pitfalls in ITN BOTECs

Lizka

6mo

The fact that the ITN framework^[1] can help us prioritize between problems feels almost magical to me.

But when I see ITN BOTECs in the wild, I’m often very skeptical. It seems really easy for these estimates to be inadvertent “conclusion-laundering” — regurgitating the author’s opinions in a quantitative or more robust-seeming form without actually providing any independent signal. And even when that’s not the case, the bottom-line estimates can seem so noisy and ungrounded that I trust them less than my fuzzy, un-BOTECed intuitions.

So I'm^[2] sharing notes on some pitfalls that seem especially pernicious and common to (i) help BOTEC authors avoid these issues, (ii) nudge readers to be somewhat cautious about deferring to such... (read 3478 more words →)

Notes on dynamism, power, & virtue

Lizka

9mo

This is very rough — it's functionally a collection of links/notes/excerpts that feel related. I don’t think what I’m sharing is in a great format; if I had more mental energy, I would have chosen a more-linear structure to look at this tangle of ideas. But publishing in the current form^[1] seemed better than not sharing at all.

I sometimes feel like two fuzzy perspective-clusters^[2] are pulling me in opposite directions:

Cluster 1: “Getting stuff done, building capacity, being willing to make tradeoffs & high-stakes decisions, etc.”

There are very real, urgent problems in the world. Some decision-makers are incredibly irresponsible — perhaps ~evil — and I wish other people had more influence. Pretending like we’re helpless

... (read 3380 more words →)

Podcast on “AI tools for existential security” — transcript

Lizka

Lizka, fin

10mo

About two months ago, @finm and I recorded a discussion about my now-published piece on “AI tools for Existential Security”^[1] (for the podcast Fin has been running at Forethought). I’d never done something like this before, and liked the result a lot more than I thought I would!^[2]

I'm sharing a quick transcript here in case anyone appreciates section headers, hyperlinks, etc.. Please note that this is very rough; it's an unpolished, not-double-checked “MVP” edit of an AI-generated transcript, and will definitely have mistakes.^[3]

Here’s the episode itself; you should also be able to find it on most podcast apps. The discussion loosely follows the structure of the piece it focuses on.

My thinking on some of what we discuss here has... (read 12623 more words →)

Replying toPreparing for the Intelligence Explosion

Lizka11mo

Preparing for the Intelligence Explosion

FYI: the paper is now out.

See also the LW linkpost: METR: Measuring AI Ability to Complete Long Tasks, and a summary on Twitter.

(IMO this is a really cool paper — very grateful to @Thomas Kwa et al. I'm looking forward to digging into the details.)

LESSWRONG
LW

LESSWRONG
LW

AI benchmarking has a Y-axis problem

Beware safety-washing

A personal take on why you should work at Forethought (maybe)

Design sketches for a more sensible world

Lizka

Strategic awareness tools: design sketches

Design sketches for a more sensible world

Design sketches for angels-on-the-shoulder

AI benchmarking has a Y-axis problem

The first type of transformative AI?

Lizka's Shortform

A personal take on why you should work at Forethought (maybe)

Lizka

AI benchmarking has a Y-axis problem

Beware safety-washing

A personal take on why you should work at Forethought (maybe)

Design sketches for a more sensible world

Lizka

Strategic awareness tools: design sketches

Design sketches for a more sensible world

Design sketches for angels-on-the-shoulder

AI benchmarking has a Y-axis problem

The first type of transformative AI?

Lizka's Shortform

A personal take on why you should work at Forethought (maybe)