LESSWRONG
LW

All of MichaelA's Comments + Replies

Survey on intermediate goals in AI governance

Glad to hear that!

I do feel excited about this being used as a sort of "201 level" overview of AI strategy and what work it might be useful to do. And I'm aware of the report being included in the reading lists / curricula for two training programs for people getting into AI governance or related work, which was gratifying.

Unfortunately we did this survey before ChatGPT and various other events since then, which have majorly changed the landscape of AI governance work to be done, e.g. opening various policy windows. So I imagine people reading this report ... (read more)

Database of existential risk estimates

MichaelA2y40

I'd consider those to be "in-scope" for the database, so the database would include any such estimates that I was aware of and that weren't too private to share in the database.

If I recall correctly, some estimates in the database are decently related to that, e.g. are framed as "What % of the total possible moral value of the future will be realized?" or "What % of the total possible moral value of the future is lost in expectation due to AI risk?"

But I haven't seen many estimates of that type, and I don't remember seeing any that were explici... (read more)

Survey on intermediate goals in AI governance

MichaelA2y10

...and while I hopefully have your attention: My team is currently hiring for a Research Manager! If you might be interested in managing one or more researchers working on a diverse set of issues relevant to mitigating extreme risks from the development and deployment of AI, please check out the job ad!

The application form should take <2 hours. The deadline is the end of the day on March 21. The role is remote and we're able to hire in most countries.

People with a wide range of backgrounds could turn out to be the best fit for the role. As such, if you'... (read more)

Let’s think about slowing down AI

MichaelA2y22

I found this thread interesting and useful, but I feel a key point has been omitted thus far (from what I've read):

Public, elite, and policymaker beliefs and attitudes related to AI risk aren't just a variable we (members of the EA/longtermist/AI safety communities) have to bear in mind and operate in light of, but instead also a variable we can intervene on.
And so far I'd say we have (often for very good reasons) done significantly less to intervene on that variable than we could've or than we could going forward.
So it seems plausible th

... (read more)

Let’s think about slowing down AI

MichaelA2y32

Personally I haven't thought about how strong the analogy to GoF is, but another thing that feels worth noting is that there may be a bunch of other cases where the analogy is similarly strong and where major government efforts aimed at risk-reduction have occurred. And my rough sense is that that's indeed the case, e.g. some of the examples here.

In general, at least for important questions worth spending time on, it seems very weird to say "You think X will happen, but we should be very confident it won't because in analogous case Y it didn't", without al... (read more)

LessWrong is providing feedback and proofreading on drafts as a service

MichaelA3y10

Cool!

Two questions:

Is it possible to also get something re-formatted via this service? (E.g., porting a Google Doc with many footnotes and tables to LessWrong or the EA Forum.)
Is it possible to get feedback, proofreading, etc. via this service for things that won't be posts?
- E.g. mildly infohazardous research outputs that will just be shared in the relevant research & policy community but not made public

(Disclaimer: I only skimmed this post, having landed here from Habryka's comment on It could be useful if someone ran a copyediting service. Apologies i... (read more)

3Ruby3y

1. Yes. 2. You're welcome to ask!

Reward Good Bets That Had Bad Outcomes

MichaelA3y70

Thanks for this post! This seems like good advice to me.

I made an Anki card on your three "principles that stand out" so I can retain those ideas. (Mainly for potentially suggesting to people I manage or other people I know - I think I already have roughly the sort of mindset this post encourages, but I think many people don't and that me suggesting these techniques sometimes could be helpful.)

Commentary on AGI Safety from First Principles

MichaelA3y10

It's not sufficient to argue that taking over the world will improve prediction accuracy. You also need to argue that during the training process (in which taking over the world wasn't possible), the agent acquired a set of motivations and skills which will later lead it to take over the world. And I think that depends a lot on the training process.
[...] if during training the agent is asked questions about the internet, but has no ability to edit the internet, then maybe it will have the goal of "predicting the world", but maybe it will have the goal of "

MichaelA3y20

Thanks for this series! I found it very useful and clear, and am very likely to recommend it to various people.

Minor comment: I think "latter" and "former" are the wrong way around in the following passage?

By contrast, I think the AI takeover scenarios that this report focuses on have received much more scrutiny - but still, as discussed previously, have big question marks surrounding some of the key premises. However, it’s important to distinguish the question of how likely it is that the second species argument is correct, from the question of how seriou

... (read more)

1Michael Wiebe3y

Also came here to say that 'latter' and 'former' are mixed up.

Object Level And Meta Level

MichaelA3y32

FWIW, I feel that this entry doesn't capture all/most of how I see "meta-level" used.

Here's my attempted description, which I wrote for another purpose. Feel free to draw on it here and/or to suggest ways it could be improved.

Meta-level and object-level = typically, “object-level” means something like “Concerning the actual topic at hand” while “Meta-level” means something like “Concerning how the topic is being tackled/researched/discussed, or concerning more general principles/categories related to this actual topic”
- E.g., “Meta-level: I really appr

... (read more)

Distinguishing AI takeover scenarios

MichaelA3y40

Thanks for writing this. The summary table is pretty blurry / hard to read for me - do you think you could upload a higher resolution version? Or if for some reason that doesn't work on LessWrong, could you link to a higher resolution version stored elsewhere?

2Sam Clarke3y

Good idea, I can't get it to work on LW but here is the link: https://docs.google.com/document/d/1XyXNZjRTNImRB6HNOOr_2S0uASpwjopRfyd4Y8fATf8/edit?usp=sharing

Good and bad ways to think about downside risks

MichaelA3y10

This section of a new post may be more practically useful than this post was: https://forum.effectivealtruism.org/posts/4T887bLdupiNyHH6f/six-takeaways-from-ea-global-and-ea-retreats#Takeaway__2__Take_more__calculated__risks

My Overview of the AI Alignment Landscape: A Bird's Eye View

MichaelA3y30

My Anki cards

Nanda broadly sees there as being 5 main types of approach to alignment research.

Addressing threat models: We keep a specific threat model in mind for how AGI causes an existential catastrophe, and focus our work on things that we expect will help address the threat model.

Agendas to build safe AGI: Let’s make specific plans for how to actually build safe AGI, and then try to test, implement, and understand the limitations of these plans. With an emphasis on understanding how to build AGI safely, rather than

MichaelA3yΩ330

Thanks for this! I found it interesting and useful.

I don't have much specific feedback, partly because I listened to this via Nonlinear Library while doing other things rather than reading it, but I'll share some thoughts anyway since you indicated being very keen for feedback.

I in general think this sort of distillation work is important and under-supplied
This seems like a good example of what this sort of distillation work should be like - broken into different posts that can be read separately, starting with an overall overview, each post is broke

... (read more)

3Neel Nanda3y

Thanks a lot for the feedback, and the Anki cards! Appreciated. I definitely find that level of feedback motivating :) These categories were formed by a vague combination of "what things do I hear people talking about/researching" and "what do I understand well enough that I can write intelligent summaries of it" - this is heavily constrained by what I have and have not read! (I am much less good than Rohin Shah at reading everything in Alignment :'( ) Eg, Steve Byrnes does a bunch of research that seems potentially cool, but I haven't read much of it, and don't have a good sense of what it's actually about, so I didn't talk about it. And this is not expressing an opinion that, Eg, his research is bad. I've updated towards including a section at the end of each post/section with "stuff that seems maybe relevant that I haven't read enough to feel comfortable summarising"

3MichaelA3y

My Anki cards Nanda broadly sees there as being 5 main types of approach to alignment research. ---------------------------------------- Nanda focuses on three threat models that he thinks are most prominent and are addressed by most current research: ---------------------------------------- Nanda considers three agendas to build safe AGI to be most prominent: ---------------------------------------- Nanda highlights 3 "robustly good approaches" (in the context of AGI risk):

MichaelA's Shortform

MichaelA3y10

Adam Binks replied to this list on the EA Forum with:

To add to your list - Subjective Logic represents opinions with three values: degree of belief, degree of disbelief, and degree of uncertainty. One interpretation of this is as a form of second-order uncertainty. It's used for modelling trust. A nice summary here with interactive tools for visualising opinions and a trust network.

Announcing the Nuclear Risk Forecasting Tournament

MichaelA4y10

Not sure what you mean by that being unverifiable? The question says:

This question resolves as the total number of nuclear weapons (fission or thermonuclear) reported to be possessed across all states on December 31, 2022. This includes deployed, reserve/ nondeployed, and retired (but still intact) warheads, and both strategic and nonstrategic weapons.
Resolution criteria will come from the Federation of American Scientists (FAS). If they cease publishing such numbers before resolution, resolution will come from the Arms Control Association or any other sim

MichaelA4y10

That makes sense to me.

But it seems like you're just saying the issue I'm gesturing at shouldn't cause mis-calibration or overconfidence, rather than that it won't reduce the resolution/accuracy or the practical usefulness of a system based on X predicting what Y will think?

3ozziegooen4y

That sounds right. However, I think that being properly calibrated is a really big deal, and a major benefit compared to other approaches. On the part: If there are good additional approaches that are less black-box, I see them ideally being additions to this rough framework. There are methods to encourage discussion and information sharing, including with the Judge / the person's beliefs who is being predicted.

Introduction To Lesswrong Subculture

MichaelA4y30

(Update: I just saw the post Welcome to LessWrong!, and I think that that serves my needs well.)

Introduction To Lesswrong Subculture

MichaelA4y70

I think it's good that a page like this exists; I'd want to be able to use it as a go-to link when suggesting people engage with or post on LessWrong, e.g. in my post on Notes on EA-related research, writing, testing fit, learning, and the Forum.

Unfortunately, it seems to me that this page isn't well suited to that purpose. Here are some things that seem like key issues to me (maybe other people would disagree):

This introduction seems unnecessarily intimidating, non-welcoming, and actually (in my perception) somewhat arrogant. For example:
- "If you have no f

... (read more)

3Ruby4y

Hey, sorry that you came across this instead of the current welcome/about page. I agree with much of your feedback here, glad the Welcome/About page does meet the need. I added a note to this page saying it was written in 2015 (by one particular user, as you'll see in the history). So we've got it for historical reasons, but I also wouldn't use it as an intro.

3MichaelA4y

(Update: I just saw the post Welcome to LessWrong!, and I think that that serves my needs well.)

Modernization and arms control don’t have to be enemies.

MichaelA4y20

Authoritarian closed societies probably have an advantage at covert racing, at devoting a larger proportion of their economic pie to racing suddenly, and at artificially lowering prices to do so. Open societies have probably a greater advantage at discovery/the cutting edge and have a bigger pie in the first place (though better private sector opportunities compete up the cost of defense engineering talent).

These are interesting points which I hadn't considered - thanks!

(Your other point also seems interesting and plausible, but I feel I lack the relevant knowledge to immediately evaluate it well myself.)

Epistemic Warfare

MichaelA4y20

Interesting post.

You or other readers might also find the idea of epistemic security interesting, as discussed in the report "Tackling threats to informed decisionmaking in democratic societies: Promoting epistemic security in a technologically-advanced world". The report is by researchers at CSER and some other institutions. I've only read the executive summary myself.

There's also a BBC Futures article on the topic by some of the same authors.

Modernization and arms control don’t have to be enemies.

MichaelA4y10

While I am not sure I agree fully with the panel, an implication to be drawn from their arguments is that from an equilibrium of treaty compliance, maintaining the ability to race can disincentivize the other side from treaty violation: it increases the cost to the other side of gaining advantage, and that can be especially decisive if your side has an economic advantage.

This is an idea/argument I hadn't encountered before, and seems plausible, so it seems valuable that you shared it.

But it seems to me that there's probably an effect pushing in the opposit... (read more)

2Gentzel4y

I agree with this line of analysis. Some points I would add: -Authoritarian closed societies probably have an advantage at covert racing, at devoting a larger proportion of their economic pie to racing suddenly, and at artificially lowering prices to do so. Open societies have probably a greater advantage at discovery/the cutting edge and have a bigger pie in the first place (though better private sector opportunities compete up the cost of defense engineering talent). Given this structure, I think you want the open societies to keep their tech advantage, and make deployment/scaling military tech a punishment for racing by closed societies. -Your first bullet seems similar to the situation the U.S. is in now, Russia and China just went through a modernization wave, and Russia has been doing far more nuclear experimentation while the U.S. talent for this is mostly old or already retired + a lot of the relevant buildings are falling apart. Once you are in the equilibrium of knowing a competitor is doing something and your decision is to match or not, you don't have leverage to stop the competitor unless you get started. Because of how old a lot of U.S. systems are/how old the talent is, Russia likely perceived a huge advantage to getting the U.S. to delay. A better structure for de-escalation is neutral with respect to relative power differences: if you de-escalate by forfeiting relative power you keep increasing the incentive for the other side to race. There are some other caveats I'm not getting into here, but I think we are mostly on the same page.

The Future of Nuclear Arms Control?

MichaelA4y10

Thanks for this thought-provoking post. I found the discussion of how political warfare may have influenced nuclear weapons activism particularly interesting.

Since large yield weapons can loft dust straight to the stratosphere, they don’t even have to produce firestorms to start contributing to nuclear winter: once you get particles that block sunlight to an altitude that heating by the sun can keep them lofted, you’ll block sunlight a very long time and start harming crop yields.

I think it's true that this could "contribute" to nuclear winter, but I don't... (read more)

1Gentzel4y

Some of the original papers on nuclear winter reference this effect, e.g. in the abstract here about high yield surface burst weapons (e.g. I think this would include the sort that would have been targeted at silos by the USSR). https://science.sciencemag.org/content/222/4630/1283 A common problem with some modern papers is that they just take soot/dust amounts from these prior papers without adjusting for arsenal changes or changes in fire modeling.

Notes on "Bioterror and Biowarfare" (2006)

MichaelA4y10

Final thoughts on whether you should read this book

I found the book useful
- The parts I found most useful were (a) the early chapters on the history of biowarfare and bioterrorism and (b) the later chapters on attempts to use international law to reduce risks from bioterror and biowarfare
I found parts of the book hard to pay attention to and remember information from
- In particular, the middle chapters on various types and examples of pathogens
  - But this might just be a “me problem”. Ever since high school, I’ve continually noticed that I seem to have a harder t

MichaelA4y10

My Anki cards

Note that:

It’s possible that some of these cards include mistakes, or will be confusing or misleading out of context.
I haven’t fact-checked Dando on any of these points.
Some of these cards are just my own interpretations - rather than definitely 100% parroting what the book is saying
The indented parts are the questions, the answers are in "spoiler blocks" (hover over them to reveal the text), and the parts in square brackets are my notes-to-self.

Dando says ___ used biological weapons in WW1, but seemingly only against ___.

the Germans and perha

MichaelA4y20

MichaelA4y30

A final thought that came to mind, regarding the following passage:

It seems possible for person X to predict a fair number of a more epistemically competent person Y’s beliefs -- even before person X is as epistemically competent as Y. And in that case, doing so is evidence that person X is moving in the right direction.

I think that that's is a good and interesting point.

But I imagine there would also be many cases in which X develops an intuitive ability to predict Y's beliefs quite well in a given set of domains, but in which that ability doesn... (read more)

3ozziegooen4y

This sounds roughly right to me. I think concretely this wouldn't catch people off guard very often. We have a lot of experience trying to model the thoughts of other people, in large part because we need to do this to communicate with them. I'd feel pretty comfortable basically saying, "I bet I could predict what Stuart will think in areas of Anthropology, but I really don't know his opinions of British politics". If forecasters are calibrated, then on average they shouldn't be overconfident. It's expected there will be pockets where they are, but I think the damage caused here isn't particularly high.

[Part 1] Amplifying generalist research via forecasting – Models of impact and challenges

MichaelA4y30

Here's a second thought that came to mind, which again doesn't seem especially critical to this post's aims...

You write:

Someone who can both predict my beliefs and disagrees with me is someone I should listen to carefully. They seem to both understand my model and still reject it, and this suggests they know something I don’t.

I think I understand the rationale for this statement (though I didn't read the linked Science article), and I think it will sometimes be true and important. But I think that those sentences might overstate the point. In par... (read more)

3ozziegooen4y

Fair points. I think that the fact that they can predict one's beliefs is minor evidence they will be EV-positive to listen to. You also have to take into account the challenge of learning from them. All that said, this sort of technique is fairly prosaic. I'm aiming for a future much better; where key understandings are all in optimized prediction applications and people generally pay attention to those.

[Part 1] Amplifying generalist research via forecasting – Models of impact and challenges

MichaelA4y10

Thanks for this and its companion post; I found the two posts very interesting, and I think they'll usefully inform some future work for me.

A few thoughts came to mind as I read, some of which can sort-of be seen as pushing back against some claims, but in ways that I think aren't very important and that I expect you've already thought about. I'll split these into separate comments.

Firstly, as you note, what you're measuring is how well predictions match a proxy for the truth (the proxy being Elizabeth's judgement), rather than the truth itself. Something ... (read more)

2ozziegooen4y

Thanks for the attention on this point. I think I'm very nervous about trying to get at "Truth". I definitely don't mean to claim that we were confident that this work gets us much closer to truth; more that it can help progress a path of deliberation. The expectation is that it can get us closer to the truth than most other methods, but we'll still be several steps away. I imagine that there are many correlated mistakes society is making. It's really difficult to escape that. I'd love for future research to make attempts here, but I suspect it's a gigantic challenge, both for research and social reasons. For example, in ancient Egypt, I believe it would have taken some intense deliberation to both realize that the popular religion was false, and also to be allowed to say such.

Notes on Schelling's "Strategy of Conflict" (1960)

MichaelA4y30

Good idea! I didn't know about that feature.

I've now edited the post to use spoiler-blocks (though a bit messily, as I wanted to do it quickly), and will use them for future lazy-Anki-card-notes-posts as well.

Notes on Schelling's "Strategy of Conflict" (1960)

MichaelA4y20

I didn't add that tag; some other reader did.

And any reader can indeed downvote any tag, so if you feel that that tag shouldn't be there, you could just downvote it.

Unless you feel that the tag shouldn't be there but aren't very confident about that, and thus wanted to just gently suggest that maybe the tag should be removed - like putting in a 0.5 vote rather than a full one. But that doesn't seem to match the tone of your comment.

That said, it actually does seem to me that this post fairly clearly does match the description for that tag; the ... (read more)

Notes on Schelling's "Strategy of Conflict" (1960)

MichaelA4y30

Yeah, I definitely agree that that's a good idea with any initialisations that won't already be known to the vast majority of one's readers (e.g., I wouldn't bother with US or UK, but would with APA). In this case, I just copied and pasted the post from the EA Forum, where I do think the vast majority of readers would know what "EA" means - but I should've used the expanded form "effective altruism" the first time in the LessWrong version. I've now edited that.

Notes on Schelling's "Strategy of Conflict" (1960)

MichaelA4y30

Here's a comment I wrote on the EA Forum version of this post, which I'm copying here as I'd be interested on people's thoughts on the equivalent questions in the context of LessWrong:

Meta: Does this sort of post seem useful? Should there be more posts like this?

I previously asked Should pretty much all content that's EA-relevant and/or created by EAs be (link)posted to the Forum? I found Aaron Gertler's response interesting and useful. Among other things, he said:

Eventually, we'd like it to be the case that almost all well-written EA content exists on the

MichaelA4y50

Note: If you found this post interesting, you may also be interested in my Notes on "The Bomb: Presidents, Generals, and the Secret History of Nuclear War" (2020), or (less likely) Notes on The WEIRDest People in the World: How the West Became Psychologically Peculiar and Particularly Prosperous. (The latter book has a very different topic; I just mention it as the style of post is the same.)

Clarifying some key hypotheses in AI alignment

MichaelA4y*20

To your first point...

My impression is that there is indeed substantially less literature on misuse risk and structural risk, compared to accident risk, in relation to AI x-risk. (I'm less confident when it comes to a broader set of negative outcomes, not just x-risks, but that's also less relevant here and less important to me.) I do think that that might the sort of work this post does less interesting if done in relation to those less-discussed types of risks, since there fewer disagreements have been revealed, so there's less to analyse and summarise.&... (read more)

The Commitment Races problem

MichaelA4y40

Thanks for this post; this does seem like a risk worth highlighting.

I've just started reading Thomas Schelling's 1960 book The Strategy of Conflict, and noticed a lot of ideas in chapter 2 that reminded me of many of the core ideas in this post. My guess is that that sentence is an uninteresting, obvious observation, and that Daniel and most readers were already aware (a) that many of the core ideas here were well-trodden territory in game theory and (b) that this post's objectives were to:

highlight these ideas to people on LessWrong
highlight their p

... (read more)

4Raemon4y

I'm about halfway through Strategy of Conflict and so far it's not really giving solutions to any of these problems, just sketching out the problem space.

MichaelA's Shortform

MichaelA4y*90

Problems in AI risk that economists could potentially contribute to

List(s) of relevant problems

What can the principal-agent literature tell us about AI risk? (and this comment)
Many of the questions in Technical AGI safety research outside AI
Many of the questions in The Centre for the Governance of AI’s research agenda
Many of the questions in Cooperation, Conflict, and Transformative Artificial Intelligence (a research agenda of the Center on Long-Term Risk)
At least a couple of the questions in 80,000 Hours' Research questions that could have a big social i

... (read more)

2Vael Gates4y

Recently I was also trying to figure out what resources to send to an economist, and couldn't find a list that existed either! The list I came up with is subsumed by yours, except: - Questions within Some AI Governance Research Ideas - "Further Research" section within an OpenPhil 2021 report: https://www.openphilanthropy.org/could-advanced-ai-drive-explosive-economic-growth - The AI Objectives Institute just launched, and they may have questions in the future

Clarifying some key hypotheses in AI alignment

MichaelA4y30

It occurs to me that all of the hypotheses, arguments, and approaches mentioned here (though not necessarily the scenarios) seem to be about the “technical” side of things. There are two main things I mean by that statement:

First, this post seems to be limited to explaining something along the lines of “x-risks from AI accidents”, rather than “x-risks from misuse of AI”, or “x-risk from AI as a risk factor” (e.g., how AI could potentially increase risks of nuclear war).

I do think it makes sense to limit the scope that way, because:

no one post c

... (read more)

2Ben Cottier4y

To your first point - I agree both with why we limited the scope (but also, it was partly just personal interests), and that there should be more of this kind of work on other classes of risk. However, my impression is the literature and "public" engagement (e.g. EA forum, LessWrong) on catastrophic AI misuse/structural risk is too small to even get traction on work like this. We might first need more work to lay out the best arguments. Having said that, I'm aware of a fair amount of writing which I haven't got around to reading. So I am probably misjudging the state of the field. To your second point - that seems like a real crux and I agree it would be good to expand in that direction. I know some people working on expanded and more in-depth models like this post. It would be great to get your thoughts when they're ready.

Clarifying some key hypotheses in AI alignment

MichaelA4y20

Thanks for this post! This seems like a really great way of visually representing how these different hypotheses, arguments, approaches, and scenarios interconnect. (I also think it’d be cool to see posts on other topics which use a similar approach!)

It seems that AGI timelines aren’t explicitly discussed here. (“Discontinuity to AGI” is mentioned, but I believe that's a somewhat distinct matter.) Was that a deliberate choice?

It does seem like several of the hypotheses/arguments mentioned here would feed into or relate to beliefs about timelines - in parti... (read more)

2Ben Cottier4y

It's great to hear your thoughts on the post! I'd also like to see more posts that do this sort of "mapping". I think that mapping AI risk arguments is too neglected - more discussion and examples in this post by Gyrodiot. I'm continuing to work collaboratively in this area in my spare time, and I'm excited that more people are getting involved. We weren't trying to fully account for AGI timelines - our choice of scope was based on a mix of personal interest and importance. I know people currently working on posts similar to this that will go in-depth on timelines, discontinuity, paths to AGI, the nature of intelligence, etc. which I'm excited about! I agree with all your points. You're right that this post's scope does not include broader alternatives for reducing AI risk. It was not even designed to guide what people should work on, though it can serve that purpose. We were really just trying to clearly map out some of the discourse, as a starting point and example for future work.

Does the US nuclear policy still target cities?

MichaelA4y20

Thanks for this post; I found it useful.

The US policy has never ruled out the possibility of escalation to full countervalue targeting and is unlikely to do so.

But the 2013 DoD report says "The United States will not intentionally target civilian populations or civilian objects". That of course doesn't prove that the US actually wouldn't engage in countervalue targeting, but doesn't it indicate that US policy at that time ruled out engaging in countervalue targeting?

This is a genuine rather than rhetorical question. I feel I might be just missing som... (read more)

Why those who care about catastrophic and existential risk should care about autonomous weapons

MichaelA4y10

If I had to choose between a AW treaty and some treaty governing powerful AI, the latter (if it made sense) is clearly more important. I really doubt there is such a choice and that one helps with the other, but I could be wrong here. [emphasis added]

Did you mean something like "and in fact I think that one helps with the other"?

Forecasting Thread: Existential Risk

MichaelA4y*70

I don't think I know of any person who's demonstrated this who thinks risk is under, say, 10%

If you mean risk of extinction or existential catastrophe from AI at the time AI is developed, it seems really hard to say, as I think that that's been estimated even less often than other aspects of AI risk (e.g. risk this century) or x-risk as a whole.

I think the only people (maybe excluding commenters who don't work on this professionally) who've clearly given a greater than 10% estimate for this are:

Buck Schlegris (50%)
Stuart Armstrong (33-50% chanc

... (read more)

Forecasting Thread: Existential Risk

MichaelA4y*10

Mostly I only start paying attention to people's opinions on these things once they've demonstrated that they can reason seriously about weird futures

[tl;dr This is an understandable thing to do, but does seem to result in biasing one's sample towards higher x-risk estimates]

I can see the appeal of that principle. I partly apply such a principle myself (though in the form of giving less weight to some opinions, not ruling them out).

But what if it turns out the future won't be weird in the ways you're thinking of? Or what if it turns out that, even if it wi... (read more)

Forecasting Thread: Existential Risk

MichaelA4y30

I'm not sure which of these estimates are conditional on superintelligence being invented. To the extent that they're not, and to the extent that people think superintelligence may not be invented, that means they understate the conditional probability that I'm using here.

Good point. I'd overlooked that.

I think lowish estimates of disaster risks might be more visible than high estimates because of something like social desirability, but who knows.

(I think it's good to be cautious about bias arguments, so take the following with a grain of salt, and note th... (read more)

4steven04614y

Mostly I only start paying attention to people's opinions on these things once they've demonstrated that they can reason seriously about weird futures, and I don't think I know of any person who's demonstrated this who thinks risk is under, say, 10%. (edit: though I wonder if Robin Hanson counts)

Thoughts on Human Models

MichaelA5y30

That does seem interesting and concerning.

Minor: The link didn’t work for me; in case others have the same problem, here is (I believe) the correct link.

3Daniel Kokotajlo5y

Thanks, fixed

Forecasting Thread: Existential Risk

MichaelA5y10

Yeah, totally agreed.

I also think it's easier to forecast extinction in general, partly because it's a much clearer threshold, whereas there are some scenarios that some people might count as an "existential catastrophe" and others might not. (E.g., Bostrom's "plateauing — progress flattens out at a level perhaps somewhat higher than the present level but far below technological maturity".)

Forecasting Thread: Existential Risk

MichaelA5y20

Conventional risks are events that already have a background chance of happening (as of 2020 or so) and does not include future technologies.

Yeah, that aligns with how I'd interpret the term. I asked about advanced biotech because I noticed it was absent from your answer unless it was included in "super pandemic", so I was wondering whether you were counting it as a conventional risk (which seemed odd) or excluding it from your analysis (which also seems odd to me, personally, but at least now I understand your short-AI-timelines-based reasoning for ... (read more)

Forecasting Thread: Existential Risk

MichaelA5y*10

The overall risk was 9.2% for the community forecast (with 7.3% for AI risk). To convert this to a forecast for existential risk (100% dead), I assumed 6% risk from AI, 1% from nuclear war, and 0.4% from biological risk

I think this implies you think:

AI is ~4 or 5 times (6% vs 1.3%) as likely to kill 100% of people as to kill between 95 and 100% of people
Everything other than AI is roughly equally likely (1.5% vs 1.4%) to kill 100% of people as to kill between 95% and 100% of people

Does that sound right to you? And if so, what was your reasoning?

I ask... (read more)

2Owain_Evans5y

The Metaculus community forecast has chance of >95% dead (7.5%) close to chance of >10% dead (9.7%) for AI. Based on this and my own intuition about how AI risks "scale", I extrapolated to 6% for 100% dead. For biological and nuclear war, there's a much bigger drop off from >10% to >95% from the community. It's hard to say what to infer from this about the 100% case. There are good arguments that 100% is unlikely from both, but some of those arguments would also cut against >95%. I didn't do a careful examination and so take all these numbers with a grain of salt.

Forecasting Thread: Existential Risk

MichaelA5y10

Very interesting, thanks for sharing! This seems like a nice example of combining various existing predictions to answer a new question.

a forecast for existential risk (100% dead)

It seems worth highlighting that extinction risk (risk of 100% dead) is a (big) subset of existential risk (risk of permanent and drastic destruction of humanity's potential), rather than those two terms being synonymous. If your forecast was for extinction risk only, then the total existential risk should presumably be at least slightly higher, due to risks of unrecoverable colla... (read more)

1Owain_Evans5y

Good points. Unfortunately it seems even harder to infer "destruction of potential" from the Metaculus forecasts. It seems plausible that AI could cause destruction of potential without any deaths at all, and so this wouldn't be covered by the Metaculus series.

Forecasting Thread: Existential Risk

MichaelA5y30

Thanks for those responses :)

MIRI people and Wei Dai for pessimism (though I'm not sure it's their view that it's worse than 50/50), Paul Christiano and other researchers for optimism.

It does seem odd to me that, if you aimed to do something like average over these people's views (or maybe taking a weighted average, weighting based on the perceived reasonableness of their arguments), you'd end up with a 50% credence on existential catastrophe from AI. (Although now I notice you actually just said "weight it by the probability that it turns out badly ... (read more)

4steven04614y

Yes, maybe I should have used 40% instead of 50%. I've seen Paul Christiano say 10-20% elsewhere. Shah and Ord are part of whom I meant by "other researchers". I'm not sure which of these estimates are conditional on superintelligence being invented. To the extent that they're not, and to the extent that people think superintelligence may not be invented, that means they understate the conditional probability that I'm using here. I think lowish estimates of disaster risks might be more visible than high estimates because of something like social desirability, but who knows.