LESSWRONG
LW

190
Raemon
61446Ω8005058846311
Message
Dialogue
Subscribe

LessWrong team member / moderator. I've been a LessWrong organizer since 2011, with roughly equal focus on the cultural, practical and intellectual aspects of the community. My first project was creating the Secular Solstice and helping groups across the world run their own version of it. More recently I've been interested in improving my own epistemic standards and helping others to do so as well.

Sequences

Posts

Sorted by New

Wikitag Contributions

Comments

Sorted by
Newest
22Raemon's Shortform
Ω
8y
Ω
711
Raemon's Shortform
Raemon10h3130

I feel so happy that "what's your crux?" / "is that cruxy" is common parlance on LW now, it is a meaningful improvement over the prior discourse. Thank you CFAR and whoever was part of the generation story of that.

Reply21
New Report: An International Agreement to Prevent the Premature Creation of Artificial Superintelligence
Raemon10h*20

Subcruxes of mine here:

  • I think, by the time any kind of international deal goes through, we will basically have already reached the frontier of what was safe, so it feels like splitting hairs, discussing whether the regime should want more capabilities in the immediate future.
    • (surely it's going to take at least a year, which is a pretty long time, it's probably not going to happen at all, and 3 years to even get started is more like what I imagine when I imagine a very successful overton-smashing campaign)
  • I think there's tons of research augmentation you can do 1-3-year-from-now AI, that are more about leveraging the existing capabilities than getting fundamentally smarter
  • I don't buy that there's a way to get end-to-end research, or "fundamentally smarter" research assistants, that aren't unacceptably dangerous at scale. (i.e. I believe you can train on more specific ). (man I have no idea what I meant by that sentence fragment. sorry, person who reacted with "?")

Do those feel like subcruxes for you, or are there other ones?

Reply1
New Report: An International Agreement to Prevent the Premature Creation of Artificial Superintelligence
Raemon10h20

Is there a decent chance an AI takeover is relatively nice? 

> This is an existential catastrophe IMO and should desperately avoided, even if they do leave us a solar system or w/e. 

Actually, I think this maybe wasn't cruxy for anyone. I think @ryan_greenblatt said he agreed it didn't change the strategic picture, it just changed some background expectations. 

(I maybe don't believe him that he doesn't think it affects the strategic picture? It seemed like his view was fairly sensitive to various things being like 30% likely instead of like 5% or <1%, and it feels like it's part of an overall optimistic package that adds up to being more willing to roll the dice on current proposals? But, I'd probably believe him if he reads his paragraph and is like "I have thought about whether this is a (maybe subconscious) motivation/crux and am confident it isn't)

Reply
New Report: An International Agreement to Prevent the Premature Creation of Artificial Superintelligence
Raemon10h20

If the international governing body starts approving AI development, then aren't we basically just back in the plan A regime?

I think MIRI's plan is clearly meant to eventually build superintelligence, given that they've stated various times it'd be an existential catastrophe if this never happened – they just think it should happen after a lot of augmentation and carefulness.

A lot of my point here is I just don't really see much difference between Plan A and Shutdown except for "once you've established some real control over AI racing, what outcome are you shooting for nearterm?", and I'm confused why Plan A advocates see it as substantially different. 

(Or, I think the actual differences are more about "how you expect it to play out in practice, esp. if MIRI-style folk end up being a significant political force." Which is maybe fair, but, it's not about the core proposal IMO.)

"We wouldn't want to pause 30 years, and then do a takeoff very quickly – it's probably better to do a smoother takeoff."

> huh, this one seems kinda relevant to me. 

Do you understand why I don't understand why you think that? Like, the MIRI plan is clearly aimed at eventually building superintelligence (I realize the literal treaty doesn't emphasize that, but, it's clear from very public writing in IABIED that it's part of the goal), and I think it's pretty agnostic over exactly how that shakes out.

Reply
Daniel Kokotajlo's Shortform
Raemon11h1823

You... could publish it as a top-level linkpost!

Reply
New Report: An International Agreement to Prevent the Premature Creation of Artificial Superintelligence
Raemon12h182

Here's an attempt to recap the previous discussion about "Global Shutdown" vs "Plan A/Controlled Takeoff", trying to skip ahead to the part where we're moving the conversation forward rather than rehashing stuff.

Cruxes that seemed particularly significant (phrased the way they made most sense to me, which is hopefully reasonably ITT passing)

...

How bad is Chinese Superintelligence? For some people, it's a serious crux whether a China-run superintelligence would be dramatically worse in outcome than a democratic country. 

...

"The gameboard could change in all kinds of bad ways over 30 years." Nations or companies could suddenly pull out in a disastrous way. If things go down in the near future there's fewer actors to make deals with and it's easier to plan things out.

...

Can we leverage useful work out of significantly-more-powerful-but-nonsuperhuman AIs? Especially since "the gameboard might change a lot", it's useful to get lots of safety research done quickly, and it's easier to do that with more powerful AIs. So, it's useful to continue to scale up until we've got the most powerful AIs can we can confidently control. (Whereas Controlled Takeoff skeptics tend to think AI that is capable of taking on the hard parts of AI safety research will already be too dangerous and untrustworthy)

...

Is there a decent chance an AI takeover is relatively nice? Giving the humans the Earth/solar system is just incredibly cheap from percentage-of-resources standpoint. This does require the AI to genuinely care about and respect our agency in a sort of complete way. But, it only has to care about us as a pretty teeny amount

...

And then, the usual  "how doomed are current alignment plans?". My impression is "Plan A" advocates are usually expecting a pretty good chance things go pretty well if humanity is making like a reasonably good faith attempt at controlled takeoff, whereas Controlled Takeoff skeptics are typically imagining "by default this just goes really poorly, you can tell because everyone seems to keep sliding off understanding or caring about the hard parts of the problem")

...

All of those seem like reasonable things for smart, thoughtful people to disagree on. I do think some disagreement about them feels fishy/sus to me, and I have my takes on them, but, I can see where you're coming from.

Three cruxes I still just don't really buy as decision-relevant:

  1. "We wouldn't want to pause 30 years, and then do a takeoff very quickly – it's probably better to do a smoother takeoff." Yep, I agree. But, if you're in a position to decide-on-purpose how smooth your takeoff is, you can still just do the slower one later. (Modulo "the gameboard could change in 30 years", which makes more sense to me as a crux). I don't see this as really arguing at all against what I imagined the Treaty to be about.
     
  2. "We need some kind of exit plan, the MIRI Treaty doesn't have one." I currently don't really buy that Plan A has more of one than the the MIRI Treaty. The MIRI treaty establishes an international governing body that makes decisions about how to change the regulations, and it's pretty straightforward for such an org to make judgment calls once people have started producing credible safety cases. I think imagining anything more specific than this feels pretty fake to me – that's a decision that makes more sense to punt to people who are more informed than us.
     
  3. Shutdown is more politically intractable than Controlled Takeoff. I don't currently buy that this is true in practice. I don't think anyone is expecting to immediately jump to either a full-fledged version of Plan A, or a Global Shutdown. Obviously, for the near future, you try for whatever level of national and international cooperation you can get, build momentum, do the easy sells first, etc. I don't expect, in practice, Shutdown to be different from "you did all of Plan A, and, then, took like 2-3 more steps, and by the time you've implemented Plan A in it's entirety, it seems crazy to me to assume the next 2-3 steps are particularly intractable."
    1. I totally buy "we won't even get to a fully fledged version of Plan A", but, that's not an argument for Plan A over Shutdown.
    2. It feels like people are imagining "naive, poorly politically executed version of Shutdown, vs some savvily executed version of Plan A." I think there are reasonable reasons to think the people advocating Shutdown will not be savvy. But, those reasons don't extend to "insofar as you thought you could savvily advocate for Plan A, you shouldn't be setting your sites on Shutdown."
Reply2
New Report: An International Agreement to Prevent the Premature Creation of Artificial Superintelligence
Raemon13h40

Note: you can get a nice-to-read version of the Treaty on https://www.ifanyonebuildsit.com/treaty I'm not sure if there's any notable differences between that and the paper but I'm guessing it's mostly the same.

Reply
Do things in small batches
Raemon16h20

I'm curious about how the specifics worked in your case.

Reply
What are your impossible problems?
Raemon1d30

I think it's been surprisingly like 50/50 when I specifically flinch away from an idea because it felt impossible, and it turned out to be "actually pretty impossible" vs "okay actually sort of straightforward if I were trying all the obvious things".

Obviously, if I systematically list out impossible things, there will be way more actually-pretty-impossible things. But somehow when it actually comes up (sampled from "things I actively wanted to do"). Maybe if I got better at dreaming impossible thoughts more of them would turn out to actually be impossible.

Reply
Mediators: a different route through conflict
Raemon1d20

I'm curious about the origin story of you using the Bosnia example here – is that a history bit that you already knew, and randomly realized was analogous to your situation? Did you learn the Bosnia example recently and then go "oh that was kinda like situation X"?, or want to write about situation X and then poked around for a real world example?

(I just find myself wondering because I don't recall you using that class of example very often and was just intrigued about the process, no particular followup implication or questions).

Also, nice post! I think this one of my favorite Inkhaven posts that I've read.

Reply
Load More
Step by Step Metacognition
Feedbackloop-First Rationality
The Coordination Frontier
Privacy Practices
Keep your beliefs cruxy and your frames explicit
LW Open Source Guide
Tensions in Truthseeking
Project Hufflepuff
Rational Ritual
Load More (9/10)
23What are your impossible problems?
4d
23
49Orient Speed in the 21st Century
5d
8
53One Shot Singalonging is an attitude, not a skill or a song-difficulty-level*
10d
11
44Solstice Season 2025: Ritual Roundup & Megameetups
12d
8
42Being "Usefully Concrete"
15d
4
59"What's hard about this? What can I do about that?"
16d
0
130Re-rolling environment
18d
2
50Mottes and Baileys in AI discourse
22d
9
20Early stage goal-directednesss
1mo
8
77"Intelligence" -> "Relentless, Creative Resourcefulness"
1mo
28
Load More
AI Consciousness
3 months ago
AI Auditing
3 months ago
(+25)
AI Auditing
3 months ago
Guide to the LessWrong Editor
7 months ago
Guide to the LessWrong Editor
7 months ago
Guide to the LessWrong Editor
7 months ago
Guide to the LessWrong Editor
7 months ago
(+317)
Sandbagging (AI)
8 months ago
Sandbagging (AI)
8 months ago
(+88)
AI "Agent" Scaffolds
8 months ago
Load More