LESSWRONG
LW

1187
So8res
18999Ω19641505611331
Message
Dialogue
Subscribe

Sequences

Posts

Sorted by New

Wikitag Contributions

Comments

Sorted by
Newest
8So8res's Shortform
Ω
3y
Ω
23
Replacing Guilt
Safety researchers should take a public stance
So8res8h105

The thing I'm imagining is more like mentioning, almost as an aside, in a friendly tone, that ofc you think the whole situation is ridiculous and that stopping would be better (before & after having whatever other convo you were gonna have about technical alignment ideas or w/e). In a sort of "carthago delanda est" fashion.

I agree that a host company could reasonably get annoyed if their researchers went on many different podcasts to talk for two hours about how the whole industry is sick. But if casually reminding people "the status quo is insane and we should do something else" at the beginning/end is a fireable offense, in a world where lab heads & Turing award winners & Nobel laureate godfathers of the field are saying this is all ridiculously dangerous, then I think that's real sketchy and that contributing to a lab like that is substantially worse than the next best opportunity. (And similarly if it's an offense that gets you sidelined or disempowered inside the company, even if not exactly fired.)

Reply
The title is reasonable
So8res11h75

(To answer your direct Q, re: "Have you ever seen someone prominent pushing a case for "optimism" on the basis of causal trade with aliens / acaual trade?", I have heard "well I don't think it will actually kill everyone because of acausal trade arguments" enough times that I assumed the people discussing those cases thought the argument was substantial. I'd be a bit surprised if none of the ECLW folks thought it was a substantial reason for optimism. My impression from the discussions was that you & others of similar prominence were in that camp. I'm heartened to hear that you think it's insubstantial. I'm a little confused why there's been so much discussion around it if everyone agrees it's insubstantial, but have updated towards it just being a case of people who don't notice/buy that it's washed out by sale to hubble-volume aliens and who are into pedantry. Sorry for falsely implying that you & others of similar prominence thought the argument was substantial; I update.)

Reply
Safety researchers should take a public stance
So8res11h2112

I am personally squeamish about AI alignment researchers staying in their positions in the case where they're only allowed to both go on podcasts & keep their jobs if they never say "this is an insane situation and I wish Earth would stop instead (even as I expect it won't and try to make things better)" if that's what they believe. That starts to feel to me like misleading the Earth in support of the mad scientists who are gambling with all our lives. If that's the price of staying at one of the labs, I start to feel like exiting and giving that as the public reason is a much better option.

In part this is because I think it'd make all sorts of news stories in a way that would shift the Overton window and make it more possible for other researchers later to speak their mind (and shift the internal culture and thus shift the policymaker understanding, etc.), as evidenced by e.g. the case of Daniel Kokotajlo. And in part because I think you'd be able to do similarly good or better work outside of a lab like that. (At a minimum, my guess is you'd be able to continue work at Anthropic, e.g. b/c Evan can apparently say it and continue working there.)

Reply2
The title is reasonable
So8res17h*2119

Ty! For the record, my reason for thinking it's fine to say "if anyone builds it, everyone dies" despite some chance of survival is mostly spelled out here. Relative to the beliefs you spell out above, I think the difference is a combination of (a) it sounds like I find the survival scenarios less likely than you do; (b) it sounds like I'm willing to classify more things as "death" than you are.

For examples of (b): I'm pretty happy to describe as "death" cases where the AI makes things that are to humans what dogs are to wolves, or (more likely) makes some other strange optimized thing that has some distorted relationship to humanity, or cases where digitized backups of humanity are sold to aliens, etc. I feel pretty good about describing many exotic scenarios as "we'd die" to a broad audience, especially in a setting with extreme length constraints (like a book title). If I were to caveat with "except maybe backups of us will be sold to aliens", I expect most people to be confused and frustrated about me bringing that point up. It looks to me like most of the least-exotic scenarios are ones that rout through things that lay audience members pretty squarely call "death".

It looks to me like the even more exotic scenarios (where modern individuals get "afterlives") are in the rough ballpark of quantum immortality / anthropic immortality arguments. AI definitely complicates things and makes some of that stuff more plausible (b/c there's an entity around that can make trades and has a record of your mind), but it still looks like a very small factor to me (washed out e.g. by alien sales) and feels kinda weird and bad to bring it up in a lay conversation, similar to how it'd be weird and bad to bring up quantum immortality if we were trying to stop a car speeding towards a cliff.

FWIW, insofar as people feel like they can't literally support the title because they think that backups of humans will be sold to aliens, I encourage them to say as much in plain language (whenever they're critiquing the title). Like: insofar as folks think the title is causing lay audiences to miss important nuance, I think it's an important second-degree nuance that the allegedly-missing nuance is "maybe we'll be sold to aliens", rather than something less exotic than that.

Reply
Safety researchers should take a public stance
So8res19h*3426

Oh yeah, I agree that (earnest and courageous) attempts to shift the internal culture are probably even better than saying your views publicly (if you're a low-profile researcher).

I still think there's an additional boost from consistently reminding people of your "this is crazy and earth should do something else" views whenever you are (e.g.) on a podcast or otherwise talking about your alignment hopes. Otherwise I think you give off a false impression that the scientists have things under control and think that the race is okay. (I think most listeners to most alignment podcasts or w/e hear lots of cheerful optimism and none of the horror that is rightly associated with >5% destruction of the whole human endeavor, and that this contributes to the culture being stuck in a bad state across many orgs.)

FWIW, it's not a crux for me whether a stop is especially feasible or the best hope to be pursuing. On my model, the world is much more likely to respond in marginally saner ways the more that decision-makers understand the problem. Saying "I think a stop would be better than what we're currently doing and beg the world to shut down everyone including us" if you believe it helps communicate your beliefs (and thus the truth, insofar as you're good at believing) even if the exact policy proposal doesn't happen. I think the equilibrium where lots and lots of people understand the gravity of the situation is probably better than the current equilibrium in lots of hard-to-articulate and hard-to-predict ways, even if the better equilibrium would not be able to pull off a full stop.

(For an intuition pump: perhaps such a world could pull off "every nation sabotages every other nation's ASI projects for fear of their own lives", as an illustration of how more understanding could help even w/out a treaty.)

Reply
Safety researchers should take a public stance
So8res1d*2516

Quick take: I agree it might be hard to get above 50 today. I think that even 12 respected people inside one lab today would have an effect on the Overton window inside labs, which I think would have an effect over time (aided primarily by the fact that the arguments are fairly clearly on the side of a global stop being better; it's harder to keep true things out it the Overton window). I expect it's easier to shift culture inside labs first, rather than inside policy shops, bc labs at least don't have the dismissals of "they clearly don't actually believe that" and "if they did believe it they'd act differently" ready to go. There are ofc many other factors that make it hard for a lab culture to fully adopt the "nobody should be doing this, not even us" stance, but it seems plausible that that could at least be brought into the Overton window of the labs, and that that'd be a big improvement (towards, eg, lab heads becoming able to say it).

Reply
Safety researchers should take a public stance
So8res2d*4134

I think there's a huge difference between labs saying "there's lots of risk" and labs saying "no seriously, please shut everyone down including me, I'm only doing this because others are allowed to and would rather we all stopped". The latter is consistent with the view; its absence is conspicuous. Here is an example of someone noticing in the wild; I have also heard that sort of response from multiple elected officials. If Dario could say it that'd be better, but lots of researchers in the labs saying it would be a start. And might even make it more possible for lab leaders to come out and say it themselves!

Reply
Safety researchers should take a public stance
So8res2d*4436

It seems to me that most people who pay attention to AI (and especially policymakers) are confused about whether the race to superintelligence is real, and whether the dangers are real. I think "people at the labs never say the world would be better without the race (eg because they think the world won't actually stop)" is one factor contributing to that confusion. I think the argument "I can have more of an impact by hiding my real views so that I can have more influence inside the labs that are gambling with everyone's lives; can people outside the labs speak up instead?" is not necessarily wrong, but it seems really sketchy to me. I think it contributes to a self-fulfilling prophecy where the world never responds appropriately because the places where world leaders looked for signals never managed to signal the danger.

From my perspective, it's not about "costly signaling", it's about sending the signal at all. I suspect you're underestimating how much the world would want to change course if it understood the situation, and underestimating how much you could participate in shifting to an equilibrium where the labs are reliably sending a saner signal (and underestimating how much credibility this would build in worlds that eventually cotton on).

And even if the tradeoffs come out that way for you, I'm very skeptical that they come out that way for everyone. I think a world where everyone at the labs pretends (to policymakers) that what they're doing is business-as-usual and fine is a pretty messed-up world.

Reply1
The title is reasonable
So8res2d161

Ty. Is this a summary of a more-concrete reason you have for hope? (Have you got alternative more-concrete summaries you'd prefer?)

"Maybe huge amounts of human-directed weak intelligent labor will be used to unlock a new AI paradigm that produces more comprehensible AIs that humans can actually understand, which would be a different and more-hopeful situation."

(Separately: I acknowledge that if there's one story for how the playing field might change for the better, then there might be bunch more stories too, which would make "things are gonna change" an argument that supports the claim that the future will have a much better chance than we'd have if ChatGPT-6 was all it took.)

Reply
The title is reasonable
So8res2d*168

I think the online resources touches on that in the "more on making AIs solve the problem" subsection here. With the main thrust being: I'm skeptical that you can stack lots of dumb labor into an alignment solution, and skeptical that identifying issues will allow you to fix them, and skeptical that humans can tell when something is on the right track. (All of which is one branch of a larger disjunctive argument, with the two disjuncts mentioned above — "the world doesn't work like that" and "the plan won't survive the gap between Before and After on the first try" — also applying in force, on my view.)

(Tbc, I'm not trying to insinuate that everyone should've read all of the online resources already; they're long. And I'm not trying to say y'all should agree; the online resources are geared more towards newcomers than to LWers. I'm not even saying that I'm getting especially close to your latest vision; if I had more hope in your neck of the woods I'd probably investigate harder and try to pass your ITT better. From my perspective, there are quite a lot of hopes and copes to cover, mostly from places that aren't particularly Redwoodish in their starting assumptions. I am merely trying to evidence my attempts to reply to what I understand to be the counterarguments, subject to constraints of targeting this mostly towards newcomers.)

Reply
Load More
311The Problem
1mo
217
519A case for courage, when speaking of AI danger
3mo
128
645Eliezer and I wrote a book: If Anyone Builds It, Everyone Dies
4mo
114
36LessWrong: After Dark, a new side of LessWrong
1y
6
42Ronny and Nate discuss what sorts of minds humanity is likely to find by Machine Learning
2y
30
26Quick takes on "AI is easy to control"
2y
49
135Apocalypse insurance, and the hardline libertarian take on AI risk
2y
40
203Ability to solve long-horizon tasks correlates with wanting things in the behaviorist sense
Ω
2y
Ω
84
112How much to update on recent AI governance moves?
2y
5
169Thoughts on the AI Safety Summit company policy requests and responses
2y
14
Load More
Report likelihoods, not p-values
8 years ago
(+4/-3)
Report likelihoods, not p-values
8 years ago
(-115)