With such a cast of characters, I've done a full voiced ElevenLabs narration for this post:
https://open.substack.com/pub/askwhocastsai/p/read-the-roon-by-zvi-mowshowitz
I would feel better if I knew Ilya was back working at Superalignment.
Same here...
I wonder what is known about that.
It seems that in mid-January OpenAI and Ilya have still been discussing what is going to happen, and I have not seen any further information since then. There are tons of Manifold Ilya-related markets, and they also reflect high uncertainty and no additional information.
Actually, we do have a bit of new information today: Ilya is listed as one of the authors of the OpenAI new blog post commenting on Elon's lawsuit
But yes, it would be really nice to find out more...
March 8 update, from the call with reporters: When Altman was asked about Sutskever’s status on the Zoom call with reporters, he said there were no updates to share. “I love Ilya... I hope we work together for the rest of our careers, my career, whatever,” Altman said. “Nothing to announce today.”
So, it seems this is still where it has been in mid-January (up in the air, with eventual resolution being uncertain).
Damn, reading Connor's letter to Roon had a psychoactive influence on me; I got Ayahuasca flashbacks. There are some terrifying and deep truths lurking there.
political culture encourages us to think that generalized anxiety is equivalent to civic duty.
This is a wonderful way to put it!
As I see it, the "generalized anxiety" is essentially costly signaling. By being anxious, you signal "my reactions to the latest political thing are important for the fate of the world", i.e. that you are important.
But if this was the entire story, you would be able to notice it and feel ashamed that you felt for such simple trick. Reframing it as a civil duty keeps you in the trap, because it provides a high-status answer to why you are doing it, distracting you from realizing what you are doing.
(Metaphorically: "The proper way to signal high status is to keep pushing this button all the time." "I keep pushing the button and I feel great about it, but also my finger is starting to hurt a lot." "Don't worry; the fact that your finger hurts proves that you are a good and wise person." "Wow, now I keep pushing the button and my finger hurts like hell, and I feel great about it!")
The LessWrong Review runs every year to select the posts that have most stood the test of time. This post is not yet eligible for review, but will be at the end of 2025. The top fifty or so posts are featured prominently on the site throughout the year.
Hopefully, the review is better than karma at judging enduring value. If we have accurate prediction markets on the review results, maybe we can have better incentives on LessWrong today. Will this post make the top fifty?
What would be a better framing?
I talk about something related in self and no-self; the outward-flowing 'attempt to control' and the inward-flowing 'attempt to perceive' are simultaneously in conflict (something being still makes it easier to see where it is, but also makes it harder to move it to where it should be) and mutually reinforcing (being able to tell where something is makes it easier to move it precisely where it needs to be).
Similarly, you can make an argument that control without understanding is impossible, that getting AI systems to do what we want is one task instead of two. I think I agree the "two progress bars" frame is incorrect but I think the typical AGI developer at a lab is not grappling with the philosophical problems behind alignment difficulties, and is trying to make something that 'works at all' instead of 'works understandably' in the sort of way that would actually lead to understanding which would enable control.
Roon, member of OpenAI’s technical staff, is one of the few candidates for a Worthy Opponent when discussing questions of AI capabilities development, AI existential risk and what we should do about it. Roon is alive. Roon is thinking. Roon clearly values good things over bad things. Roon is engaging with the actual questions, rather than denying or hiding from them, and unafraid to call all sorts of idiots idiots. As his profile once said, he believes spice must flow, we just do go ahead, and makes a mixture of arguments for that, some good, some bad and many absurd. Also, his account is fun as hell.
Thus, when he comes out as strongly as he seemed to do recently, attention is paid, and we got to have a relatively good discussion of key questions. While I attempt to contribute here, this post is largely aimed at preserving that discussion.
The Initial Statement
As you would expect, Roon’s statement last week that AGI was inevitable and nothing could stop it so you should essentially spend your final days with your loved ones and hope it all works out, led to some strong reactions.
Many pointed out that AGI has to be built, at very large cost, by highly talented hardworking humans, in ways that seem entirely plausible to prevent or redirect if we decided to prevent or redirect those developments.
Sounds like we should take action to get some control, then. This seems like the kind of thing we should want to be able to control.
Saying no one has any control so why try to do anything to get control back seems like the opposite of what is needed here.
The Doubling Down
Roon’s reaction:
Roon’s point on idle anxiety is indeed a good one. If you are not one of those trying to gain or assert some of that control, as most people on Earth are not and should not be, then of course I agree that idle anxiety is not useful. However Roon then did attempt to extend this to claim that all anxiety about AGI is idle, that no one has any control. That is where there is strong disagreement, and what is causing the reaction.
Connor Leahy Gives it a Shot
This is a very good attempt to identify key elements of the elephant I grasp when I notice that being in San Francisco very much does not agree with me. I always have excellent conversations during visits because the city has abducted so many of the best people, I always get excited by them, but the place feels alien, as if I am being constantly attacked by paradox spirits, visiting a deeply hostile and alien culture that has inverted many of my most sacred values and wants to eat absolutely everything. Whereas here, in New York City, I feel very much at home.
Meanwhile, back in the thread:
Roon Responds to Connor
Roon responds quite well:
This is a very good response. He is pointing out that yes, some people such as Connor can influence what happens, and they in particular should try to model and influence events.
Roon is also saying that he himself is doing his best to influence events. Roon realizes that those at OpenAI matter and what they do matter.
Roon reached out to leadership on several occasions with safety concerns. When he says he was ‘wrong to worry’ I presume he means that the situation worked out and was handled, I am confident that expressing his concerns was the output of the best available decision algorithm, you want most such concerns you express to turn out fine.
Roon also worked, in the wake of events at OpenAI, to remind people of the importance of alignment work, that they should not toss it out based on those events. Which is a scary thing for him to report having to do, but expected, and it is good that he did so. I would feel better if I knew Ilya was back working at Superalignment.
And of course, Roon is constantly active on Twitter, saying things that impact the discourse, often for the better. He seems keenly aware that his actions matter, whether or not he could meaningfully slow down AGI. I actually think he perhaps could, if he put his mind to it.
The contrast here versus the original post is important. The good message is ‘do not waste time worrying too much over things you do not impact.’ The bad message is ‘no one can impact this.’
Connor Goes Deep
Then Connor goes deep and it gets weirder, also this long post has 450k views and is aimed largely at trying to get through to Roon in particular. But also there are many others in a similar spot, so some others should read this as well. Many of you however should skip it.
I like to think I got most of that, but how would I know if I was wrong?
Focusing on the one aspect of this: One must hold both concepts in one’s head at the same time.
These are both ‘obviously’ true. You are in the shadow of the Elder Gods up against Cthulhu (well, technically Azathoth), the odds are against you and the situation is grim, and if we are to survive you are going to have to punch them out in the end, which means figuring out how to do that and you won’t be doing it alone.
A Question of Agency
Meanwhile, some more wise words:
Also see:
And also from this week:
What would be a better framing? The issue is that all alignment work is likely to also be capabilities work, and much of capabilities work can help with alignment.
One can and should still ask the question, does applying my agency to differentially advancing this particular thing make it more likely we will get good outcomes versus bad outcomes? That it will relatively rapidly grow our ability to control and understand what AI does versus getting AIs to be able to better do more things? What paths does this help us walk down?
Yes, collectively we absolutely have control over these questions. We can coordinate to choose a different path, and each individual can help steer towards better paths. If necessary, we can take strong collective action, including regulatory and legal action, to stop the future from wiping us out. Pointless anxiety or worry about such outcomes is indeed pointless, that should be minimized, only have the amount required to figure out and take the most useful actions.
What that implies about the best actions for a given person to take will vary widely. I am certainly not claiming to have all the answers here. I like to think Roon would agree that both of us, and many but far from all of you reading this, are in the group that can help improve the odds.