In the past few weeks I've noticed a significant change in the Overton window of what seems possible to talk about. I think the broad strokes of this article seem basically right, and I agree with most of the details.
I don't expect this to immediately cause AI labs or world governments to join hands and execute a sensibly-executed-moratorium. But I'm hopeful about it paving the way for the next steps towards it. I like that this article, while making an extremely huge ask of the world, spells out exactly how huge an ask is actually needed.
Many people on hackernews seemed suspicious of the FLI Open Letter because it looks superficially like the losers in a race trying to gain a local political advantage. I like that Eliezer's piece makes it more clear that it's not about that.
I do still plan to sign the FLI Open Letter. If a better open letter comes along, making an ask that is more complete and concrete, I'd sign that as well. I think it's okay to sign open letters that aren't exactly the thing you want to help build momentum and common knowledge of what people think. (I think not-signing-the-letter while arguing for what better letter should be written, similar to what Eliez...
A concise and impactful description of the difficulty we face.
I expect that the message in this article will not truly land with a wider audience (it still doesn't seem to land with all of the LW audience...), but I'm glad to see someone trying.
I would be interested in hearing the initial reactions and questions of readers who were previously unfamiliar with AI x-risk have after reading this article. I'll keep an eye on Twitter, I suppose.
I just want to say that this is very clear argumentation and great rhetoric. Eliezer's writing at its best.
And it does seem to have got a bit of traction. A very non-technical friend just sent me the link, on the basis that she knows "I've always been a bit worried about that sort of thing."
I disagree with AI doomers, not in the sense that I consider it a non-issue, but that my assessment of the risk of ruin is something like 1%, not 10%, let alone the 50%+ that Yudkowsky et al. believe. Moreover, restrictive AI regimes threaten to produce a lot of outcomes things, possibly including the devolution of AI control into a cult (we have a close analogue in post-1950s public opinion towards civilian applications of nuclear power and explosions, which robbed us of Orion Drives amongst other things), what may well be a delay in life extension timelines by years if not decades that results in 100Ms-1Bs of avoidable deaths (this is not just my supposition, but that of Aubrey de Grey as well, who has recently commented on Twitter that AI is already bringing LEV timelines forwards), and even outright technological stagnation (nobody has yet canceled secular dysgenic trends in genomic IQ). I leave unmentioned the extreme geopolitical risks from "GPU imperialism".
While I am quite irrelevant, this is not a marginal viewpoint - it's probably pretty mainstream within e/acc, for instance - and one that has to be countered if Yudkowsky's extreme and far-reaching proposals are to have a...
Couple of points:
It's ultimately a question of probabilities, isn't it? If the risk is ~1%, we mostly all agree Yudkowsky's proposals are deranged. If 50%+, we all become Butlerian Jihadists.
My point is I and people like me need to be convinced it's closer to 50% than to 1%, or failing that we at least need to be "bribed" in a really big way.
I'm somewhat more pessimistic than you on civilizational prospects without AI. As you point out, bioethicists and various ideologues have some chance of tabooing technological eugenics. (I don't understand your point about assortative mating; yes, there's more of it, but does it now cancel out regression to the mean?). Meanwhile, in a post-Malthusian economy such as ours, selection for natalism will be ultra-competitive. The combination of these factors would logically result in centuries of technological stagnation and a population explosion that brings the world population back up to the limits of the industrial world economy, until Malthusian constraints reassert themselves in what will probably be quite a grisly way (pandemics, dearth, etc.), until Clarkian selection for thrift and intelligence reasserts itself. It will also, needless to say, be a few centuries in which other forms of existential risks will remain at play.
PS. Somewhat of an aside but don't think it's a great idea to throw terms like "grifter" around, especially when the most globally famous EA representative is a crypto crook (who literally stole some of my money, small % of my portfolio, but nonetheless, no e/acc person has stolen anything from me).
It's ultimately a question of probabilities, isn't it? If the risk is ~1%, we mostly all agree Yudkowsky's proposals are deranged. If 50%+, we all become Butlerian Jihadists.
Uhh... No, we don't? 1% of 8 billion people is 80 million people, and AI risk involves more at stake if you loop in the whole "no more new children" thing. I'm not saying that "it's a small chance of a very bad thing happening so we should work on it anyways" is a good argument, but if we're taking as a premise is that the chance of failure is 1%, that'd be sufficient to justify several decades of safety research. At least IMO.
I don't understand your point about assortative mating; yes, there's more of it, but does it now cancel out regression to the mean?
https://en.wikipedia.org/wiki/Coming_Apart_(book)
AI research is pushed mostly by people at the tails of intellgence, not by lots of small contributions from people with average intelligence. It's true that currently smarter people have slightly fewer children, but now more than ever smarter people are having children with each other, and so the amount of very smart people is probably increasing over time, at least by Charles Murray's analysis. Whatever ...
Heritability is measured in a way that rules that out. See e.g. Judith Harris or Bryan Caplan for popular expositions about the relevant methodologies & fine print.
I totally get where you're coming from, and if I thought the chance of doom was 1% I'd say "full speed ahead!"
As it is, at fifty-three years old, I'm one of the corpses I'm prepared to throw on the pile to stop AI.
The "bribe" I require is several OOMs more money invested into radical life extension research
Hell yes. That's been needed rather urgently for a while now.
I think there's an important meta-level point to notice about this article.
This is the discussion that the AI research and AI alignment communities have been having for years. Some agree, some disagree, but the 'agree' camp is not exactly small. Until this week, all of this was unknown to most of the general public, and unknown to anyone who could plausibly claim to be a world leader.
When I say it was unknown, I don't mean that they disagreed. To disagree with something, at the very least you have to know that there is something out there to disagree with. In fact they had no idea this debate existed. Because it's very hard to notice the implications of upcoming technologiy when you're a 65 year old politician in DC rather than a 25 year old software engineer in SF. But also because many people and many orgs made the explicit decision to not do public outreach, to not try to make the situation legible to laypeople, to not look like people playing with the stakes we have in fact been playing with.
I do not think lies were told, exactly, but I think the world was deceived. I think the phrasing of the FLI open letter was phrased so as to continue that deception, and that the phra...
Until this week, all of this was [...] unknown to anyone who could plausibly claim to be a world leader.
I don't think this is known to be true.
In fact they had no idea this debate existed.
That seems too strong. Some data points:
1. There's been lots of AI risk press over the last decade. (E.g., Musk and Bostrom in 2014, Gates in 2015, Kissinger in 2018.)
2. Obama had a conversation with WIRED regarding Bostrom's Superintelligence in 2016, and his administration cited papers by MIRI and FHI in a report on AI the same year. Quoting that report:
...General AI (sometimes called Artificial General Intelligence, or AGI) refers to a notional future AI system that exhibits apparently intelligent behavior at least as advanced as a person across the full range of cognitive tasks. A broad chasm seems to separate today’s Narrow AI from the much more difficult challenge of General AI. Attempts to reach General AI by expanding Narrow AI solutions have made little headway over many decades of research. The current consensus of the private-sector expert community, with which the NSTC Committee on Technology concurs, is that General AI will not be achieved for at least decades.[14]
People have long specul
There simply don't exist arguments with the level of rigor needed to justify a claim such as this one without any accompanying uncertainty:
If we go ahead on this everyone will die, including children who did not choose this and did not do anything wrong.
I think this passage, meanwhile, rather misrepresents the situation to a typical reader:
When the insider conversation is about the grief of seeing your daughter lose her first tooth, and thinking she’s not going to get a chance to grow up, I believe we are past the point of playing political chess about a six-month moratorium.
This isn't "the insider conversation". It's (the partner of) one particular insider, who exists on the absolute extreme end of what insiders think, especially if we restrict ourselves to those actively engaged with research in the last several years. A typical reader could easily come away from that passage thinking otherwise.
Would you say the same thing about the negations of that claim? If you saw e.g. various tech companies and politicians talking about how they're going to build AGI and then [something that implies that people will still be alive afterwards] would you call them out and say they need to qualify their claim with uncertainty or else they are being unreasonable?
Re: the insider conversation: Yeah, I guess it depends on what you mean by 'the insider conversation' and whether you think the impression random members of the public will get from these passages brings them closer or farther away from understanding what's happening. My guess is that it brings them closer to understanding what's happening; people just do not realize how seriously experts take the possibility that literally AGI will literally happen and literally kill literally everyone. It's a serious possibility. I'd even dare to guess that the majority of people building AGI (weighted by how much they are contributing) think it's a serious possibility, which maybe we can quantify as >5% or so, despite the massive psychological pressure of motivated cognition / self-serving rationalization to think otherwise. And the public does not realize this yet, I think.
Also, on a more personal level, I've felt exactly the same way about my own daughter for the past two years or so, ever since my timelines shortened.
The negation of the claim would not be "There is definitely nothing to worry about re AI x-risk." It would be something much more mundane-sounding, like "It's not the case that if we go ahead with building AGI soon, we all die."
That said, yay -- insofar as you aren't just applying a double standard here, then I'll agree with you. It would have been better if Yud added in some uncertainty disclaimers.
"But yeah, I wish this hadn't happened."
Who else is gonna write the article? My sense is that no one (including me) is starkly stating publically the seriousness of the situation.
"Yudkowsky is obnoxious, arrogant, and most importantly, disliked, so the more he intertwines himself with the idea of AI x-risk in the public imagination, the less likely it is that the public will take those ideas seriously"
I'm worried about people making character attacks on Yudkowsky (or other alignment researchers) like this. I think the people who think they can probably solve alignment by just going full-speed ahead and winging it, they are arrogant. Yudkowsky's arrogant-sounding comments about how we need to be very careful and slow, are negligible in comparison. I'm guessing you agree with this (not sure) and we should be able to criticise him for his communication style, but I am a little worried about people publically undermining Yudkowsky's reputation in that context. This seems like not what we would do if we were trying to coordinate well.
takes a deep breath
(Epistemic status: vague, ill-formed first impressions.)
So that's what we're doing, huh? I suppose EY/MIRI has reached the point where worrying about memetics / optics has become largely a non-concern, in favor of BROADCASTING TO THE WORLD JUST HOW FUCKED WE ARE
I have... complicated thoughts about this. My object-level read of the likely consequences is that I have no idea what the object-level consequences are likely to be, other than that this basically seems to be an attempt at heaving a gigantic rock through the Overton window, for good or for ill. (Maybe AI alignment becomes politicized as a result of this? But perhaps it already has been! And even if not, maybe politicizing it will at least raise awareness, so that it might become a cause area with similar notoriety as e.g. global warming—which appears to have at least succeeded in making token efforts to reduce greenhouse emissions?)
I just don't know. This seems like a very off-distribution move from Eliezer—which I suspect is in large part the point: when your model predicts doom by default, you go off-distribution in search of higher-variance regions of outcome space. So I suppose from his viewpoint, this action does make some sense; I am (however) vaguely annoyed on behalf of other alignment teams, whose jobs I at least mildly predict will get harder as a result of this.
This seems like a very off-distribution move from Eliezer—which I suspect is in large part the point: when your model predicts doom by default, you go off-distribution in search of higher-variance regions of outcome space.
That's not how I read it. To me it's an attempt at the simple, obvious strategy of telling people ~all the truth he can about a subject they care a lot about and where he and they have common interests. This doesn't seem like an attempt to be clever or explore high-variance tails. More like an attempt to explore the obvious strategy, or to follow the obvious bits of common-sense ethics, now that lots of allegedly clever 4-dimensional chess has turned out stupid.
I don't think what you say Anna contradicts what dxu said. The obvious simple strategy is now being tried, because the galaxy brained strategies don't seem like they are working; the galaxy-brained strategies seemed lower-variance and more sensible in general at the time, but now they seem less sensible so EY is switching to the higher-variance, less-galaxy-brained strategy.
"For instance, personally I think the reason so few people take AI alignment seriously is that we haven't actually seen anything all that scary yet. "
And if this "actually scary" thing happens, people will know that Yudkowsky wrote the article beforehand, and they will know who the people are that mocked it.
The average person on the street is even further away from this I think.
This contradicts the existing polls, which appear to say that everyone outside of your subculture is much more concerned about AGI killing everyone. It looks like if it came to a vote, delaying AGI in some vague way would win by a landslide, and even Eliezer's proposal might win easily.
People like Ezra Klein are hearing Eliezer and rolling his position into their own more palatable takes. I really don't think it's necessary for everyone to play that game, it seems really good to have someone out there just speaking honestly, even if they're far on the pessimistic tail, so others can see what's possible. 4D chess here seems likely to fail.
https://steno.ai/the-ezra-klein-show/my-view-on-ai
Also, there's the sentiment going around that normies who hear this are actually way more open to the simple AI Safety case than you'd expect, we've been extrapolating too much from current critics. Tech people have had years to formulate rationalizations and reassure one another they are clever skeptics for dismissing this stuff. Meanwhile regular folks will often spout off casual proclamations that the world is likely ending due to climate change or social decay or whatever, they seem to err on the side of doomerism as often as the opposite. The fact that Eliezer got published in TIME is already a huge point in favor of his strategy working.
EDIT: Case in point! Met a person tonight, completely offline rural anti-vax astrology doesn't-follow-the-news type of person, I said the word AI and immediately she says she thinks "robots will eventually take over". I understand this might not be the level of sophistication we'd desire, but at least be aware that raw material is out there. No idea how it'll play out, but 4d chess still seems like a mistake, let Yud speak his truth.
I think that Eliezer (and many others including myself!) may be suspectable to "living in the should-universe"
That's a new one!
More seriously: Yep, it's possible to be making this error on a particular dimension, even if you're a pessimist on some other dimensions. My current guess would be that Eliezer isn't making that mistake here, though.
For one thing, the situation is more like "Eliezer thinks he tried the option you're proposing for a long time and it didn't work, so now he's trying something different" (and he's observed many others trying other things and also failing), rather than "it's never occurred to Eliezer that LWers are different from non-LWers".
I think it's totally possible that Eliezer and I are missing important facts about an important demographic, but from your description I think you're misunderstanding the TIME article as more naive and less based-on-an-underlying-complicated-model than is actually the case.
I just don't know. This seems like a very off-distribution move from Eliezer—which I suspect is in large part the point: when your model predicts doom by default, you go off-distribution in search of higher-variance regions of outcome space. So I suppose from his viewpoint, this action does make some sense; I am (however) vaguely annoyed on behalf of other alignment teams, whose jobs I at least mildly predict will get harder as a result of this.
Personally, I think Eliezer's article is actually just great for trying to get real policy change to happen here. It's not clear to me why Eliezer saying this would make anything harder for other policy proposals. (Not that I agree with everything he said, I just think it was good that he said it.)
I am much more conflicted about the FLI letter; it's particular policy proscription seems not great to me and I worry it makes us look pretty bad if we try approximately the same thing again with a better policy proscription after this one fails, which is approximately what I expect we'll need to do.
(Though to be fair this is as someone who's also very much on the pessimistic side and so tends to like variance.)
I think this is probably right. When all hope is gone, try just telling people the truth and see what happens. I don't expect it will work, I don't expect Eliezer expects it to work, but it may be our last chance to stop it.
One quote I expect to be potentially inflammatory / controversial:
Make immediate multinational agreements to prevent the prohibited activities from moving elsewhere. Track all GPUs sold. If intelligence says that a country outside the agreement is building a GPU cluster, be less scared of a shooting conflict between nations than of the moratorium being violated; be willing to destroy a rogue datacenter by airstrike.
I'll remark that this is not in any way a call for violence or even military escalation.
Multinational treaties (about nukes, chemical weapons, national borders, whatever), with clear boundaries and understanding of how they will be enforced on all sides, are generally understood as a good way of decreasing the likelihood of conflicts over these issues escalating to actual shooting.
Of course, potential treaty violations should be interpreted charitably, but enforced firmly according to their terms, if you want your treaties to actually mean anything. This has not always happened for historical treaties, but my gut sense is that on the balance, the existence of multinational treaties has been a net-positive in reducing global conflict.
It is absolutely a call for violence.
He says if a "country outside the agreement" builds a GPU cluster, then some country should be be willing to destroy that cluster by airstrike. That is not about enforcing agreements. That means enforcing one's will unilaterally on a non-treaty nation -- someone not a party to a multinational treaty.
"Hey bro, we decided if you collect more than 10 H100s we'll bomb you" is about as clearly violence as "Your money or your life."
Say you think violence is justified, if that's what you think. Don't give me this "nah, airstrikes aren't violence" garbage.
Strictly speaking it is a (conditional) "call for violence", but we often reserve that phrase for atypical or extreme cases rather than the normal tools of international relations. It is no more a "call for violence" than treaties banning the use of chemical weapons (which the mainstream is okay with), for example.
Yeah, this comment seemed technically true but seems misleading with regards to how people actually use words
It is advocating that we treat it as the class-of-treaty we consider nuclear treaties, and yes that involves violence, but "calls for violence" just means something else.
The use of violence in case of violations of the NPT treaty has been fairly limited and highly questionable in international law. And, in fact, calls for such violence are very much frowned upon because of fear they have a tendency to lead to full scale war.
No one has ever seriously suggested violence as a response to potential violation of the various other nuclear arms control treaties.
No one has ever seriously suggested running a risk of nuclear exchange to prevent a potential treaty violation. So, what Yudkowsky is suggesting is very different than how treaty violations are usually handled.
Given Yudkowsky's view that the continued development of AI has an essentially 100% probability of killing all human beings, his view makes total sense - but he is explicitly advocating for violence up to and including acts of war. (His objections to individual violence mostly appear to relate to such violence being ineffective.)
It's a call for preemptive war; or rather, it's a call to establish unprecedented norms that would likely lead to a preemptive war if other nations don't like the terms of the agreement. I think advocating a preemptive war is well-described as "a call for violence" even if it's common for mainstream people to make such calls. For example, I think calling for an invasion of Iraq in 2003 was unambiguously a call for violence, even though it was done under the justification of preemptive self-defense.
Also, there is a big difference between "Calling for violence", and "calling for the establishment of an international treaty, which is to be enforced by violence if necessary". I don't understand why so many people are muddling this distinction.
It seems like this makes all proposed criminalization of activities punished by death penalty a call for violence?
Yes! Particularly if it's an activity people currently do. Promoting death penalty for women who get abortion is calling for violence against women; promoting death penalty for apostasy from Islam is calling for violence against ex-apostates. I think if a country is contemplating passing a law to kill rapists, and someone says "yeah, that would be a great fuckin law" they are calling for violence against rapists, whether or not it is justified.
I don't really care whether something occurs beneath the auspices of supposed international law. Saying "this coordinated violence is good and worthy" is still saying "this violence is good and worthy." If you call for a droning in Pakistan, and a droning in Pakistan occurs and kills someone, what were you calling for if not violence.
Meh, we all agree on what's going on here, in terms of concrete acts being advocated and I hate arguments over denotation. If "calling for violence" is objectionable, "Yud wants states to coordinate to destroy large GPU clusters, potentially killing people and risking retaliatory killing up to the point of nuclear war killing millions, if other states don't obey the will of the more powerful states, because he thinks even killing some millions of people is a worthwhile trade to save mankind from being killed by AI down the line" is, I think, very literally what is going on. When I read that it sounds like calling for violence, but, like, dunno.
The thing I’m pretty worried about here is people running around saying ‘Eliezer advocated violence’, and people hearing ‘unilaterally bomb data centers’ rather than ‘build an international coalition that enforces a treaty similar to how we treat nuclear weapons and bioweapons, and enforce it.”
I hear you saying (and agree with) “guys you should not be oblivious to the fact that this involves willingness to use nuclear weapons” Yes I agree very much it’s important to stare that in the face.
But “a call for willingness to use violence by state actors” is just pretty different from “a call for violence”. Simpler messages move faster than more nuanced messages. Going out of your way to accelerate simple and wrong conceptions of what’s going on doesn’t seem like it’s helping anyone.
people hearing ‘unilaterally bomb data centers’ rather than ‘build an international coalition that enforces a treaty similar to how we treat nuclear weapons and bioweapons, and enforce it.”
It is rare to start wars over arms treaty violations. The proposal considered here -- if taken seriously -- would not be an ordinary enforcement action but rather a significant breach of sovereignty almost without precedent within this context. I think it's reasonable to consider calls for preemptive war extremely seriously, and treat it very differently than if one had proposed e.g. an ordinary federal law.
I'm specifically talking about the reference class of nuclear and bioweapons, which do sometimes involve invasion or threat-of-invasion of sovereign states. I agree that's really rare, something we should not do lightly.
But I don't think you even need Eliezer-levels-of-P(doom) to think the situation warrants that sort of treatment. The most optimistic people I know of who seem to understand the core arguments say things like "10% x-risk this century", which I think is greater than x-risk likelihood from nuclear war.
I agree with this. I find it very weird to imagine that "10% x-risk this century" versus "90% x-risk this century" could be a crux here. (And maybe it's not, and people with those two views in fact mostly agree about governance questions like this.)
Something I wouldn't find weird is if specific causal models of "how do we get out of this mess" predict more vs. less utility for state interference. E.g., maybe you think 10% risk is scarily high and a sane world would respond to large ML training runs way more aggressively than it responds to nascent nuclear programs, but you also note that the world is not sane, and you suspect that government involvement will just make the situation even worse in expectation.
If nuclear war occurs over alignment, then in the future people are likely to think about "alignment" much much worse than people currently think about words like "eugenics," for reasons actually even better than the ones people currently dislike "eugenics." Additionally, I don't think it will get easier to coordinate post nuclear war, in general; I think it probably takes us closer to a post-dream-time setting, in the Hansonian sense. So -- obviously predicting the aftermath of nuclear war is super chaotic, but my estimate of % of future light-cone utilized does down -- and if alignment caused the nuclear war, it should go down even further on models which judge alignment to be important!
This is a complex / chaotic / somewhat impossible calculation of course. But people seem to be talking about nuclear war like it's a P(doom)-from-AI-risk reset button, and not realizing that there's an implicit judgement about future probabilities that they are making. Nuclear war isn't the end of history but another event whose consequences you can keep thinking about.
(Also, we aren't gods, and EV is by fucking golly the wrong way to model this, but, different convo)
It makes me... surprised? fe...
I agree pretty strongly with your points here especially the complete lack of good predictions from EY/MIRI about the current Cambrian explosion of intelligence and how any sane agent using a sane updating strategy (like mixture of experts or equivalently solomonof weighting) should more or less now discount/disavow much of their world model.
However I nonetheless agree that AI is by far the dominant x-risk. My doom probability is closer to ~5% perhaps, but the difference between 5% and 50% doesn't cash out to much policy difference at this point.
So really my disagreement is more on alignment strategy. A problem with this site is that it overweights EY/MIRI classic old alignment literature and arguments by about 100x what it should be, and is arguably doing more harm than good by overpromoting those ideas vs alternate ideas flowing from those who actually did make reasonably good predictions about the current cambrian explosion - in advance.
If there was another site that was a nexus for AI/risk/alignment/etc with similar features but with most of the EY/MIRI legacy cultish stuff removed, I would naturally jump there. But it doesn't seem to exist yet.
So really my disagreement is more on alignment strategy. A problem with this site is that it overweights EY/MIRI classic old alignment literature and arguments by about 100x what it should be
I don't think there are many people with alignment strategies and research that they're working on. Eliezer has a hugely important perspective, Scott Garrabrant, Paul Christiano, John Wentworth, Steve Byrnes, and more, all have approaches and perspectives too that they're working full-time on. I think if you're working on this full-time and any of your particular ideas check out as plausible I think there's space for you to post here and get some engagement respect (if you post in a readable style that isn't that of obfuscatory-academia). If you've got work you're doing on it full-time I think you can probably post here semi-regularly and eventually find collaborators and people you're interested in feedback from and eventually funding. You might not get super high karma all the time, but that's okay, I think a few well-received posts is enough to not have to worry about a bunch of low-karma posts.
The main thing that I think makes space for a perspective here is (a) someone is seriously committ...
So really my disagreement is more on alignment strategy. A problem with this site is that it overweights EY/MIRI classic old alignment literature and arguments by about 100x what it should be
I don't think there are many people with alignment strategies and research that they're working on.
I agree that's a problem - but causally downstream of the problem I mention. Whereas Bostrom deserves credit for raising awareness of AI-risk in academia, EY/MIRI deserves credit for awakening many young techies to the issue - but also some blame.
Whether intentionally or not, the EY/MIRI worldview aligned itself against DL and its proponents, leading to an antagonistic dynamic that you may not have experienced if you haven't spent much time on r/MachineLearning or similar. Many people in ML truly hate anything associated with EY/MIRI/LW. Part of that is perhaps just the natural result of someone sounding an alarm that your life's work could literally kill everyone. But it really really doesn't help if you then look into their technical arguments and reach the conclusion that they don't know what they are talking about.
I otherwise agree with much of your comment. I think this site is l...
"But I don't think you even need Eliezer-levels-of-P(doom) to think the situation warrants that sort of treatment."
Agreed. If a new state develops nuclear weapons, this isn't even close to creating a 10% x-risk, yet the idea of airstrikes on nuclear enrichment facillities, even though it is very controversial, has for a long time very much been an option on the table.
You're assuming I agree with the premise. I don't. I don't think that bombing GPU clusters in other countries will help much to advance AI safety, so I don't think the conclusion follows from the premise.
I agree with the principle that if X is overwhelmingly important and Y achieves X, then we should do Y, but the weak point of the argument is that Y achieves X. I do not think it does. You should respond to the argument that I'm actually saying.
You are muddling the meaning of "pre-emptive war", or even "war". I'm not trying to diminish the gravity of Yudkowsky's proposal, but a missile strike on a specific compound known to contain WMD-developing technology is not a "pre-emptive war" or "war". Again I'm not trying to diminish the gravity, but this seems like an incorrect use of the term.
Say you think violence is justified, if that's what you think. Don't give me this "nah, airstrikes aren't violence" garbage.
I think (this kind) of violence is justified. Most people support some degree of state violence. I don't think it's breaching any reasonable deontology for governments to try to prevent a rogue dictator from building something, in violation of a clear international treaty, that might kill many more people than the actual airstrike would kill. It's not evil (IMO) when Israel airstrikes Iranian enrichment facilities, for example.
I think the scenario is that all the big powers agree to this, and agree to enforce it on everyone else.
OK, I guess I was projecting how I would imagine such a scenario working, i.e. through the UN Security Council, thanks to a consensus among the big powers. The Nuclear Non-Proliferation Treaty seems to be the main precedent, except that the NNPT allows for the permanent members to keep their nuclear weapons for now, whereas an AGI Prevention Treaty would have to include a compact among the enforcing powers to not develop AGI themselves.
UN engagement with the topic of AI seems slender, and the idea that AI is a threat to the survival of the human race does not seem to be on their radar at all. Maybe the G-20's weirdly named "supreme audit institution" is another place where the topic could first gain traction at the official inter-governmental level.
Fox News’ Peter Doocy uses all his time at the White House press briefing to ask about an assessment that “literally everyone on Earth will die” because of artificial intelligence: “It sounds crazy, but is it?”
See if this one resonates with you: https://www.cold-takes.com/ai-could-defeat-all-of-us-combined/
Here's a comment from r/controlproblem with feedback on this article (plus tips for outreach in general) that I thought was very helpful.
Where's the lie?
More generally, if this is the least radical policy that Eliezer thinks would actually work, then this is the policy that he and others who believe the same thing should be advocating for in public circles, and they should refuse to moderate a single step. You don't dramatically widen the overton window in <5 years by arguing incrementally inside of it.
Is this now on the radar of national security agencies and the UN Security Council? Is it being properly discussed inside the US government? If not, are meetings being set up? Would be good if someone in the know could give an indication (I hope Yudkowsky is busy talking to lots of important people!)
Jeff Bezos has now followed Eliezer on Twitter: https://twitter.com/bigtechalert/status/1641659849539833856?s=46&t=YyfxSdhuFYbTafD4D1cE9A
[EDIT: fallenpegasus points out that there's a low bar to entry to this corner of TIME's website. I have to say I should have been confused that even now they let Eliezer write in his own idiom.]
The Eliezer of 2010 had no shot of being directly published (instead of featured in an interview that at best paints him as a curiosity) in TIME of 2010. I'm not sure about 2020.
I wonder at what point the threshold of "admitting it's at least okay to discuss Eliezer's viewpoint at face value" was crossed for the editors of TIME. I fear the answer is "last month".
Public attention is rare and safety measures are even more rare unless there's real world damage. This is a known pattern in engineering, product design and project planning so I fear there will be little public attention and even less legislation until someone gets hurt by AI. That could take the form of a hot coffee type incident or it could be a Chernobyl type incident. The threshold won't be discussing Eliezer's point of view, we've been doing that for a long time, but losing sleep over Eliezer's point of view. I appreciate in the article Yudkowsky's use of the think-of-the-children stance which has a great track record for sparking legislation.
Eliezer had a response on twitter to the criticism of "calling for violence"
...The great political writers who also aspired to be good human beings, from George Orwell on the left to Robert Heinlein on the right, taught me to acknowledge in my writing that politics rests on force.
George Orwell considered it a tactic of totalitarianism, that bullet-riddled bodies and mass graves were often described in vague euphemisms; that in this way brutal policies gained public support without their prices being justified, by hiding those prices.
Robert Heinlein thought it beneath a citizen's dignity to pretend that, if they bore no gun, they were morally superior to the police officers and soldiers who bore guns to defend their law and their peace; Heinlein, both metaphorically and literally, thought that if you eat meat—and he was not a vegetarian—you ought to be willing to visit a farm and try personally slaughtering a chicken.
When you pass a law, it means that people who defy the law go to jail; and if they try to escape jail they'll be shot. When you advocate an international treaty, if you want that treaty to be effective, it may mean sanctions that will starve families, or
I'm getting reports that Time Magazine's website is paywalled for some people e.g. in certain states or countries or something. Here is the full text of the article:
...An open letter published today calls for “all AI labs to immediately pause for at least 6 months the training of AI systems more powerful than GPT-4.”
This 6-month moratorium would be better than no moratorium. I have respect for everyone who stepped up and signed it. It’s an improvement on the margin.
I refrained from signing because I think the letter is understating the seriousness of the situation and asking for too little to solve it.
The key issue is not “human-competitive” intelligence (as the open letter puts it); it’s what happens after AI gets to smarter-than-human intelligence. Key thresholds there may not be obvious, we definitely can’t calculate in advance what happens when, and it currently seems imaginable that a research lab would cross critical lines without noticing.
Many researchers steeped in these issues, including myself, expect that the most likely result of building a superhumanly smart AI, under anything remotely like the current circumstances, is that literally everyone on Earth will die. Not as in
I'll note (because some commenters seem to miss this) that Eliezer is writing in a convincing style for a non-technical audience. Obviously the debates he would have with technical AI safety people are different then what is most useful to say to the general population.
What are examples that can help to see this tie more clearly? Procedures that works similarly enough to say "we do X during planning and building a bridge and if we do X in AI building...". Is there are even exist such X that can be applied to enginering a bridge and enginering an AI?
Eliezer's repeated claim that we have literally no idea about what goes on in AI because they're inscrutable piles of numbers is untrue and he must know that. There have been a number of papers and LW posts giving at least partial analysis of neural networks, learning how they work and how to control them at a fine grained level, etc. That he keeps on saying this without caveat casts doubt on his ability or willingness to update on new evidence on this issue.
I struggle to recall another piece of technology that humans have built and yet understand less than AI models trained by deep learning. The statement that we have "no idea" seems completely appropriate. And I don't think he's trying to say that interpretability researchers are wasting their time by noticing that current state of affairs; the not knowing is why interpretability research is necessary in the first place.
Eliezer has clear beliefs about interpretability and bets on it: https://manifold.markets/EliezerYudkowsky/by-the-end-of-2026-will-we-have-tra
Doesn't the prisoner's dilemma (esp. in the military context) inevitably lead us to further development of AI? If so, it would seem that focusing attention and effort on developing AI as safely as possible is a more practical and worthwhile issue than any attempt to halt such development altogether.
I’ve seen pretty uniform praise from rationalist audiences, so I thought it worth mentioning that the prevailing response I’ve seen from within a leading lab working on AGI is that Eliezer came off as an unhinged lunatic.
For lack of a better way of saying it, folks not enmeshed within the rat tradition—i.e., normies—do not typically respond well to calls to drop bombs on things, even if such a call is a perfectly rational deduction from the underlying premises of the argument. Eliezer either knew that the entire response to the essay would be dominated by...
I think the harsh truth is that no one cared about Nuclear Weapons until Hiroshima was bombed. The concept of one nation "disarming" AI would never be appreciated until somebody gets burned.
Do you remember the end of Watchmen?
...To visualize a hostile superhuman AI, don’t imagine a lifeless book-smart thinker dwelling inside the internet and sending ill-intentioned emails. Visualize an entire alien civilization, thinking at millions of times human speeds, initially confined to computers—in a world of creatures that are, from its perspective, very stupid and very slow. A sufficiently intelligent AI won’t stay confined to computers for long. In today’s world you can email DNA strings to laboratories that will produce proteins on demand, allo
That's not an "article in Time". That's a "TIME Ideas" contribution. It has less weight and less vetting than any given popular substack blog.
I don't know how most articles get into that section, but I know, from direct communication with a Time staff writer, that Time reached out and asked for Eliezer to write something for them.
Time appears to have commissioned a graphic for the article (the animated gif with red background and yellow circuits forming a mushroom cloud, captioned "Illustration for TIME by Lon Tweeten", with nothing suggesting it to be a stock photo), so there appears to be some level of editorial spotlighting. The article currently also appears on time.com in a section titled "Editor's picks" in a list of 4 articles, where the other 3 are not "Ideas" articles.
"The moratorium on new large training runs needs to be indefinite and worldwide."
Here lies the crux of the problem. Classical prisoners' dilemma, where individuals receive the greatest payoffs if they betray the group rather than cooperate. In this case, a bad actor will have the time to leapfrog the competition and be the first to cross the line to super-intelligence. Which, in hindsight, would be an even worse outcome.
The genie is out of the bottle. Given how (relatively) easy it is to train large language models, it is safe to assume that this whole fie...
Capabilities Researcher: *repeatedly shooting himself in the foot, reloading his gun, shooting again* "Wow, it sure is a shame that my selfish incentives aren't aligned with the collective good!" *reloads gun, shoots again*
You know what... I read the article, then your comments here... and I gotta say - there is absolutely not a chance in hell that this will come even remotely close to being considered, let alone executed. Well - at least not until something goes very wrong... and this something need not be "We're all gonna die" but more like, say, an AI system that melts down the monetary system... or is used (either deliberately, but perhaps especially if accidentally) to very negatively impact a substantial part of a population. An example could be that it ends up destroy...
If I had infinite freedom to write laws, I might carve out a single exception for AIs being trained solely to solve problems in biology and biotechnology, not trained on text from the internet, and not to the level where they start talking or planning; but if that was remotely complicating the issue I would immediately jettison that proposal and say to just shut it all down.
I thought this was interesting. Wouldn't an AI solving problems in biology pick up Darwinian habits and be equally dangerous as one trained on text? Why is training on text from the int...
Market:
https://manifold.markets/tailcalled/will-the-time-article-and-the-open
I suppose even if this market resolves YES, it may be worth the loss of social capital for safety reasons. Though I'm not convinced by shutting down AI research without an actual plan of how to proceed.
Also even if the market resolves YES and it turns out strategically bad, it may be worth it for honesty reasons.
For someone so good at getting a lot of attention he sure has no idea what the second order effects of his actions on capability will be
edit: also dang anyone who thinks he did a bad job at pr is sure getting very downvoted here
>The likely result of humanity facing down an opposed superhuman intelligence is a total loss. Valid metaphors include “a 10-year-old trying to play chess against Stockfish 15”, “the 11th century trying to fight the 21st century,” and “Australopithecus trying to fight Homo sapiens“.
But obviously these metaphors are not very apt, since humanity kinda has a massive incumbent advantage that would need to be overcome. Rome Sweet Rome is a fun story not because 21st century soldiers and Roman legionnaires are intrinsically equals but because the technologica...
I just want to be clear I understand your "plan".
We are going to build a powerful self-improving system, and then let it try end humanity with some p(doom)<1 (hopefully) and then do that iteratively?
My gut reaction to a plan like that looks like this "Eff you. You want to play Russian roulette, fine sure do that on your own. But leave me and everyone else out of it"
AI will be able to invent highly-potent weapons very quickly and without risk of detection, but it seems at least pretty plausible that...... this is just too difficult
You lack imagination, its painfully easy, also cost + required IQ has been dropping steadily every year.
And no there is zero chance I will elaborate on any of the possible ways humanity purposefully could be wiped out.
[Reposting from a Facebook thread discussing the article because my thoughts may be of interest]
I woke to see this shared by Timnit Gebru on my Linkedin and getting 100s of engagements. https://twitter.com/xriskology/status/1642155518570512384
It draws a lot of attention to the airstrikes comment which is unfortunate.
Stressful to read
A quick comment on changes that I would probably make to the article:
Make the message less about EY so it is harder to attack the messenger and undermine the message.
Reference other supporting authorities and sources of eviden...
Yud keeps asserting the near-certainty of human extinction if superhuman AGI is developed before we do a massive amount of work on alignment. But he never provides anything close to a justification for this belief. That makes his podcast appearances and articles unconvincing - a most surprising, and crucial part of his argument is left unsupported. Why has he made the decision to present his argument this way? Does he think there is no normie-friendly argument for the near-certainty of extinction? If so, it's kind of a black pill with regard to his argumen...
The point isn't that I'm unaware of the orthogonality thesis, it's that Yudkowsky doesn't present it in his recent popular articles and podcast appearances[0]. So, he asserts that the creation of superhuman AGI will almost certainly lead to human extinction (until massive amounts of alignment research has been successfully carried out), but he doesn't present an argument for why that is the case. Why doesn't he? Is it because he thinks normies cannot comprehend the argument? Is this not a black pill? IIRC he did assert that superhuman AGI would likely decide to use our atoms on the Bankless podcast, but he didn't present a convincing argument in favour of that position.
[0] see the following: https://time.com/6266923/ai-eliezer-yudkowsky-open-letter-not-enough/ ,
,
This letter makes me think that only large-scale nuclear war, specially targeting AI-related targets, like electric power plants, chip factories, data centers can be plausible alternative to creation non-aligned AI. And I don't like this alternative.
The LessWrong Review runs every year to select the posts that have most stood the test of time. This post is not yet eligible for review, but will be at the end of 2024. The top fifty or so posts are featured prominently on the site throughout the year.
Hopefully, the review is better than karma at judging enduring value. If we have accurate prediction markets on the review results, maybe we can have better incentives on LessWrong today. Will this post make the top fifty?
Obviously, we cannot figure out how to make a leash for such an intellect that will be ahead of us by many orders of magnitude and develop instantly. We may miss the moment with the singularity.
People regularly hack almost every defense they come up with, let alone an intelligence so superior to us.
But if it is so easy to make such a strong AI (I mean the speed of its creation and our position as an advanced civilization in the time period of the existence of the universe). Surely someone has already created it, and we are either a simulation, or we just h...
Could we take from Eliezer's message the need to redirect more efforts into AI policy and into widening the Overton window to try, in any way we can, to give AI safety research the time it needs? As Raemon said, the Overton window might be widening already, making more ideas "acceptable" for discussion, but it doesn't seem enough. I would say the typical response from the the overwhelming majority of the population and world leaders to misaligned AGI concerns still is to treat them as a panicky sci-fi dystopia rather than to say "maybe we should stop every...
In December 2022, awash in recent AI achievements, it concerned me that much of the technology had become very synergistic during the previous couple of years. Essentially: AI-type-X (e.g. Stable Diffusion) can help improve AI-type-Y (e.g. Tesla self-driving) across many, many pairs of X and Y. And now, not even 4 months after that, we have papers released on GPT4's ability to self-reflect and self-improve. Given that it is widely known how badly human minds predict geometric progression, I have started to feel like we are already past the AI singularity "...
Tl;dr - We must enlist and educate professional politicians, reporters and policymakers to talk about alignment.
This interaction between Peter Doocy and Karine Jean-Pierre (YouTube link below) is representative of how EY’s time article has been received in many circles.
I see a few broad fronts of concern in LLM’s.
Of these, alignment is like...
How about we augment human intelligence in a massive way and get them to solve agi problem. If we can make agi, should we not be close enough to augment human intelligence as well?
Beyond any rationality about the banning, it won't go too far, it won't happen, because the geopolitical game between superpowers which already have some present existential risks is right now bigger than any future risks could emerge from advanced AI development. And if you have not sufficiently developed AI technology at some point in the geopolitical game, you may be well into the bigger existential risk since not having nuclear weapons.
So, there you go, do you think present almost unexistent risks (albeit feasible to be present really close in the future), can overweight the other multiple, hotter, present existential risks?
I can't imagine such a proposal working well in the United States. I can imagine some countries e.g. China potentially being on board with proposals like these. Because the United Nations is a body chiefly concerned with enforcing international treaties, I imagine it would be incentivized to support arguments in favor of increasing its own scope and powers. I do predict that AI will be an issue it will eventually decide to weigh in on and possibly act on in a significant way.
However, that creates a kind of bi-polar geopolitical scenario for the remainder o...
There's no proof that superintelligence is even possible. The idea of the updating AI that will rewrite itself to godlike intelligence isn't supported.
There is just so much hand-wavey magical thinking going on in regard to the supposed superintelligence AI takeover.
The fact is that manufacturing networks are damn fragile. Power networks too. Some bad AI is still limited by these physical things. Oh, it's going to start making its own drones? Cool, so it is running thirty mines, and various shops, plus refining the oil and all the rest of the network's requ...
New article in Time Ideas by Eliezer Yudkowsky.
Here’s some selected quotes.
In reference to the letter that just came out (discussion here):