Thanks Nicholas for raising this issue. I think your framing overcomplicates the crux:
the root cause of an inspiring future with AI won't be international coordination, but national self-interest.
Once the US and Chinese leadership serves their self-interest by preventing uncontr...
The only winning move is “agreement”, not “not to play”. There is quite some difference.
But how to find an agreeement when so many parties are involved? Treaty-making has been failing miserably for nuclear and climate. So we need a much better treaty-making, perhaps that fo the open intergovernmental constituent assembly?
Salut Boghdan!
I'm not sure this line of reasoning has the force some people seem to assume. What would you expect the results of hypothetical, similar referendums would have been e.g. before the industrial revolution and before the agricultural revolution, on those changes?
I'm somewhat horrified by this comment. This hypothetical referendum is about replacing all biological humans by machines, whereas the agricultural and industrial revolutions did no such thing. If you believes in democracy, then why would you allow a tiny minority to decide to kill...
Right, Tamsin: so reasonable safety standards would presumably ban fully unrestricted superassistants too, but allow more limited assistants that could still be incredibly helpful. I'm curious what AI safety standards you'd propose – it's not a hypothetical question, since many politicians would like to know.
Thanks Noosphere89 for your long and thoughtful comment! I don't have time to respond to everything before putting my 1-year-old to bed, but here are some brief comments.
1) Although I appreciate that you wrote out a proposed AGI alignment plan, I think you'll agree that it contains no theorems or proofs, or even quantitative risk bounds. Since we insist on quantitative risk bounds before allowing much less dangerous technology such as airplanes and nuclear reactors, my view is that it would be crazy to launch AGI without quantitative risk bounds - es...
Excellent question, Gordon! I defined tool AI specifically as controllable, so AI without a quantitative guarantee that it's controllable (or "safe", as you write) wouldn't meet the safety standards and its release would be prohibited. I think it's crucial that, just as for aviation and pharma, the onus is on the companies rather than the regulators to demonstrate that products meet the safety standards. For controllable tools with great potential for harm (say plastic explosives), we already have regulatory approaches for limiting who can use them and how. Analogously, there's discussion at the UNGA this week about creating a treaty on lethal autonomous weapons, which I support.
I defined tool AI specifically as controllable, so AI without a quantitative guarantee that it's controllable (or "safe", as you write) wouldn't meet the safety standards and its release would be prohibited.
If your stated definition is really all you mean by tool AI, then you've defined tool AI in a very nonstandard way that will confuse your readers.
When most people hear "tool AI", I expect them to think of AI like hammers: tools they can use to help them achieve a goal, but aren't agentic and won't do anything on their own they weren't directly asked to ...
Important clarification: Neither here nor in the twitter post did I advocate appeasement or giving in to blackmail. In the Venn diagram of possible actions, there's certainly a non-empty intersection of "de-escalation" and "appeasement", but they're not the same set, and there are de-escalation strategies that don't involve appeasement but might nonetheless reduce nuclear war risk. I'm curious: do you agree that halting (and condemning) the following strategies can reduce escalation and help cool things down without giving in to blackmail?
The more items on the list of nuclear near-misses, the more convinced you should be that de-escalation works, no matter how close we get to nuclear war.
That's an interesting argument, but it ignores the selection effect of survivor bias. If you play Russian roulette many times and survive, that doesn't mean that the risk you took was small. Similarly, if you go with the Xia et al estimate that nuclear winter kills 99% of Americans and Europeans, the fact that we find ourself being in that demographic in 2022 doesn't mean that the past risks we took w...
Algon, please provide references to peer-reviewed journals supporting your claims that smoke predictions are overblown, etc. Since there's a steady stream of peer-reviewed papers quantifying nuclear winter in serious science journals, I find myself unconvinced by criticism that appears only on blogs and without the detailed data, GitHub code, etc. that tends to accompany peer-reviewed research. Thanks!
The Reisner et al paper (and the back and forth between Robock's group and Reisner's group) casts doubt on this:
Ege, if you find the framework helpful, I'd love to hear your estimates for the factor probabilities 30%, 70%, 80%. I'd also be very interested in seeing alternative endpoint classifications and alternative frameworks. I sense that we both agree that it's valuable to estimate the nuclear war risk, and basing the estimate on a model that decomposes into pieces that can be debated separately rather than basing it on just gazing into our belly-buttons and tossing out a single probability that feels right.
Russia also wanted to withdraw of US troops from the baltic states which is also a nonstarter.
Yeah, that was clearly a non-starter, and perhaps a deliberate one they could drop later to save face and claim they'd won a compromise. My point was simply that since the West didn't even offer a promise not to let Ukraine into NATO, I don't think they'd ever agree to a "Kosovo".
Thanks David and Ege for these excellent points! You're giving me too much credit by calling it a "thesis"; it was simply part of my reasoning behind the 30% number. Yeah, I did consider the Gulf War as an important counterexample. I'll definitely consider revising my 30% number downward in my next update, but there are also interesting examples on the other side:
Responding to your examples:
I agree with the Falklands War being a good example of your thesis; I forgot about it while making my list. No arguments there.
I did consider the Yom Kippur War, but I noticed as you did that the national leaders didn't lose power and it was not clear to me whether we should say the Arab forces were defeated in the war. It seems like Egypt achieved at least some limited objectives as a result of the war, even if it fell far short of what they might have wanted to achieve. So I'm not sure if we should consider this as a "su
Thanks Wei for these interesting comments. Whether humans can "solve" ontological crises clearly depends one's definition of "solve". Although there's arguably a clear best solution for de Blanc's corridor example, it's far from clear that there is any behavior that deserves being called a "solution" if the ontological update causes the entire worldview of the rational agent to crumble, revealing the goal to have been fundamentally confused and undefined beyond repair. That's what I was getting at with my souls example.
As to what Nick's views are, I plan to ask him about this when I see him tomorrow.
Thanks Eliezer for your encouraging words and for all these interesting comments! I agree with your points, and we clearly agree on the bottom line as well: 1) Building FAI is hard and we’re far from there yet. Sorting out “final goal” issues is part of the challenge. 2) It’s therefore important to further research these questions now, before it’s too late. :-)
Thanks Akash! As I mentioned in my reply to Nicholas, I view it as flawed to think that China or the US would only abstain from AGI because of a Sino-US agreement. Rather, they'd each unilaterally do it out of national self-interest.
Once the US and Chinese leadership serves... (read more)