All of Max Tegmark's Comments + Replies

Thanks Akash! As I mentioned in my reply to Nicholas, I view it as flawed to think that China or the US would only abstain from AGI because of a Sino-US agreement. Rather, they'd each unilaterally do it out of  national self-interest.

  • It's not in the US self-interest to disempower itself and all its current power centers by allowing a US company to build uncontrollable AGI.
  • It's not in the interest of the Chinese Communist Party to disempower itself by allowing a Chinese company to build uncontrollable AGI. 

Once the US and Chinese leadership serves... (read more)

2Nathan Helm-Burger
I think that China and the US would definitely agree to pause if and only if they can confirm the other also committing to a pause. Unfortunately, this is a really hard thing to confirm, much harder than with nuclear. Thus, I propose that we are trapped in this suicide race unless we can come up with better coordination mechanisms. Lower counterfactual cost to participation, lower entry cost, higher reliability, less eeliance on a centralized authority... Decentralized AI-powered privacy-preserving safety inspections and realtime monitoring. Components include: if you opt in to the monitoring of a particular risk a, then you get to view the reports of my monitor on myself for risk a. Worried about a? Me too. Let's both monitor and report to each other. Not worried about b? Fine, then I'll only tell my fellow b-monitoring participants about the reports from my b monitors.

Thanks Nicholas for raising this issue. I think your framing overcomplicates the crux:
the root cause of an inspiring future with AI won't be international coordination, but national self-interest.

  • It's not in the US self-interest to disempower itself and all its current power centers by allowing a US company to build uncontrollable AGI.
  • It's not in the interest of the Chinese Communist Party to disempower itself by allowing a Chinese company to build uncontrollable AGI. 

Once the US and Chinese leadership serves their self-interest by preventing uncontr... (read more)

Wow – I'd never seen that chillingly prophetic passage! Moloch for the win. 
"The only winning move is not to play."
A military-AGI-industrial-complex suicide race has been my worst nightmare since my teens.
But I didn't expect "the good guys" in the Anthropic leadership pouring gasoline on it.

The only winning move is “agreement”, not “not to play”. There is quite some difference.

But how to find an agreeement when so many parties are involved? Treaty-making has been failing miserably for nuclear and climate. So we need a much better treaty-making, perhaps that fo the open intergovernmental constituent assembly?

Salut Boghdan!

I'm not sure this line of reasoning has the force some people seem to assume. What would you expect the results of hypothetical, similar referendums would have been e.g. before the industrial revolution and before the agricultural revolution, on those changes?

I'm somewhat horrified by this comment. This hypothetical referendum is about replacing all biological humans by machines, whereas the agricultural and industrial revolutions did no such thing. If you believes in democracy,  then why would you allow a tiny minority to decide to kill... (read more)

8Bogdan Ionut Cirstea
Salut Max! To clarify, I wouldn't personally condone 'replacing all biological humans by machines' and I have found related e/acc suggestions quite inappropriate/repulsive. I don't think there are easy answers here, to be honest. On the one hand, yes, allowing tiny minorities to take risks for all of [including future] humanity doesn't seem right. On the other, I'm not sure it would have necessarily been right either to e.g. stop the industrial revolution if a global referendum in the 17th century had come with that answer. This is what I was trying to get at. I don't think 'lackadaisical support for democratic ideals' is what's going on here (FWIW, I feel incredibly grateful to have been living in liberal democracies, knowing the past tragedies of undemocratic regimes, including in my home country not-so-long-ago), nor am I (necessarily) advocating for a rush to AGI. I just think it's complicated, and it will probably take nuanced cost-benefit analyses based on (ideally quantitative) risk estimates. If I could have it my way, my preferred global policy would probably look something like a coordinated, international pause during which a lot of automated safety research can be produced safely, combined with something like Paretotopian Goal Alignment. (Even beyond the vagueness) I'm not sure how tractable this mix is, though, and how it might trade-off e.g. extinction risk from AI vs. risks from (potentially global, stable) authoritarianism. Which is why I think it's not that obvious.
1Seth Herd
I don't think that's what Bogdan meant. I think if we took a referendum on AI replacing humans entirely, the population would be 99.99% against - far higher than the consensus that might've voted against the industrial revolution (and actually I suspect that referendum might've been in favor - job loss only affected minorities of the population at any one point I think). Even the e/acc people accused of wanting to replace humanity with machines mostly don't want that, when they're read in detail. I did this with "Beff Jezos" writings since he's commonly accused of being anti-human. He's really not - he thinks humans will be preserved or else machines will carry on humans values. There are definitely a few people who actually think intelligence is the most important thing to preserve (Sutton), but they're very rare compared to those who want humans to persist. Most of those like Jezos who say it's fine to be replaced by machines are still thinking those machines would be a lot like humans, including have a lot of our values. And even those are quite rare. For the most part, e/acc, d/acc, and doomers all share a love of humanity and its positive potential. We just disagree on how to get there. And given how new and complex this discussion is, I hold hope that we can mostly converge as we sort through the complex logic and evidence.

Right, Tamsin: so reasonable safety standards would presumably ban fully unrestricted superassistants too, but allow more limited assistants that could still be incredibly helpful. I'm curious what AI safety standards you'd propose – it's not a hypothetical question, since many politicians would like to know. 

Thanks Noosphere89 for your long and thoughtful comment! I don't have time to respond to everything before putting my 1-year-old to bed, but here are some brief comments. 

1) Although I appreciate that you wrote out a proposed AGI alignment plan, I think you'll agree that it contains no theorems or proofs, or even quantitative risk bounds. Since we insist on quantitative risk bounds before allowing much less dangerous technology such as airplanes and nuclear reactors, my view is that it would be crazy to launch AGI without quantitative risk bounds - es... (read more)

6Noosphere89
I can wait for your response, so don't take this as meaning you need to respond immediately, but I do have some comments. After you are done with everything, I invite you to respond to this comment. In response to 1, I definitely didn't show quantitative risk bounds for my proposal, for a couple of reasons: 1, my alignment proposal would require a lot more work and concreteness than I was able to do, and my goal was to make an alignment proposal that was concrete enough for other people to fill in the details of how it could actually be done. Then again, that's why they are paid the big bucks and not me. 2, I am both much more skeptical of formal proof/verification for AGI safety than you are, and also believe that it is unnecessary to do formal proofs to get high confidence in an alignment plan working (though I do think that formal proof may, emphasis on may be a thing that is useful for AI control metastrategies). For an example, I currently consider the Provably Safe AI agenda by Steve Omohundro and Ben Goldhaber at this time to be far too ambitious, and the biggest issue IMO is that the things they are promising rely on being able to quantify over all higher order behaviors that a system doesn't have, which is out of the range of currently extrapolated formalization techniques, and Zach Hatfield Dodds and Ben Goldhaber bet on whether 3 locks that couldn't be illegitimately unlocked could be designed by formal proof, where Zach Hatfield Dodds bet no, and Ben Goldhaber said yes, and the bet will resolve in 2027. See these links for more: https://www.lesswrong.com/posts/B2bg677TaS4cmDPzL/limitations-on-formal-verification-for-ai-safety#kPRnieFrEEifZjksa https://www.lesswrong.com/posts/P8XcbnYi7ooB2KR2j/provably-safe-ai-worldview-and-projects#Ku3X4QDBSyZhrtxkM https://www.lesswrong.com/posts/P8XcbnYi7ooB2KR2j/provably-safe-ai-worldview-and-projects#jjFsFmLbKNtMRyttK https://www.lesswrong.com/posts/P8XcbnYi7ooB2KR2j/provably-safe-ai-worldview-and-projects

Excellent question, Gordon! I defined tool AI specifically as controllable, so AI without a quantitative guarantee that it's controllable (or "safe", as you write) wouldn't meet the safety standards and its release would be prohibited.  I think it's crucial that, just as for aviation and pharma, the onus is on the companies rather than the regulators to demonstrate that products meet the safety standards. For controllable tools with great potential for harm (say plastic explosives), we already have regulatory approaches for limiting who can use them and how.  Analogously, there's discussion at the UNGA this week about creating a treaty on lethal autonomous weapons, which I support.

I defined tool AI specifically as controllable, so AI without a quantitative guarantee that it's controllable (or "safe", as you write) wouldn't meet the safety standards and its release would be prohibited.

If your stated definition is really all you mean by tool AI, then you've defined tool AI in a very nonstandard way that will confuse your readers.

When most people hear "tool AI", I expect them to think of AI like hammers: tools they can use to help them achieve a goal, but aren't agentic and won't do anything on their own they weren't directly asked to ... (read more)

Even if tool AI is controllable, tool AI can be used to assist in building non-tool AI. A benign superassistant is one query away from outputting world-ending code.

I indeed meant only "worst so far", in the sense that it would probably kill more people than any previous disaster.

2Petter
Thanks, that makes sense given your assumptions and results.

Important clarification: Neither here nor in the twitter post did I advocate appeasement or giving in to blackmail. In the Venn diagram of possible actions, there's certainly a non-empty intersection of "de-escalation" and "appeasement", but they're not the same set, and there are de-escalation strategies that don't involve appeasement but might nonetheless reduce nuclear war risk. I'm curious: do you agree that halting (and condemning) the following strategies can reduce escalation and help cool things down without giving in to blackmail?

  1. nuclear threats
  2. at
... (read more)
3Tomasz Darmetko
  All 1. to 7. have been condemned by some or all of the Western countries in multiple forms on multiple forums.  Strong words unsupported by actions will not change the situation. To be more precise, I think there is ~0% chance that condemnation form Western countries would reduce my prediction of 10% chance that Russia may use nuclear weapons to 5% or less. This is excluding all situations where weapons supply to Ukraine are significantly limited. (I'm ranked 18th on Metaculus and I really mean that ~0%) This also follows from your model where "David winning" is a first step towards nuclear use. According to that model we need to reduce Ukraine chances of winning in order to reduce chances of nuclear use. Condemnations are not affecting Ukraine chances of winning. Western weapons supplies are. Crushing vote for Russia in UN General Assembly on resolution A/ES-11/L.1 "Aggression against Ukraine" did not change anything. The only countries opposed to that resolution were Russian Federation, Belarus, Democratic People's Republic of North Korea, Syrian Arab Republic and Eritrea. In fact, recent questions and very weak condemnation from India and China were followed by escalation from Russia.   Russia annexed the Southern and Eastern territories of Ukraine two weeks later.

The more items on the list of nuclear near-misses, the more convinced you should be that de-escalation works, no matter how close we get to nuclear war.

That's an interesting argument, but it ignores the selection effect of survivor bias. If you play Russian roulette many times and survive, that doesn't mean that the risk you took was small. Similarly,  if you go with the Xia et al estimate that nuclear winter kills 99% of Americans and Europeans, the fact that we find ourself being in that demographic in 2022 doesn't mean that the past risks we took w... (read more)

1Henrik Karlsson
Where is the 99 % coming from? I can't see it in the paper.
3DirectedEvolution
If we live to 2080 and, in that time, double the total number of nuclear near-misses, would you feel like that was evidence that baseline nuclear risk in any single incident is on average higher or lower than you currently think?

Algon, please provide references to peer-reviewed journals supporting your claims that smoke predictions are overblown, etc. Since there's a steady stream of peer-reviewed papers quantifying nuclear winter in serious science journals, I find myself unconvinced by criticism that appears only on blogs and without the detailed data, GitHub code, etc. that tends to accompany peer-reviewed research. Thanks!

3Algon
Uh, the blog posts I linked to do reference peer reviewed journals which criticize soot production models or the evidence for them[1]. Here is a post on the EA forum that does provide a model and the data they used to generate it, along with plenty of references, and has incorporated the critiques of one researcher who studies nuclear winter. The article's conclusion is in the same direction as the blog posts I linked to. 1. ^ I was just presenting the gist of it for people who don't like clicking on links. 

The Reisner et al paper (and the back and forth between Robock's group and Reisner's group) casts doubt on this:

https://agupubs.onlinelibrary.wiley.com/doi/full/10.1002/2017JD027331?fbclid=IwAR0SlQ_naiKY5k27PL0XlY-3jsocG3lomUXGf3J1g8GunDV8DPNd7birz1w

Ege, if you find the framework helpful, I'd love to hear your estimates for the factor probabilities 30%, 70%, 80%. I'd also be very interested in seeing alternative endpoint classifications and alternative frameworks. I sense that we both agree that it's valuable to estimate the nuclear war risk, and basing the estimate on a model that decomposes into pieces that can be debated separately rather than basing it on just gazing into our belly-buttons and tossing out a single probability that feels right.

8Ege Erdil
I'd probably estimate the three factors at ~ 10%, ~ 50% and ~ 10% respectively, so my probability of all out nuclear war between Russia and the US is like ~ 0.5%. Overall, I think I still roughly endorse my reasoning in the following Metaculus comment I wrote in early March: The military situation changing substantially means (1) is now more likely than I had thought in March, so maybe I would now update it to something closer to 5%, but even in this situation I can't really endorse a risk of all out nuclear war that's significantly greater than 1%.

Russia also wanted to withdraw of US troops from the baltic states which is also a nonstarter. 

Yeah, that was clearly a non-starter, and perhaps a deliberate one they could drop later to save face and claim they'd won a compromise. My point was simply that since the West didn't even offer a promise not to let Ukraine into NATO, I don't think they'd ever agree to a "Kosovo". 

4ChristianKl
Huge concessions are usually not publically announced, so it's hard to know what was actually on offer.  The West would easily agree with giving Russian speakers in those Ukrainian regions with a majority that identifies as primarily Russian speakers the kind of minority rights that the French population of Quebec has. On the other hand, Ukraine does not want to give its Russian speakers those kinds of rights.  I don't believe that Ukraine would have the votes in parliament to change the Ukrainian constitution in the necessary way as it seems even unwilling to give Russian speakers the rights for which the EU asks as a precondition to joining the EU.  The population of Russia believed that the minority rights of the populations of Russians in Ukraine got violated and that the Russian government needed to act on that front. Getting a concession on NATO but not on minority rights just wouldn't have been enough domestically. 

Thanks David and Ege for these excellent points! You're giving me too much credit by calling it a "thesis"; it was simply part of my reasoning behind the 30% number. Yeah, I did consider the Gulf War as an important counterexample. I'll definitely consider revising my 30% number downward in my next update, but there are also interesting examples on the other side:

  • The Falklands War: The Argentinian military junta's 1982 invasion of the British Falkland Islands was humiliatingly defeated. This became the final nail in the coffin for a dictatorship facing a c
... (read more)
2Douglas_Knight
Another example of a dictator driven from power by losing a war is the Greek Junta. They instigated a coup in Cyprus, triggering an invasion by Turkey, and then lost power at home. But Bruce Bueno de Mosquita claims that dictators are much better at cutting their losses and surviving, whereas democracies double down and escalate to total war.

Responding to your examples:

  • I agree with the Falklands War being a good example of your thesis; I forgot about it while making my list. No arguments there.

  • I did consider the Yom Kippur War, but I noticed as you did that the national leaders didn't lose power and it was not clear to me whether we should say the Arab forces were defeated in the war. It seems like Egypt achieved at least some limited objectives as a result of the war, even if it fell far short of what they might have wanted to achieve. So I'm not sure if we should consider this as a "su

... (read more)

Thanks Wei for these interesting comments. Whether humans can "solve" ontological crises clearly depends one's definition of "solve". Although there's arguably a clear best solution for de Blanc's corridor example, it's far from clear that there is any behavior that deserves being called a "solution" if the ontological update causes the entire worldview of the rational agent to crumble, revealing the goal to have been fundamentally confused and undefined beyond repair. That's what I was getting at with my souls example.

As to what Nick's views are, I plan to ask him about this when I see him tomorrow.

Thanks Eliezer for your encouraging words and for all these interesting comments! I agree with your points, and we clearly agree on the bottom line as well: 1) Building FAI is hard and we’re far from there yet. Sorting out “final goal” issues is part of the challenge. 2) It’s therefore important to further research these questions now, before it’s too late. :-)

This should be awesome, except for the 2-minute introduction that will be given by this annoying Swedish guy (me). Be there for be square! ;-)