Even Jaan Tallinn is “now questioning the merits of running companies based on the philosophy.”
The actual quote by Tallin is:
The OpenAI governance crisis highlights the fragility of voluntary EA-motivated governance schemes... So the world should not rely on such governance working as intended.
which to me is a different claim than questioning the merits of running companies based on the EA philosophy - it's questioning an implementation of that philosophy via voluntarily limiting the company from being too profit motivated at the expense of other EA concerns.
Thanks. I was quoting Semafor, but on a closer reading of Tallinn's quote I agree that they might have been misinterpreting him. (Has he commented on this, does anyone know?)
The analogy between SBF and Helen Toner is completely misguided. SBF did deeply immoral things, with catastrophic results for everyone, whatever his motivations has been. With Toner, we don't know what really happened, but if she indeed was willing to destroy OpenAI for safety reasons, then AFAICT she was 100% justified. The only problem is that she didn't succeed. (Where "success" would mean actually removing OpenAI from the gameboard, rather than e.g. rebranding it as part of Microsoft.)
Why would destroying OpenAI be positive for safety? I simply do not see any realistic arguments for that being the case.
There is certainly no moral equivalence between the two of them; SBF was a fraud and Toner was (from what I can tell) acting honestly according to her convictions. Sorry if I didn't make that clear enough.
But I disagree about destroying OpenAI—that would have been a massive destruction of value and very far from justified IMO.
When negotiating it can be useful to be open to outcomes that are net destruction of value, even if the outcome is not what you ideally want.
A lot of my thinking over the last few months has shifted from "how do we get some sort of AI pause in place?" to "how do we win the peace?". That is, you could have a picture of AGI as the most important problem that precedes all other problems; anti-aging research is important, but it might actually be faster to build an aligned artificial scientist who solves it for you than to solve it yourself (on this general argument, see Artificial Intelligence as a Positive and Negative Factor in Global Risk). But if alignment requires a thirty-year pause on the creation of artificial scientists to work, that belief flips--now actually it makes sense to go ahead with humans researching the biology of aging, and to do projects like Loyal.
This isn't true of just aging; there are probably something more like twelve major areas of concern. Some of them are simply predictable catastrophes we would like to avert; others are possibly necessary to be able to safely exit the pause at all (or to keep the pause going when it would be unsafe to exit).
I think 'solutionism' is basically the right path, here. What I'm interested in: what's the foundation for solutionism, or what support does it need? Why is solutionism not already the dominant view? I think one of the things I found most exciting about SENS was the sense that "someone had done the work", had actually identified the list of seven problems, and had a plan of how to address all of the problems. Even if those specific plans didn't pan out, the superstructure was there and the ability to pivot was there. It looked like a serious approach by serious people. What is the superstructure for solutionism such that one can be reasonably confident that marginal efforts are actually contributing to success, instead of bailing water on the Titanic?
Vaniver is it your belief that a worldwide AI pause - not one limited to a specific geographic area - is a plausible outcome? Could you care to elaborate in more detail why you think it would be possible? The recent news, to me, doesn't sound like it is in the process of happening. Almost all the news I have read has consisted of announcements consistent with an accelerating arms race, with 2 governance actions : EU AI act, Biden executive order, that aren't pauses. I don't know of any historical pauses in history that happened for useful technology without an adjacent alternative that continued to be used.
For example, cfcs were successfully banned but there are refrigerants and fire suppressants readily available without as much ozone layer hazard.
Bioweapons and nerve agents were semi successfully banned but nukes are strictly better.
Nukes were reduced in number but the superpowers keep arsenals capable of "more than 1.0 complete destruction of any enemy economic or military capacity". Or "greater than 1.0 doomsdays".
I could see an AI ban on totally untested systems, large system weight release, physical security requirements for large systems, and a ban on AI systems that can self modify their own framework. But this wouldn't be a ban on AGI or ASI, there are system topologies that would be just as effective in doing things for humans without the above hazardous features.
I think as the race heats up and AI becomes more and more promising, we might see national total efforts to develop AI faster. Instead of private labs and whatever VC capital they can raise it would be government funded, and a "total" effort means an effort like a total war - all available resources would be invested.
Would you please share your world model with me. What am I missing?
Vaniver is it your belief that a worldwide AI pause - not one limited to a specific geographic area - is a plausible outcome? Could you care to elaborate in more detail why you think it would be possible?
Yes, I think it's plausible. I don't think it's especially likely--my modal scenario still involves everyone dying--but I think especially if you condition on success it seems pretty likely, and it makes sense to play to your outs.
The basic argument for plausibility is that 1) people are mostly awake to risks from advanced AI, 2) current power structures are mostly not enamored with AI / view it as more likely to be destabilizing than stabilizing, 3) the people pushing for unregulated AI development are not particularly charismatic or sympathetic, and 4) current great powers are pretty willing to meddle in other countries when it comes to serious national security issues.
I expect pauses to look more like "significant regulatory apparatus" than "ban"; the sort of thing where building new nuclear plants was legal with approval and yet it takes decades to get NRC approval. Probably this involves a significant change in how chips are constructed and sold. [I note that computer hardware seems like an area where people are pouring gas onto the race instead of trying to slow down.]
I think as the race heats up and AI becomes more and more promising, we might see national total efforts to develop AI faster.
I think this might happen, and is >98% likely to be game over for humanity.
Ok, so your belief is that however low the odds are, it's the only hope. And the odds are pretty low. I thought of a "dumbest possible frequentist algorithm" to estimate the odds.
The dumbest algorithm is to simply ask how many times outcome A vs B happens. For example, if the question is "how likely is a Green party candidate to be elected president", and it's never happened, and there have been 10 presidential elections since the founding of the green party, then we know the odds are under 10 percent. Obviously the odds are much lower than that, 2 party winner take all system makes the actual odds about 0, but say you are a Green party supporter - even you have to admit, based on this evidence, it isn't likely.
And "humans failed to build useful weapons technology for concern about its long term effects". Well, as far as I know, bioweapons research was mostly done for non replicating bioweapons. Anthrax can't spread from person to person. The replicating ones would affect everyone, they aren't specific enough. Like developing a suicide nuke as a weapon.
So it's happened 1 time? And humans have developed how many major weapons in human history? Even if we go by category and only count major ones there's bronze age, iron age, those spear launchers, roman phalanxes, horse archers, cannon, muskets, castles, battleships, submarines, aircraft, aircraft carriers, machine guns, artillery, nukes, ballistic missiles, cruise missiles, SAMs, stealth aircraft, tanks..at 21 categories and I am bored.
To steelman your argument would you say the odds are under 5 percent? Because AI isn't just a weapon, it lets you make better medicine and mass produce housing and consumer goods and find criminals and so on. Frankly there is almost no category of human endeavor an AI won't help with, vs like a tank where you can't use it for anything but war.
So would you say, in your model, it works out to :
5 percent chance of multilateral AI slowdown. In these futures, what's the odds of surviving here? If it's 50 percent, then 2.5 percent survival here.
95 percent chance of arms race, where you think only 2 percent of these futures humans survive in. Then 1.9 percent survival here.
This how you see it?
- people are mostly awake to risks from advanced AI,
Nukes have x-risk but humans couldn't help but build them
- current power structures are mostly not . enamored with AI / view it as more likely to be > destabilizing than stabilizing,
Each weapons advance I mentioned changed the world map and power balance. Entire powers fell as a consequence. They were destabilizing but agreements couldn't be reached not to build and use them. Like for a simple example, the British empire benefitted hugely from cannons on warships and really good sail driven warships. What if all the other powers at that time went to the Pope and asked for a bull that firing grapeshot wasn't Christian. Would this change anything?
In today's world some powers are currently in a weaker position and AI offers them an opportunity to move to dominance.
- the people pushing for unregulated AI development are not particularly charismatic or sympathetic, and
That's not a particularly strong argument. There's thousands of other people who aren't pushing e/acc but the rules of the game means they are aligned with AI racing. The most clear incentives go to chip vendors, they stand to gain enormous revenue from AI silicon sales. Jensen Huang, Lisa Su, C. C. Wei, Patrick P. Gelsinger - they are all plenty charismatic, and they implicitly stand to gain. A lot. https://www.reddit.com/r/dataisbeautiful/comments/16u9w6f/oc_nvidias_revenue_breaks_records/
- current great powers are pretty willing to meddle in other countries when it comes to serious national security
This is true. Note however the largest powers do want they want, where most meddling is information theft not sabotage. Soviets never bothered to try to sabotage USA defense complexes directly because the scale made it pointless, there was too much resiliency.
AI development has a tremendous amount of inherent resilience, much more than physical world tech. Each cluster of AI accelerators is interchangeable. Model checkpoints can be stored at many geographic locations. If you read the Gemini model card, they mention developing full determinism. This means someone could put a bomb on a tpuv5 cluster and the Google sysadmins could resume training, possibly autonomously.
The bottlenecks are in the chip fabrication tooling.
This how you see it?
Roughly.
Nukes have x-risk but humans couldn't help but build them
I think no one seriously considered the prospect of nuclear winter until well after stockpiles were large, and even now it's not obviously an existential concern instead of merely catastrophic. If you're talking about the 'ignite the atmosphere' concern, I think that's actually evidence for voluntary relinquishment--they came up with a number where if they thought the risk was that high, they would give up on the project and take the risk of Nazi victory.
I expect the consensus estimate will be that AGI projects have risks in excess of that decision criterion, and that will motivate a halt until the risks are credibly lowered.
What if all the other powers at that time went to the Pope and asked for a bull that firing grapeshot wasn't Christian. Would this change anything?
I assume you're familiar with Innocent II's prohibition on crossbows, and that it wasn't effectively enforced. I am more interested in, say, the American/Israeli prohibition on Iranian nuclear weapons, which does seem to be effectively enforced on Earth.
The bottlenecks are in the chip fabrication tooling.
Yeah, I think it is more likely that we get compute restrictions / compute surveillance than restrictions on just AI developers. But even then, I think there aren't that many people involved in AI development and it is within the capacities of intelligence agencies to surveil them (tho I am not confident that a "just watch them all the time" plan works out; you need to be able to anticipate the outcomes of the research work they're doing, which requires technical competence that I don't expect those agencies to have).
So ok here's some convergence.
I believe there is minimal chance that all the superpowers will simultaneously agree on a meaningful ai pause. It seems like you agree the same way. A superpower cannot be stopped by the measures you mentioned, they will train new ai experts, build their own chip fabrication equipment, build lots of spare capacity etc. Iran is not a superpower.
I think there is some dispute over what we even mean by "AGi/ASI". I am thinking of any system that scores above a numerical threshold on a large benchmark of tasks, where the majority of the score comes from complex multimodal tasks that are withheld. AGI means the machine did at least as well as humans on a broad selection of these tasks, ASI means it beat the best human experts on a broad selection of the tasks.
Any machine able to do the above counts.
Note you can pass such tasks without situational or context awareness or ongoing continuity of existence or self modification or online learning. (All forms of state buildup)
So this is a major split in our models I think. I am thinking of an arms race that builds tool ASI without the state above, and I think you are assuming past a certain point the AI systems will have context awareness and the ability to coordinate among each other?
Like is that what drives your doom assumptions and assumptions that people would stop? Do you think decision-makers would avoid investing in tool AI that they have a high confidence they can control? (The confidence would come from controlling context. An isolated model without context can't even know it's not still in training)
Helen Toner was apparently willing to let OpenAI be destroyed because of a general feeling that the organization was moving too fast or commercializing too much.
The source you linked doesn't seem to support the claim you made. It supports that Helen was willing to let the organization be destroyed, but not that this is due to "a general feeling that the organization was moving too fast or commercializing too much".
I also don't know why you would otherwise think this is (clearly) true. Like it could be true, but I don't see any strong evidence supporting this and this certainly isn't the official reason the board gave.
One also might argue that Altman was willing to see the organization destroyed, and that he was the one raising the threat of taking OpenAI with him if he went down.
Did Sam threaten to take the team with him, or did the team threaten to quit and follow him? From what I saw it looked like the latter.
I mean he didn't threaten to take the team with him, he was just going to do so.
We also don't know what went on behind the scenes, and it seems plausible that many OpenAI employees were (mildly) pressured into signing by the pro-Sam crowd.
So if counterfactually he hadn't been willing to destroy the company, he could have assuaged the people closest to him, and likely the dynamics would have been much different.
I was basing my (uncertain) interpretation on a number of sources, and I only linked to one, sorry.
In particular, the only substantive board disagreement that I saw was over Toner's report that was critical of OpenAI for releasing models too quickly, and Sam being upset over it.
The glorious abundant technological future is waiting. Let’s muster the best within ourselves—the best of our courage and the best of our rationality—and go build it.
I'm confused about exactly what this post is arguing we should be trying to build. AI? Other technology which would independently result in a singularity? Approaches to building AI safely? Sufficient philosophical and moral progress such that we know what to do with techno utopia?
The effect of working on building technologies (AI and otherwise) which produce the techo utopia is mostly to speed up when techno utopia happens. This could be good under various empirical and moral views, but this seems like a complex question. (E.g. how much do you value currently existing people reaching techno utopia, how much exogenous non-utopia related risk is there, etc.)
Perhaps you disagree (with at least me) about the fundamental dynamics around singularities and human obsolescence?
I agree that there is a bunch of other stuff to build (AI safety, sufficient philosophical and moral progress) which we need to do in order to unlock the full value of techno utopia, but it seems strange to describe to describe this as "building techno utopia" as all of this stuff is more about avoiding obstacles and utilizing abundance well rather than actually building the technology.
While the community sees the potential of, and earnestly hopes for, a glorious abundant technological future, it is mostly focused not on what we can build but on what might go wrong. The overriding concern is literally the risk of extinction for the human race. Frankly, it’s exhausting.
It might be exhausting, but this seems unrelated to whether or not it's the best thing to focus on under various empirical and moral views?
Perhaps you don't dispute this, but you do want to reframe what people are working on rather than changing what they work on?
E.g. reframe "I'm working on ensuring that AI is built carefully and safely"[1] to "I'm working on building the glorious techo utopia". Totally fair if so, but it might be good to be clear about this.
Supposing this is the best thing to work on which may or may not be true for a given person ↩︎
This is generally my overall objection to progress: it seems unclear if generally pushing technological progress is good and minimally I would guess that there are much better things to be pushing (under my empirical views about the likelihood of an AI related singularity in the next 100 years).
"if you can't beat 'em, join em" at work, I guess. I don't care what we call it, my research agenda doesn't change, and my research agenda was always about making inhumanly powerful systems that can bring us all into the stars. I mean "ascension" in the very most literal sense: we aren't going to be able to offer every single human the chance to go to space and then come back to a comfortable world without embarrassingly advanced technology that we also understand. And I always meant that. E whatever a whatever, I'm a transhumanist first and have always been a transhumanist first.
Related to this topic, with a similar outlook but also more discussion of specific approaches going forward, is Vitalik's recent post on techno-optimism:
https://vitalik.eth.limo/general/2023/11/27/techno_optimism.html
There is a lot at the link, but just to give a sense of the message here's a quote:
"To me, the moral of the story is this. Often, it really is the case that version N of our civilization's technology causes a problem, and version N+1 fixes it. However, this does not happen automatically, and requires intentional human effort. The ozone layer is recovering because, through international agreements like the Montreal Protocol, we made it recover. Air pollution is improving because we made it improve. And similarly, solar panels have not gotten massively better because it was a preordained part of the energy tech tree; solar panels have gotten massively better because decades of awareness of the importance of solving climate change have motivated both engineers to work on the problem, and companies and governments to fund their research. It is intentional action, coordinated through public discourse and culture shaping the perspectives of governments, scientists, philanthropists and businesses, and not an inexorable "techno-capital machine", that had solved these problems."
Over the last few years, effective altruism has gone through a rise-and-fall story arc worthy of any dramatic tragedy.
The pandemic made them look prescient for warning about global catastrophic risks, including biosafety. A masterful book launch put them on the cover of TIME. But then the arc reversed. The trouble started with FTX, whose founder Sam Bankman-Fried claimed to be acting on EA principles and had begun to fund major EA efforts; its collapse tarnished the community by association with fraud. It was bad for EA if SBF was false in his beliefs; it was worse if he was sincere. Now we’ve just watched a major governance battle over OpenAI that seems to have been driven by concerns about AI safety of exactly the kind long promoted by EA.
SBF was willing to make repeated double-or-nothing wagers until FTX exploded; Helen Toner was apparently willing to let OpenAI be destroyed because of a general feeling that the organization was moving too fast or commercializing too much. Between the two of them, a philosophy that aims to prevent catastrophic risk in the future seems to be creating its own catastrophes in the present. Even Jaan Tallinn is “now questioning the merits of running companies based on the philosophy.”
On top of that, there is just the general sense of doom. All forms of altruism gravitate towards a focus on negatives. EA’s priorities are the relief of suffering and the prevention of disaster. While the community sees the potential of, and earnestly hopes for, a glorious abundant technological future, it is mostly focused not on what we can build but on what might go wrong. The overriding concern is literally the risk of extinction for the human race. Frankly, it’s exhausting.
So I totally understand why there has been a backlash. At some point, I gather, someone said, hey, we don’t want effective altruism, we want “effective accelerationism”—abbreviated “e/acc” (since of course we can’t just call it “EA”). This meme has been frequent in my social feeds lately.
I call it a meme and not a philosophy because… well, as far as I can tell, there isn’t much more to it than memes and vibes. And hey, I love the vibe! It is bold and ambitious. It is terrapunk. It is a vision of a glorious abundant technological future. It is about growth and progress. It is a vibe for the builder, the creator, the discoverer, the inventor.
But… it also makes me worried. Because to build the glorious abundant technological future, we’re going to need more than vibes. We’re going to need ideas. A framework. A philosophy. And we’re going to need just a bit of nuance.
We’re going to need a philosophy because there are hard questions to answer: about risk, about safety, about governance. We need good answers to those questions in part because mainstream culture is so steeped in fears about technology that the world will never accept a cavalier approach. But more importantly, we need good answers because one of the best features of the glorious abundant technological future is not dying, and humanity not being subject to random catastrophes, either natural or of our own making. In other words, safety is a part of progress, not something opposed to it. Safety is an achievement, something actively created through a combination of engineering excellence and sound governance. Our approach can’t just be blind, complacent optimism: “pedal to the metal” or “damn the torpedos, full speed ahead.” It needs to be one of solutionism: “problems are real but we can solve them.”
You will not find a bigger proponent of science, technology, industry, growth, and progress than me. But I am here to tell you that we can’t yolo our way into it. We need a serious approach, led by serious people.
The good news is that the intellectual and technological leaders of this movement are already here. If you are looking for serious defenders and promoters of progress, we have Eli Dourado in policy, Bret Kugelmass or Casey Handmer in energy, Ben Reinhardt investing in nanotechnology, Raiany Romanni advocating for longevity, and many many more, including the rest of the Roots of Progress fellows.
I urge anyone who values progress to take the epistemic high road. Let’s make the best possible case for progress that we can, based on the deepest research, the most thorough reasoning, and the most intellectually honest consideration of counterarguments. Let’s put forth an unassailable argument based on evidence and logic. The glorious abundant technological future is waiting. Let’s muster the best within ourselves—the best of our courage and the best of our rationality—and go build it.
Followup thoughts based on feedback
Thanks all.