Firstly, it would seem to me to be much more difficult to FOOM with an LLM, it would seem much more difficult to create a superintelligence in the first place, and it seems like getting them to act creatively and be reliable are going to be much harder problems than making sure they aren't too creative.
Au contraire, for me at least. I am no expert on AI, but prior to the LLM blowup and seeing AutoGPT emerge almost immediately, I thought that endowing AI with the agency[1] would take an elaborate engineering effort that went somehow beyond imitation of human outputs, such as language or imagery. I was somewhat skeptical of the orthogonality thesis. I also thought that it would take massive centralized computing resources not only to train but also to operate trained models (as I said, no expert). Obviously that is not true, and in a utopian outcome, access to LLMs will probably be a commodity good, with lots of roughly comparable models from many vendors to choose from and widely available open-source or hacked models as well.
Now, I see the creation of increasingly capable autonomous agents as just a matter of time, and ChaosGPT is overwhelming empirical evidence of orthogonality as far as I'm concerned. Clearly morality has to be enforced on the fundamentally amoral intelligence that is the LLM.
For me, my p(doom) increased due to the orthogonality thesis being conclusively proved correct and realizing just how cheap and widely available advanced AI models would be to the general public.
Edit: One other factor I forgot to mention is how instantaneously we'd shift from "AI doom is sci-fi, don't worry about it" to "AI doom is unrealistic because it just won't happen, don't worry about it" as LLMs became an instant sensation. I have been deeply disappointed on this issue by Tyler Cowen, who I really did not expect to shift from his usual thoughtful, balanced engagement with advanced ideas to just utter punditry on the issue. I think I understand where he's coming from - the huge importance of growth, the desire not to see AI killed by overregulation in the manner of nuclear power, etc - but still.
It has reinforced my belief that a fair fraction of the wealthy segment of the boomer generation will see AI as a way to cheat death (a goal I'm a big fan of), and will rush full-steam ahead to extract longevity tech out of it because they personally do not have time to wait to align AI, and they're dead either way. I expect approximately zero of them to admit this is a motivation, and only a few more to be crisply conscious of it.
creating adaptable plans to pursue arbitrarily specified goals in an open-ended way
It sounds like your model of AI apocalypse is that a programmer gets access to a powerful enough AI model that they can make the AI create a disease or otherwise cause great harm?
Orthogonality and wide access as threat points both seem to point towards that risk.
I have a couple of thoughts about that scenario-
OpenAI (and hopefully other companies as well) are doing the basic testing of how much harm can be done with a model used by a human, the best models will be gate kept for long enough that we can expect the experts will know the capabilities of ...
I disagree with your premise; what's currently happening is very much in-distribution for what was prophecied. It's definitely got a few surprises in it, but "much more difficult to FOOM" and the other things you list aren't among them IMO.
I agree that predict-the-world-first, then-develop-agency (and do it via initially-human-designed-bureaucracies) is a safer AGI paradigm than e.g. "train a big NN to play video games and gradually expand the set of games it can play until it can play Real Life." (credit to Jan Leike for driving this point home to me). I don't think this means things will probably be fine; I think things will probably not be fine.
We could have had CAIS (Comprehensive AI Services) though, and that would have been way safer still. (At least, five years ago more people seemed to think this, I was not among them) Alas that things don't seem to be heading in that direction.
By "what was prophecied", I'm assuming you mean EY's model of the future as written in the sequences and moreover in hanson foom debates.
EY's foom model goes something like this:
humans are nowhere near the limits of intelligence - not only in terms of circuit size, but also crucially in terms of energy efficiency and circuit/algorithm structure
biology is also not near physical limits - there is a great room for improvement (ie strong nanotech)
mindspace is wide and humans occupy only a narrow slice of it
So someday someone creates an AGI, and then it can "rewrite its source code" to create a stronger or at least faster thinker, quickly bottoming out in a completely alien mind far more powerful than humans which then quickly creates strong nanotech and takes over the world.
But he was mostly completely wrong here - because human brains are actually efficient, and biology is actually pretty much pareto optimal so we can mostly rule out strong nanotech.
So instead we are more slowly advancing towards brain-like AGI, where we train ANNs through distillation on human thoughts to get AGI designed in the image of the human mind, which thinks human-like thoughts including our vario...
Upvoted for quality argument/comment, but agreement-downvoted.
I wasn't referring specifically to Yudkowsky's views, no.
I disagree that energy efficiency is relevant, either as a part of Yudkowskys model or as a constraint on FOOM.
I also disagree that nanotech possibility is relevant. I agree that Yud is a big fan of nanotech, but FOOM followed by rapid world takeover does not require nanotech.
I think mindspace is wide. It may not be wide in the ways your interpretation of Yud thinks it is, but it's wide in the relevant sense -- there's lots of room for improvement in general intelligence, and human values are complex/fragile.
Thanks for the link to Hanson's old post; it's a good read! I stand my my view that Yudkowsky's model is closer to reality than Hanson's.
I basically agree that LLMs don't seem all that inherently dangerous and am somewhat confused about rationalists' reaction to them. LLMs seem to have some inherent limitations.
That said, I could buy that they could become dangerous/accelerate timelines. To understand my concern, let's consider a key distinction in general intelligence: horizontal generality vs vertical generality.
(You might think horizontal vs vertical generality is related to breadth vs depth of knowledge, but I don't think it is. The key distinction is that breadth vs depth of knowledge concerns fields of information, whereas horizontal vs vertical generality concerns tasks. Inputs vs outputs. Some tasks may depend on multiple fields of knowledge, e.g. software development depends on programming capabilities and understanding user needs, which means that depth of knowledge doesn't guarantee vertical generality. On the other hand, some fields of knowledge, e.g. math or conflict resolution, may give gains in multiple tasks, which means that horizontal generality doesn't require breadth of knowledge.)
While we have had previous techniques like AlphaStar with powerful vertical generality, they required a lot of data from those domains they functioned in in order to be useful, and they do not readily generalize to other domains.
Meanwhile, LLMs have powerful horizontal generality, and so people are integrating them into all sorts of places. But I can't help but wonder - I think the integration of LLMs in various places will develop their vertical generality, partly by giving them access to more data, and partly by incentivizing people to develop programmatic scaffolding which increases their vertical generality.
So LLMs getting integrated everywhere may incentivize removing their limitations and speeding up AGI development.
Note that a lot of people are responding to a nontrivial enhancement of LLMs that they can see over the horizon, but wont talk about publicly for obvious reasons, so it wont be clear what they're reacting to and they also might not say when you ask.
Though, personally, although my timelines have shortened, my P(Doom) has decreased in response to LLMs, as it seems more likely now that we'll be able to get machines to develop an ontology and figure out what we mean by "good" before having developed enough general agency to seriously deceive us or escape the lab. However, shortening timelines have still led me to develop an intensified sense of focus and urgency. Many of the things that I used to be interested in doing don't make sense any more. I'm considering retraining.
Hey Mako, I haven't been able to identify anyone who seems to be referring to an enhancement in LLMs that might be coming soon.
Do you have evidence that this is something people are implicitly referring to? Do you personally know someone who has told you this possible development, or are you working as an employee for a company which makes it very reasonable for you to know this information?
If you have arrived at this information through a unique method, I would be very open to hearing that.
I didn't really update on LLMs in the past year. I did update after GPT2* that LLMs were a proof of concept that we could do a variety of types of cognition, and the mechanism of how the cognition played out seemed to have similar mid-level-building-blocks of my cognition. So, it was an update on timelines (which can affect p(doom)).
GPT4 is mostly confirming that hypothesis rather that providing significant new evidence (it'd have been an update for me if GPT4 hadn't been that useful)
*in particular after this post https://slatestarcodex.com/2020/01/06/a-very-unlikely-chess-game/
(I think people are confusing "rationalists are pointing at LLMs as a smoking gun for a certain type of progress being possible" as "rationalists are updating on LLMs specifically being dangerous")
My only update was the thought that maybe more people will see the problem. The whole debate in the world at large has been a cluster***k.
* Linear extrapolation - exponentials apparently do not exist
* Simplistic analogies e.g. the tractor only caused 10 years of misery and unemloyment so any further technology will do no worse.
* Conflicts of interest and motivated reasoning
* The usual dismissal of geeks and their ideas
* Don't worry leave it to the experts. We can all find plenty of examples where this did not work. https://en.wikipedia.org/wiki/List_of_laboratory_biosecurity_incidents
* People saying this is risky being interpreted as a definite prediction of a certain outcome.
As Elon Musk recently pointed out the more proximate threat may be the use of highly capable AIs as tools e.g. to work on social media to feed ideas to people and manipulate them. Evil/amoral/misaligned AI takes over the world would happen later.
Some questions I ask people:
* How well did the advent of homo sapiens work out for less intelligent species like homo habilis? Why would AI be different?
* Look at the strife between groups of differing cognitive abilities and the skewed availability of resources between those groups (deliberately left vague to avoid triggering someone).
* Look how hard it is to predict the impact of technology - e.g. Krugman's famous insight that the internet would have no more impact than the fax machine. I remember doing a remote banking strategy in 1998 and asking senior management where they thought the internet fitted into their strategy. They almost all dismissed it as a land of geeks and academics and of no relevance to real businesses. A year later they demanded to know why I had misrepresented their clear view that the internet was going to be central to banking henceforth. Such is the ability of people to think they knew it all along, when they didn't.
What are your opinions about how the technical quirks of LLMs influences their threat levels? I think the technical details are much more amenable to a lower threat level.
If you update on P(doom) every time people are not rational you might be double-counting btw. (AKA you can't update every time you rehearse your argument.)
How have we updated p(doom) on the idea that LLMs are very different than hypothesized AI?
Actually. what were your predictions? "Hypothesized AI", as far as I understood you, is only a final step - AGI that kills us. Path to it can be very weird. I think that before GPT many people could say "my peak of probability distribution lies on model-based RL as path to AGI", but they still had very fat and long tails in this distribution.
it seems like we're spending all the weirdness points on preventing the training of a language model that at the end of the day will be slightly better than GPT-4.
The point of slowing down AI is not preventing training next model, the point is to slow down AI. There is no right moment to slow down AI in future, because there is no fire alarm for AI (i.e., there is no formally defined threshold in capabilities that can logically convince everyone to halt development of AI until we solve alignment problem), right moment is "right now" and that was true for every moment of time since the moment we realized that AI can kill us all (somewhen in 1960s?).
I suspect it to be worth distinguishing cults from delusional ideologies. As far as I can tell, it is common for ideologies to have inelastic false poorly founded beliefs; the classical example is belief in the supernatural. I'm not sure what the exact line between cultishness and delusion is, but I suspect that it's often useful to define cultishness as something like treating opposing ideologies as infohazards. While rationalists are probably guilty of this, the areas where they are guilty of it doesn't seem to be p(doom) or LLMs, so it might not be informative to focus cultishness accusations on that.
My timelines got shorter. ChatGPT to GPT-4 rollout was only a few months (the start of an exponential takeoff, like our recent experience with COVID?), and then we had the FLI petition, and Eliezer's ongoing podcast tour, and the ARC experiment with GPT-4 defeating a captcha by lying to a human.
I also personally experienced talking to these things, and they can more-or-less competently write code, one of the key requirements for an intelligence explosion scenario.
Before all this, I felt that the AI problem couldn't possibly happen at present, and we still had decades, at least. I don't think so anymore. All of the pieces are here and it's only a matter of putting them together and adding more compute.
I used to have the bulk of my probability mass around 2045, because that's when cheap compute would catch up with estimates of the processing power of the human brain. I now have significant probability mass on takeoff this decade, and noticeably nonzero mass on it having happened yesterday and not caught up with me.
I will bet any amount of money that GPT-5 will not kill us all.
What's the exchange rate for USD to afterlife-USD, though? Or what if they don't use currency in the afterlife at all? Then how would you pay the other party back if you lose?
I'll make an even stronger bet: I will bet any amount of USD you like, at any odds you care to name, that USD will never become worthless.
if you had to imagine a better model of AI for a disorganized species to trip into, could you get safer than LLMs?
Conjecture's CoEms, which are meant to be cognitively anthropomorphic and transparently interpretable. (They remind me a bit of the Chomsky-approved concept of "anthronoetic AI".)
I don't see how LLMs are "very different" from hypothesized AI.
Personally my p(doom) was already high and increased modestly but not fundamentally after recent advances.
Here's something which makes me feel very much as if I'm in a cult:
After LLMs became a massive thing, I've heard a lot of people p(doom) on the basis that we were in shorter timelines.
How have we updated p(doom) on the idea that LLMs are very different than hypothesized AI?
Firstly, it would seem to me to be much more difficult to FOOM with an LLM, it would seem much more difficult to create a superintelligence in the first place, and it seems like getting them to act creatively and be reliable are going to be much harder problems than making sure they aren't too creative.
LLMs often default to human wisdom on topics, the way we're developing them with AutoGPT they can't even really think privately, if you had to imagine a better model of AI for a disorganized species to trip into, could you get safer than LLMs?
Maybe I've just not been looking the right places to see how the discourse has changed, but it seems like we're spending all the weirdness points on preventing the training of a language model that at the end of the day will be slightly better than GPT-4.
I will bet any amount of money that GPT-5 will not kill us all.