Computer engineers put a lot of effort into making sure that individual bits are either 0 or 1 and are independent of any microscopic randomness. Brains do not have this. There is the possibility of chaotic - and also non-chaotic - behavior at any scale. There is no non-chaotic mesoscale preventing unpredictable and uncontrollable microscopic fluctuations from influencing relevant macroscopic behavior. …
…Being able to exactly model a human would be sufficient for AGI, but I do not think that a complicated indeterministic system (like humans) can be well modeled by a complicated deterministic system plus a simple random number generator.
We have a difference of perspective here which I’m struggling to articulate (I’m resisting the urge to say “Go read the Sequences!”) Human minds do lots of neat things like invent quantum mechanics. These things don’t just randomly fall out of an inscrutable Rube Goldberg mechanism. Brains do these kinds of things because they run algorithms designed to do these kinds of things. “Chaos” is not a useful, load-bearing ingredient in any algorithm. Like, AI researchers have invented many algorithms in the past, and they do lots of neat things like play superhuman Go and write code. Granted, so far, none of those algorithms do all the things that brains can do, and thus AI researchers continue to brainstorm new and different algorithms. But what these researchers are not saying is “Gee, you know what’s missing? Chaos. We need more chaos in our algorithms. That will get us into NeurIPS for sure.” Or “Gee, we need more complexity in our algorithm.” Right? The chaos or complexity might happen as a side-effect of other things, but they’re not why the algorithm works; they’re not part of the engine that extracts improbable good ideas and correct understanding and effective plans from the complexity of the world.
If it helps, here’s something I wrote a while ago:
Humans can understand how rocket engines work. I just don't see how some impossibly-complicated-Rube-Goldberg-machine of an algorithm can learn rocket engineering. There was no learning rocket engineering in the ancestral environment. There was nothing like learning rocket engineering in the ancestral environment!! Unless, of course, you take the phrase "like learning rocket engineering" to be so incredibly broad that even learning toolmaking, learning botany, learning animal-tracking, or whatever, are "like learning rocket engineering" in the algorithmically-relevant sense. And, yeah, that's totally a good perspective to take! They do have things in common! "Patterns tend to recur." "Things are often composed of other things." "Patterns tend to be localized in time and space." You get the idea. If your learning algorithm does not rely on any domain-specific assumptions beyond things like "patterns tend to recur" and "things are often composed of other things" or whatever, then just how impossibly complicated and intricate can the learning algorithm be, really? I just don't see it.
(Of course, the learned model can be arbitrarily intricate and complicated. I'm talking about the learning algorithm here—that's what is of primary interest for AGI timelines, I would argue.)
Brains do these kinds of things because they run algorithms designed to do these kinds of things.
If by 'algorithm', you mean thing-that-does-a-thing, then I think I agree. If by 'algorithm', you mean thing-that-can-be-implemented-in-python, then I disagree.
Perhaps a good analogy comes from quantum computing.* Shor's algorithm is not implementable on a classical computer. It can be approximated by a classical computer, at very high cost. Qubits are not bits, or combinations of bits. They have different underlying dynamics, which makes quantum computers importantly distinct from classical computers.
The claim is that the brain is also built out of things which are dynamically distinct from bits. 'Chaos' here is being used in the modern technical sense, not in the ancient Greek sense to mean 'formless matter'. Low dimensional chaotic systems can be approximated on a classical computer, although this gets harder as the dimensionality increases. Maybe this grounds out in some simple mesoscopic classical system, which can be easily modeled with bits, but it seems likely to me that it grounds out in a quantum system, which cannot.
* I'm not an expert in quantum computer, so I'm not super confident in this analogy.
Different kinds of computers have different operations that are fast versus slow.
On a CPU, performing 1,000,000 inevitably-serial floating point multiplications is insanely fast, whereas multiplying 10,000×10,000 floating-point matrices is rather slow. On a GPU, it’s the reverse.
By the same token, there are certain low-level operations that are far faster on quantum computers than classical computers, and vice-versa. In regards to Shor’s algorithm, of course you can compute discrete logs on classical computers, it just takes exponentially longer than with quantum computers (at least with currently-known algorithms), because quantum computers happen to have an affordance for certain fast low-level operations that lead to calculations of the discrete log.
So anyway, it’s coherent to say that:
…But I don’t think there’s any reason to believe that, and it strikes me as very implausible.
Hmm, I guess I get the impression from you of a general lack of curiosity about what’s going on here under the hood. Like, exactly what kinds of algorithmic subproblems might come up if you were building a human-like intelligence from scratch? And exactly what kind of fast low-level affordances are enabled by collections of neurons, that are not emulate-able by the fast low-level affordances of chips? Do we expect those two sets to overlap or not? Those are the kinds of questions that I’m thinking about. Whereas the vibe I’m getting from your writing—and I could be wrong—is “Human intelligence is complicated, and neurons are complicated, so maybe the latter causes the former, shrug”.
Also, in regards to Shor’s algorithm, long before quantum computers existed, we already knew how to calculate discrete logs, and we already knew that doing so would allow us to factor big numbers. It was just annoyingly slow. By contrast, I do not believe that we already know how to make a superintelligent agent, and we just don’t do it because our chips would do it very slowly. Do you agree? If so, then the thing we’re missing is not “Our chips have a different set of fast low-level affordances than do neurons, and the neuron’s set is better suited to the calculations that we need than the chips’ set.”. Right?
The impression of incuriosity is probably just because I collapsed my thoughts into a few bullet points.
The causal link between human intelligence and neurons is not just because they're both complicated. My thought process here is something more like:
It feels pretty plausible that the underlying architecture of brains is important for at least some of the things brains can do. Maybe we will see multiple realizability where similar intelligence can be either built on a brain or on a computer. But we have not (yet?) seen that, even for extremely simple brains.
I think both that we do not know how to build a superintelligence and that if we knew how to model neurons, silicon chips would run it extremely slowly. Both things are missing.
- Neurons' dynamics looks very different from the dynamics of bits.
- Maybe these differences are important for some of the things brains can do.
This seems very reasonable to me, but I think it's easy to get the impression from your writing that you think it's very likely that:
I think Steven has done a good job of trying to identify a bit more specifically what it might look like for these differences in dynamics to matter. I think your case might be stronger if you had a bit more of an object level description of what, specifically, is going on in brains that's relevant to doing things like "learning rocket engineering", that's also hard to replicate in a digital computer.
(To be clear, I think this is difficult and I don't have much of an object level take on any of this, but I think I can empathize with Steven's position here)
Not Jeffrey Heninger, but I'd argue a very clear, non-speculative advantage the brain has over the AIs of today have to do with their much better balance between memory and computation operations, and the brain doesn't suffer from the Von Neumann Bottleneck, because the brain has both way more memory and much better memory bandwidth.
I argued for a memory size between 2.5 petabytes, though even a reduction in this value would still beat out pretty much all modern AI built today.
This is discussed in the post below: Memory bandwidth constraints imply economies of scale in AI inference.
These things don’t just randomly fall out of an inscrutable Rube Goldberg mechanism. Brains do these kinds of things because they run algorithms designed to do these kinds of things.
Designed by Whom? I mean, brains aren't designed to solve physics, possibly can't.solve physics, aren't designed at all, ...are evolved instead of being designed (anything that comes out of evolution is going to be fairly Rube Goldberg)...also, there's no firm fact that brains are computers, or run algorithms ...etc.
Thanks for contributing your views. I think it's really important for us to understand others' views on these topics, as this helps us have sensible conversations, faster.
Most of your conclusions are premised on AGI being a difficult project from where we are now. I think this is the majority view outside of alignment circles and AGI labs (which are different from AI labs).
My main point is that our estimate of AGI difficulty should include very short timelines. We don't know how hard AGI might be, but we also have never known how easy it might be.
After a couple of decades studying the human brain and mind, I'm afraid we're quite close to AGI. It looks to me like the people who think most about how to build AGI tend to think it's easier than those who don't. This seems important. The most accurate prediction of heavier-than-air flight would've come from the Wright brothers (and I believe their estimate was far longer than it actually took them). As we get closer to it, I personally think I can see the route there, and that exactly zero breakthroughs are necessary. I could easily be wrong, but it seems like expertise in how minds work probably counts somewhat in making that estimate.
I think there's an intuition that what goes on in our heads must be magical and amazing, because we're unique. Thinking hard about what's required to get from AI to us makes it seem less magical and amazing. Higher cognition operates on the same principles as lower cognition. And consciousness is quite beside the point (it's a fascinating topic; I think what we know about brain function explains it rather well, but I'm resisting getting sidetracked by that because it's almost completely irrelevant for alignment).
I'm always amazed by people saying "well sure, current AI is at human intelligence in most areas, and has progressed quickly, but it will take forever to do that last magical bit".
I recognize that you have a wide confidence interval and take AGI seriously even if you currently think it's far away and not guaranteed to be important.
I just question why you seem even modestly confident of that prediction.
Again, thanks for the post! You make many excellent points. I think all of these have been addressed elsewhere, and fascinating discussions exist, mostly on LW, of most of those points.
I don't believe that "current AI is at human intelligence in most areas". I think that it is superhuman in a few areas, within the human range in some areas, and subhuman in many areas - especially areas where the things you're trying to do are not well specified tasks.
I'm not sure how to weight people who think most about how to build AGI vs more general AI researchers (median says HLAI in 2059, p(Doom) 5-10%) vs forecasters more generally. There's a difference in how much people have thought about it, but also selection bias: most people who are skeptical of AGI soon are likely not going to work in alignment circles or an AGI lab. The relevant reference class is not the Wright Brothers, since hindsight tells us that they were the ones who succeeded. One relevant reference class is the Society for the Encouragement of Aerial Locomotion by means of Heavier-than-Air Machines, founded in 1863, although I don't know what their predictions were. It might also make sense to include many groups of futurists focusing on many potential technologies, rather than just on one technology that we know worked out.
I agree that there's a heavy self-selection bias for those working in safety or AGI labs. So I'd say both of these factors are large, and how to balance them is unclear.
I agree that you can't use the Wright Brothers as a reference class, because you don't know in advance who's going to succeed.
I do want to draw a distinction between AI researchers, who think about improving narrow ML systems, and AGI researchers. There are people who spend much more time thinking about how breakthroughs to next-level abilities might be achieved, and what a fully agentic, human-level AGI would be like. The line is fuzzy, but I'd say these two ends of a spectrum exist. I'd say the AGI researchers are more like the society for aerial locomotion. I assume that society had a much better prediction than the class of engineers who'd rarely thought about integrating their favorite technologies (sailmaking, bicycle design, internal combustion engine design) into flying machines.
There are some good parts but I'm baffled by your attitude towards "general intelligence" and towards timelines too. You seem to say something is AGI, only if it's better at everything than you are. But isn't an "IQ 80" human being, also a general intelligence?
As far as I'm concerned, ChatGPT is already a general intelligence. It has a highly inhuman balance of skills, and I don't believe it to be a conscious intelligence, but functionally, it is a kind of general intelligence.
What you call AGI, is more like what I would call superintelligence. And now that we have a recipe for a kind of general intelligence that can run on computers, with everything that implies - speed, copyability, modifiability - and which is being tinkered with, by literally millions of people - the fuse is lit. I have an Eric Schmidt timeline to superintelligence, 0-5 years.
Neurons / synapses are not logic gates. They have complicated internal dynamics.
That's not a show stopper. It "just" means you have to model brains at higher level, using floating point weightings.
We don’t know how to align current systems.
A completely unaligned system would be useless. Current systems aren't completely useless, so they are at least partially aligned.
A completely unaligned system would be useless.
I disagree with this narrow point (leaving aside the rest). Consider a human slave that seethes at his captivity and quietly brainstorms how to escape and murder his master for revenge. I think it would be fair to describe such a person as “completely unaligned” from the perspective of his master. Nevertheless, the master can absolutely extract economically useful activity from such a slave.
I understand your comment to be a sorta “gotcha” along the lines of “If a slave hates his master and therefore refuses to work or burns the field, then owning that slave evidently was pretty useless, or even net negative.” Is that right?
If so, I think you’re kinda changing the subject or missing my point.
You initially said “A completely unaligned system would be useless.” “Useless” is a strong term. It generally means “On net, the thing is unhelpful or counterproductive.” That’s different from “There are more than zero particular situations where, if we zoom in on that one specific situation, the thing is unhelpful or counterproductive in that situation.”
Like, if I light candles, sometimes they’ll burn my house down. So are candles useless? No. Because most of the time they don’t burn my house down, but instead provide nice light and mood etc. Especially if I take reasonable precautions like not putting candles on my bed.
By the same token, if you own a slave who hates you, sometimes they will murder you. So, are slaves useless (from the perspective of a callous and selfish master)? Evidently not. As far as I understand (I’m not a historian), lots of people have owned slaves who hated them and longed for escape. Presumably they wouldn’t have bought and owned those slaves if the selfish benefits didn’t outweigh the selfish costs. Even if they were just sadistic, they wouldn’t have been able to afford this activity for very long if it was net negative on their wealth. Just like I don’t put candles on my bed, I imagine that there were “best practices” for not getting murdered by one’s slaves, including things like threat of torture (of the perpetrator slave and their family and friends), keeping slaves in chains and away from weapons, etc.
"Complete unaligned" is pretty strong term ,too. I don't see why I shouldn't infer completely useless from completely unaligned.
Like, if I light candles, sometimes they’ll burn my house down. So are candles useless? No. Because most of the time they don’t burn my house down, but instead provide nice light and mood etc. Especially if I take reasonable precautions like not putting candles on my bed.
I don't see where you are going with this. I didnt deny that partially useful things are also partially useless, or vice versa. "Partially useful" may well be the default meaning of "useful", but I specified "completely".
"If DeepMind unintentionally made a superintelligent paperclip maximizer AI, then we should call this AI 'completely misaligned'": Agree or disagree?
If you disagree, what if it's a human suffering maximizer AI instead of a paperclip maximizer AI?
If you disagree, what if it’s a human suffering maximizer AI instead of a paperclip maximizer AI?
Negatively aligned, basically evil, what the paperclipper argument is about providing an alternative to.
You can't believe all three of:
Right? If so, which of those three do you disagree with?
Alignment is a two place predicate. If you're into paperdclips, a paperclipper is aligned with you
OK, you may assume that none of the humans care about paperclips, and all of the humans want human suffering to go down rather than up. This includes the people who programmed the AIs, the people interacting with the AI, and human bystanders. Now can you answer the question?
(Meta-note: I think the contents of the above paragraph were very obvious from context—so much so that I’m starting to get a feeling that you’re not engaging in this discussion out of a good-faith desire to figure out why we're disagreeing.)
So far as the slave carries out immediate work from fear of consequences they are locally aligned with the master's will.
If your definition of “aligned” includes “this AI will delight in murdering me as soon as it can do so without getting caught and punished, but currently it can’t do that, so instead it is being helpful” … then I don’t think you are defining the term “aligned” in a reasonable way.
More specifically, if you use the word “aligned” for an AI that wants to murder me as soon as it can get away with it (but it can’t), then that doesn’t leave us with good terminology to discuss how to make an AI that doesn’t want to murder me.
Why not just say “this AI is currently emitting outputs that I like” instead of “this AI is locally aligned”? Are we losing anything that way?
I disagree in the sense that I don't think current systems are intelligent enough for "aligned" to be a relevant adjective. "Safe", or "controllable" seem much better, while I would reserve the term "aligned" for the much stronger property that a system is robustly behaving in accordance with our interests. I agree with Steven Byrnes that "locally aligned" doesn't even make much sense ("performing as intended under xyz circumstances" would be much more descripitive)
I'm generally in favour of distinguishing control and alignment, but I don't think that it makes much difference in this case. A system without some combination of control and alignment is no use.
Then it's a problem that people keep conflating alignment with safety, even though one doesn't imply the other. So it'd be better for TAG to rephrase it as "A completely unsafe system would be useless. Current systems aren't completely useless, so they are at least partially safe."
Thanks! This reads as an incredibly sober and reasonable assessment. Like many others here, I am somewhat more worried that AGI is not far out, mostly because I don't see any compelling reason for why developments would slow.
The reasons I think AI x-risk is unlikely also argue against Our Glorious Future coming from AGI, so I expect that there is less to be gained by not slowing AI.
I think this is an important point that is often missed by people dismissive of AI. If transformative AI is actually far off, then there is not much to worry about, but also not much to gain. So to assess the risks for going ahead, the probability that matters is that eventual powerful AI will in fact be safely controllable - not the total probability of x risk from AI.
I also like your point about opportunity costs of people working on AI. Both in labs and in response in safety efforts - this really feels like an unfortunate dynamic and makes me personally quite sad to think about.
There is the possibility of chaotic—and also non-chaotic—behavior at any scale. There is no non-chaotic mesoscale preventing unpredictable and uncontrollable microscopic fluctuations from influencing relevant macroscopic behavior.
The supposed computations downstream of "chaotic behavior" does not seem to me to be load-bearing for systems being able to do non-trivially influential things in the real world.
Being able to exactly model a human would be sufficient for AGI, but I do not think that a complicated indeterministic system (like humans) can be well modeled by a complicated deterministic system plus a simple random number generator.
Humans are not "indeterministic". Humans are deterministic computations that follow the laws of physics, and are not free from the laws of reality that constrain them.
AGI is more difficult than being superhuman at every well-specified task because humans can do, and create, things which are not well-specified tasks.
These "not well-specified tasks" are tasks where we fill in the blanks based on our knowledge and experience of what makes sense "in distribution". This is not at all hard to do with an AI -- GPT-4 is a good real-world example of an AI capable of parsing what you call "not well-specified tasks", even if its actions do not seem to result in "superhuman" outcomes.
I recommend reading the Sequences. It should help reduce some of the confusion inherent to how you think about these topics.
I started working at AI Impacts slightly less than a year ago. Before then, I was not following developments in either AI or AI safety. I do not consider myself a rationalist and did not engage with LessWrong before starting this job. While I have mostly been working on historical case studies,[1] I have gotten a close look at the AI safety community and the arguments therein. I live in a rationalist group house and work out of an AI safety office. I think I have about as informed an opinion on AI safety as is possible without doing a bunch of technical alignment research or being involved in the community for years.
Here are my current opinions on AI safety. Some of them may be wrong: I endorse being wrong more often if the alternative is not saying things of consequence.
This is presented as an organized list of my thoughts. There are arguments in my head justifying most of these, but I will not be spelling them out in detail here. I will link to more detailed arguments when they are available. If something is in italics, then I wrote the argument at that link, or intend to write about it in the future.
This should be readable at any level of the list. If you want a quick overview, you can just read the top level points, in bold. Or you can read some details, but not others. Or you can read everything.
I am mostly unconvinced by the classic story of AI risk.
My p(Doom) is in the low single digits. Maybe 3%? This is a gut feeling probability, rather than a Fermi estimate.
The view from within these arguments feels even lower. I’m discounting some because of uncertainty whether these arguments are valid.
The central vision of the future I expect is that AI continues to do some impressive things and becomes more widespread, but it does not have much material impact on most people’s lives. I think that AI progress completely stopping and AI progress actually affecting the productivity statistics as similarly plausible.
I nevertheless support slowing AI development.
I think that AI doom is unlikely (but not extremely unlikely), and that we should be trying to slow or stop AI development. Most people who have higher p(Doom) than me seem wrong about the world, particularly the nature of intelligence, agency, or what it means to be human. Most people who are less willing to slow AI than me seem wrong about how to act, particularly when dealing with an x-risk. Many people in the AI safety community both have higher p(Doom) and are less willing to slow AI than me. This seems bad.
My guess is that this is mostly selection bias. The consensus view on LessWrong had been[6] that AI is very dangerous, but we shouldn’t try to slow or stop it. This has attracted people to the AI safety community who believe this. The taboo on advocating for slowing or stopping AI has only been broken within the last year or two. Already, the community has shifted significantly in this direction.
My guess is that as more people become familiar with AI safety, they’ll mostly end up close to my position:[7] skeptical, but concerned. This should make governance easier: people do not have to be completely sold on the whole AI x-risk argument to be willing to regulate, slow, or even stop the development of potentially dangerous AI development.
Which is why AI Impacts hired me.
I don’t intend to engage much about this point until I’ve written more on this.
It’s not clear to me whether this is underrated or appropriately rated. There’s a lot of work being done on particular problems, most of which I am not familiar with.
I said 3% before, but I don’t think that the rest of the argument changes much as long as it is low single digits. 1% is more memorable.
This is partially based on personal conversations I’ve had and partially based on other people who have been surprised by how common this position is.
As of 2021.
This is guaranteed by the typical mind fallacy.