As a reductionist, I view the universe as nothing more than particles/forces/quantum fields/static event graph. Everything that is or was comes from simple rules down at the bottom. I agree with Eliezer regarding many-worlds versus copenhagen.
With this as my frame of reference, Searle's argument is trivially bogus, as every person (including myself) is obviously a Chinese Room. If a person can be considered 'conscious', then so can some running algorithm on a Turing machine of sufficient size. If no Turing machine program exists that can be considered conscious when run, then people aren't conscious either.
I've never needed more than this, and I find the Chinese Room argument to be one of those areas where philosophy is an unambiguously 'diseased discipline'.
The real issue with the Chinese Room argument is that it misidentifies the reason we can't make Giant Look Up Tables: The reason is Conservation of Energy and lightspeed means we can't create new energy out of nothing, and this matters. Specifically, Giant Look Up Tables grow fast to learn new things. By the time you have emulated a human mind, you have now used countless observable universes to make the look up table, which you can't do
That's why look up tables can't be used for intelligence past simple tasks, and why GPT can't be a look up table.
To be fair, even if what you're referring to above is true (I don't believe it is - lookup table compression is a thing), it's an implementation detail. It doesn't matter that a naive implementation might not fit in our current observable universe; it need merely be able to exist in some universe for the argument to hold.
And in a way, this is my core problem with Searle's argument. I believe you can fully emulate a human with both sufficiently large lookup tables, and also with pretty small lookup tables combined with some table expansion/generation code running on an organic substrate. I don't challenge the argument based the technical feasibility of the table implementation. I challenge the argument on the basis that the author mistakenly believes that the implementation of any given table (static lookup table versus algorithmic lookup) somehow determines consciousness.
While I agree with your argument against Searle, it matters whether it's at all feasible because if it isn't, then the argument Searle is using has no real relation to AI today or in the future, and therefore we can't use it to argue against the hypothesis that they lack intelligence/consciousness.
To be clear, I agree with your argument. I just want to note that physical impossibilities are being used to argue that today's AI aren't intelligent or conscious.
Instructions to carry out paperwork. You read a process off from the book and not a symbol.
If you are allowed to make (guided by the book) notes then you have memory that persists between "lookups".
If you have new paper available and there are recursive instructions in the book it might be quite a while you write symbols for "internal consumption" before you produce any symbol that is put in the output slot.
It might be that the original idea was less specified but I think it points in the same direction as effective method.
Its instructions need only to be followed rigorously to succeed. In other words, it requires no ingenuity to succeed
With "you do not need to know what you are doing" meaning that "sticking to the book" is sufficient ie no ingenuity.
But brainless action still involves more stuff than just writing a single (character/phrase) to the output slot.
With "you do not need to know what you are doing"
This is basically black box intelligence, and there's no reason to make the assumption that black box methods cannot work. Indeed black boxes are already used for AI today.
It may be nice for it to be white box, but I see no reason for black boxes not to be intelligent or conscious.
It is about that the human knows how to read the book and doesn't misunderstand it. If you where in the chinense room and had a book written in english and you do not know even english you do not know how to operate the room. It is a "no box" in that the human does not need to bring anything to the table (and the book does it all).
If you knew what the chinese was about it wouldn't be an obstacle if you could not read a book written in english. But knowing english whether or not you know chinese doesn't make a difference.
So the critical disagreement is assuming we can add enough energy to learn everything in the book with no priors, and arbitrarily large memory capacity, then it is equivalent to actually knowing Chinese, since you can store arbitrarily large memory in your head, which includes a vast but finite ruleset for Chinese. Of course to learn new languages, you will have to expend it again, which rapidly spirals into an uncontrollable energy cost, which is why Chinese Rooms can't actually be built in real life, which is why the success of GPT refutes Searle and Gary Marcus's thesis that they are just Chinese Rooms.
No, I think I am not claiming about energy usage.
We are allowed to only look at a single page / sentence at a time which is quite a lot more possible with a finite read-head and not need to remember pages we have turned away from.
You can google to benefit from the whole internet without needing to download the whole internet. You can work a 2 TB hardrive while only having 64MiB L1 cache. You can run arbitrary python programs with a CPU core with an instruction set that is small, finite and can not be appended.
You can manage a 2-hour paperwork session with a 6-second memory buffer.
I guess the human needs to bring english in their head instead of having a literally empty head. But having english in your head is not energywise a miracle to do.
then it is equivalent to actually knowing Chinese,
Then it is equivalent to who/what knowing Chinese?
Specifically, once it has memorized the book, it doesn't need to use the book anymore, and can rely on it's own memory.
So the operator can manage without the book, but the book can't manage without the operator...?
You can read the paper yourself. It doesn't exactly say it is a LUT, and it doesn't exactly say it isnt.
The passage quoted in the OP says
Imagine a native English speaker who knows no Chinese locked in a room full of boxes of Chinese symbols (a data base) "together with a book of instructions for manipulating the symbols (the program).
...But says nothing about the complexity of the "program". Philosophybear, like so many others, jumps to the conclusion that it must be a LUT.
The problem is that the limiting case of pure memorization is a look up table of some form. If we drop the requirement for pure memorization and instead allow world models or heuristics, then we don't have to use a look up table.
More generally, this is important, since if AI could only memorize like the Chinese Room, GPT-3 would not be at all possible without expending vastly more energy than was actually done, so we should strongly update against the pure memorization hypothesis.
The other complaint I think Searle is making is that black box intelligence/consciousness, whether a look up table or something else, is not real intelligence/consciousness. And that's wrong, since there are in fact black box intelligences we can't interpret well like Evolution Strategies or Model Free RL, so the fact that a Turing machine emulates Chinese in a black box format is equivalent to learning it in a white box format.
Searle might be trying to tease apart that intelligence/consciousness. If you take a feeling and breathing human being and make it perform a complex task, you didn't add feeling at any point but it just got passively preserved. So if you start from nothing, add performance of a complex task and end up with a feeling and breathing human being, it is a bit weird where the breathing came from.
If some task is exactly breathing then it is plausible that starting from nothing can work if the task involves breathing. If it can't be done then "taskness" is not sufficient.
Searle might be trying to tease apart that intelligence/consciousness. If you take a feeling and breathing human being and make it perform a complex task, you didn't add feeling at any point but it just got passively preserved. So if you start from nothing, add performance of a complex task and end up with a feeling and breathing human being, it is a bit weird where the breathing came from.
It's already covered by conservation of energy, IE new feelings must pay an energy cost to feel something new, so this is double-counting.
And conservation of energy does not always hold in mathematics (though such a cosmology would allow time travel.) Thus this is a contingent fact of our universe, not something that has to be universally true.
Paying an energy cost would be a task and would operate on the intelligence side of it. Typically feelings come with cognition discernable aspects.
Think of a pinball machine that moves a ball in complex ways and entirely built and defined as moving operations. Then you put in a red ball and the pinball machine spits out a blue ball. This is surprising as no operation should color the ball. One can't explain with kinetic energy conversation conservation that color conversion should be impossible.
If you had some part of the flipper that red the color of the ball and conditional on that kicked the ball then one could tell the color of the ball from its trajectory. If no such things exists then pointfulness of even talking about color might drop and somebody might think that color is "epiphenomena" that is not an actual thing that happens. A claim that the machine recolors the ball is not a claim about trajectories.
I made a bad analog in using breathing as that can sound like a trajectory term. Somebody that believes in P-zombies might ask whether a particular actor is a P-zombie. You start something that is definetely not functional and definetely is a zombie. Then you put it through and upgrade/process that makes it functional. You would expect to have produced a P-zombie. But if you actually receive a human non-p-zombie you might start suspecting that the process somehow also works in the zombie dimension. P-zombie compared to usual human is intelligence similar and conciousness dissimilar. Conflating and taking them to be synonymous with each other makes talking about it hard and/or impossible.
Edit: typo that has chances to be quite meaningful. I did not mean "turn energy to color" but rather that "energy stays stable, color is unstable".
Think of a pinball machine that moves a ball in complex ways and entirely built and defined as moving operations. Then you put in a red ball and the pinball machine spits out a blue ball. This is surprising as no operation should color the ball. One can't explain with kinetic energy conversation that color conversion should be impossible.
The key to solve this problem is that extra energy was applied somewhere else, like a color machine, and this is an open system, so conservation of energy does not hold here.
If it is open, how do we know how much energy there is supposed to be, to determine that there is extra?
If we have built the machine and therefore are quite sure that there are no surprise coloring machines then the various part placements can not be the detail where we screwed up.
Why would applying a surprise jolt of energy on the ball change its color? Imagine that you find a particular bumber before which the ball is red and after which it is blue. Why would calling this bumber a "coloring machine" explain anything? Would the blue ball leaving with a tiny bit of speed deficiency explain why the ball got colored or why it is this specific bumber rather than all the others which have come from the same assembly line?
I was of course talking about a coloring machine, though one important point is with enough energy directed in the right way, you can do things like simulate a coloring machine and then add new color.
Energy, properly applied and enough energy, can do nearly everything, like changing the color.
I will give that bumber became a superpowered arcane user that can with miniscule energy make unknown effects.
It would still be interesting that this ended up happening starting by taking out a factory-standard bumber part and setting out to build a non-magical pinball machine. That is you are not trying to make a coloring machine happen. You do not have any reason to believe any unknown high-tech spy would be interested to be gaslighting you. As long as you can be super sure that you saw it blue and super sure you saw it red and that you were not trying to make it happen, you have reason to believe you do not understand what you ended up doing.
Maybe you try to look a bit more into it and try a geiger counter on the ball. Before machine it doesn't radiate and after machine it radiates. You could feel good and go "I invented an irradiator!" or you could try to hunt down an irradiator bumber in the pinball machine. But you still do not know how you did it.
There could be any number of properties that you could learn to check your balls for. The claim is not that you need to learn all these properties to master predicting the kinetics of the pinball. The claim is not that the new properties would be a source of infinite or absurd amount of kinetic energy. The kinetics works as predicted and is closed in regards to all these other things. Learning about new properties does not change the fact that you saw the ball previously bounce around the machine. The claim is that your mastery of kinetics can not explain the machine turning out to be a converter for property number 56.
Maybe you think that kinetics can not be closed in regards to other properties. "Kinetics is everything". Then when you try to blueprint the pinball machine in meticilous detail you should be able to predict all the other properties. Then showing what kind of ball went in and what kind of ball came out, you should be 100% be able to determine whether it was put in your unmeddled machine or in some other machine.
But meticulousness is hard and in your non-omnisciene you can only do that to the practical limit. So you learn about new property number 12. Unassembled you test each bumber separately what kind of effect it has on the ball. Your friends ask that since you always master the latest discovered property first whether you have done it for number 12 yet. You feel confident. Then you assemble the pinball machine. And put a ball through the machine. If you are surprised whether it is a property 12 converter then you have knowledge that your mastery is not at a level of what is possible to construct and measure practically.
So claims of the form "you need to keep track of property number 9 in order to be able to practically predict what happens about practically doable measurements of property 10" do not accept unknown unknowns as an excuse.
Claims of the form "you can mixup properties 8 and 7 and it will not have a practical or observable difference" combined with not being able to be close kinetics away from them means that existence of of such properties is a settleable question.
If intelligence and conciousness are separate properties and we can contain an argument to be about intelligence only, it cannot inform us about conciousness.
If intelligence and conciousness are connected properties we can not claim that an argument about one of them is irrelevant in regards to the other.
The problem is that there are a bunch of intermediate positions between LUT and strong AI. The doubt about an LLM is whether it has a world model, when it is only designed as a word model.
Nothing is fundamentally a black box.
Nothing is fundamentally a black box.
That claim is unjustified and unjustifiable. Everything is fundamentally a black box until proven otherwise. And we will never find any conclusive proof. (I want to tell you to look up Hume's problem of induction and Karl Popper's solution, although I feel that making such a remark would be insulting your intelligence.) Our ability to imagine systems behaving in ways that are 100% predictable and our ability to test systems so as to ensure that they behave predictably does not change the fact that everything is always fundamentally a black box.
That claim is unjustified and unjustifiable
Nothing complex is a black box , because it has components, which can potentially be understood.
Nothing artificial is a black box to the person who built it.
An LLM is , of course, complex and artificial.
Everything is fundamentally a black box until proven otherwise.
What justifies that claim?
Our ability to imagine systems behaving in ways that are 100% predictable and our ability to test systems so as to ensure that they behave predictably
I wasn't arguing on that basis.
Nothing is fundamentally a black box.
This might a key crux: I think that white box is a rare state, and interpretability is not the default case here.
Indeed I see the opposite: Black box strategies like Model Free RL work a lot better currently than white box strategies, and I think black boxes are the default scenario. You have to do a lot of work to interpret things enough to make it white box.
Any system of sufficient complexity is incomprehensible to an observer of insufficient intelligence. But that's not fundamental.
I have a better argument now, and the answer is that the argument fails in the conclusion.
The issue is that conditional on assuming that a computer program (speaking very generally here) is able to give a correct response to every input of Chinese characters, and it knows the rules of Chinese completely, then it must know/understand Chinese in order to do the things that Searle claims it to be doing, and in this instance we'd say that it does understand Chinese/decide Chinese for all purposes.
Basically, I'm claiming that the premises lead to a different, opposite conclusion.
These premises:
“Imagine a native English speaker who knows no Chinese locked in a room full of boxes of Chinese symbols (a data base) together with a book of instructions for manipulating the symbols (the program). Imagine that people outside the room send in other Chinese symbols which, unknown to the person in the room, are questions in Chinese (the input). And imagine that by following the instructions in the program the man in the room is able to pass out Chinese symbols which are correct answers to the questions (the output).
assuming that every input has in fact been used, contradicts this conclusion:
The program enables the person in the room to pass the Turing Test for understanding Chinese but he does not understand a word of Chinese.”
The correct conclusion, including all assumptions is that they do understand/decide Chinese completely.
The one-sentence slogan is "Look-up table programs are a valid form of intelligence/understanding, albeit the most inefficient form of intelligence/understanding."
What it does say is that without any restrictions on how the program computes Chinese or any problem, other than it must give a correct answer to every input, the answer to the question of "Is it intelligent on this specific problem/does it understand this specific problem?" is always yes, and to have the possibility of it being no, you need to add more restrictions than that to make the answer be no.
The look-up table idea can be quite misleading. In essense you need something else than a look-up table or your look-up table needs to be properly huge.
The room should pass a challenge like
U: "Hello"
R: "Hello"
U:"Blue"
R:"Excuse me?"
U:"What did I just say?"
R: "Blue"
Doing this with a fixed book that you can't write on is quite challenging. It can be done with a book that has instructions like "conversation history is: [...], give 'blue' to user" (it is so huge that avoiding a black-hole weight even with minimal energy-use per single instruction is hard (not that we are concerned with physics in thought-experiment land)).
However I think it is better to think that the room operator has access to a big pile of blank paper and can copy symbols given by instructions on them and rules can condition on a symbol being drawn on a piece of paper. In order to be convinced that this "expansion" no relevant capabilities have been added, the setup should be fixed but I claim that without loss of generality one can think about the filled papers to be in a an ordered pile on which only the top paper is visible and book can contain orders like "put top paper to the bottom of the pile" etc.
Then an echo request becomes a rather simple paper-dance rather than 30 000 different commands spanning the whole vocabulary for chinese for each possibility of what the content of the echo request might be.
A related claim is that if you have a specification of a turing machine, running it requires absolutely zero idea what the symbols mean. And that a turing machine that has a fixed symbol-replacement-table can have quite dynamical and variable-like behaviour. The book of the room operator is the symbol-replacement-table and not the table of inputs-to-outputs (although given such a IO table constructing a paper-shuffle-table that realises it should be consistently possible).
To the point about being text-biased. Sense organs are connected to the brain via nerves. It is possible to find a spatial surface that goes through a human where all cognitive relevant data is inside the nerve-signaling (approximately the spine, one could argue that hormonal signals should be included). A nerve can either be firing or not be firing. Thus a bit is a faitful representation of its informational content. One can do time-slices where say each 10 micro-second gets its own slice. So 100 if the firign happens early, 010 a bit later and 001 if it is the last time. Say that you have 3 nerves you can combine their time-content representations together a representation of the whole 100010010 for one nerve firing early and 2 other firing simultanoeusly after that If you do this so that it contains all the nerves and how much time that the extended present spans this is a sufficient representation of data that the brain works on (it is not missing anything, claiming that it isn't gets to ESP territority (which adrenaline for these purposes count)).
Then one can use a scheme like taking every 5 bits and a scheme like 00001=A, 00010=B, 00011=C ... to have it as a string of letters (that different nerves and times mix is interesting but inessential) (chop into chunks of 5 letters if thinking in words is preffered). And this covers data coming from the eye, the data coming from nose and the data coming from the inner ear etc. Similarly for commands going out to arms and such. Thus text has to be sufficient to detail a humans experience of the world and this is not at all dependent on what data formats senses use or what the qualia are. If we had a text to picture conversion and a picture to text conversion we might next be asking about the possiblity of smell to text conversions and even if we exhaust all known "obvious" senses. With this setup we know that any "electronic sense" gets accounted for.
While words are transmitted as phonons and are more multidimensional than this to the extent that humans can cognitively engage with them they have to appear through this nerve-representation layer.
The lookup table view is a distraction. First of all, it is completely implausible - the lookup table needed would have more entries than atoms in the universe (much more, actually). But even if the lookup table could exist, the only way it could be filled in with good entries is by having a real or simulated person specify all those entries. So while the room may not understand Chinese, the real/simulated people who filled in the lookup table entries did. When someone interacts with the room, and thinks the room understands what they say, they are basically right, but are just wrong about the location in time and space of the entity that understands.
You keep treating possibilities as actualities. LamDa might be simulating people without being programmed or prompted to, the CR might have full semantics without a single symbol being grounded ..but in both cases they might not.
There isn't any fundamental.doubt about what computers are doing, because they are computers. Computers can't have strongly emergent properties. You can peek inside the box and see what's going on.
The weakness of the systems reply to the CR is that it forces you to accept that a system that is nothing but a look up table has consciousness.. or that a system without a single grounded symbol has semantics. (Searle can close the loophole about encoding images by stipulation).
Likewise, there is no reason to suppose that LamDa is simulating a person every time it answers a request -- it's not designed to do that, and it's not going to do so inexplicably because it's a computer, and you can examine what it's doing.
Preface:
I’m putting my existing work on AI on Less Wrong, and editing as I go, in preparation to publishing a collection of my works on AI in a free online volume. If this content interests you, you could always follow my Substack, it's free and also under the name Philosophy Bear.
Anyway, enjoy. Comments are appreciated as I will be rewriting parts of the essays before I put them out:
The essay:
The title is a play on “Against the Airport and its World” and is in no way intended as a slight against any named author, both of whom I respect intellectually, and do not know enough about to evaluate as people.
The other day I gave an argument that it may be that the differences between whatever LaMDA is and true personhood may be more quantitative than qualitative. But there’s an old argument that no model which is based purely on processing text and outputting text can understand anything. If such models can’t understand the text they work with, then any claim they may have to personhood is at least tenuous, indeed let us grant, at least provisionally, scrapped.
That argument is the Chinese Room Argument. Gary Marcus, for example, invokes it in his 2022 article “Google’s AI is not sentient. Not even slightly”- [Edit: or I should say, at least on my reading of Marcus’s article he alludes to the Chinese Room argument although some of my readers disagree].
To be clear, Marcus, unlike Searle does not think that no AI could be sentient, but he does think, as far as I can tell, that a pure text-in, text-out model could not be sentient for Chinese Room-related reasons. Such models merely associate text with text- they are a “giant spreadsheet” in his memorable phrase. One might say tthey have a purely syntactic not semantic character.
I will try to explain why I find the Chinese Room argument unconvincing, not just as proof that AI couldn’t be intelligent, but even as proof that a language model alone can’t be intelligent. Even though the arguments I go through here have already been hashed out by other, better philosophers, I want to revisit this issue and say something on it- even if it’s only a rehash of what other people have said- because the issue of what a model that works on a text-in-text-out basis can or cannot understand is very dear to my heart.
The Chinese Room argument, summarised by Searle goes:
“Imagine a native English speaker who knows no Chinese locked in a room full of boxes of Chinese symbols (a data base) together with a book of instructions for manipulating the symbols (the program). Imagine that people outside the room send in other Chinese symbols which, unknown to the person in the room, are questions in Chinese (the input). And imagine that by following the instructions in the program the man in the room is able to pass out Chinese symbols which are correct answers to the questions (the output). The program enables the person in the room to pass the Turing Test for understanding Chinese but he does not understand a word of Chinese.”
In the original thought experiment the program effectively constituted a lookup table. “Output these words in response to these inputs”.
I’ve always thought that two replies- taken jointly - capture the essence of what is wrong with the argument.
The whole room reply: It is not the individual in the room who understands Chinese, but the room itself. This reply owes to many people, too numerous to list here.
The cognitive structure reply: The problem with the Chinese room thought experiment is that it depends upon a lookup table. If the Chinese room used instead of some kind of internal model of how things relate to each other in the world in order to give its replies, it would understand Chinese- and, moreover, large swathes of the world. This reply, I believe, owes to David Braddon-Mitchell and to Frank Jackson. The summary of the two replies I’ve endorsed, taken together, is:
“The Chinese Room Operator does not understand Chinese. However, if a system with a model of interrelations of things in the world were used instead, the room as a whole, but not the operator, could be said to understand Chinese.”
There need be nothing mysterious about this modeling relationship I mention here. It’s just the same kind of modeling a computer does when it predicts the weather. Roughly speaking I think X models Y if X contains parts that are isomorphic to the parts of Y, and these stand in isomorphic relationships with each other (especially the same or analogous causal relationships) that the parts of Y do. Also, the inputs and outputs of the system causally relate to the thing modeled in the appropriate way.
It is certainly possible in principle for a language model to contain such world models. It also seems to me likely that actually existing language models can be said to contain these kinds of models implicitly, though very likely not at a sufficient level of sophistication to count as people. Think about how even a simple feed-forward, fully connected neural network could model many things through its weights and biases, and through the relationships between its inputs, outputs and the world.
Indeed, we know that these language models contain such world models at least to a degree. We have found nodes that correspond to variables like “positive sentiment” and “negative sentiment'“. The modeling relationship doesn’t have to be so crude as “one node, one concept” to count, but in some cases, it is.
The memorisation response
Let me briefly deal with one reply to the whole room argument that Searle makes- what if the operator of the Chinese room memorized the books and applied them? She could now function outside the room as if she were in it, but surely she wouldn’t understand Chinese. Now it might seem like I can dismiss this reply out of hand because my reply to the Chinese room combines a point about functional structure, a look-up table is not good enough. Nothing obliges me to say that if the operator memorized the lookup tables, they’d understand Chinese.
But this alone doesn’t beat Searle’s counterargument because it is possible that she calculates the answer with a model representing parts of the world, but she (or at least her English-speaking half) does not understand these calculations. Imagine that instead of memorizing a lookup table, she had memorized a vast sequence of abstract relationships- perhaps represented by complex geometric shapes, which she moves around in her mind according to rules in an abstract environment to decide what she will say next in Chinese. Let’s say that the shapes in this model implicitly represent things in the real world, with relationships between each other that are isomorphic to relationships between real things, and appropriate relationships to inputs and outputs. Now Searle says “look, this operator still doesn’t understand Chinese, but she has the right cognitive processes according to you.”
But I have a reply- In this case I’d say that she’s effectively been bifurcated into two people, one of which doesn’t have semantic access to the meanings of what the other says. When she runs the program of interacting abstract shapes that tell her what to say in Chinese, she is bringing another person into being. This other person is separated from her, because it can’t interface with her mental processes in the right way [This “the operator is bifurcated” response is not new- c.f. many such as Haugeland who gives a more elegant and general version of it].
Making the conclusion intuitive
Let me try to make this conclusion more effective through a digression.
It is not by the redness of red that you understand the apple, it is by the relationships between different aspects of your sensory experience. The best analogy here, perhaps, is music. Unless you have perfect pitch, you wouldn’t be able to distinguish between c4 and f4 if I played them on a piano for you (seperated by a sufficient period of time). You might not even be able to distinguish between c4 and c5. What you can distinguish are the relationships between notes. You will most likely be able to instantly hear the difference between me playing C4 then C#4 and me playing C4 then D4 (the interval C4-C#4 will sound sinister because it is a minor interval. The interval between C4 and D4 will sound harmonious because it is a major interval. You will know that both are rising in pitch. Your understanding comes from the relationships between bits of your experience and other bits of your experience.
I think much of the prejudice against the Chinese room comes from the fact that it receives its input in text:
Consider this judgment by Gary Marcus on claims that LaMDA possesses a kind of sentience:
“Nonsense. Neither LaMDA nor any of its cousins (GPT-3) are remotely intelligent. All they do is match patterns, drawn from massive statistical databases of human language. The patterns might be cool, but language these systems utter doesn’t actually mean anything at all. And it sure as hell doesn’t mean that these systems are sentient. Which doesn’t mean that human beings can’t be taken in. In our book Rebooting AI, Ernie Davis and I called this human tendency to be suckered by The Gullibility Gap — a pernicious, modern version of pareidolia, the anthromorphic bias that allows humans to see Mother Theresa in an image of a cinnamon bun. Indeed, someone well-known at Google, Blake LeMoine, originally charged with studying how “safe” the system is, appears to have fallen in love with LaMDA, as if it were a family member or a colleague. (Newsflash: it’s not; it’s a spreadsheet for words.)”
But all we humans do is match patterns in sensory experiences. True, we do so with inductive biases that help us to understand the world by predisposing us to see it in such ways, but LaMDA also contains inductive biases. The prejudice comes, in part, I think, from the fact that it’s patterns in texts, and not, say, pictures or sounds.
Now it’s important to remember that there really is nothing qualitatively different between a passage containing text, and an image because both can easily include each other. Consider this sentence. “The image is six hundred pixels by six hundred pixels. At point 1,1 there is red 116. At point 1,2 there is red 103”…” and so on. Such a sentence conveys all the information in the image. Of course, there are quantitative reasons this won’t be feasible in many cases, but they are only quantitative.
I don’t see any reason in principle that you can’t build an excellent model of the world through relationships between text alone. As I wrote a long time ago [ed: in a previous essay in this anthology.]:
“In hindsight, it makes a certain sense that reams and reams of text alone can be used to build the capabilities needed to answer questions like these. A lot of people remind us that these programs are really just statistical analyses of the co-occurrence of words, however complex and glorified. However, we should not forget that the statistical relationships between words in a language are isomorphic to the relations between things in the world—that isomorphism is why language works. This is to say the patterns in language use mirror the patterns of how things are. Models are transitive—if x models y, and y models z, then x models z. The upshot of these facts are that if you have a really good statistical model of how words relate to each other, that model is also implicitly a model of the world, and so we shouldn't surprised that such a model grants a kind of "understanding" about how the world works.”
Now that’s an oversimplification in some ways (what about false statements, deliberate or otherwise), but in the main the point holds. Even in false narratives, things normally relate to each other in the same way they relate in the real world, generally you’ll only start walking on the ceiling if that’s key to the story, for example. The relationships between things in the world are implicit in the relationships between words in text, especially over large corpora. Not only is it possible in principle for a language model to use these, I think it’s very possible that, in practice, backpropagation could arrive at them. In fact, I find it hard to imagine the alternative, especially if you’re going to produce language to answer complex questions with answers that are more than superficially plausible.
Note: In this section, I have glossed over the theory-ladeness of perception in this section and treated perception as if it were a series of discrete “sense data” that we relate statistically, but I don’t think it would create any problems for my argument to expand it to include a more realistic view of perception. This approach just makes exposition easier.
What about qualia
I think another part of the force of the Chinese room thought experiment comes from qualia. In this world of text associated with text in which the Chinese room lives where is the redness of red? I have two responses here.
The first is that I’m not convinced that being a person requires qualia, I think that if philosophical zombies are possible, they still count as persons, and have at least some claim to ethical consideration.
The second is that qualia are poorly understood. They essentially amount to the non-functional part of experience, the redness of red that would remain even if you swapped red and green in a way that made no difference to behavior, in the famous inverted spectrum argument. Currently, we have no real leads in solving the hard problem. Thus who can say that there couldn’t be hypothetical language models that feel the wordiness of certain kinds of words? Maybe verbs are sharp and adjectives are so. We haven’t got a theory of qualia that would rule this out. I’d urge interested readers to read more about functionalism, probably our best current theory in the philosophy of mind. I think it puts many of these problems in perspective.
Edit: An excellent study recently came to my attention showing that when GPT-2 is taught to play chess by receiving the moves of games (in text form) as input, it knows where the pieces are, that is to say, it contains a model of the board state at any given time. “Chess as a Testbed for Language Model State Tracking” (2021) As the authors of that paper suggest, this is a toy case that gives us evidence these word machines work by world modeling