Less Wrong is a community blog devoted to refining the art of human rationality. Please visit our About page for more information.

How An Algorithm Feels From Inside

87 Post author: Eliezer_Yudkowsky 11 February 2008 02:35AM

Followup toNeural Categories

"If a tree falls in the forest, and no one hears it, does it make a sound?"  I remember seeing an actual argument get started on this subject—a fully naive argument that went nowhere near Berkeleyan subjectivism.  Just:

"It makes a sound, just like any other falling tree!"
"But how can there be a sound that no one hears?"

The standard rationalist view would be that the first person is speaking as if "sound" means acoustic vibrations in the air; the second person is speaking as if "sound" means an auditory experience in a brain.  If you ask "Are there acoustic vibrations?" or "Are there auditory experiences?", the answer is at once obvious.  And so the argument is really about the definition of the word "sound".

I think the standard analysis is essentially correct.  So let's accept that as a premise, and ask:  Why do people get into such an argument?  What's the underlying psychology?

A key idea of the heuristics and biases program is that mistakes are often more revealing of cognition than correct answers.  Getting into a heated dispute about whether, if a tree falls in a deserted forest, it makes a sound, is traditionally considered a mistake.

So what kind of mind design corresponds to that error?

In Disguised Queries I introduced the blegg/rube classification task, in which Susan the Senior Sorter explains that your job is to sort objects coming off a conveyor belt, putting the blue eggs or "bleggs" into one bin, and the red cubes or "rubes" into the rube bin.  This, it turns out, is because bleggs contain small nuggets of vanadium ore, and rubes contain small shreds of palladium, both of which are useful industrially.

Except that around 2% of blue egg-shaped objects contain palladium instead.  So if you find a blue egg-shaped thing that contains palladium, should you call it a "rube" instead?  You're going to put it in the rube bin—why not call it a "rube"?

But when you switch off the light, nearly all bleggs glow faintly in the dark.  And blue egg-shaped objects that contain palladium are just as likely to glow in the dark as any other blue egg-shaped object.

So if you find a blue egg-shaped object that contains palladium, and you ask "Is it a blegg?", the answer depends on what you have to do with the answer:  If you ask "Which bin does the object go in?", then you choose as if the object is a rube.  But if you ask "If I turn off the light, will it glow?", you predict as if the object is a blegg.  In one case, the question "Is it a blegg?" stands in for the disguised query, "Which bin does it go in?".  In the other case, the question "Is it a blegg?" stands in for the disguised query, "Will it glow in the dark?"

Now suppose that you have an object that is blue and egg-shaped and contains palladium; and you have already observed that it is furred, flexible, opaque, and glows in the dark.

This answers every query, observes every observable introduced.  There's nothing left for a disguised query to stand for.

So why might someone feel an impulse to go on arguing whether the object is really a blegg?

Blegg3

This diagram from Neural Categories shows two different neural networks that might be used to answer questions about bleggs and rubes.  Network 1 has a number of disadvantages—such as potentially oscillating/chaotic behavior, or requiring O(N2) connections—but Network 1's structure does have one major advantage over Network 2:  Every unit in the network corresponds to a testable query.  If you observe every observable, clamping every value, there are no units in the network left over.

Network 2, however, is a far better candidate for being something vaguely like how the human brain works:  It's fast, cheap, scalable—and has an extra dangling unit in the center, whose activation can still vary, even after we've observed every single one of the surrounding nodes.

Which is to say that even after you know whether an object is blue or red, egg or cube, furred or smooth, bright or dark, and whether it contains vanadium or palladium, it feels like there's a leftover, unanswered question:  But is it really a blegg?

Usually, in our daily experience, acoustic vibrations and auditory experience go together.  But a tree falling in a deserted forest unbundles this common association.  And even after you know that the falling tree creates acoustic vibrations but not auditory experience, it feels like there's a leftover question:  Did it make a sound?

We know where Pluto is, and where it's going; we know Pluto's shape, and Pluto's mass—but is it a planet?

Now remember:  When you look at Network 2, as I've laid it out here, you're seeing the algorithm from the outside.  People don't think to themselves, "Should the central unit fire, or not?" any more than you think "Should neuron #12,234,320,242 in my visual cortex fire, or not?"

It takes a deliberate effort to visualize your brain from the outside—and then you still don't see your actual brain; you imagine what you think is there, hopefully based on science, but regardless, you don't have any direct access to neural network structures from introspection.  That's why the ancient Greeks didn't invent computational neuroscience.

When you look at Network 2, you are seeing from the outside; but the way that neural network structure feels from the inside, if you yourself are a brain running that algorithm, is that even after you know every characteristic of the object, you still find yourself wondering:  "But is it a blegg, or not?"

This is a great gap to cross, and I've seen it stop people in their tracks.  Because we don't instinctively see our intuitions as "intuitions", we just see them as the world.  When you look at a green cup, you don't think of yourself as seeing a picture reconstructed in your visual cortex—although that is what you are seeing—you just see a green cup.  You think, "Why, look, this cup is green," not, "The picture in my visual cortex of this cup is green."

And in the same way, when people argue over whether the falling tree makes a sound, or whether Pluto is a planet, they don't see themselves as arguing over whether a categorization should be active in their neural networks.  It seems like either the tree makes a sound, or not.

We know where Pluto is, and where it's going; we know Pluto's shape, and Pluto's mass—but is it a planet?  And yes, there were people who said this was a fight over definitions—but even that is a Network 2 sort of perspective, because you're arguing about how the central unit ought to be wired up.  If you were a mind constructed along the lines of Network 1, you wouldn't say "It depends on how you define 'planet'," you would just say, "Given that we know Pluto's orbit and shape and mass, there is no question left to ask."  Or, rather, that's how it would feel—it would feel like there was no question left—if you were a mind constructed along the lines of Network 1.

Before you can question your intuitions, you have to realize that what your mind's eye is looking at is an intuition—some cognitive algorithm, as seen from the inside—rather than a direct perception of the Way Things Really Are.

People cling to their intuitions, I think, not so much because they believe their cognitive algorithms are perfectly reliable, but because they can't see their intuitions as the way their cognitive algorithms happen to look from the inside.

And so everything you try to say about how the native cognitive algorithm goes astray, ends up being contrasted to their direct perception of the Way Things Really Are—and discarded as obviously wrong.

 

Part of the sequence A Human's Guide to Words

Next post: "Disputing Definitions"

Previous post: "Neural Categories"

Comments (77)

Sort By: Old
Comment author: Aaron4 11 February 2008 04:11:44AM -1 points [-]

For what it's worth, I've always responded to questions such as "Is Pluto a planet?" in a manner more similar to Network 1 than Network 2. The debate strikes me as borderline nonsensical.

Comment author: Origin64 04 November 2012 09:17:03PM 2 points [-]

Analytically, I'd have to agree, but the first thing that I say when I get this question is no. I explain that it depends on definition, that have a definition for planet, and we know the characteristics of Pluto. Pluto doesn't match the requirements in the definition, ergo, not a planet.

Lots easier than trying to explain to someone they don't actually know what question they're asking, although it's of course a more elegant answer.

Comment author: Scott_Aaronson 11 February 2008 04:32:00AM 30 points [-]

While "reifying the internal nodes" must indeed be counted as one of the great design flaws of the human brain, I think the recognition of this flaw and the attempt to fight it are as old as history. How many jokes, folk sayings, literary quotations, etc. are based around this one flaw? "in name only," "looks like a duck, quacks like a duck," "by their fruits shall ye know them," "a rose by any other name"... Of course, there wouldn't be all these sayings if people didn't keep confusing labels with observable attributes in the first place -- but don't the sayings suggest that recognizing this bug in oneself or others doesn't require any neural-level understanding of cognition?

Comment author: Amanojack 11 March 2010 05:00:58PM 4 points [-]

Exactly. People merely need to keep in mind that words are not the concepts they represent. This is certainly not impossible, but - like all aspects of being rational - it's harder than it sounds.

Comment author: bigjeff5 31 January 2011 09:46:20PM 11 points [-]

I think it goes beyond words.

Reality does not consist of concepts, reality is simply reality. Concepts are how we describe reality. They are like words squared, and have all the same problems as words.

Comment author: Amanojack 27 April 2011 06:58:20PM *  6 points [-]

Looking back from a year later, I should have said, "Words are not the experiences they represent."

As for "reality," well it's just a name I give to a certain set of sensations I experience. I don't even know what "concepts" are anymore - probably just a general name for a bunch of different things, so not that useful at this level of analysis.

Comment author: Doug_S. 11 February 2008 04:35:50AM 9 points [-]

Well, is "Pluto is a planet" the right password, or not? ;)

Comment author: Eliezer_Yudkowsky 11 February 2008 04:46:38AM 14 points [-]

Don't the sayings suggest that recognizing this bug in oneself or others doesn't require any neural-level understanding of cognition?

Clearly, bug-recognition at the level described in this blog post does not so require, because I have no idea what the biological circuitry that actually recognizes a tiger looks like, though I know it happens in the temporal lobe.

Comment author: JulianMorrison 11 February 2008 05:44:30AM 0 points [-]

Given that this bug relates to neural structure on an abstract, rather than biological level, I wonder if it's a cognitive universal beyond just humans? Would any pragmatic AGI built out of neurons necessarily have the same bias?

Comment author: moshez 24 December 2012 09:59:54PM 2 points [-]

The same bias to...what? From the inside, the AI might feel "conflicted" or "weirded out" by a yellow, furry, ellipsoid shaped object, but that's not necessarily a bug: maybe this feeling accumulates and eventually results in creating new sub-categories. The AI won't necessarily get into the argument about definitions, because while part of that argument comes from the neural architecture above, the other part comes from the need to win arguments -- and the evolutionary bias for humans to win arguments would not be present in most AI designs.

Comment author: Ben_Jones 11 February 2008 10:21:57AM 1 point [-]

I've always been vaguely aware of this, but never seen it laid out this clearly - good post. The more you think about it, the more ridiculous it seems. "No, we can know whether it's a planet or not! We just have to know more about it!"

Scott, you forgot 'I yam what I yam and that's all what I yam'.

Comment author: Silas 11 February 2008 02:36:50PM 10 points [-]

At risk of sounding ignorant, it's not clear to me how Network 1, or the networks in the prerequisite blog post, actually work. I know I'm supposed to already have superficial understanding of neural networks, and I do, but it wasn't immediately obvious to me what happens in Network 1, what the algorithm is. Before you roll your eyes, yes, I looked at the Artificial Neural Network Wikipedia page, but it still doesn't help in determining what yours means.

Comment author: gmaxwell 08 September 2010 04:55:52AM *  1 point [-]

Network 1 would work just fine (ignoring how you'd go about training such a thing). Each of the N^2 edges has a weight expressing the relationship of the vertices it connects. E.g. if nodes A and B are strongly anti-correlated the weight between them might be -1. You then fix the nodes you know and then either solve the system analytically or through numerical iteration until it settles down (hopefully!) and then you have expectations for all the unknown.

Typical networks for this sort of thing don't have cycles so stability isn't a question, but that doesn't mean that networks with cycles can't work and reach stable solutions. Some error correcting codes have graph representations that aren't much better than this. :)

Comment author: thomblake 24 October 2011 04:43:01PM 0 points [-]

Silas, I'm sure you've seen the answer by now, but for anyone who comes later, if you think of the diagrams above as Bayes Networks then you're on the right track.

Comment author: billswift 11 February 2008 03:22:06PM 5 points [-]

Silas, the diagrams are not neural networks, and don't represent them. They are graphs of the connections between observable characteristics of bleggs and rubes.

Comment author: Adam_Safron 11 February 2008 03:42:53PM 2 points [-]

Once again, great post.

Eliezer: "We know where Pluto is, and where it's going; we know Pluto's shape, and Pluto's mass - but is it a planet? And yes, there were people who said this was a fight over definitions..."

It was a fight over definitions. Astronomers were trying to update their nomenclature to better handle new data (large bodies in the Kuiper belt). Pluto wasn't quite like the other planets but it wasn't like the other asteroids either. So they called it a dwarf-planet. Seems pretty reasonable to me. http://en.wikipedia.org/wiki/Dwarf_planet

Comment author: Silas 11 February 2008 04:39:02PM 2 points [-]

billswift: Okay, if they're not neural networks, then there's no explanation of how they work, so I don't understand how to compare them all. How was I supposed to know from the posts how they work?

Comment author: tcpkac 11 February 2008 04:48:07PM 0 points [-]

Silas, billswift, Eliezer does say, introducing his diagrams in the Neural Categories post : "Then I might design a neural network that looks something like this:"

Comment author: Cyan2 11 February 2008 05:07:32PM 0 points [-]

Silas,

The keywords you need are "Hopfield network" and "Hebbian learning". MacKay's book has a section on them, starting on page 505.

Comment author: anonymous12 11 February 2008 08:57:24PM 4 points [-]

Silas, see Naive Bayes classifier for how an "observable characteristics graph" similar to Network 2 should work in theory. It's not clear whether Hopfield or Hebbian learning can implement this, though.

To put it simply, Network 2 makes the strong assumption that the only influence on features such as color or shape is whether the object is a a rube or a blegg. This is an extremely strong assumption which is often inaccurate; despite this, naive Bayes classifiers work extremely well in practice.

Comment author: Eliezer_Yudkowsky 11 February 2008 09:18:30PM 6 points [-]

I was wondering if anyone would notice that Network 2 with logistic units was exactly equivalent to Naive Bayes.

To be precise, Naive Bayes assumes that within the blegg cluster, or within the rube cluster, all remaining variance in the characteristics is independent; or to put it another way, once we know whether an object is a blegg or a rube, this screens off any other information that its shape could tell us about its color. This isn't the same as assuming that the only causal influence on a blegg's shape is its blegg-ness - in fact, there may not be anything that corresponds to blegg-ness.

But one reason that Naive Bayes does work pretty well in practice, is that a lot of objects in the real world do have causal essences, like the way that cat DNA (which doesn't mix with dog DNA) is the causal essence that gives rise to all the surface characteristics that distinguish cats from dogs.

The other reason Naive Bayes works pretty well in practice is that it often successfully chops up a probability distribution into clusters even when the real causal structure looks nothing like a central influence.

Comment author: Cyan2 20 June 2008 10:58:43PM 1 point [-]

Silas,

The essential idea is that network 1 can be trained on a target pattern, and after training, it will converge to the target when initialized with a partial or distorted version of the target. Wikipedia's article on Hopfield networks has more.

Both types of networks can be used to predict observables given other observables. Network 1, being totally connected, is slower than network 2. But network 2 has a node which corresponds to no observable thing. It can leave one with the feeling that some question has not been completely answered even though all the observables have known states.

Comment author: HalFinney 24 June 2008 11:04:32PM 5 points [-]

Silas, let me try to give you a little more explicit answer. This is how I think it is meant to work, although I agree that the description is rather unclear.

Each dot in the diagram is an "artificial neuron". This is a little machine that has N inputs and one output, all of which are numbers. It also has an internal "threshold" value, which is also a number. The way it works is it computes a "weighted sum" of its N inputs. That means that each input has a "weight", another number. It multplies weight 1 times input 1, plus weight 2 times input 2, plus weight 3 times input 3, and so on, to get the weighted sum. (Note that weights can also be negative, so some inputs can lower the sum.) It then compares this with the threshold value. If the sum is greater than the threshold, it outputs 1, otherwise it outputs 0. If a neuron's output is a 1 we say it is "firing" or "activated".

The diagram shows how the ANs are hooked up into a network, an ANN. Each neuron in Figure 1 has 5 inputs. 4 of them come from the other 4 neurons in the circuit and are represented by the lines. The 5th comes from the particular characteristic which is assigned to that neuron, i.e. color, luminance, etc. If the object has that property, that 5th input is a 1, else a 0. All of the connections in this network are bidirectional, so that neuron 1 receives input from neuron 2, while neuron 2 receives input from neuron 1, etc.

So to think about what this network does, we imagine inputting the 5 qualities which are observed about an object to the "5th" input of each of the 5 neurons. We imagine that the current output levels of all the neurons are set to something arbitrary, let's just say zero. And perhaps initially the weights and threshold values are also quite random.

When we give the neurons this activation pattern, some of them may end up firing and some may not, depending on how the weights and thresholds are set up. And once a neuron starts firing, that feeds into one of the inputs of the other 4 neurons, which may change their own state. That feeds back through the network as well. This may lead to oscillation or an unstable state, but hopefully it will settle down into some pattern.

Now, according to various rules, we will typically adjust the weights. There are different ways to do this, but I think the concept in this example is that we will try to make the output of each neuron match its "5th input", the object characteristic assigned to that neuron. We want the luminance neuron to activate when the object is luminous, and so on. So we increase weights that will tend to move the output in that direction, decrease weights that would move it the other way, tweak the thresholds a bit. We do this repeatedly with different objects, making small changes to the weights - this is "training" the network. Eventually it hopefully settles down and does pretty much what we want it to.

Now we can give it some wrong or ambiguous inputs, and ideally it will still produce the output that is supposed to go there. If we input 4 of the characteristics of a blegg, the 5th neuron will also show the blegg-style output. It has "learned" the characteristics of bleggs and rubes.

In the case of Network 2, the setup is simpler - each edge neuron has just 2 inputs: its unique observed characteristic, and a feedback value from the center neuron. Each one performs its weighted-sum trick and sends its output to the center one, which has its own set of weights and a threshold that determines whether it activates or not. In this case we want to teach the center one to distinguish bleggs from rubes, so we would train it that way - adjusting the weights a little bit at a time until we find it firing when it is a blegg but not when it is a rube.

Anyway, I know this is a long explanation but I didn't see anyone else making it explicit. Hopefully it is mostly correct.

Comment author: PhilGoetz 22 August 2009 03:02:19AM 3 points [-]

I think the standard analysis is essentially correct. So let's accept that as a premise, and ask: Why do people get into such an argument? What's the underlying psychology?

I think that people historically got into this argument because they didn't know what sound was. It is a philosophical appendix, a vestigial argument that no longer has any interest.

Comment author: dconrad 28 October 2009 04:17:21AM 2 points [-]

The extra node in network 2 corresponds to assigning a label, an abstract term to the thing being reasoned about. I wonder if a being with a network-1 mind would have ever evolved intelligence. Assigning names to things, creating categories, allows us to reason about much more complex things. If the price we pay for that is occasionally getting into a confusing or pointless argument about "is it a rube or a blegg?" or "does a tree falling in a deserted forest make a sound?" or "is Pluto a planet?", that seems like a fair price to pay.

Comment author: David_Gerard 07 December 2010 06:09:20PM 1 point [-]

I tend to resolve this sort of "is it really an X?" issue with the question "what's it for?" This is similar to making a belief pay rent: why do you care if it's really an X?

Comment author: lukstafi 23 January 2011 11:49:39PM *  1 point [-]

I'm a little bit lazy and already clicked here from the reductionism article, is the philosophical claim that of a non-eliminative reductionism? Or does Eliezer render a more eliminativist variant of reductionism? (I'm not implying that there is a contradiction between quoted sources, only some amount of "tension".)

Comment author: Jaeson 02 June 2011 09:23:49PM 0 points [-]

Most of this is about word-association, multiple definitions of worlds, or not enough words to describe the situation.

In this case, a far more complicated Network setup would be required to describe the neural activity. Not only would you need the Network you have, but you would also need a second (or intermediate) network connecting sensory perceptions with certain words, and then yet another (or extended) network connecting those words with memory and cognitive associations with those words in the past. You could go on and on, by then also including the other words linked to those cognitive associations (and then the words associated with those, etc., etc.) In truth, even then, it would probably a far-more simplistic and less-connected view than what is truly occuring in the brain.

What is occuring (90% of the time) with the "Tree argument" is multiple definitions (and associations) for one word. For instance, let's say 'quot' was a well-known English word for accoustic vibrations. Being a single word, with no other definitions, no one would ever (even when thinking) mistake it with the subjective experience of sound. People wouldn't ask 'If a tree falls, when no one is there, does it make a quot', because everyone would instantly associate the word 'quot' with the vibrations that must be made, and can be proven to exist, with or without people to listen to them (unless you are one of the few who claim the vibrations (or quots) do not exist, either). People also, then, would not ask if the tree made a sound, either, because they would instantly link the word 'sound' with the subjective experience, as the word would have no competing definition any longer (unless you are someone who claims the subjective experience of sound would still exist, even without a person [I've never met such a person, but chances are, they're out there]).

As for the question of whether or not it is a blegg, this is example is mostly true to what your saying, though word-associate for the colors 'blue' and 'red' would also play a role. The word 'Blegg' has three of the letters 'blue' has, and thus people would probably be inclined to call something that looks blue a 'blegg' when given the choice. As for a 'Rube', this word has three letters and would be similiar in pronounciation to 'Ruby'. This, also, would cause people to be more likely say something is a 'Rube' if it is red, rather than if it was blue.

As for the question of Pluto being a planet (besides cultural bias by people who grew-up calling it one), the argument lies in not enough people knowing the true definition (or else no set definition) of the word. From my understanding, planets are defined as things big-enough to move a certain amount of other things around it in space. The evidence long-ago showed that Pluto could do this, so it was called a planet. But now, the evidence says that Pluto cannot do this, so it is not a planet. If people asked 'Is Pluto big-enough to move things?', the debate (if you could call it that) would be much different. People have known Pluto isn't a 'planet' for years, but only when they discovered the dwarf planet 'Eris' did they decide Pluto would have to go, or else books would soon be saying our Solar System had eleven planets (two of which actually being dwarf ones).

All of that being said, I enjoyed your writing very much, and agreed with much of it.

Comment author: TheatreAddict 09 July 2011 03:26:33AM 2 points [-]

So.. is this pretty much a result of our human brains wanting to classify something? Like, if something doesn't necessarily fit into a box that we can neatly file away, our brains puzzle where to classify it, when actually it is its own classification... if that makes sense?

Comment author: potato 18 September 2013 07:30:15PM *  4 points [-]

Except that around 2% of blue egg-shaped objects contain palladium instead. So if you find a blue egg-shaped thing that contains palladium, should you call it a "rube" instead? You're going to put it in the rube bin—why not call it a "rube"?

But when you switch off the light, nearly all bleggs glow faintly in the dark. And blue egg-shaped objects that contain palladium are just as likely to glow in the dark as any other blue egg-shaped object.

So if you find a blue egg-shaped object that contains palladium, and you ask "Is it a blegg?", the answer depends on what you have to do with the answer: If you ask "Which bin does the object go in?", then you choose as if the object is a rube. But if you ask "If I turn off the light, will it glow?", you predict as if the object is a blegg. In one case, the question "Is it a blegg?" stands in for the disguised query, "Which bin does it go in?". In the other case, the question "Is it a blegg?" stands in for the disguised query, "Will it glow in the dark?"

This is amazing, but too fast. It's too important and counter intuitive to do that fast, and we absolutely devastatingly painfully need it in philosophy departments. Please help us. This is an S.O.S. our ship is sinking. Write this again longer, so that I can show it to people and change their minds. People who are not lesswrong litterate. It's too important to go over that fast, anyway. I also ask that you, or anyone for that matter, find a simple real world example which has roughly analogous parameters to the ones you specified, and use that as the example instead. Somebody do it [please, I'm too busy arguing with philosophy proffesors about it, and there are better writers on this site that could take up the endeavor. It would be useful and well liked anyway chances are, and I'll give what rewards I can.

Comment author: erikbjare 25 May 2014 06:59:11PM *  2 points [-]

There is a good quote by Alan Watts relating to the first paragraphs.

Problems that remain persistently insoluble should always be suspected as questions asked in the wrong way.

Comment author: Elund 21 October 2014 06:39:38AM 0 points [-]

I personally prefer names to be self-explanatory. Therefore, in this example I would consider a "blegg" to be a blue egg, regardless of its other qualities, and a "rube" to be a red cube, regardless of its other qualities. I suspect many other people would have a similar intuition.

Comment author: [deleted] 09 February 2015 11:48:58PM 0 points [-]

This article argues to the effect that the node categorising an unnamed category over 'Blegg' and 'Rube' ought to be got rid of, in favour of a thought-system with only the other five nodes. This brings up the following questions. Firstly, how are we to know which categorisations are the ones we ought to get rid of, and which are the ones we ought to keep? Secondly, why is it that some categorisations ought to be got rid of, and others ought not be?

So far as I can see, the article does not attempt to directly answer the first question (correct me if I am mistaken). The article does seem to try and answer the second question through some kind of Essentialism; that 'Blegg' and 'Rube' don't pick out real "kinds", whilst the other categorisations do. Is this the correct reading of the article? And how exactly would that type of Essentialism pan out?

Comment author: Jarogers326 07 October 2015 12:35:50PM 0 points [-]

I doubt I'd be able to fully grasp this if I had not first read hpmor, so thanks for that. Also, eggs vs ovals.