Less Wrong is a community blog devoted to refining the art of human rationality. Please visit our About page for more information.

The Design Space of Minds-In-General

2Eliezer_Yudkowsky25 June 2008 06:37AM

Followup toThe Psychological Unity of Humankind

People ask me, "What will Artificial Intelligences be like?  What will they do?  Tell us your amazing story about the future."

And lo, I say unto them, "You have asked me a trick question."

ATP synthase is a molecular machine - one of three known occasions when evolution has invented the freely rotating wheel - which is essentially the same in animal mitochondria, plant chloroplasts, and bacteria.  ATP synthase has not changed significantly since the rise of eukaryotic life two billion years ago.  It's is something we all have in common -  thanks to the way that evolution strongly conserves certain genes; once many other genes depend on a gene, a mutation will tend to break all the dependencies.

Any two AI designs might be less similar to each other than you are to a petunia.

Asking what "AIs" will do is a trick question because it implies that all AIs form a natural class. Humans do form a natural class because we all share the same brain architecture.  But when you say "Artificial Intelligence", you are referring to a vastly larger space of possibilities than when you say "human".  When people talk about "AIs" we are really talking about minds-in-general, or optimization processes in general.  Having a word for "AI" is like having a word for everything that isn't a duck.

Imagine a map of mind design space... this is one of my standard diagrams...

Mindspace_2

All humans, of course, fit into a tiny little dot - as a sexually reproducing species, we can't be too different from one another.

This tiny dot belongs to a wider ellipse, the space of transhuman mind designs - things that might be smarter than us, or much smarter than us, but which in some sense would still be people as we understand people.

This transhuman ellipse is within a still wider volume, the space of posthuman minds, which is everything that a transhuman might grow up into.

And then the rest of the sphere is the space of minds-in-general, including possible Artificial Intelligences so odd that they aren't even posthuman.

But wait - natural selection designs complex artifacts and selects among complex strategies.  So where is natural selection on this map?

So this entire map really floats in a still vaster space, the space of optimization processes.  At the bottom of this vaster space, below even humans, is natural selection as it first began in some tidal pool: mutate, replicate, and sometimes die, no sex.

Are there any powerful optimization processes, with strength comparable to a human civilization or even a self-improving AI, which we would not recognize as minds?  Arguably Marcus Hutter's AIXI should go in this category: for a mind of infinite power, it's awfully stupid - poor thing can't even recognize itself in a mirror.  But that is a topic for another time.

My primary moral is to resist the temptation to generalize over all of mind design space

If we focus on the bounded subspace of mind design space which contains all those minds whose makeup can be specified in a trillion bits or less, then every universal generalization that you make has two to the trillionth power chances to be falsified.

Conversely, every existential generalization - "there exists at least one mind such that X" - has two to the trillionth power chances to be true.

So you want to resist the temptation to say either that all minds do something, or that no minds do something.

The main reason you could find yourself thinking that you know what a fully generic mind will (won't) do, is if you put yourself in that mind's shoes - imagine what you would do in that mind's place - and get back a generally wrong, anthropomorphic answer.  (Albeit that it is true in at least one case, since you are yourself an example.)  Or if you imagine a mind doing something, and then imagining the reasons you wouldn't do it - so that you imagine that a mind of that type can't exist, that the ghost in the machine will look over the corresponding source code and hand it back.

Somewhere in mind design space is at least one mind with almost any kind of logically consistent property you care to imagine.

And this is important because it emphasizes the importance of discussing what happens, lawfully, and why, as a causal result of a mind's particular constituent makeup; somewhere in mind design space is a mind that does it differently.

Of course you could always say that anything which doesn't do it your way, is "by definition" not a mind; after all, it's obviously stupid.  I've seen people try that one too.

Comments (29)

Tiiba225 June 2008 07:13:17AM0 points [-]

"everything that isn't a duck"

Muggles?

Shane_Legg25 June 2008 08:13:32AM0 points [-]

@ Eli:

"Arguably Marcus Hutter's AIXI should go in this category: for a mind of infinite power, it's awfully stupid - poor thing can't even recognize itself in a mirror."

Have you (or somebody else) mathematically proven this?

(If you have then that's great and I'd like to see the proof, and I'll pass it on to Hutter because I'm sure he will be interested. A real proof. I say this because I see endless intuitions and opinions about Solomonoff induction and AIXI on the internet. Intuitions about models of super intelligent machines like AIXI just don't cut it. In my experience they very often don't do what you think they will.)

Eliezer_Yudkowsky25 June 2008 08:36:05AM0 points [-]

Shane, there was a discussion about this on the AGI list way back when, "breaking AIXI-tl", in which e.g. this would be one of the more technical posts. I think I proved this at least as formally, as you proved that proof that FAI was impossible that I refuted.

But of course this subject is going to take a separate post.

Shane_Legg25 June 2008 09:20:22AM0 points [-]

@ Eli:

Yeah, my guess is that AIXI-tl can be broken. But AIXI? I'm pretty sure it can be broken in some senses, but whether these senses are very meaningful or significant, I don't know.

And yes, my "proof" that FAI would fail failed. But it also wasn't a formal proof. Kind of a lesson in that don't you think?

So until I see a proof, I'll take your statement about AIXI being "awfully stupid" as just an opinion. It will be interesting to see if you can prove yourself to be smarter than AIXI (I assume you don't view yourself as below awfully stupid).

Roko25 June 2008 09:51:17AM0 points [-]

I might pitch in with an intuition about whether AIXI can recognize itself in a mirror. If I understand the algorithm correctly, it would depend on the rewards you gave it, and the computational and time cost would depend on what sensory inputs and motor outputs you connected it to.

For example, if you ran AIXI on a computer connected to a webcam with a mirror in front of it, and rewarded it if and only if it printed "I recognize myself" on the screen, it would eventually learn to do this all the time. The time cost might be large, though.

Tim_Tyler25 June 2008 10:42:50AM0 points [-]

What if AIXI accidentally bashed its own brains out before it tried that?

Will_Pearson25 June 2008 11:09:27AM0 points [-]

Where would cats fit on the space. I would assume that they would be near humans, sharing as they do an amygdala, prefrontal cortex, cerebellum and the neurons fire at the same speed I assume. Not sure about the abstract planning. Could you have done the psychological unity of the mammals for your previous article?

Silas25 June 2008 01:19:01PM0 points [-]

Does anyone know the ratio of discussion of implementations of AIXI/-tl, to discussion of its theoretical properties? I've calculated it at about zero.

Shane_Legg25 June 2008 01:38:46PM0 points [-]

@ Silas:

Given that AIXI is uncomputable, how is somebody going to discuss implementing it?

An approximation, sure, but an actual implementation?

bambi25 June 2008 02:32:11PM0 points [-]

What do you mean by a mind?

All you have given us is that a mind is an optimization process. And: what a human brain does counts as a mind. Evolution does not count as a mind. AIXI may or may not count as a mind (?!).

I understand your desire not to "generalize", but can't we do better than this? Must we rely on Eliezer-sub-28-hunches to distinguish minds from non-minds?

Is the FAI you want to build a mind? That might sound like a dumb question, but why should it be a "mind", given what we want from it?

bambi25 June 2008 02:35:48PM0 points [-]

Perhaps "mind" should just be tabooed. It doesn't seem to offer anything helpful, and leads to vast fuzzy confusion.

Roko25 June 2008 02:48:32PM0 points [-]

@Tim Tyler: yeah, this is always an issue. And there is the issue that AIXI might kill the person giving it the rewards. [I'm being sloppy here: you can't implement an uncomputable algorithm on physically real computer, so we should be talking about some kind of computable approximation to the algorithm being able to recognize "itself" in a mirror. ]

Silas25 June 2008 03:42:44PM0 points [-]

Shane_Legg: factoring in approximations, it's still about zero. I googled a lot hoping to find someone actually using some version of it, but only found the SIAI's blog's python implementation of Solomonoff induction, which doesn't even compile on Windows.

poke25 June 2008 03:44:47PM0 points [-]

So is the reason I should believe this space of minds-in-general exists at all going to come in a later post?

Shane_Legg25 June 2008 04:00:35PM1 point [-]

@ Silas:

I assume you mean "doesn't run" (python isn't normally a compiled language).

Regarding approximations of Solomonoff induction: it depends how broadly you want to interpret this statement. If we use a computable prior rather than the Solomonoff mixture, we recover normal Bayesian inference. If we define our prior to be uniform, for example by assuming that all models have the same complexity, then the result is maximum a posteriori (MAP) estimation, which in turn is related to maximum likelihood (ML) estimation. Relations can also be established to Minimum Message Length (MML), Minimum Description Length (MDL), and Maximum entropy (ME) based prediction (see Chapter 5 of Kolmogorov complexity and its applications by Li and Vitanyi, 1997).

In short, much of statistics and machine learning can be view as being computable approximations of Solomonoff induction.

Phil_Goetz425 June 2008 04:23:22PM0 points [-]

The larger point, that the space of possible minds is very large, is correct.

The argument used involving ATP synthase is invalid. ATP synthase is a building block. Life on earth is all built using roughly the same set of Legos. But Legos are very versatile.

Here is an analogous argument that is obviously incorrect:

People ask me, "What is world literature like? What desires and ambitions, and comedies and tragedies, do people write about in other languages?"

And lo, I say unto them, "You have asked me a trick question."

"the" is a determiner which is identical in English poems, novels, and legal documents. It has not changed significantly since the rise of modern English in the 17th century. It's is something that every English document has in common.

Any two works of literature from different countries might be less similar to each other than Hamlet is to a restaurant menu.

Caledonian225 June 2008 05:01:13PM0 points [-]

I would point out, Mr. Goetz, that some languages do not have a "the".

It is not clear how this changes the content of things people say or write in those languages. Whorf-Sapir, while disproven in the technical sense, is surprisingly difficult to abolish.

Cyan225 June 2008 05:18:31PM0 points [-]

Phil, I'm not really sure what your criticism has to do with what Eliezer wrote. He's saying that evolution is contingent -- bits that work can get locked into place because other bits rely on them. Eliezer asserts that AI design is not contingent in this manner, so the space of possible AI designs does not form a natural class, unlike the space of realized Earth-based lifeforms. Your objection is... what, precisely?

bambi25 June 2008 05:24:56PM0 points [-]

Silas: you might find this paper of some interest:

http://www.agiri.org/docs/ComputationalApproximation.pdf

Lincoln_Cannon25 June 2008 05:45:57PM0 points [-]

Eliezer, do you intend your use of "artificial intelligence" to be understood as always referencing something with human origins? What does it mean to you to place some artificial intelligences outside the scope of posthuman mindspace? Do you trust that human origins are capable of producing all possible artificial intelligences?

Unknown25 June 2008 06:07:08PM0 points [-]

Phil Goetz was not saying that all languages have the word "the." He said that the word "the" is something every ENGLISH document has in common. His criticism is that this does not mean that Hamlet is more similar to an English restaurant menu than an English novel is to a Russian novel. Likewise, Eliezer's argument does not show that we are more like petunias then like an AI.

komponisto225 June 2008 06:54:07PM0 points [-]

Caledonian, Sapir-Whorf becomes trivial to abolish once you regard language in the correct way: as an evolved tool for inducing thoughts in others' minds, rather than a sort of Platonic structure in terms of which thought is necessarily organized.

Phil, I don't see how the argument is obviously incorrect. Why can't two works of literature from different cultures be as different from each other as Hamlet is from a restaurant menu?

Cyan225 June 2008 06:58:17PM0 points [-]

Unknown, okay, I see it now. Thanks.

The word "the" is something every English document has in common.

... except for Gadsby, A Void, and other lipograms. ;-)

Phil_Goetz425 June 2008 10:18:13PM0 points [-]

Phil, I don't see how the argument is obviously incorrect. Why can't two works of literature from different cultures be as different from each other as Hamlet is from a restaurant menu?

They could be, but usually aren't. "World literature" is a valid category.

Fabio_Franco25 June 2008 11:01:42PM0 points [-]

This discussion reminds me of Frithjof Schuon's "The Transcendent Unity of Religions", in which he argues that a metaphysical unity exists which transcends the manifest world and which can be "univocally described by none and concretely aprehended by few".

Nick_Tarleton25 June 2008 11:26:05PM0 points [-]

poke: what are you trying to say? It "exists" in the same sense as the set of all integers, i.e. it's a natural and useful abstraction, regardless of what you think of it ontologically.

Psy-Kosh26 June 2008 03:38:22PM0 points [-]

Sapir-Whorf is disproven? *Blinks* I thought only the strong form is disproven and that the weak form has significant support. (But, on the other hand, this isn't a field I'm familiar with at all, so go ahead and correct me...)

Unknown26 June 2008 04:24:04PM0 points [-]

In regard to AIXI: One should consider more carefully the fact that any self-modifying AI can be exactly modeled by a non-self modifying AI.

One should also consider the fact that no intelligent being can predict its own actions-- this is one of those extremely rare universals. But this doesn't mean that it can't recognize itself in a mirror, despite its inability to predict its actions.

jmmcd06 January 2009 05:44:46PM0 points [-]

If we focus on the bounded subspace of mind design space which contains all those minds whose makeup can be specified in a trillion bits or less, then every universal generalization that you make has two to the trillionth power chances to be falsified.

Conversely, every existential generalization - "there exists at least one mind such that X" - has two to the trillionth power chances to be true.

So you want to resist the temptation to say either that all minds do something, or that no minds do something.

This is fine where X is a property which has a one-to-one correspondence with a particular bit in the mind's specification. For higher-level properties (perhaps emergent ones -- yes, I said it) this probabilistic argument is not convincing.

Consider the minds of specification-size 1 trillion. We can happily make the generalisation that none of them will be able to predict whether a given Turing machine halts. Yes, there are 2^trillion chances for this generalisation to be falsified, but we know it never will be.

But this generalisation is true of everything, not just "minds", so we haven't added to our knowledge. Well, let's try this generalisation instead: no mind's state will remain unchanged by a non-null input. This is not true of rocks, but is true of minds. Perhaps there are some other, more useful, things we can say about minds.

Apologies for resurrecting a months-old post. I'm new here.