Aiming at the Target

Eliezer Yudkowsky

42 Aiming at the Target

26th Oct 2008

6 min read

42

Previously in series: Belief in Intelligence

Previously, I spoke of that very strange epistemic position one can occupy, wherein you don't know exactly where Kasparov will move on the chessboard, and yet your state of knowledge about the game is very different than if you faced a random move-generator with the same subjective probability distribution - in particular, you expect Kasparov to win. I have beliefs about where Kasparov wants to steer the future, and beliefs about his power to do so.

Well, and how do I describe this knowledge, exactly?

In the case of chess, there's a simple function that classifies chess positions into wins for black, wins for white, and drawn games. If I know which side Kasparov is playing, I know the class of chess positions Kasparov is aiming for. (If I don't know which side Kasparov is playing, I can't predict whether black or white will win - which is not the same as confidently predicting a drawn game.)

More generally, I can describe motivations using a preference ordering. When I consider two potential outcomes, X and Y, I can say that I prefer X to Y; prefer Y to X; or find myself indifferent between them. I would write these relations as X > Y; X < Y; and X ~ Y.

Suppose that you have the ordering A < B ~ C < D ~ E. Then you like B more than A, and C more than A. {B, C}, belonging to the same class, seem equally desirable to you; you are indifferent between which of {B, C} you receive, though you would rather have either than A, and you would rather have something from the class {D, E} than {B, C}.

When I think you're a powerful intelligence, and I think I know something about your preferences, then I'll predict that you'll steer reality into regions that are higher in your preference ordering.

Think of a huge circle containing all possible outcomes, such that outcomes higher in your preference ordering appear to be closer to the center. Outcomes between which you are indifferent are the same distance from the center - imagine concentric rings of outcomes that are all equally preferred. If you aim your actions and strike a consequence close to the center - an outcome that ranks high in your preference ordering - then I'll think better of your ability to aim.

The more intelligent I believe you are, the more probability I'll concentrate into outcomes that I believe are higher in your preference ordering - that is, the more I'll expect you to achieve a good outcome, and the better I'll expect the outcome to be. Even if a powerful enemy opposes you, so that I expect the final outcome to be one that is low in your preference ordering, I'll still expect you to lose less badly if I think you're more intelligent.

What about expected utilities as opposed to preference orderings? To talk about these, you have to attribute a probability distribution to the actor, or to the environment - you can't just observe the outcome. If you have one of these probability distributions, then your knowledge of a utility function can let you guess at preferences between gambles (stochastic outcomes) and not just preferences between the outcomes themselves.

The "aiming at the target" metaphor - and the notion of measuring how closely we hit - extends beyond just terminal outcomes, to the forms of instrumental devices and instrumental plans.

Consider a car - say, a Toyota Corolla. The Toyota Corolla is made up of some number of atoms - say, on the (very) rough order of ten to the twenty-ninth. If you consider all the possible ways we could arrange those 10²⁹ atoms, it's clear that only an infinitesimally tiny fraction of possible configurations would qualify as a working car. If you picked a random configurations of 10²⁹ atoms once per Planck time, many ages of the universe would pass before you hit on a wheeled wagon, let alone an internal combustion engine.

(When I talk about this in front of a popular audience, someone usually asks: "But isn't this what the creationists argue? That if you took a bunch of atoms and put them in a box and shook them up, it would be astonishingly improbable for a fully functioning rabbit to fall out?" But the logical flaw in the creationists' argument is not that randomly reconfiguring molecules would by pure chance assemble a rabbit. The logical flaw is that there is a process, natural selection, which, through the non-chance retention of chance mutations, selectively accumulates complexity, until a few billion years later it produces a rabbit. Only the very first replicator in the history of time needed to pop out of the random shaking of molecules - perhaps a short RNA string, though there are more sophisticated hypotheses about autocatalytic hypercycles of chemistry.)

Even restricting our attention to running vehicles, there is an astronomically huge design space of possible vehicles that could be composed of the same atoms as the Corolla, and most of them, from the perspective of a human user, won't work quite as well. We could take the parts in the Corolla's air conditioner, and mix them up in thousands of possible configurations; nearly all these configurations would result in a vehicle lower in our preference ordering, still recognizable as a car but lacking a working air conditioner.

So there are many more configurations corresponding to nonvehicles, or vehicles lower in our preference ranking, than vehicles ranked greater than or equal to the Corolla.

A tiny fraction of the design space does describe vehicles that we would recognize as faster, more efficient, and safer than the Corolla. Thus the Corolla is not optimal under our preferences, nor under the designer's own goals. The Corolla is, however, optimized, because the designer had to hit an infinitesimal target in design space just to create a working car, let alone a car of Corolla-equivalent quality. The subspace of working vehicles is dwarfed by the space of all possible molecular configurations for the same atoms. You cannot build so much as an effective wagon by sawing boards into random shapes and nailing them together according to coinflips. To hit such a tiny target in configuration space requires a powerful optimization process. The better the car you want, the more optimization pressure you have to exert - though you need a huge optimization pressure just to get a car at all.

This whole discussion assumes implicitly that the designer of the Corolla was trying to produce a "vehicle", a means of travel. This assumption deserves to be made explicit, but it is not wrong, and it is highly useful in understanding the Corolla.

Planning also involves hitting tiny targets in a huge search space. On a 19-by-19 Go board there are roughly 1e180 legal positions (not counting superkos). On early positions of a Go game there are more than 300 legal moves per turn. The search space explodes, and nearly all moves are foolish ones if your goal is to win the game. From all the vast space of Go possibilities, a Go player seeks out the infinitesimal fraction of plans which have a decent chance of winning.

You cannot even drive to the supermarket without planning - it will take you a long, long time to arrive if you make random turns at each intersection. The set of turn sequences that will take you to the supermarket is a tiny subset of the space of turn sequences. Note that the subset of turn sequences we're seeking is defined by its consequence - the target - the destination. Within that subset, we care about other things, like the driving distance. (There are plans that would take us to the supermarket in a huge pointless loop-the-loop.)

In general, as you live your life, you try to steer reality into a particular region of possible futures. When you buy a Corolla, you do it because you want to drive to the supermarket. You drive to the supermarket to buy food, which is a step in a larger strategy to avoid starving. All else being equal, you prefer possible futures in which you are alive, rather than dead of starvation.

When you drive to the supermarket, you aren't really aiming for the supermarket, you're aiming for a region of possible futures in which you don't starve. Each turn at each intersection doesn't carry you toward the supermarket, it carries you out of the region of possible futures where you lie helplessly starving in your apartment. If you knew the supermarket was empty, you wouldn't bother driving there. An empty supermarket would occupy exactly the same place on your map of the city, but it wouldn't occupy the same role in your map of possible futures. It is not a location within the city that you are really aiming at, when you drive.

Human intelligence is one kind of powerful optimization process, capable of winning a game of Go or turning sand into digital computers. Natural selection is much slower than human intelligence; but over geological time, cumulative selection pressure qualifies as a powerful optimization process.

Once upon a time, human beings anthropomorphized stars, saw constellations in the sky and battles between constellations. But though stars burn longer and brighter than any craft of biology or human artifice, stars are neither optimization processes, nor products of strong optimization pressures. The stars are not gods; there is no true power in them.

OptimizationGeneral intelligenceWorld Optimization

Personal Blog

42

New Comment

Rendering 0/40 comments, sorted by

oldest

(show more) Click to highlight new comments since: Today at 3:32 PM

Moderation Log