7 min read

"However many ways there may be of being alive, it is certain that there are vastly more ways of being dead."
        -- Richard Dawkins

In the coming days, I expect to be asked:  "Ah, but what do you mean by 'intelligence'?"  By way of untangling some of my dependency network for future posts, I here summarize some of my notions of "optimization".

Consider a car; say, a Toyota Corolla.  The Corolla is made up of some number of atoms; say, on the rough order of 1029.  If you consider all possible ways to arrange 1029 atoms, only an infinitesimally tiny fraction of possible configurations would qualify as a car; if you picked one random configuration per Planck interval, many ages of the universe would pass before you hit on a wheeled wagon, let alone an internal combustion engine.

Even restricting our attention to running vehicles, there is an astronomically huge design space of possible vehicles that could be composed of the same atoms as the Corolla, and most of them, from the perspective of a human user, won't work quite as well.  We could take the parts in the Corolla's air conditioner, and mix them up in thousands of possible configurations; nearly all these configurations would result in a vehicle lower in our preference ordering, still recognizable as a car but lacking a working air conditioner.

So there are many more configurations corresponding to nonvehicles, or vehicles lower in our preference ranking, than vehicles ranked greater than or equal to the Corolla.

Similarly with the problem of planning, which also involves hitting tiny targets in a huge search space.  Consider the number of possible legal chess moves versus the number of winning moves.

Which suggests one theoretical way to measure optimization - to quantify the power of a mind or mindlike process:

Put a measure on the state space - if it's discrete, you can just count.  Then collect all the states which are equal to or greater than the observed outcome, in that optimization process's implicit or explicit preference ordering.  Sum or integrate over the total size of all such states.  Divide by the total volume of the state space.  This gives you the power of the optimization process measured in terms of the improbabilities that it can produce - that is, improbability of a random selection producing an equally good result, relative to a measure and a preference ordering.

If you prefer, you can take the reciprocal of this improbability (1/1000 becomes 1000) and then take the logarithm base 2.  This gives you the power of the optimization process in bits.  An optimizer that exerts 20 bits of power can hit a target that's one in a million.

When I think you're a powerful intelligence, and I think I know something about your preferences, then I'll predict that you'll steer reality into regions that are higher in your preference ordering.   The more intelligent I believe you are, the more probability I'll concentrate into outcomes that I believe are higher in your preference ordering.

There's a number of subtleties here, some less obvious than others.  I'll return to this whole topic in a later sequence.  Meanwhile:

* A tiny fraction of the design space does describe vehicles that we would recognize as faster, more fuel-efficient, safer than the Corolla, so the Corolla is not optimal.  The Corolla is, however, optimized, because the human designer had to hit an infinitesimal target in design space just to create a working car, let alone a car of Corolla-equivalent quality.  This is not to be taken as praise of the Corolla, as such; you could say the same of the Hillman Minx. You can't build so much as a wooden wagon by sawing boards into random shapes and nailing them together according to coinflips.

* When I talk to a popular audience on this topic, someone usually says:  "But isn't this what the creationists argue?  That if you took a bunch of atoms and put them in a box and shook them up, it would be astonishingly improbable for a fully functioning rabbit to fall out?"  But the logical flaw in the creationists' argument is not that randomly reconfiguring molecules would by pure chance assemble a rabbit.  The logical flaw is that there is a process, natural selection, which, through the non-chance retention of chance mutations, selectively accumulates complexity, until a few billion years later it produces a rabbit.

* I once heard a senior mainstream AI type suggest that we might try to quantify the intelligence of an AI system in terms of its RAM, processing power, and sensory input bandwidth.  This at once reminded me of a quote from Dijkstra:  "If we wish to count lines of code, we should not regard them as 'lines produced' but as 'lines spent': the current conventional wisdom is so foolish as to book that count on the wrong side of the ledger."  If you want to measure the intelligence of a system, I would suggest measuring its optimization power as before, but then dividing by the resources used.  Or you might measure the degree of prior cognitive optimization required to achieve the same result using equal or fewer resources.  Intelligence, in other words, is efficient optimization.  This is why I say that evolution is stupid by human standards, even though we can't yet build a butterfly:  Human engineers use vastly less time/material resources than a global ecosystem of millions of species proceeding through biological evolution, and so we're catching up fast.

* The notion of a "powerful optimization process" is necessary and sufficient to a discussion about an Artificial Intelligence that could harm or benefit humanity on a global scale.  If you say that an AI is mechanical and therefore "not really intelligent", and it outputs an action sequence that hacks into the Internet, constructs molecular nanotechnology and wipes the solar system clean of human(e) intelligence, you are still dead.  Conversely, an AI that only has a very weak ability steer the future into regions high in its preference ordering, will not be able to much benefit or much harm humanity.

* How do you know a mind's preference ordering?  If this can't be taken for granted, then you use some of your evidence to infer the mind's preference ordering, and then use the inferred preferences to infer the mind's power, then use those two beliefs to testably predict future outcomes.  Or you can use the Minimum Message Length formulation of Occam's Razor: if you send me a message telling me what a mind wants and how powerful it is, then this should enable you to compress your description of future events and observations, so that the total message is shorter.  Otherwise there is no predictive benefit to viewing a system as an optimization process.

* In general, it is useful to think of a process as "optimizing" when it is easier to predict by thinking about its goals, than by trying to predict its exact internal state and exact actions.  If you're playing chess against Deep Blue, you will find it much easier to predict that Deep Blue will win (that is, the final board position will occupy the class of states previously labeled "wins for Deep Blue") than to predict the exact final board position or Deep Blue's exact sequence of moves.  Normally, it is not possible to predict, say, the final state of a billiards table after a shot, without extrapolating all the events along the way.

* Although the human cognitive architecture uses the same label "good" to reflect judgments about terminal values and instrumental values, this doesn't mean that all sufficiently powerful optimization processes share the same preference ordering.  Some possible minds will be steering the future into regions that are not good.

* If you came across alien machinery in space, then you might be able to infer the presence of optimization (and hence presumably powerful optimization processes standing behind it as a cause) without inferring the aliens' final goals, by way of noticing the fulfillment of convergent instrumental values.  You can look at cables through which large electrical currents are running, and be astonished to realize that the cables are flexible high-temperature high-amperage superconductors; an amazingly good solution to the subproblem of transporting electricity that is generated in a central location and used distantly.  You can assess this, even if you have no idea what the electricity is being used for.

* If you want to take probabilistic outcomes into account in judging a mind's wisdom, then you have to know or infer a utility function for the mind, not just a preference ranking for the optimization process.  Then you can ask how many possible plans would have equal or greater expected utility.  This assumes that you have some probability distribution, which you believe to be true; but if the other mind is smarter than you, it may have a better probability distribution, in which case you will underestimate its optimization power.  The chief sign of this would be if the mind consistently achieves higher average utility than the average expected utility you assign to its plans.

* When an optimization process seems to have an inconsistent preference ranking - for example, it's quite possible in evolutionary biology for allele A to beat out allele B, which beats allele C, which beats allele A - then you can't interpret the system as performing optimization as it churns through its cycles.  Intelligence is efficient optimization; churning through preference cycles is stupid, unless the interim states of churning have high terminal utility.

* For domains outside the small and formal, it is not possible to exactly measure optimization, just as it is not possible to do exact Bayesian updates or to perfectly maximize expected utility.  Nonetheless, optimization can be a useful concept, just like the concept of Bayesian probability or expected utility - it describes the ideal you're trying to approximate with other measures.

New Comment


45 comments, sorted by Click to highlight new comments since:

Without a narrow enough prior on the space of possible preferences, we might explain any behavior as the optimization of some preferences. It is interesting that this doesn't seem much of a problem in practice.

Robin, the circumstances under which a Bayesian will come to believe that a system is optimizing, are the same circumstances under which a message-length minimizer will send a message describing the system's "preferences": namely, when your beliefs about its preferences are capable of making ante facto predictions - or at least being more surprised by some outcomes than by others.

Most of the things we find it useful to describe as optimizers, have preferences that are stable over a longer timescale than the repeated observations we make of them. An inductive suspicion of such stability is enough of a prior to deal with aliens (or evolution). Also, different things of the same class often have similar preferences, like multiple humans.

Without a narrow enough prior on the space of possible preferences,

But we've already developed complex models of what 'intelligent' beings 'should' prefer - based on our own preferences. Try to explore beyond that intuitive model, and we become very uncomfortable.

When speculating about real intelligences, alien or artificial, it becomes very important that we justify the restrictions we place on the set of potential preferences we consider they might have.

Do you think it is possible to capture the idealized utility function of humans merely from factual knowledge about the world dynamics? Is it possible to find the concept of "right" without specifying even the minimal instructions on how to find it in humans, by merely making AI search for a simple utility function underlying the factual observations (given that it won't be remotely simple)? I started from this position several months back, but gradually shifted to requiring at least some anthropomorphic directions initiating the extraction of "right", establishing a path by which representation of "right" in humans translates into representation of utility computation in AI. One of the stronger problems for me is that it's hard to distinguish between observing terminal and instrumental optimization performed in the current circumstances, and you can't start reiterating towards volition without knowing even approximately right direction of transformations to the agent (or whole environment, if agent is initially undefined). It would be interesting if it's unnecessary, although of course plenty of safeguards are still in order, if there is an overarching mechanisms to wash them out eventually.

If you attempt to quantify the "power" of an optimisation process - without any attempt to factor in the number of evaluations required, the time taken, or the resources used - the "best" algorithm is usually an exhaustive search.

Eli: I think that your analysis here, and the longer analysis presented in "knowability of FAI" misses a very important point. The singularity is a fundamentally different process than playing chess or building a saloon car. The important distinction is that in building a car, the car-maker's ontology is perfectly capable of representng all of the high-level properties of the desired state, but the I stigators of the singularity are, by definition lacking a sufficiently complex representation system to represent any of the important properties of the desired state: post singularity earth. You have had the insight required to see this: you posted about " dreams of XML in a universe of quantum mechanics" a couple of posts back. I posted about this on my blog: "ontologies, approximations and fundamentalists" too.

It suffices to say that an optimization process which takes place with respect to a fixed background ontology or set of states is fundamentally different to a process which I might call vari-optimization, where optimization and ontology change happen at the same time. The singularity (whether an AI singularity or non AI) will be of the latter type.

Roko, see "Optimization and the Singularity" and "DNA Consequentialism and Protein Reinforcement" on the concept of cross-domain optimization. Yes, this is part of what distinguishes human general intelligence and natural selection, from modern-day narrow-domain Artificial Intelligences and nonprimate organisms. However, this doesn't present any challenge to my criterion of optimization, except for the possibility of needing to reframe your terminal values for a new ontology (which I've also discussed in my metaethics, under "Unnatural Categories"). Instrumentally speaking, and assuming away the problem of unnatural categories, you're just finding new paths to the same goals, leading through portions of reality that you may not have suspected existed. Sort of like humans digging miles under the ground to obtain geothermal energy in order to light their homes - they're steering the future into the same region as did the maker of candles, but wending through parts of reality and laws of physics that ancient candle-makers didn't suspect existed.

Eliezer, this particular point you made is of concern to me: "* When an optimization process seems to have an inconsistent preference ranking - for example, it's quite possible in evolutionary biology for allele A to beat out allele B, which beats allele C, which beats allele A - then you can't interpret the system as performing optimization as it churns through its cycles. Intelligence is efficient optimization; churning through preference cycles is stupid, unless the interim states of churning have high terminal utility."

You see, it seems quite likely to me that humans evaluate utility in such a circular way under many circumstances, and therefore aren't performing any optimizations. Ask middle school girls to rank boyfriend prenference and you find Billy beats Joey who beats Micky who beats Billy... Now, when you ask an AI to carry out an optimization of human utility based on observing how people optimize their own utility as evidence, what do you suppose will happen? Certainly humans optimize somethings, sometimes, but optimizations of some things are at odds with others. Think how some people want both security and adventure. A man might have one (say security), be happy for a time, get bored, then move on to the other and repeat the cycle. Is opimization a flux of the two states? Or the one that gives the most utility over the other? I suppose you could take an integral of utility over time and find which set of states = max utility over time. How are we going to begin to define utility? "We like it! But it has to be real, no wire-heading." Now throw in the complication of different people having utility functions at odds with each other. Not everyone can be king of the world, no matter how much utility they will derive from this position. Now ask the machine to be efficient- do it as easily as possible, so that easier solutions are favored over more difficult "expensive" ones.

Even if we avoid all the pitfalls of 'misunderstanding' the initial command to 'optimize utility', what gives you reason to assume you or I or any of the small, small subsegment of the population that reads this blog is going to like what the vector sum of all human preferences, utilities, etc. coughs up?

Lara Foster, I would be interested to hear a really solid example of nontransitive utility preferences, if you can think of one.

@Lara_Foster: You see, it seems quite likely to me that humans evaluate utility in such a circular way under many circumstances, and therefore aren't performing any optimizations.

Eliezer touches on that issue in "Optimization and the Singularity":

Natural selection prefers more efficient replicators. Human intelligences have more complex preferences. Neither evolution nor humans have consistent utility functions, so viewing them as "optimization processes" is understood to be an approximation.

By the way, Ask middle school girls to rank boyfriend preference and you find Billy beats Joey who beats Micky who beats Billy...

Would you mind peeking into your mind and explaining why that arises? :-) Is it just a special case of the phenomenon you described in the rest of your post?

Those prone to envy bias can tend to think that "the grass is greener on the other side of the fence". That idea can lead to circular preferences. Such preferences should not be very common - evolution should weed them out.

Does it make sense to speak of a really powerful optimisation process under this definition. Consider the man building the car, how good the car will be is dependent upon him, but also the general society around him. Have a person with the same genetics grow up in the early 1900s and today and they will be able to hit radically different points in optimisation space, not due to their own ability but due to what has been discovered and the information and economy around them.

Another example, give me a workshop, the internet and a junkyard and I might be able to whip up a motor vehicle, strand me on the desert island, the best I could come up with is probably a litter. Maybe a wooden bicycle.

Similarly for ideas, Newtons oft quoted comment about shoulder standing, indicates that relying on other optimisation processes is also very useful.

I wouldn't assume a process seeming to churn through preference cycles to have an inconsistent preference ranking, it could be efficiently optimizing if each state provides diminishing returns. If every few hours a jailer offers either food, water or a good book, you don't pick the same choice each time!

Will, it's for that reason, and also for the sake of describing natural selection, that I speak of "optimization processes" rather than just saying "optimizer". For humans to produce a nuclear power plant requires more than one human and more than one crew of humans; it takes an economy and a history of science, though a good deal of extra effort is concentrated at the end of the problem.

So can we ever standardise matters to say, for example, whether it requires 'better' or 'more' optimisation processes to build a wagon in 1850 than a car in 1950?

If you go around a cycle - and you are worse off than you would have been if you had stayed still - you may have circular preferences. If it happens repeatedly, a review is definitely in order.

I am suspicious of attempts to define intelligence for the following reason. Too often, they lead the definer down a narrow and ultimately fruitless path. If you define intelligence as the ability to perform some function XYZ, then you can sit down and start trying to hack together a system that does XYZ. Almost invariably this will result in a system that achieves some superficial imitation of XYZ and very little else.

Rather than attempting to define intelligence and move in a determined path toward that goal, we should look around for novel insights and explore their implications.

Imagine if Newton had followed the approach of "define physics and then move toward it". He may have decided that physics is the ability to build large structures (certainly an understanding of physics is helpful or required for this). He might then have spent all his time investigating the material properties of various kinds of stone - useful perhaps, but misses the big picture. Instead he looked around in the most unlikely places to find something interesting that had very little immediate practical application. That should be our mindset in pursuing AI: the scientist's, rather than the engineer's, approach.

A question, would you consider computers as part of the dominant optimisation processes on earth already?

It is just that you often compare neurons to silicon as one of the things that will be very different for AI. But as we already use silicon as part of our optimisation processes (modelling protein folding, weather systems, data mining). The jump to pure silicon optimisation processes might not be as huge as you suggest with the comparison of firing rates of neurons and processor speed.

I suppose I am having trouble localising an optimisation process and rating its "power". Consider two identical computer systems with software that if on the surface would enable it to be a RPOP, both of them locked underground on the moon one has sufficient energy (chemical and heat gradients) to bootstrap to fusion or otherwise get itself out of the moon, the other one doesn't. They obviously will have a different ability to affect the future. Should I say that the energy reserves are part of the optimisation process rather than separating them away, so I can still say they are equally powerful? How much of the universe do you need to consider part of the process so that powerfulness(process) gives a unique answer?

One last question, if you create friendly AI, should I consider you as powerful an optimisation process as it?

I don't think characterising the power of an optimiser by using the size of the target region relative to the size of the total space is enough. A tiny target in a gigantic space is trivial to find if the space has a very simple structure with respect to your preferences. For example, a large smooth space with a gradient that points towards the optimum. Conversely, a bigger target on a smaller space can be practically impossible to find if there is little structure, or if the structure is deceptive.

@Eliezer: I think your definition of "optimization process" is a very good one, I just don't think that the technological singularity [will neccessarily be]/[ought to be] an instance of it.

Eliezer: " ... you're just finding new paths to the same goals, leading through portions of reality that you may not have suspected existed."

  • This may be a point that we disagree on quite fundamentally. Has it occurred to you that one might introduce terminal values in a new, richer ontology which were not even possible to state in the old one? Surely: you're aware that most of the things that an adult human considers to be of terminal value are not stateable in the ontology of a 3 year old ("Loyalty to Libertarianism", "Mathematical Elegance", "Fine Literature", "Sexuality", ...); that most things a human considers to be of terminal value are not stateable in the ontology of a chimp.

I think that it is the possibility of finding new value-states which were simply unimaginable to an earlier version of oneself that attracts me to the transhumanist cause; if you cast the singularity as an optimization process you rule out this possibility from the start. An "optimization process" based version of the singuilarity will land us in something like Iain M Banks' Culture, where human drives and desires are supersaturated by advanced technology, but nothing really new is done.

Furthermore, my desire to experience value-states that are simply unimaginable to the current version of me is simply not stateable as an optimization problem: for optimization always takes place over a known set of states (as you explained well in the OP).

Shane, that seems distantly like trying to compare two positions in different coordinate systems, without transforming one into the other first. Surely there is a transform that would convert the "hard" space into the terms of the "easy" space, so that the size of the targets could be compared apples to apples.

Will P: A question, would you consider computers as part of the dominant optimisation processes on earth already?

Will, to the extent that you can draw a category around discrete optimisation processes, I'd say a qualified 'yes'. Computers, insofar as they are built and used to approach human terminal values, are a part of the human optimisation process. Humanity is far more efficient in its processes than a hundred years back. The vast majority of this has something to do with silicon.

In response to your computers-on-the-moon (great counterfactual, by the way), I think I'd end up judging optimisation by its results. That said, I suppose how you measure efficiency depends on what that query is disguising. Intelligence/g? Reproductive speed? TIT-FOR-TAT-ness?

I read recently that the neanderthals, despite having the larger brain cavities, may well have gone under because our own ancestors simply bred faster. Who had the better optimisation process there?

The question of whether 'computer-optimisation' will ever be a separate process from 'human-optimisation' depends largely on your point of view. It seems as though a human-built computer should never spontaneously dream up its own terminal value. However, feel free to disagree with me when your DNA is being broken down to create molecular smiley faces.

Andy:

Sure, you can transform a problem in a hard coordinate space into an easy one. For example, simply order the points in terms of their desirability. That makes finding the optimum trivial: just point at the first element! The problem is that once you have transformed the hard problem into an easy one, you've essentially already solved the optimisation problem and thus it no longer tests the power of the optimiser.

Surely there is a transform that would convert the "hard" space into the terms of the "easy" space, so that the size of the targets could be compared apples to apples.

But isn't this the same as computing a different measure (i.e. not the counting measure) on the "hard" space? If so, you could normalize this to a probability measure, and then compute its Kullback-Leibler divergence to obtain a measure of information gain.

That should be our mindset in pursuing AI: the scientist's, rather than the engineer's, approach.
Scientists do not predefine the results they wish to find. Predefining is what Eliezer's Friendly AI is all about.

Shane, I was basically agreeing with you with regard to problem spaces: normalizing space size isn't enough, you've also got to normalize whatever else makes them incomparable. However, let's not confuse problem space with state space. Eliezer focuses on the latter, which I think is pretty trivial compared to what you're alluding to.

It strikes me as odd to define intelligence in terms of ability to shape the world; among other things, this implies that if you amputate a man's limbs, he immediately becomes much less intelligent.

Nominull, I distinguished between optimization power and intelligence. Optimization power is ability to shape the world. Intelligence is efficiently using time/CPU/RAM/sensory/motor resources to optimize. Cut off a person's limbs and they become less powerful; but if they can manipulate the world to their bidding without their limbs, why, they must be very intelligent.

John Maxwell- I thought the security/adventrure example was good, but that the way I portrayed it might make it seem that ever-alternating IS the answer. Heregoes: A man lives as a bohemian out on the street, nomadically day to day solving his problems of how to get food and shelter. It seems to him that he would be better off looking for a secure life, and thus gets a job to make money. Working for money for a secure life is difficult and tiring and it seems to him that he will be better off once he has the money and is secure. Now he's worked a long time and has the money and is secure, which he now finds is boring both in comparison to working and living a bohemian life with uncertainty in it. People do value uncertainty and 'authenticity' to a very high degree. Thus Being Secure is > Working to be secure > Not being secure > being secure.

Now, Eliezer would appropriately point out that the man only got trapped in this loop, because he didn't actually know what would make him happiest, but assumed without having the experience. But, that being said, do we think this fellow would have been satisfied being told to start with 'Don't bother working son, this is better for you, trust me!' There's no obvious reason to me why the fAI will allow people the autonomy they so desire to pursue their own mistakes unless the final calculation of human utility determines that it wins out, and this is dubious... I'm saying that I don't care if what in truth maximizes utility is for everyone to believe they're 19th century god-fearing farmers, or to be on a circular magic quest the memory of the earliest day of which disappears each day, such that it replays for eternity, or whatever simulation the fAI decides on for post-singularity humanity, I think I'd rather be free of it to fuck up my own life. Me and many others.

I guess this goes to another more important problem than human nonlinear preference- Why should we trust an AI that maximizes human utility, even if it understands what that means? Why should we, from where we sit now, like what human volition (a collection of non-linear preferences) extrapolates to, and what value do we place on our own autonomy?

Optimization power really ought to have something to do with the ability to solve optimization problems. The possible subsequent impact of those solutions on the world seems like a side issue.

Thus Being Secure is > Working to be secure > Not being secure > being secure.

As judged at different times, under different circumstances (having less or more money, being less or more burned out). This doesn't sound like a "real" intransitive preference.

whatever simulation the fAI decides on for post-singularity humanity, I think I'd rather be free of it to fuck up my own life. Me and many others.... Why should we trust an AI that maximizes human utility, even if it understands what that means?

But then, your freedom is a factor in deciding what's best for you. It sounds like you're thinking of an FAI as a well-intentioned but extremely arrogant human, who can't resist the temptation to meddle where it rationally shouldn't.

It's not about resisting temptation to meddle, but about what will, in fact, maximize human utility. The AI will not care whether utility is maximized by us or by it, as long as it is maximized (unless you want to program in 'autonomy' as an axiom, but I'm sure there are other problems with that). I think there is a high probability that, given its power, the fAI will determine that it can best maximize human utility by taking away human autonomy. It might give humans the illusion of autonomy in some circumstances, and low and behold these people will be 'happier' than non-delusional people would be. Heck, what's to keep it from putting everyone in their own individual simulation? I was assuming some axiom that stated, 'no wire-heading', but it's very hard for me to even know what that means in a post-singularity context. I'm very skeptical of handing over control of my life to any dictatorial source of power, no matter how 'friendly' it's programmed to be. Now, if Eliezer is conviced it's a choice between his creation as dictator vs someone else's destroying the universe, then it is understandable why he is working towards the best dictator he can surmise... But I would rather not have either.

Oh, come on, Lara, did you really think I hadn't thought of that? One of the reasons why Friendly AI isn't trivial is that you need to describe human values like autonomy - "I want to optimize my own life, not have you do it for me" - whose decision-structure is nontrivial, e.g., you wouldn't want an AI choosing the exact life-course for you that maximized your autonomy.

What I always wonder is why we need to preserve human values like autonomy if we could produce better results without it? For example, if an AI could compute the absolute best way to perform a specific action then why is it a good thing to be able to choose a different way to perform the action?

'Better' is exactly why we'd want to retain (some) autonomy! I personally wouldn't want an AI to tell me exactly how to live my life.

Oh, come on, Eliezer, of course you thought of it. ;) However, it might not have been something that bothered you, as in- A) You didn't believe actually having autonomy mattered as long as people feel like they do (ie a Matrix/Nexus situation). I have heard this argued. Would it matter to you if you found out your whole life was a simulation? Some say no. I say yes. Matter of taste perhaps?

B) OR You find it self evident that 'real' autonomy would be extrapolated by the AI as something essential to human happiness, such that an intelligence observing people and maximizing our utility wouldn't need to be told 'allow autonomy.' This I would disagree with.

C) OR You recognize that this is a problem with a non-obvious solution to an AI, and thus intend to deal with it somehow in code ahead of time, before starting the volition extrapolating AI. Your response indicates you feel this way. However, I am concerned even beyond setting an axiomatic function for 'allow autonomy' in a program. There are probably an infinite number of ways that an AI can find ways to carry out its stated function that will somehow 'game' our own system and lead to suboptimal or outright repugnant results (ie everyone being trapped in a permanent quest- maybe the AI avoids the problem of 'it has to be real' by actually creating a magic ring that needs to be thrown into a volcano every 6 years or so). You don't need me telling you that! Maximizing utility while deluding us about reality is only one. It seems impossible that we could axiomatically safeguard against all possibilities. Assimov was a pretty smart cookie, and his '3 laws' are certainly not sufficient. 'Eliezer's million lines of code' might cover a much larger range of AI failures, but how could you ever be sure? The whole project just seems insanely dangerous. Or are you going to address safety concerns in another post in this series?

What Morpheus and his crew gave Neo in the Matrix movie is not more autonomy, IMHO, but rather a much more complete model of reality.

Shane, this is an advanced topic; it would be covered under the topic of trying to compute the degree of optimization of the optimizer, and the topic of choosing a measure on the state space.

First, if you look at parts of the problem in a particular order according to your search process, that's somewhat like having a measure that gives large chunks of mass to the first options you search. If you were looking for your keys, then, all else being equal, you would search first in the places where you thought you were most likely to find your keys (or the easiest places to check, probability divided by cost, but forget that for the moment) - so there's something like a measure, in this case a probability measure, that corresponds to where you look first. Think of turning it the other way around, and saying that the points of largest measure correspond to the first places you search, whether because the solution is most likely to be there, or because the cost is lowest. These are the solutions we call "obvious" or "straightforward" as if they had high probability.

Second, suppose you were smarter than you are now and a better programmer, transhumanly so. Then for you, creating a chess program like Deep Blue (or one of the modern more efficient programs) might be as easy as computing the Fibonacci sequence. But the chess program would still be just as powerful as Deep Blue. It would be just as powerful an optimizer. Only to you, it would seem like an "obvious solution" so you wouldn't give it much credit, any more than you credit gradient descent on a problem with a global minimum - though that might seem much harder to Archimedes than to you; the Newton-Raphson method was a brilliant innovation, once upon a time.

If you see a way to solve an optimization problem using a very simple program, then it will seem to you like the difficulty of the problem is only the difficulty of writing that program. But it may be wiser to draw a distinction between the object level and the meta level. Kasparov exerted continuous power to win multiple chess games. The programmers of Deep Blue exerted a constant amount of effort to build it, and then they could win as many chess games as they liked by pressing a button. It is a mistake to compare the effort exerted by Kasparov to the effort exerted by the programmers; you should compare Kasparov to Deep Blue, and say that Deep Blue was a more powerful optimizer than Kasparov. The programmers you would only compare to natural selection, say, and maybe you should include in that the economy behind them that built the computing hardware.

But this just goes to show that what we consider difficulty isn't always the same as object-level optimization power. Once the programmers built Deep Blue, it would have been just a press of the button for them to turn Deep Blue on, but when Deep Blue was running, it would still have been exerting optimization power. And you don't find it difficult to regulate your body's breathing and heartbeat and other properties, but you've got a whole medulla and any number of gene regulatory networks contributing to a continuous optimization of your body. So what we perceive as difficulty is not the same as optimization-power-in-the-world - that's more a function of what humans consider obvious or effortless, versus what they have to think about and examine multiple options in order to do.

We could also describe the optimizer in less concrete and more probabilistic terms, so that if the environment is not certain, the optimizer has to obtain its end under multiple conditions. Indeed, if this is not the case, then we might as well model the system by thinking in terms of a single linear chain of cause and effect, which would not arrive at the same destination if perturbed anywhere along its way - so then there is no point in describing the system as having a goal.

We could say that optimization isn't really interesting until it has to cross mutiple domains or unknown domains, the way we consider human intelligence and natural selection as more interesting optimizations than beavers building a dam. These may also be reasons why you feel that simple problems don't reflect much difficulty, or that the kind of optimization performed isn't commensurate with the work your intelligence perceives as "work'.

Even so, I would maintain the view of an optimization process as something that squeezes the future into a particular region, across a range of starting conditions, so that it's simpler to understand the destination than the pathway. Even if the program that does this seems really straightforward to a human AI researcher, the program itself is still squeezing the future - it's working even if you aren't. Or maybe you want to substitute a different measure on the state space than the equiprobable one - but at that point you're bringing your own intelligence into the problem. There's a lot of problems that look simple to humans, but it isn't always easy to make an AI solve them.

Being Secure is > Working to be secure > Not being secure > being secure

It is not much of a loop unless you repeat it. If the re-Bohemian becomes a re-office worker, we have a loop. Otherwise, we have an experiment that did not work. The actual preferences described sounded more like: (Inaccurate) expected value of being secure > Not being secure > (Actual value of) being secure > Working to be secure where the Bohemian was willing to endure the lowest point to reach the highest point, only to discover his incorrect expectations. I expect, however, that the re-Bohemian period shortly after liquidating everything from the security period will be a lot of fun.

Eli, most of what you say above isn't new to me -- I've already encountered these things in my work on defining machine intelligence. Moreover, none of this has much impact on the fact that measuring the power of an optimiser simply in terms of the relative size of a target subspace to the search space doesn't work: sometimes tiny targets in massive spaces are trivial to solve, and sometimes bigger targets in moderate spaces are practically impossible. The simple number-of-bits-of-optimisation-power method you describe in this post doesn't take this into account. As far as I can see, the only way you could deny this is if you were a strong NFL theorem believer.

Moreover, none of this has much impact on the fact that measuring the power of an optimiser simply in terms of the relative size of a target subspace to the search space doesn't work: sometimes tiny targets in massive spaces are trivial to solve, and sometimes bigger targets in moderate spaces are practically impossible.

I thought I described my attitude toward this above: The concept of a problem's difficulty, the amount of mental effort it feels like you need to exert to solve it, should not be confused with the optimization power exerted in solving it, which should not be confused with the intelligence used in solving it. What is "trivial to solve" depends on how intelligent you start out; "how much you can accomplish" depends on the resources you start with; and whether I bother to describe something as an "optimization process" at all will depend on whether it achieves the "same goal" across multiple occasions, conditions, and starting points.

Eliezer - Consider maximizing y in the search space y = - vector_length(x). You can make this space as large as you like, by increasing the range or the dimensionality of x. But it does not get any more difficult, whether you measure by difficulty, power needed, or intelligence needed.

Re: optimization power

The problem I see is that - if you are making up terminology - it would be nice if the name reflected what was being measured.

Optimisation power suggests something useful - but the proposed metric contains no reference to the number of trials, the number of trials in series on the critical path - or most of the other common ways of measuring the worth of optimisation processes. It seems to be more a function of the size of the problem space than anything else - in which case, why "power" and not, say "factor".

Before christening the notion, there are some basic questions: Is the proposed metric any use? What is the point of it?

Eli, you propose this number of bits metric as a way "to quantify the power of a mind". Surely then, something with a very high value in your metric should be a "powerful mind"?

It's easy to come up with a wide range of optimisation problems, as Phil Goetz did above, where a very simple algorithm on very modest hardware would achieve massive scores with respect to your mind power metric. And yet, this is clearly not a "powerful mind" in any reasonable sense.

Re: I would rather not have either.

Choose no superintelligence, then - but don't expect reality to pay your wishes much attention.