Stuart_Armstrong comments on Siren worlds and the perils of over-optimised search - Less Wrong

27 Post author: Stuart_Armstrong 07 April 2014 11:00AM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (411)

You are viewing a single comment's thread. Show more comments above.

Comment author: Stuart_Armstrong 28 April 2014 09:32:58AM 0 points [-]

If I only specify that I want low rates of abortion, for example,

You would get a world with no conception, or possibly with no humans at all.

Comment author: PhilosophyTutor 28 April 2014 11:21:16AM *  1 point [-]

I don't think you have highlighted a fundamental problem since we can just specify that we mean a low percentage of conceptions being deliberately aborted in liberal societies where birth control and abortion are freely available to all at will.

My point, though, is that I don't think it is very plausible that "marketing worlds" will organically arise where there are no humans, or no conception, but which tick all the other boxes we might think to specify in our attempts to describe an ideal world. I don't see how there being no conception or no humans could possibly be a necessary trade-off with things like wealth, liberty, rationality, sustainability, education, happiness, the satisfaction of rational and well-informed preferences and so forth.

Of course a sufficiently God-like malevolent AI could presumably find some way of gaming any finite list we give it, since there are probably an unbounded number of ways of bringing about horrible worlds, so this isn't a problem with the idea of siren worlds. I just don't find the idea of market worlds very plausible because so many of the things we value are fundamentally interconnected.

Comment author: Stuart_Armstrong 28 April 2014 11:42:33AM 0 points [-]

The "no conception" example is just to illustrate that bad things happen when you ask an AI to optimise along a certain axis without fully specifying what we want (which is hard/impossible).

A marketing world is fully optimised along the "convince us to choose this world" axis. If at any point, the AI in confronted with a choice along the lines of "remove genuine liberty to best give the appearance of liberty/happiness", it will choose to do so.

That's actually the most likely way a marketing world could go wrong - the more control the AI has over people's appearance and behaviour, the more capable it is of making the world look good. So I feel we should presume that discrete-but-total AI control over the world's "inhabitants" would be the default in a marketing world.

Comment author: PhilosophyTutor 28 April 2014 09:03:29PM 3 points [-]

I think this and the "finite resources therefore tradeoffs" argument both fail to take seriously the interconnectedness of the optimisation axes which we as humans care about.

They assume that every possible aspect of society is an independent slider which a sufficiently advanced AI can position at will, even though this society is still going to be made up of humans, will have to be brought about by or with the cooperation of humans and will take time to bring about. These all place constraints on what is possible because the laws of physics and human nature aren't infinitely malleable.

I don't think discreet but total control over a world is compatible with things like liberty, which seem like obvious qualities to specify in an optimal world we are building an AI to search for.

I think what we might be running in to here is less of an AI problem and more of a problem with the model of AI as an all-powerful genie capable of absolutely anything with no constraints whatsoever.

Comment author: Stuart_Armstrong 29 April 2014 09:28:38AM *  0 points [-]

I don't think discreet but total control over a world is compatible with things like liberty

Precisely and exactly! That's the whole of the problem - optimising for one thing (appearance) results in the loss of other things we value.

which seem like obvious qualities to specify in an optimal world we are building an AI to search for.

Next challenge: define liberty in code. This seems extraordinarily difficult.

model of AI as an all-powerful genie capable of absolutely anything with no constraints whatsoever.

So we do agree that there are problem with an all-powerful genie? Once we've agreed on that, we can scale back to lower AI power, and see how the problems change.

(the risk is not so much that the AI would be an all powerful genie, but that it could be an all powerful genie compared with humans).

Comment author: PhilosophyTutor 29 April 2014 11:29:33AM 3 points [-]

Precisely and exactly! That's the whole of the problem - optimising for one thing (appearance) results in the loss of other things we value.

This just isn't always so. If you instruct an AI to optimise a car for speed, efficiency and durability but forget to specify that it has to be aerodynamic, you aren't going to get a car shaped like a brick. You can't optimise for speed and efficiency without optimising for aerodynamics too. In the same way it seems highly unlikely to me that you could optimise a society for freedom, education, just distribution of wealth, sexual equality and so on without creating something pretty close to optimal in terms of unwanted pregnancies, crime and other important axes.

Even if it's possible to do this, it seems like something which would require extra work and resources to achieve. A magical genie AI might be able to make you a super-efficient brick-shaped car by using Sufficiently Advanced Technology indistinguishable from magic but even for that genie it would have to be more work than making an equally optimal car by the defined parameters that wasn't a silly shape. In the same way an effectively God-like hypothetical AI might be able to make a siren world that optimised for everything except crime and create a world perfect in every way except that it was rife with crime but it seems like it would be more work, not less.

Next challenge: define liberty in code. This seems extraordinarily difficult.

I think if we can assume we have solved the strong AI problem, we can assume we have solved the much lesser problem of explaining liberty to an AI.

So we do agree that there are problem with an all-powerful genie?

We've got a problem with your assumptions about all-powerful genies, I think, because I think your argument relies on the genie being so ultimately all-powerful that it is exactly as easy for the genie to make an optimal brick-shaped car or an optimal car made out of tissue paper and post-it notes as it is for the genie to make an optimal proper car. I don't think that genie can exist in any remotely plausible universe.

If it's not all-powerful to that extreme then it's still going to be easier for the genie to make a society optimised (or close to it) across all the important axes at once than one optimised across all the ones we think to specify while tanking all the rest. So for any reasonable genie I still think market worlds don't make sense as a concept. Siren worlds, sure. Market worlds, not so much, because the things we value are deeply interconnected and you can't just arbitrarily dump-stat some while efficiently optimising all the rest.

Comment author: Stuart_Armstrong 29 April 2014 12:07:41PM 0 points [-]

I think if we can assume we have solved the strong AI problem, we can assume we have solved the much lesser problem of explaining liberty to an AI.

The strong AI problem is much easier to solve than the problem of motivating an AI to respect liberty. For instance, the first one can be brute forced (eg AIXItl with vast resources), the second one can't. Having the AI understand human concepts of liberty is pointless unless it's motivated to act on that understanding.

An excess of anthropomophisation is bad, but an analogy could be about creating new life (which humans can do) and motivating that new life to follow specific rules are requirements if they become powerful (which humans are pretty bad at at).

Comment author: PhilosophyTutor 29 April 2014 09:40:30PM *  4 points [-]

The strong AI problem is much easier to solve than the problem of motivating an AI to respect liberty. For instance, the first one can be brute forced (eg AIXItl with vast resources), the second one can't.

I don't believe that strong AI is going to be as simple to brute force as a lot of LessWrongers believe, personally, but if you can brute force strong AI then you can just get it to run a neuron-by-neuron simulation of the brain of a reasonably intelligent first year philosophy student who understands the concept of liberty and tell the AI not to take actions which the simulated brain thinks offend against liberty.

That is assuming that in this hypothetical future scenario where we have a strong AI we are capable of programming that strong AI to do any one thing instead of another, but if we cannot do that then the entire discussion seems to me to be moot.

Comment author: Nornagest 29 April 2014 10:17:07PM 6 points [-]

then [...] run a neuron-by-neuron simulation of the brain of a reasonably intelligent first year philosophy student who understands the concept of liberty and tell the AI not to take actions which the simulated brain thinks offend against liberty.

I've met far too many first-year philosophy students to be comfortable with this program.

Comment author: Stuart_Armstrong 30 April 2014 04:55:07AM 0 points [-]

tell the AI not to take actions which the simulated brain thinks offend against liberty.

How? "tell", "the simulated brain thinks" "offend": defining those incredibly complicated concepts contains nearly the entirety of the problem.

Comment author: PhilosophyTutor 30 April 2014 06:28:16AM 1 point [-]

I could be wrong but I believe that this argument relies on an inconsistent assumption, where we assume we have solved the problem of creating an infinitely powerful AI, but we have not solved the problem of operationally defining commonplace English words which hundreds of millions of people successfully understand in such a way that a computer can perform operations using them.

It seems to me that the strong AI problem is many orders of magnitude more difficult than the problem of rigorously defining terms like "liberty". I imagine that a relatively small part of the processing power of one human brain is all that is needed to perform operations on terms like "liberty" or "paternalism" and engage in meaningful use of them so it is a much, much smaller problem than the problem of creating even a single human-level AI, let alone a vastly superhuman AI.

If in our imaginary scenario we can't even define "liberty" in such a way that a computer can use the term, it doesn't seem very likely that we can build any kind of AI at all.

Comment author: EHeller 30 April 2014 05:14:02AM 1 point [-]

How? "tell", "the simulated brain thinks" "offend": defining those incredibly complicated concepts contains nearly the entirety of the problem.

If you can simulate the whole brain, you can just simulate asking the brain the question "does this offend against liberty."

Comment author: Neph 15 June 2014 02:13:42PM *  0 points [-]
def checkMorals():
>[simulate philosophy student's brain]
>if [simulated brain's state is offended]:
>>return False
>else:
>>return True
if checkMorals():
>[keep doing AI stuff]

there. that's how we tell an AI capable of being an AI and capable of simulating a brain to not to take actions which the simulated brain thinks offend against liberty, as implemented in python.

Comment author: [deleted] 01 May 2014 07:44:42AM -2 points [-]

That is assuming that we are capable of programming a strong AI to do any one thing instead of another, but if we cannot do that then the entire discussion seems to me to be moot.

And therein lies the rub. Current research-grade AGI formalisms don't actually allow us to specifically program the agent for anything, not even paperclips.

Comment author: PhilosophyTutor 01 May 2014 11:49:29AM 0 points [-]

If I was unclear, I was intending that remark to apply to the original hypothetical scenario where we do have a strong AI and are trying to use it to find a critical path to a highly optimal world. In the real world we obviously have no such capability. I will edit my earlier remark for clarity.

Comment author: Strange7 02 May 2014 03:10:43PM 0 points [-]

This just isn't always so. If you instruct an AI to optimise a car for speed, efficiency and durability but forget to specify that it has to be aerodynamic, you aren't going to get a car shaped like a brick. You can't optimise for speed and efficiency without optimising for aerodynamics too.

Unless you start by removing the air, in some way that doesn't count against the car's efficiency.

Comment author: drnickbone 29 April 2014 10:18:15AM *  0 points [-]

This also creates some interesting problems... Suppose a very powerful AI is given human liberty as a goal (or discovers that this is a goal using coherent extrapolated volition). Then it could quickly notice that its own existence is a serious threat to that goal, and promptly destroy itself!

Comment author: Stuart_Armstrong 29 April 2014 11:00:27AM 1 point [-]

yes, but what about other AIs that might be created, maybe without liberty as a top goal - it would need to act to prevent them from being built! It's unlikely that "destroy itself" is the best option it can find...

Comment author: drnickbone 29 April 2014 11:30:44AM 0 points [-]

Except that acting to prevent other AIs from being built would also encroach on human liberty, and probably in a very major way if it was to be effective! The AI might conclude from this that liberty is a lost cause in the long run, but it is still better to have a few extra years of liberty (until the next AI gets built), rather than ending it right now (through its own powerful actions).

Other provocative questions: how much is liberty really a goal in human values (when taking the CEV for humanity as a whole, not just liberal intellectuals)? How much is it a terminal goal, rather than an instrumental goal? Concretely, would humans actually care about being ruled over by a tyrant, as long as it was a good tyrant? (Many people are attracted to the idea of an all-powerful deity for instance, and many societies have had monarchs who were worshipped as gods.) Aren't mechanisms like democracy, separation of powers etc mostly defence mechanisms against a bad tyrant? Why shouldn't a powerful "good" AI just dispense with them?

Comment author: Stuart_Armstrong 29 April 2014 12:08:46PM 0 points [-]

A certain impression of freedom is valued by humans, but we don't seem to want total freedom as a terminal goal.

Comment author: [deleted] 29 April 2014 01:11:56PM 2 points [-]

Well of course we don't. Total freedom is an incoherent goal: the only way to ensure total future freedom of action is to make sure nothing ever happens, thus maximizing the number of available futures without ever actually choosing one.

As far as I've been able to reason out, the more realistic human conception of freedom is: "I want to avoid having other agenty things optimize me (for their preferences (unilaterally))." The last part is there because there are mixed opinions on whether you've given up your ethical freedom if an agenty thing optimizes you for your preferences (as might happen in ideal situations, such as dealing with an FAI handing out transhuman candy), or whether you've given up your ethical freedom if you bind yourself to implement someone else's preferences mixed-in with your own (for instance, by getting married).

Comment author: PhilosophyTutor 29 April 2014 11:34:15AM 0 points [-]

I think Asimov did this first with his Multivac stories, although rather than promptly destroy itself Multivac executed a long-term plan to phase itself out.