PhilosophyTutor comments on Siren worlds and the perils of over-optimised search - Less Wrong
You are viewing a comment permalink. View the original post to see all comments and the full post content.
You are viewing a comment permalink. View the original post to see all comments and the full post content.
Comments (411)
I think this and the "finite resources therefore tradeoffs" argument both fail to take seriously the interconnectedness of the optimisation axes which we as humans care about.
They assume that every possible aspect of society is an independent slider which a sufficiently advanced AI can position at will, even though this society is still going to be made up of humans, will have to be brought about by or with the cooperation of humans and will take time to bring about. These all place constraints on what is possible because the laws of physics and human nature aren't infinitely malleable.
I don't think discreet but total control over a world is compatible with things like liberty, which seem like obvious qualities to specify in an optimal world we are building an AI to search for.
I think what we might be running in to here is less of an AI problem and more of a problem with the model of AI as an all-powerful genie capable of absolutely anything with no constraints whatsoever.
Precisely and exactly! That's the whole of the problem - optimising for one thing (appearance) results in the loss of other things we value.
Next challenge: define liberty in code. This seems extraordinarily difficult.
So we do agree that there are problem with an all-powerful genie? Once we've agreed on that, we can scale back to lower AI power, and see how the problems change.
(the risk is not so much that the AI would be an all powerful genie, but that it could be an all powerful genie compared with humans).
This just isn't always so. If you instruct an AI to optimise a car for speed, efficiency and durability but forget to specify that it has to be aerodynamic, you aren't going to get a car shaped like a brick. You can't optimise for speed and efficiency without optimising for aerodynamics too. In the same way it seems highly unlikely to me that you could optimise a society for freedom, education, just distribution of wealth, sexual equality and so on without creating something pretty close to optimal in terms of unwanted pregnancies, crime and other important axes.
Even if it's possible to do this, it seems like something which would require extra work and resources to achieve. A magical genie AI might be able to make you a super-efficient brick-shaped car by using Sufficiently Advanced Technology indistinguishable from magic but even for that genie it would have to be more work than making an equally optimal car by the defined parameters that wasn't a silly shape. In the same way an effectively God-like hypothetical AI might be able to make a siren world that optimised for everything except crime and create a world perfect in every way except that it was rife with crime but it seems like it would be more work, not less.
I think if we can assume we have solved the strong AI problem, we can assume we have solved the much lesser problem of explaining liberty to an AI.
We've got a problem with your assumptions about all-powerful genies, I think, because I think your argument relies on the genie being so ultimately all-powerful that it is exactly as easy for the genie to make an optimal brick-shaped car or an optimal car made out of tissue paper and post-it notes as it is for the genie to make an optimal proper car. I don't think that genie can exist in any remotely plausible universe.
If it's not all-powerful to that extreme then it's still going to be easier for the genie to make a society optimised (or close to it) across all the important axes at once than one optimised across all the ones we think to specify while tanking all the rest. So for any reasonable genie I still think market worlds don't make sense as a concept. Siren worlds, sure. Market worlds, not so much, because the things we value are deeply interconnected and you can't just arbitrarily dump-stat some while efficiently optimising all the rest.
The strong AI problem is much easier to solve than the problem of motivating an AI to respect liberty. For instance, the first one can be brute forced (eg AIXItl with vast resources), the second one can't. Having the AI understand human concepts of liberty is pointless unless it's motivated to act on that understanding.
An excess of anthropomophisation is bad, but an analogy could be about creating new life (which humans can do) and motivating that new life to follow specific rules are requirements if they become powerful (which humans are pretty bad at at).
I don't believe that strong AI is going to be as simple to brute force as a lot of LessWrongers believe, personally, but if you can brute force strong AI then you can just get it to run a neuron-by-neuron simulation of the brain of a reasonably intelligent first year philosophy student who understands the concept of liberty and tell the AI not to take actions which the simulated brain thinks offend against liberty.
That is assuming that in this hypothetical future scenario where we have a strong AI we are capable of programming that strong AI to do any one thing instead of another, but if we cannot do that then the entire discussion seems to me to be moot.
I've met far too many first-year philosophy students to be comfortable with this program.
How? "tell", "the simulated brain thinks" "offend": defining those incredibly complicated concepts contains nearly the entirety of the problem.
I could be wrong but I believe that this argument relies on an inconsistent assumption, where we assume we have solved the problem of creating an infinitely powerful AI, but we have not solved the problem of operationally defining commonplace English words which hundreds of millions of people successfully understand in such a way that a computer can perform operations using them.
It seems to me that the strong AI problem is many orders of magnitude more difficult than the problem of rigorously defining terms like "liberty". I imagine that a relatively small part of the processing power of one human brain is all that is needed to perform operations on terms like "liberty" or "paternalism" and engage in meaningful use of them so it is a much, much smaller problem than the problem of creating even a single human-level AI, let alone a vastly superhuman AI.
If in our imaginary scenario we can't even define "liberty" in such a way that a computer can use the term, it doesn't seem very likely that we can build any kind of AI at all.
Yes. Here's another brute force approach: upload a brain (without understanding it), run it very fast with simulated external memory, subject it to evolutionary pressure. All this can be done with little philosophical and conceptual understanding, and certainly without any understanding of something as complex as liberty.
If you can do that, then you can just find someone who you think understands what we mean by "liberty" (ideally someone with a reasonable familiarity with Kant, Mill, Dworkin and other relevant writers), upload their brain without understanding it, and ask the uploaded brain to judge the matter.
(Off-topic: I suspect that you cannot actually get a markedly superhuman AI that way, because the human brain could well be at or near a peak in the evolutionary landscape so that there is no evolutionary pathway from a current human brain to a vastly superhuman brain. Nothing I am aware of in the laws of physics or biology says that there must be any such pathway, and since evolution is purposeless it would be an amazing lucky break if it turned out that we were on the slope of the highest peak there is, and that the peak extends to God-like heights. That would be like if we put evolutionary pressure on a cheetah and discovered that if we do that we can evolve a cheetah that runs at a significant fraction of c.
However I believe my argument still works even if I accept for the sake of argument that we are on such a peak in the evolutionary landscape, and that creating God-like AI is just a matter of running a simulated human brain under evolutionary pressure for a few billion simulated years. If we have that capability then we must also be able to run a simulated philosopher who knows what "liberty" refers to).
EDIT: Downvoting this without explaining why you disagree doesn't help me understand why you disagree.
And would their understanding of liberty remain stable under evolutionary pressure? That seems unlikely.
Have not been downvoting it.
My mind is throwing a type-error on reading your comment.
Liberty could well be like pornography: we know it when we see it, based on probabilistic classification. There might not actually be a formal definition of liberty that includes all actual humans' conceptions of such as special cases, but instead a broad range of classifier parameters defining the variation in where real human beings "draw the line".
The standard LW position (which I think is probably right) is that human brains can be modelled with Turing machines, and if that is so then a Turing machine can in theory do whatever it is we do when we decide that something ls liberty, or pornography.
There is a degree of fuzziness in these words to be sure, but the fact we are having this discussion at all means that we think we understand to some extent what the term means and that we value whatever it is that it refers to. Hence we must in theory be able to get a Turing machine to make the same distinction although it's of course beyond our current computer science or philosophy to do so.
While I don't know how much I believe the OP, remember that "liberty" is a hotly contested term. And that's without a superintelligence trying to create confusing cases. Are you really arguing that "a relatively small part of the processing power of one human brain" suffices to answer all questions that might arise in the future, well enough to rule out any superficially attractive dystopia?
I really am. I think a human brain could rule out superficially attractive dystopias and also do many, many other things as well. If you think you personally could distinguish between a utopia and a superficially attractive dystopia given enough relevant information (and logically you must think so, because you are using them as different terms) then it must be the case that a subset of your brain can perform that task, because it doesn't take the full capabilities of your brain to carry out that operation.
I think this subtopic is unproductive however, for reasons already stated. I don't think there is any possible world where we cannot achieve a tiny, partial solution to the strong AI problem (codifying "liberty", and similar terms) but we can achieve a full-blown, transcendentally superhuman AI. The first problem is trivial compared to the second. It's not a trivial problem, by any means, it's a very hard problem that I don't see being overcome in the next few decades, but it's trivial compared to the problem of strong AI which is in turn trivial compared to the problem of vastly superhuman AI. I think Stuart_Armstrong is swallowing a whale and then straining at a gnat.
No, this seems trivially false. No subset of my brain can reliably tell when an arbitrary Turing machine halts and when it doesn't, no matter how meaningful I consider the distinction to be. I don't know why you would say this.
If you can simulate the whole brain, you can just simulate asking the brain the question "does this offend against liberty."
Under what circumstances? There are situations - torture, seduction, a particular way of asking the question - that can make any brain give any answer. Defining "non-coercive yet informative questioning" about a piece of software (a simulated brain) is... hard. AI hard, as some people phrase it.
Why would that .be more of a problem for an AI than a human?
? The point is that having a simulated brain and saying "do what this brain approves of" does not make the AI safe, as defining the circumstance in which the approval is acceptable is a hard problem.
This is a problem for us controlling an AI, not a problem for the AI.
there. that's how we tell an AI capable of being an AI and capable of simulating a brain to not to take actions which the simulated brain thinks offend against liberty, as implemented in python.
oh, it's so clear and obvious now, how could I have missed that?
And therein lies the rub. Current research-grade AGI formalisms don't actually allow us to specifically program the agent for anything, not even paperclips.
If I was unclear, I was intending that remark to apply to the original hypothetical scenario where we do have a strong AI and are trying to use it to find a critical path to a highly optimal world. In the real world we obviously have no such capability. I will edit my earlier remark for clarity.
Unless you start by removing the air, in some way that doesn't count against the car's efficiency.
This also creates some interesting problems... Suppose a very powerful AI is given human liberty as a goal (or discovers that this is a goal using coherent extrapolated volition). Then it could quickly notice that its own existence is a serious threat to that goal, and promptly destroy itself!
yes, but what about other AIs that might be created, maybe without liberty as a top goal - it would need to act to prevent them from being built! It's unlikely that "destroy itself" is the best option it can find...
Except that acting to prevent other AIs from being built would also encroach on human liberty, and probably in a very major way if it was to be effective! The AI might conclude from this that liberty is a lost cause in the long run, but it is still better to have a few extra years of liberty (until the next AI gets built), rather than ending it right now (through its own powerful actions).
Other provocative questions: how much is liberty really a goal in human values (when taking the CEV for humanity as a whole, not just liberal intellectuals)? How much is it a terminal goal, rather than an instrumental goal? Concretely, would humans actually care about being ruled over by a tyrant, as long as it was a good tyrant? (Many people are attracted to the idea of an all-powerful deity for instance, and many societies have had monarchs who were worshipped as gods.) Aren't mechanisms like democracy, separation of powers etc mostly defence mechanisms against a bad tyrant? Why shouldn't a powerful "good" AI just dispense with them?
A certain impression of freedom is valued by humans, but we don't seem to want total freedom as a terminal goal.
Well of course we don't. Total freedom is an incoherent goal: the only way to ensure total future freedom of action is to make sure nothing ever happens, thus maximizing the number of available futures without ever actually choosing one.
As far as I've been able to reason out, the more realistic human conception of freedom is: "I want to avoid having other agenty things optimize me (for their preferences (unilaterally))." The last part is there because there are mixed opinions on whether you've given up your ethical freedom if an agenty thing optimizes you for your preferences (as might happen in ideal situations, such as dealing with an FAI handing out transhuman candy), or whether you've given up your ethical freedom if you bind yourself to implement someone else's preferences mixed-in with your own (for instance, by getting married).
That doesn't make sense -- why would the status quo, whatever it is, always maximize the number of available futures? Choosing a future does not restrict you, it does close some avenues but also opens other ones.
"Total freedom" is a silly concept, of course, but it's just as silly as "Total <anything>".
Total happiness seems to make more plausible sense than total freedom.
Not sure how you determine degrees of plausibility :-/
The expression "total happiness" (other than in contexts of the "it's like, dude, I was so totally chill and happy" kind) makes no more sense to me than "total freedom".
Assume B choose without coercion, but assume A always knows what B will choose and can set up various facts in the world to determine B's choice. Is B free?
I think there is insufficient information to answer the question as asked.
If I offer you the choice of a box with $5 in it, or a box with $500 000 in it, and I know that you are close enough to a rational utility-maximiser that you will take the $500 000, then I know what you will choose and I have set up various facts in the world to determine your choice. Yet it does not seem on the face of it as if you are not free.
On the other hand if you are trying to decide between being a plumber or a blogger and I use superhuman AI powers to subtly intervene in your environment to push you into one or the other without your knowledge then I have set up various facts in the world to determine your choice and it does seem like I am impinging on your freedom.
So the answer seems to depend at least on the degree of transparency between A and B in their transactions. Many other factors are almost certainly relevant, but that issue (probably among many) needs to be made clear before the question has a simple answer.
Can you cash out the difference between those two cases in sufficient detail that we can use it to safely defined what liberty means?
So, just checking before I answer: you're claiming that no direct, gun-to-the-head coercion is employed, but Omega can always predict your actions and responses, and sets things up to ensure you will choose a specific thing.
Are you free, or are you in some sense "serving" Omega? I answer: The latter, very, very, very definitely.
If we take it out of abstract language, real people manipulate each-other all the time, and we always condemn it as a violation of the ethical principle of free choice. Yes, sometimes there are principles higher than free choice, as with a parent who might say, "Do your homework or you get no dessert" (treat that sentence as a metasyntactic variable for whatever you think is appropriate parenting), but we still prefer, all else equal, that our choices and those of others be less manipulated rather than more.
Just because fraud and direct coercion are the usual standards for proving a violation of free choice in a court of law, for instance in order to invalidate a legal contract, does not mean that these are the all-and-all of the underlying ethics of free choice.
Then if Omega is superintelligent, it has a problem: every single decision it makes increases or decreases the probability of someone answering something or other, possibly by a large amount. It seems Omega cannot avoid being coercive, just because it's so knowledgeable.
I think Asimov did this first with his Multivac stories, although rather than promptly destroy itself Multivac executed a long-term plan to phase itself out.