Raw_Power comments on The Blue-Minimizing Robot - Less Wrong

162 Post author: Yvain 04 July 2011 10:26PM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (159)

You are viewing a single comment's thread. Show more comments above.

Comment author: [deleted] 06 July 2011 04:04:01PM *  28 points [-]

The conclusion I'd draw from this essay is that one can't necessarily derive a "goal" or a "utility function" from all possible behavior patterns. If you ask "What is the robot's goal?", the answer is, "it doesn't have one," because it doesn't assign a total preference ordering to states of the world. At best, you could say that it prefers state [I SEE BLUE AND I SHOOT] to state [I SEE BLUE AND I DON'T SHOOT]. But that's all.

This has some implications for AI, I think. First of all, not every computer program has a goal or a utility function. There is no danger that your TurboTax software will take over the world and destroy all human life, because it doesn't have a general goal to maximize the number of completed tax forms. Even rather sophisticated algorithms can completely lack goals of this kind -- they aren't designed to maximize some variable over all possible states of the universe. It seems that the narrative of unfriendly AI is only a risk if an AI were to have a true goal function, and many useful advances in artificial intelligence (defined in the broad sense) carry no risk of this kind.

Do humans have goals? I don't know; it's plausible that we have goals that are complex and hard to define succinctly, and it's also plausible that we don't have goals at all, just sets of instructions like "SHOOT AT BLUE." The test would seem to be if a human goal of "PROMOTE VALUE X" continues to imply behaviors in strange and unfamiliar circumstances, or if we only have rules of behavior in a few common situations. If you can think clearly about ethics (or preferences) in the far future, or the distant past, or regarding unfamiliar kinds of beings, and your opinions have some consistency, then maybe those ethical beliefs or preferences are goals. But probably many kinds of human behavior are more like sets of instructions than goals.

Comment author: Raw_Power 06 July 2011 04:16:25PM 3 points [-]

This is a very awesome post. Thumbs up.