You're looking at Less Wrong's discussion board. This includes all posts, including those that haven't been promoted to the front page yet. For more information, see About Less Wrong.

TheOtherDave comments on Don't plan for the future - Less Wrong Discussion

1 Post author: PhilGoetz 23 January 2011 10:46PM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (50)

You are viewing a single comment's thread. Show more comments above.

Comment author: TheOtherDave 24 January 2011 04:11:53PM 0 points [-]

This seems to be conflating a Friendly intelligence (that is, one constrained by its creators' terminal values) with a friendly one (that is, one that effectively signals the intent to engage in mutually beneficial social exchange).

As I said below, the reasoning used elsewhere on this site seems to conclude that a Friendly intelligence with nonhuman creators will not be Friendly to humans, since there's no reason to expect a nonhuman's terminal values to align with our own.

(Conversely, if there is some reason to expect nonhuman terminal values to align with human ones, then it may be worth clarifying that reason, as the same forces that make such convergence likely for natural-selection-generated NIs might also apply to human-generated AIs.)

Comment author: Desrtopa 24 January 2011 04:53:18PM 0 points [-]

I think that an AI whose values aligned perfectly with our own (or at least, my own) would have to assign value in its utility function to other intelligent beings. Supposing I created an AI that established a utopia for humans, but when it encountered extraterrestrial intelligences, subjected them to something they considered a fate worse than death, I would consider that to be a failing of my design.

Perfectly Friendly AI might be deserving of a category entirely to itself, since by its nature it seems that it would be a much harder problem to resolve even than ordinary Friendly AI.