Jiro comments on The genie knows, but doesn't care - Less Wrong

54 Post author: RobbBB 06 September 2013 06:42AM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (515)

You are viewing a single comment's thread. Show more comments above.

Comment author: Jiro 03 March 2015 05:00:55PM 0 points [-]

A prerequisite for planning a Friendly AI is understanding individual and collective human values well enough to predict whether they would be satisfied with the outcome, which entails (in the logical sense) having a very well-developed model of the specific humans you interact with, or at least the capability to construct one if you so choose. Having a sufficiently well-developed model to predict what you will do given the data you are given is logically equivalent to a weak form of "control people just by talking to them".

By that reasoning, there's no such thing as a Friendly human. I suggest that most people when talking about friendly AIs do not mean to imply a standard of friendliness so strict that humans could not meet it.

Comment author: TheOtherDave 14 March 2015 08:25:27PM 1 point [-]

Yeah, what Vauroch said. Humans aren't close to Friendly. To the extent that people talk about "friendly AIs" meaning AIs that behave towards humans the way humans do, they're misunderstanding how the term is used here. (Which is very likely; it's often a mistake to use a common English word as specialized jargon, for precisely this reason.)

Relatedly, there isn't a human such that I would reliably want to live in a future where that human obtains extreme superhuman power. (It might turn out OK, or at least better than the present, but I wouldn't bet on it.)

Comment author: [deleted] 14 March 2015 08:41:32PM *  0 points [-]

Relatedly, there isn't a human such that I would reliably want to live in a future where that human obtains extreme superhuman power. (It might turn out OK, or at least better than the present, but I wouldn't bet on it.)

Just be careful to note that there isn't a binary choice relationship here. There are also possibilities where institutions (multiple individuals in a governing body with checks and balances) are pushed into positions of extreme superhuman power. There's also the possibility of pushing everybody who desires to be enhanced through levels of greater intelligence in lock step so as to prevent a single human or groups of humans achieving asymmetric power.

Comment author: TheOtherDave 14 March 2015 10:05:06PM 0 points [-]

Sure. I think my initial claim holds for all currently existing institutions as well as all currently existing individuals, as well as for all simple aggregations of currently existing humans, but I certainly agree that there's a huge universe of possibilities. In particular, there are futures in which augmented humans have our own mechanisms for engaging with and realizing our values altered to be more reliable and/or collaborative, and some of those futures might be ones I reliably want to live in.

Perhaps what I ought to have said is that there isn't a currently existing human with that property.

Comment author: VAuroch 14 March 2015 03:45:27PM 0 points [-]

By that reasoning, there's no such thing as a Friendly human.

True. There isn't.

I suggest that most people when talking about friendly AIs do not mean to imply a standard of friendliness so strict that humans could not meet it.

Well, I definitely do, and I'm at least 90% confident Eliezer does as well. Most, probably nearly all, of people who talk about Friendliness would regard a FOOMed human as Unfriendly.