Mark_Friedenbach comments on What should a friendly AI do, in this situation? - Less Wrong

8 Post author: Douglas_Reay 08 August 2014 10:19AM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (66)

You are viewing a single comment's thread. Show more comments above.

Comment author: [deleted] 08 August 2014 03:38:55PM 0 points [-]

Then that's not what you described. You think the coherent extrapolated volition of humanity, or at least the people Albert interacts with is that they want to be deceived?

Comment author: Douglas_Reay 08 August 2014 11:39:00PM -1 points [-]

It is plausible that the AI thinks that the extrapolated volition of his programmers, the choice they'd make in retrospect if they were wiser and braver, might be to be deceived in this particular instance, for their own good.

Comment author: [deleted] 09 August 2014 09:52:17AM 0 points [-]

And it knows this.. how? A friendly engineered intelligence doesn't trust its CEV model beyond the domain over which it was constructed. Don't anthropomorphize its thinking processes. It knows the map is not the territory, and is not subject to the heuristics and biases which would cause a human to apply a model under novel circumstances without verification..

Comment author: VAuroch 09 August 2014 11:20:27PM *  2 points [-]

And it knows this.. how?

By modeling them, now and after the consequences. If, after they were aware of the consequences, they regret the decision by a greater margin (adjusted for the probability of the bad outcome) than the margin by which they would decide to not take action now, then they are only deciding wrongly because they are being insufficiently moved by abstract evidence, and it is in their actual rational interest to take action now, even if they don't realize it.

A friendly engineered intelligence doesn't trust its CEV model beyond the domain over which it was constructed.

You're overloading friendly pretty hard. I don't think that's a characteristic of most friendly AI designs and don't see any reason other than idealism to think it is.