ChristianKl comments on Open Thread, Aug. 22 - 28, 2016 - Less Wrong

3 Post author: polymathwannabe 22 August 2016 04:24PM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (67)

You are viewing a single comment's thread. Show more comments above.

Comment author: WhySpace 23 August 2016 06:26:08PM *  2 points [-]

(1) Given: AI risk comes primarily from AI optimizing for things besides human values.

(2) Given: humans already are optimizing for things besides human values. (or, at least besides our Coherent Extrapolated Volition)

(3) Given: Our world is okay.^[CITATION NEEDED!]

(4) Therefore, imperfect value loading can still result in an okay outcome.

This is, of course, not necessarily always the case for any given imperfect value loading. However, our world serves as a single counterexample to the rule that all imperfect optimization will be disastrous.

(5) Given: A maxipok strategy is optimal. ("Maximize the probability of an okay outcome.")

(6) Given: Partial optimization for human values is easier than total optimization. (Where "partial optimization" is at least close enough to achieve an okay outcome.)

(7) ∴ MIRI should focus on imperfect value loading.

Note that I'm not convinced of several of the givens, so I'm not certain of the conclusion. However, the argument itself looks convincing to me. I’ve also chosen to leave assumptions like “imperfect value loading results in partial optimization” unstated as part of the definitions of those 2 terms. However, I’ll try and add details to any specific areas, if questioned.

Comment author: ChristianKl 29 August 2016 03:17:57PM 1 point [-]

(1) Given: AI risk comes primarily from AI optimizing for things besides human values.

I don't that's a good description of the orthogonality thesis. An AI that optimizes for a single human value like purity could still produce huge problems.

Given: humans already are optimizing for things besides human values.

Human's don't effectively self modify to achieve specific objectives in the way an AGI could.

(6) Given: Partial optimization for human values is easier than total optimization. (Where "partial optimization" is at least close enough to achieve an okay outcome.)

Why do you believe that?

Comment author: WhySpace 29 August 2016 10:12:41PM 0 points [-]

I don't that's a good description of the orthogonality thesis.

Probably not, but it highlights the relevant (or at least related) portion. I suppose I could have been more precise by specifying terminal values, since things like paperclips are obviously instrumental values, at least for us.

Human's don't effectively self modify

Agreed, except in the trivial case where we can condition ourselves to have different emotional responses. That's substantially less dangerous, though.

Partial optimization for human values is easier than total optimization.

Why do you believe that?

I'm not sure I do, in the sense that I wouldn't assign the proposition >50% probability. However, I might put the odds at around 25% for a Reduced Impact AI architecture providing a useful amount of shortcuts.

That seems like decent odds of significantly boosting expected utility. If such an AI would be faster to develop by even just a couple years, that could make the difference between winning and loosing an AI arms race. Sure, it'd be at the cost of a utopia, but if it boosted the odds of success enough it'd still have enough expected utility to compensate.