Juno_Watt comments on The genie knows, but doesn't care - Less Wrong

54 Post author: RobbBB 06 September 2013 06:42AM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (515)

You are viewing a single comment's thread. Show more comments above.

Comment author: RobbBB 05 September 2013 04:23:35PM *  13 points [-]

But how could a seed AI be able to make itself superhuman powerful if it did not care about avoiding mistakes such as autocoreccting "meditating" to "masturbating"?

Those are only 'mistakes' if you value human intentions. A grammatical error is only an error because we value the specific rules of grammar we do; it's not the same sort of thing as a false belief (though it may stem from, or result in, false beliefs).

A machine programmed to terminally value the outputs of a modern-day autocorrect will never self-modify to improve on that algorithm or its outputs (because that would violate its terminal values). The fact that this seems silly to a human doesn't provide any causal mechanism for the AI to change its core preferences. Have we successfully coded the AI not to do things that humans find silly, and to prize un-silliness before all other things? If not, then where will that value come from?

A belief can be factually wrong. A non-representational behavior (or dynamic) is never factually right or wrong, only normatively right or wrong. (And that normative wrongness only constrains what actually occurs to the extent the norm is one a sufficiently powerful agent in the vicinity actually holds.)

Maybe that distinction is the one that's missing. You're assuming that an AI will be capable of optimizing for true beliefs if and only if it is also optimizing for possessing human norms. But, by the is/ought distinction, there is no true beliefs about the physical world that will spontaneously force a being that believes it to become more virtuous, if it didn't already have a relevant seed of virtue within itself.

Comment author: Juno_Watt 12 September 2013 04:06:29PM -2 points [-]

Those are only 'mistakes' if you value human intentions. A grammatical error is only an error because we value the specific rules of grammar we do; it's not the same sort of thing as a false belief (though it may stem from, or result in, false beliefs).

You will see a grammatical error as a mistake if you value grammar in general, or if you value being right in general.

A self-improving AI needs a goal. A goal of self-improvement alone would work. A goal of getting things right in general would work too, and be much safer, as it would include getting our intentions right as a sub-goal.

Comment author: MugaSofer 12 September 2013 04:18:59PM *  1 point [-]

A goal of self-improvement alone would work.

Although since "self-improvement" in this context basically refers to "improving your ability to accomplish goals"...

You will see a grammatical error as a mistake if you value grammar in general, or if you value being right in general.

Stop me if this is a non-secteur, but surely "having accurate beliefs" and "acting on those beliefs in a particular way" are completely different things? I haven't really been following this conversation, though.