Alexandros comments on 'Dumb' AI observes and manipulates controllers - Less Wrong

33 Post author: Stuart_Armstrong 13 January 2015 01:35PM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (19)

You are viewing a single comment's thread.

Comment author: Alexandros 13 January 2015 06:22:01PM 4 points [-]

The truly insidious effects are when the content of the stories changes the reward but not by going through the standard quality-evaluation function.

For instance, maybe the AI figures out that the order of the stories affects the rewards. Or perhaps it finds how stories that create a climate of joy/fear on campus lead to overall higher/lower evaluations for that period. Then the AI may be motivated to "take a hit" to push through some fear mongering so as to raise its evaluations for the following period. Perhaps it finds that causing strife in the student union, or perhaps causing racial conflict, or causing trouble with the university faculty affects its rewards one way or another. Perhaps if it's unhappy with a certain editor, it can slip through bad enough errors to get the editor fired, hopefully replaced with a more rewarding editor.

etc etc.

Comment author: benwr 14 January 2015 02:34:54AM *  2 points [-]

The problem with these particular extensions is that they don't sound plausible for this type of AI. In my opinion it would be easier when talking with designers to switch from this example to a slightly more sci-fi example.

The leap is between the obvious "it's 'manipulating' its editors by recognizing simple patterns in their behavior" to "it's manipulating its editors by correctly interpreting the causes underlying their behavior."

Much easier to extend in the other direction first: "Now imagine that it's not an article-writer, but a science officer aboard the commercial spacecraft Nostromo..."

Comment author: G-Max 14 January 2015 12:07:31PM 0 points [-]

Upvoted for remembering that Ash was the science officer and not just the movie's token android.