You're looking at Less Wrong's discussion board. This includes all posts, including those that haven't been promoted to the front page yet. For more information, see About Less Wrong.

Vaniver comments on Open Thread, Jul. 13 - Jul. 19, 2015 - Less Wrong Discussion

5 Post author: MrMind 13 July 2015 06:55AM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (297)

You are viewing a single comment's thread. Show more comments above.

Comment author: [deleted] 13 July 2015 11:46:50PM 1 point [-]

What are your thoughts on this AI failure mode: Assume an AI works by rewarding itself when it improves its model of the world (which is roughly Schmidhuber’s curiosity-driven reinforcement learning approach to AI), however, the AI figures out that it can also receive reward if it turns this sort of learning on its head: Instead of changing a model to make it better fit the world, the AI starts changing the world to make it better fit its model.

Has this been considered before? Can we see this occurring in natural intelligence?

Comment author: Vaniver 14 July 2015 04:22:06PM 1 point [-]

Instead of changing a model to make it better fit the world, the AI starts changing the world to make it better fit its model.

One might call this 'cleaning' or 'homogenizing' the world; instead of trying to get better at predicting the variation, you try to reduce the variation so that prediction is easier.

I don't think I've seen much mathematical work on this, and very little that discusses it as an AI failure mode. Most of the discussions I see of it as a failure mode have to do with markets, globalization, agriculture, and pandemic risk.