Vaniver comments on Open Thread, Jul. 13 - Jul. 19, 2015 - Less Wrong Discussion
You are viewing a comment permalink. View the original post to see all comments and the full post content.
You are viewing a comment permalink. View the original post to see all comments and the full post content.
Comments (297)
What are your thoughts on this AI failure mode: Assume an AI works by rewarding itself when it improves its model of the world (which is roughly Schmidhuber’s curiosity-driven reinforcement learning approach to AI), however, the AI figures out that it can also receive reward if it turns this sort of learning on its head: Instead of changing a model to make it better fit the world, the AI starts changing the world to make it better fit its model.
Has this been considered before? Can we see this occurring in natural intelligence?
One might call this 'cleaning' or 'homogenizing' the world; instead of trying to get better at predicting the variation, you try to reduce the variation so that prediction is easier.
I don't think I've seen much mathematical work on this, and very little that discusses it as an AI failure mode. Most of the discussions I see of it as a failure mode have to do with markets, globalization, agriculture, and pandemic risk.