You're looking at Less Wrong's discussion board. This includes all posts, including those that haven't been promoted to the front page yet. For more information, see About Less Wrong.

whpearson comments on Drive-less AIs and experimentation - Less Wrong Discussion

4 Post author: whpearson 17 June 2011 02:33PM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (14)

You are viewing a single comment's thread. Show more comments above.

Comment author: whpearson 17 June 2011 08:54:11PM 1 point [-]

I suspect "avoiding changing the world" actually has some human-values baked into it.

See my reply to Vladimir Nesov.

This seems to be trying to box an AI with it's own goal system, which I think puts it in the tricky-wish category.

Do you count CEV to be in the same category?

Comment author: JamesAndrix 18 June 2011 01:20:19AM 0 points [-]

With 2 differences: CEV is tries to correct any mistakes in the initial formulation of the wish(aiming for an attractor), and it doesn't force the designers to specify details like whether making bacteria is ok or not-ok.

It's the difference between painting a painting of a specific scene, and making an auto-focus camera.

I do currently think it is possible to create a powerful cross-domain optimizer that is not a person and will not create persons or unbox itself or look at our universe or tile the universe with anything or make AI that doesn't comply with this. But I approach this line of thought with extreme caution, and really only to accelerate whatever it takes to get to CEV, because AI can't safely make changes to the real world without some knowledge of human volition, even if it wants to.

What if I missed something that's on the scale of the nonperson predicate? My AI works, creatively paints the apple, but somehow it's solution is morally awful. Even staying within pure math could be bad for unforseen reasons.