ChristianKl comments on Boxing an AI? - Less Wrong
You are viewing a comment permalink. View the original post to see all comments and the full post content.
You are viewing a comment permalink. View the original post to see all comments and the full post content.
Comments (39)
The difficult thing isn't to have the AI act sensibly in the medium term. The difficult thing is to have it's values stay stable under self modification and to complex problems right like not wireheading everyone right.
This would definitely let you test the values-stable-under-self-modification. Just plonk the AI in an environment where it can self-modify and keep track of its values. Since this is not dependent on morality, you can just give it easily-measurable values.