ciphergoth comments on Holden Karnofsky's Singularity Institute Objection 1 - Less Wrong
You are viewing a comment permalink. View the original post to see all comments and the full post content.
You are viewing a comment permalink. View the original post to see all comments and the full post content.
Comments (60)
No-one is seriously worried that an AGI will misunderstand human values. The worry is that an AGI will understand human values perfectly well, and go on to optimize what it was built to optimize.
Right, so I'm still thinking about it from the "what it was built to optimize" step. You want to try to build the AGI to optimize for human values, right? So you do your best to explain to it what you mean by your human values. But then you fail at explaining and it starts optimizing something else instead.
But suppose the AGI is a super-intelligent human. Now you can just ask it to "optimize for human values" in those exact words (although you probably want to explain it a bit better, just to be on the safe side).