anotheruser comments on asking an AI to make itself friendly - Less Wrong Discussion
You are viewing a comment permalink. View the original post to see all comments and the full post content.
You are viewing a comment permalink. View the original post to see all comments and the full post content.
Comments (30)
It would want to, because it's goal is defined as "tell the truth".
You have to differentiate between the goal we are trying to find (the optimal one) and the goal that is actually controlling what the AI does ("tell the truth"), while we are still looking for what that optimal goal could be.
the optimal goal is only implemented later, when we are sure that there are no bugs.