whpearson comments on The Urgent Meta-Ethics of Friendly Artificial Intelligence - Less Wrong
You are viewing a comment permalink. View the original post to see all comments and the full post content.
You are viewing a comment permalink. View the original post to see all comments and the full post content.
Comments (249)
Something like:
Run simulations of agents that can chose randomly out of the same actions as the agent has. Look for regularities in the world state that occur more or less frequently in the sensible agent compared to random agent. Those things could be said to be what it likes and dislikes respectively.
To determine terminal vs instrumental values look at the decision tree and see which of the states gets chosen when a choice is forced.
Thanks. Come to think of it that's exactly the right answer.