Nebu comments on Reply to Holden on 'Tool AI' - Less Wrong

94 Post author: Eliezer_Yudkowsky 12 June 2012 06:00PM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (348)

You are viewing a single comment's thread. Show more comments above.

Comment author: Nebu 17 February 2016 11:28:40AM 0 points [-]

But what then makes it recommend a policy that we will actually want to implement?

First of all, I'm assuming that we're taking as axiomatic that the tool "wants" to improve itself (or else why would it have even bothered to consider recommending that it be modified to improve itself?); i.e. improving itself is favorable according to its utility function.

Then: It will recommend a policy that we will actually want to implement, because its model of the universe includes our minds and it can see that if it recommends a policy we will actually want to implement leads it to a higher ranked state in its utility function.