William_S comments on Superintelligence 24: Morality models and "do what I mean" - Less Wrong

7 Post author: KatjaGrace 24 February 2015 02:00AM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (47)

You are viewing a single comment's thread.

Comment author: William_S 24 February 2015 08:51:35PM 0 points [-]

After this section, it feels like the "do what I mean"/"do what I want" instruction pretty much solves the problem of what we want the AI to value. If the creator the of the AI doesn't want things that work to a good future, then it seems like they would be unlikely to succeed in specifying a good future through other means. On the other hand, if the creator wants the right thing, then DWIM seems to avoid all perverse instantiations. Additionally, it seems like the only technical requirement is that the AI be able to follow natural language instructions (maybe with a bit of simpler definitions of value for the AI to use while it is still learning). Overall, my impression is that this area doesn't require nearly as much work as other parts of superintelligence design (such as getting an AI to value goals described in natural language in the first place).