William_S comments on Superintelligence 24: Morality models and "do what I mean" - Less Wrong

7 Post author: KatjaGrace 24 February 2015 02:00AM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (47)

You are viewing a single comment's thread.

Comment author: William_S 24 February 2015 08:44:03PM 0 points [-]

Suppose we have a bunch of short natural language descriptions of what we would want the AI to value. Can we simply give the AI a list of these, and tell it to maximize all of these values given some kind of equal weighting? It seems to me that, much more than in other areas of superintelligence design, the things we come up with are likely to point to what we want, and so aggregating a bunch of these descriptions is more likely to lead to what we want than picking any description individually. Does it seem like this would work? Is there any way this can go wrong?