Let's assume such an AI could be created perfectly.
Wouldn't there be a danger of freezing human values forever to the values of the society which created it?
Imagine somehow the Victorian people (using steampunk or whatever) managed to build such an AI, and that AI would forever enforce their values. Would you be happy with every single value it enforced?
So in this formulation, human values are explicitly considered to be dynamic and in a constant change as people accumulate new experiences and have their environment change. Say that the Victorians invent a steampunk version of the Internet; that's going to cause them to have new kinds of experiences, which will cause changes in their values.
Both individuals and societies also have lots of different value conflicts that they will want to resolve; see e.g. the last three paragraphs of this comment. Resolving those conflicts and helping people find the most ...
MIRI recently blogged about the workshop paper that I presented at AAAI.
My abstract:
Their summary:
Would be curious to hear whether anyone here has any thoughts. This is basically a "putting rough ideas together and seeing if they make any sense" kind of paper, aimed at clarifying the hypothesis and seeing whether others kind find any obvious holes in it, rather than being at the stage of a serious scientific theory yet.