I think my reply to Lurker, above, might clarify some things. To answer your question.
Making an operating system is easy. Deciding which operating system should be used is harder. This is true despite the fact that an operating system's performance on most criteria can easily be assessed. Assessing whether an operating system is fast is easier than assessing whether a universe is "just", for example. Also, choosing one operating system from a set of preexisting options is much easier than choosing one future from all possibilities that can be created (unimaginably many). Finally, there are far fewer tradeoffs involved in choice of operating system than there are in choice of satisficing criteria. You might trade money for speed, or speed for usability, and be able to guess correctly a reasonably high percentage of the time. But few other tradeoffs exist, so the situation is comparatively a simple one. If comparing two AI with different satisficing protocols, it would be very hard to guess which AI did a better job just by looking at the results they created, unless one failed spectacularly. But the fact that it's difficult to judge a tradeoff doesn't mean it doesn't exist. Because I think human values are complicated, I think such tradeoffs are very likely to exist.
We are used to imagining spectacular failures resulting from tiny mistakes, on LW. An AI that's designed to make people smiles creates a monstrous stitching of fleshy carpet. But we are not used to imagining failures that are less obvious, but still awful. I think that's a problem.
We are used to imagining spectacular failures resulting from tiny mistakes, on LW. An AI that's designed to make people smiles creates a monstrous stitching of fleshy carpet. But we are not used to imagining failures that are less obvious, but still awful. I think that's a problem.
I think people understand that this is a danger, but by nature, one can't spend a lot of time imagining these situations. Also, UGC has little to do with this problem.
I am not a computer scientist and do not know much about complexity theory. However, it's a field that interests me, so I occasionally browse some articles on the subject. I was brought to https://www.simonsfoundation.org/mathematics-and-physical-science/approximately-hard-the-unique-games-conjecture/ by a link on Scott Aaronson's blog, and read the article to reacquaint myself with the Unique Games Conjecture, which I had partially forgotten about. If you are not familiar with the UGC, that article will explain it to you better than I can.
One phrase in the article stuck out to me: "there is some number of colors k for which it is NP-hard (that is, effectively impossible) to distinguish between networks in which it is possible to satisfy at least 99% of the constraints and networks in which it is possible to satisfy at most 1% of the constraints". I think this sentence is concerning for those interested in the possibility of creating FAI.
It is impossible to perfectly satisfy human values, as matter and energy are limited, and so will be the capabilities of even an enormously powerful AI. Thus, in trying to maximize human happiness, we are dealing with a problem that's essentially isomorphic to the UGC's coloring problem. Additionally, our values themselves are ill-formed. Human values are numerous, ambiguous, even contradictory. Given the complexities of human value systems, I think it's safe to say we're dealing with a particularly nasty variation of the problem, worse than what computer scientists studying it have dealt with.
Not all specific instances of complex optimization problems are subject to the UGC and thus NP hard, of course. So this does not in itself mean that building an FAI is impossible. Also, even if maximizing human values is NP hard (or maximizing the probability of maximizing human values, or maximizing the probability of maximizing the probability of human values) we can still assess a machine's code and actions heuristically. However, even the best heuristics are limited, as the UGC itself demonstrates. At bottom, all heuristics must rely on inflexible assumptions of some sort.
Minor edits.