I am not a computer scientist and do not know much about complexity theory. However, it's a field that interests me, so I occasionally browse some articles on the subject. I was brought to https://www.simonsfoundation.org/mathematics-and-physical-science/approximately-hard-the-unique-games-conjecture/ by a link on Scott Aaronson's blog, and read the article to reacquaint myself with the Unique Games Conjecture, which I had partially forgotten about. If you are not familiar with the UGC, that article will explain it to you better than I can.
One phrase in the article stuck out to me: "there is some number of colors k for which it is NP-hard (that is, effectively impossible) to distinguish between networks in which it is possible to satisfy at least 99% of the constraints and networks in which it is possible to satisfy at most 1% of the constraints". I think this sentence is concerning for those interested in the possibility of creating FAI.
It is impossible to perfectly satisfy human values, as matter and energy are limited, and so will be the capabilities of even an enormously powerful AI. Thus, in trying to maximize human happiness, we are dealing with a problem that's essentially isomorphic to the UGC's coloring problem. Additionally, our values themselves are ill-formed. Human values are numerous, ambiguous, even contradictory. Given the complexities of human value systems, I think it's safe to say we're dealing with a particularly nasty variation of the problem, worse than what computer scientists studying it have dealt with.
Not all specific instances of complex optimization problems are subject to the UGC and thus NP hard, of course. So this does not in itself mean that building an FAI is impossible. Also, even if maximizing human values is NP hard (or maximizing the probability of maximizing human values, or maximizing the probability of maximizing the probability of human values) we can still assess a machine's code and actions heuristically. However, even the best heuristics are limited, as the UGC itself demonstrates. At bottom, all heuristics must rely on inflexible assumptions of some sort.
Minor edits.
Sure.
There's a basic argument that if P != NP then we should expect recursive self-improvement to be hard because many of the problems that would be involved in self-improvement (e.g. memory management, circuit design, protein folding) are NP-hard or NP-complete.
This argument in this form suffers from two problems:
First, it isn't clear what the constants in question are. There could be an exponential time algorithm for solving your favorite NP-complete problem but the algorithm is so efficient on instances of any size that matters that it doesn't end up impacting things. It isn't completely clear how serious this problem is. On the one hand, as of right now, it looks like "polynomial time algorithm if and only if has practical algorithm" is a decent heuristic even though it has exceptions. On the other hand, we've seen even in the last few years important examples where careful improvements to algorithms can drastically improve practical running speed. One major example is linear programming.
Second, most of the NP-complete problems we want to solve in practice are optimization problems. So one might hope that even if a problem is NP-hard, getting close to an optimal solution will still be doable efficiently. Maybe for most of these problems, we can expect that as the problem instances get large we actually get really efficient approximations for things which are very close to the best possible solutions.
UGC addresses both these points- the first point in a weak way, and the second point in a strong way. Prior to the Unique Games Conjecture, there was a major theorem called the PCP theorem. This is one of the few unambiguously deep results in computational complexity and it has important results that say that in many cases approximation is tough. If UGC is true, then the idea that approximation is tough becomes even more true. Consider for example the problem of minimal vertex cover(which shows up in a bunch of practical contexts and is NP-complete). Now, there is an algorithm which will approximate minimal vertex cover within a factor of 2 or so, and the algorithm is simple, and it isn't a bad idea try to work it out. PCP implies that no algorithm can in polynomial time do better than about 1.3 times the best possible, but there's a gap between 2 and 1.3. It turns out that UGC implies that a factor of 2 is really the best unless P=NP (more formally that approximating minimal vertex cover within a factor of 2 would be NP complete). This is discussed in some detail (with spoilers for the above recommended exercise) here. This is one of many problems where UGC implies that some fairly straightforward algorithm really is best possible unless P=NP. More precisely, if UGC is true, then in a wide variety of circumstances beating approximations one get from semidefinite programming is not doable.
So, if one believes UGC, then optimization is tough. This addresses primarily issue 2. How does this address issue 1? Well, as a rough heuristic, if all these approximation problems are NP-complete even for very weak approximations, and P != NP, then one should expect that even improving one's demand for the closeness of approximation a tiny amount will cause the constants (if not even also the asymptotics) to shoot up.
This isn't the entire story: For example, it ignores any context that quantum computing may impact things. Right now, the general consensus is that it is very unlikely that a quantum computer will be able to solve NP-complete problems in polynomial time. Formally, we believe that NP is not contained in BQP. But since BQP contains P, this is a much stronger claim than P != NP. On the other hand, if one believes UGC, then one also gets that if one does believe that NP is not contained in BQP then one should also believe that many approximate optimization problems cannot be done substantially better on a quantum computer than they can on a classical computer. Some people have also raised the possibility that quantum computers may somehow even if they don't improve the asymptotics for some problems may improve the constants: there's no intrinsic, theoretical, or strong empirical reason to believe this (with the exception of Grover's algorithm), but it seems like enough of a concern that I consider the idea of letting any nascent AI have access to a functioning quantum computer to be a Really Bad Idea (TM).
It is also worth noting that even if P=NP in some strong sense, this doesn't mean that recursive self-improvement necessarily can happen, although it does it make it more likely: physical limits and other issues may still come into play. It is worth keeping in mind that even if one has really fast optimization algorithms one cannot optimize beyond what is actually possible. To put it another way, if you have a bunch of needles in a haystack, even if you can search haystacks really fast, if none of the needles are made of gold you won't find any gold needles.