One possible answer to the argument "attempting to build FAI based on Eliezer's ideas seems infeasible and increases the risk of UFAI without helping much to increase the probability of a good outcome, and therefore we should try to achieve a positive Singularity by other means" is that it's too early to decide this. Even if our best current estimate is that trying to build such an FAI increases risk, there is still a reasonable chance that this estimate will turn out to be wrong after further investigation. Therefore, the counter-argument goes, we ought to mount a serious investigation into the feasibility and safety of Eliezer's design (as well as other possible FAI approaches), before deciding to either move forward or give up.
(I've been given to understand that this is a standard belief within SI, except possibly for Eliezer, which makes me wonder why nobody gave this counter-argument in response to my post linked above. ETA: Carl Shulman did subsequently give me a version of this argument here.)
This answer makes sense to me, except for the concern that even seriously investigating the feasibility of FAI is risky, if the team doing so isn't fully rational. For example they may be overconfident about their abilities and thereby overestimate the feasibility and safety, or commit sunken cost fallacy once they have developed lots of FAI-relevant theory in the attempt to study feasibility, or become too attached to their status and identity as FAI researchers, or some team members may disagree with a consensus of "give up" and leave to form their own AGI teams and take the dangerous knowledge developed with them.
So the question comes down to, how rational is such an FAI feasibility team likely to be, and is that enough for the benefits to exceed the costs? I don't have a lot of good ideas about how to answer this, but the question seems really important to bring up. I'm hoping this post this will trigger SI people to tell us their thoughts, and maybe other LWers have ideas they can share.
Speaking for myself...
I don't think I have anything special or insightful to say about this. Basically, I hope this all becomes clearer as the situation develops. Right now MIRI is still years away from successfully recruiting an "FAI team," let alone "building FAI."
John Maxwell, philh, and yourself all presented reasonable ideas, and I expect additional opportunities to present themselves as time goes on. Shulman's "caged AGIs/WBEs working on small pieces of the problem, including critiquing each others' results" concept is another idea.
Do you think the question in the OP is significantly and immediately policy-relevant? As explained previously, I'm less confident than (e.g.) Eliezer that MIRI should eventually try to build FAI itself, but in the meantime, it looks really useful to collect a bunch of young people with top cognitive ability and turn their attention to concrete research problems in FAI theory, x-risk strategy, effective altruism, etc. It also looks pretty useful to directly attack the problem of FAI, because I expect strategic information from the exercise, because philosophy is best done from within a science, etc.
Yes, because it implies that building FAI is even harder than it looked before. All the "reasonable ideas" presented so far require doubling the required resources or multiplying them several times, and further slowing down FAI progress relative to people unconcerned with Friendliness. Together with the apparent difficulty of solving the FAI-specific problem over and above the AGI problems, and the outside-view conclusion that there's no way to become confide... (read more)