In a comment on my last interview with Yudkowsky, Eric Jordan wrote:
John, it would be great if you could follow up at some point with your thoughts and responses to what Eliezer said here. He’s got a pretty firm view that environmentalism would be a waste of your talents, and it’s obvious where he’d like to see you turn your thoughts instead. I’m especially curious to hear what you think of his argument that there are already millions of bright people working for the environment, so your personal contribution wouldn’t be as important as it would be in a less crowded field.
I’ve been thinking about this a lot.
[...]
This a big question. It’s a bit self-indulgent to discuss it publicly… or maybe not. It is, after all, a question we all face. I’ll talk about me, because I’m not up to tackling this question in its universal abstract form. But it could be you asking this, too.
[...]
I’ll admit I’d be happy to sit back and let everyone else deal with these problems. But the more I study them, the more that seems untenable… especially since so many people are doing just that: sitting back and letting everyone else deal with them.
[...]
I think so far the Azimuth Project is proceeding in a sufficiently unconventional way that while it may fall flat on its face, it’s at least trying something new.
[...]
The most visible here is the network theory project, which is a step towards the kind of math I think we need to understand a wide variety of complex systems.
[...]
I don’t feel satisfied, though. I’m happy enough—that’s never a problem these days—but once you start trying to do things to help the world, instead of just have fun, it’s very tricky to determine the best way to proceed.
Link: johncarlosbaez.wordpress.com/2011/04/24/what-to-do/
His answer, as far as I can tell, seems to be that his Azimuth Project does trump the possibility of working directly on friendly AI or to support it indirectly by making and contributing money.
It seems that he and other people who understand all the arguments in favor of friendly AI and yet decide to ignore it, or disregard it as unfeasible, are rationalizing.
I myself took a different route, I was rather trying to prove to myself that the whole idea of AI going FOOM is somehow flawed rather than trying to come up with justifications for why it would be better to work on something else.
I still have some doubts though. Is it really enough to observe that the arguments in favor of AI going FOOM are logically valid? When should one disregard tiny probabilities of vast utilities and wait for empirical evidence? Yet I think that compared to the alternatives the arguments in favor of friendly AI are water-tight.
The problem why I and other people seem to be reluctant to accept that it is rational to support friendly AI research is that the consequences are unbearable. Robin Hanson recently described the problem:
Reading the novel Lolita while listening to Winston’s Summer, thinking a fond friend’s companionship, and sitting next to my son, all on a plane traveling home, I realized how vulnerable I am to needing such things. I’d like to think that while I enjoy such things, I could take them or leave them. But that’s probably not true. I like to think I’d give them all up if needed to face and speak important truths, but well, that seems unlikely too. If some opinion of mine seriously threatened to deprive me of key things, my subconscious would probably find a way to see the reasonableness of the other side.
So if my interests became strongly at stake, and those interests deviated from honesty, I’ll likely not be reliable in estimating truth.
I believe that people like me feel that to fully accept the importance of friendly AI research would deprive us of the things we value and need.
I feel that I wouldn't be able to justify what I value on the grounds of needing such things. It feels like that I could and should overcome everything that isn't either directly contributing to FAI research or that helps me to earn more money that I could contribute.
Some of us value and need things that consume a lot of time...that's the problem.
Well, let's start with the conditional probability if humans don't find some other way to kill ourselves or end civilization before it comes to this. Eliezer seems to argue the following:
A. Given we survive long enough, we'll find a way to write a self-modifying program that has, or can develop, human-level intelligence. (The capacity for self-modification follows from 'artificial human intelligence,' but since we've just seen links to writers ignoring that fact I thought I'd state it explicitly.) This necessarily gives the AI the potential for greater-than-human intelligence due to our known flaws. (I don't know how we'd give it all of our disadvantages even if we wanted to. If we did, then someone else could and eventually would build an AI without such limits.)
B. Given A, the intelligence would improve itself to the point where we could no longer predict its actions in any detail.
C. Given B, the AI could escape from any box we put it in. (IIRC this excludes certain forms of encryption, but I see no remotely credible scenario in which we sufficiently encrypt every self-modifying AI forever.)
D. Given B and C, the AI could wipe out humanity if it 'wanted' to do so.
My estimate for the probability of some of these fluctuates from day to day, but I tend to give them all a high number. Claim A in particular seems almost undeniable given the evidence of our own existence. (I only listed that one separately so that people who want to argue can do so more precisely.) And when it comes to Claim E saying that if you tell a computer to kill you it will try to kill you, I don't think the alternative has enough evidence to even consider. So I find it hard to imagine anyone rationally getting a total lower than 12% or just under 1/8.
Now that all applies to the conditional probability (if human technological civilization lives that long). I don't know how to evaluate the timescale involved or the chance of us killing ourselves before the issue would come up. The latter certainly feels like less than 11/12.
The question would grow in importance if we found out that we needed to convince a nationally important number of people to pay attention to the issue before someone creates a theory of Friendly AI including AI goal stability. I really hope that doesn't apply, because I suspect that if it does we're screwed.