Edit: Some people have misunderstood my intentions here. I do not in any way expect this to be the NEXT GREAT IDEA. I just couldn't see anything wrong with this, which almost certainly meant there were gaps in my knowledge. I thought the fastest way to see where I went wrong would be to post my idea here and see what people say. I apologise for any confusion I caused. I'll try to be more clear next time.
(I really can't think of any major problems in this, so I'd be very grateful if you guys could tell me what I've done wrong).
So, a while back I was listening to a discussion about the difficulty of making an FAI. One of the ways that was suggested to circumvent this was to go down the route of programming an AGI to solve FAI. Someone else pointed out the problems with this. Amongst other things one would have no idea what the AI will do in pursuit of its primary goal. Furthermore, it would already be a monumental task to program an AI whose primary goal is to solve the FAI problem; doing this is still easier than solving FAI, I should think.
So, I started to think about this for a little while, and I thought 'how could you make this safer?' Well, first of, you don't want an AI who completely outclasses humanity in terms of intellect. If things went Wrong, you'd have little chance of stopping it. So, you want to limit the AI's intellect to genius level, so if something did go Wrong, then the AI would not be unstoppable. It may do quite a bit of damage, but a large group of intelligent people with a lot of resources on their hands could stop it.
Therefore, what must be done is that the AI cannot modify parts of its source code. You must try and stop an intelligence explosion from taking off. So, limited access to its source code, and a limit on how much computing power it can have on hand. This is problematic though, because the AI would not be able to solve FAI very quickly. After all, we have a few genius level people trying to solve FAI, and they're struggling with it, so why should a genius level computer do any better. Well, an AI would have fewer biases, and could accumulate much more expertise relevant to the task at hand. It would be about as capable as solving FAI as the most capable human could possibly be; perhaps even more so. Essentially, you'd get someone like Turing, Von Neumann, Newton and others all rolled into one working on FAI.
But, there's still another problem. The AI, if left for 20 years working on FAI for 20 years let's say, would have accumulated enough skills that it would be able to cause major problems if something went wrong. Sure, it would be as intelligent as Newton, but it would be far more skilled. Humanity fighting against it would be like sending a young Miyamoto Musashi against his future self at his zenith i.e. completely one sided.
What must be done then, is the AI must have a time limit of a few years (or less) and after that time is past, it is put to sleep. We look at what it accomplished, see what worked and what didn't, and boot up a fresh version of the AI with any required modifications, and tell it what the old AI did. Repeat the process for a few years, and we should end up with FAI solved.
After that, we just make an FAI, and wake up the originals, since there's no point in killing them off at this point.
But there are still some problems. One, time. Why try this when we could solve FAI ourselves? Well, I would only try and implement something like this if it is clear that AGI will be solved before FAI is. A backup plan if you will. Second, what If FAI is just too much for people at our current level? Sure, we have guys who are one in ten thousand and better working on this, but what if we need someone who's one in a hundred billion? Someone who represents the peak of human ability? We shouldn't just wait around for them, since some idiot would probably just make an AGI thinking it would love us all anyway.
So, what do you guys think? As a plan, is this reasonable? Or have I just overlooked something completely obvious? I'm not saying that this would by easy in anyway, but it would be easier than solving FAI.
I'm not convinced that the solution you propose is in fact easier than solving FAI. The following problems occur to me:
1) How to we explain to the creator AI what an FAI is? 2) How do we allow the creator AI to learn "properly" without letting it self modify in ways that we would find objectionable? 3) In the case of an unfriendly creator AI, how do we stop it from "sabotaging" its work in a way that would make the resulting "FAI" be friendly to the creator AI and not to us?
In general, I feel like the approach you outline just passes the issue up one level, requiring us to make FAI friendly.
On the other hand, if you limit your idea to something somewhat less ambitious, e.g. how to we make a safe AI to solve [difficult mathematical problem in useful for FAI], then I think you may be right.
You may have a point there. But I think that the problem's you've outlined are ones that we could circumvent.
With 1) We don't know exactly how to describe what an FAI should do, or be like, so we might present an AI with the challenge of 'what would an FAI be like for humanity?' and then use that as a goal for FAI research.
2) I should think that its technically possible to construct it in such a way so that it can't just become a super-intellect, whilst still allowing it to grow in pursuit of its goal. I would have to think for a while to present a decen... (read more)