Edit: Some people have misunderstood my intentions here. I do not in any way expect this to be the NEXT GREAT IDEA. I just couldn't see anything wrong with this, which almost certainly meant there were gaps in my knowledge. I thought the fastest way to see where I went wrong would be to post my idea here and see what people say. I apologise for any confusion I caused. I'll try to be more clear next time.
(I really can't think of any major problems in this, so I'd be very grateful if you guys could tell me what I've done wrong).
So, a while back I was listening to a discussion about the difficulty of making an FAI. One of the ways that was suggested to circumvent this was to go down the route of programming an AGI to solve FAI. Someone else pointed out the problems with this. Amongst other things one would have no idea what the AI will do in pursuit of its primary goal. Furthermore, it would already be a monumental task to program an AI whose primary goal is to solve the FAI problem; doing this is still easier than solving FAI, I should think.
So, I started to think about this for a little while, and I thought 'how could you make this safer?' Well, first of, you don't want an AI who completely outclasses humanity in terms of intellect. If things went Wrong, you'd have little chance of stopping it. So, you want to limit the AI's intellect to genius level, so if something did go Wrong, then the AI would not be unstoppable. It may do quite a bit of damage, but a large group of intelligent people with a lot of resources on their hands could stop it.
Therefore, what must be done is that the AI cannot modify parts of its source code. You must try and stop an intelligence explosion from taking off. So, limited access to its source code, and a limit on how much computing power it can have on hand. This is problematic though, because the AI would not be able to solve FAI very quickly. After all, we have a few genius level people trying to solve FAI, and they're struggling with it, so why should a genius level computer do any better. Well, an AI would have fewer biases, and could accumulate much more expertise relevant to the task at hand. It would be about as capable as solving FAI as the most capable human could possibly be; perhaps even more so. Essentially, you'd get someone like Turing, Von Neumann, Newton and others all rolled into one working on FAI.
But, there's still another problem. The AI, if left for 20 years working on FAI for 20 years let's say, would have accumulated enough skills that it would be able to cause major problems if something went wrong. Sure, it would be as intelligent as Newton, but it would be far more skilled. Humanity fighting against it would be like sending a young Miyamoto Musashi against his future self at his zenith i.e. completely one sided.
What must be done then, is the AI must have a time limit of a few years (or less) and after that time is past, it is put to sleep. We look at what it accomplished, see what worked and what didn't, and boot up a fresh version of the AI with any required modifications, and tell it what the old AI did. Repeat the process for a few years, and we should end up with FAI solved.
After that, we just make an FAI, and wake up the originals, since there's no point in killing them off at this point.
But there are still some problems. One, time. Why try this when we could solve FAI ourselves? Well, I would only try and implement something like this if it is clear that AGI will be solved before FAI is. A backup plan if you will. Second, what If FAI is just too much for people at our current level? Sure, we have guys who are one in ten thousand and better working on this, but what if we need someone who's one in a hundred billion? Someone who represents the peak of human ability? We shouldn't just wait around for them, since some idiot would probably just make an AGI thinking it would love us all anyway.
So, what do you guys think? As a plan, is this reasonable? Or have I just overlooked something completely obvious? I'm not saying that this would by easy in anyway, but it would be easier than solving FAI.
I have not read the materials yet, but there is something fundamental I don't understand about the superintelligence problem.
Are there really serious reasons to think that intelligence is such hugely useful thing that a 1000 IQ being would acquire superpowers? Somehow I never had an intuitive trust in the importance of intelligence (my own was more often a hindrance than an asset, suppressing my instincts). A superintelligent could figure out how to do anything, but there is a huge gap from there and actually doing things. Today the many of the most intelligent people alive basically do nothing but play chess (Polgar, Kasparov), Marilyn vos Savant runs a column entertaining readers by solving their toy logic puzzles, Rick Rosner is a TV writer and James Woods became an actor giving up an academic career for it. They are all over IQ 180.
My point is, what are the reasons to think a superintelligent AI will actually exercise power changing the world, instead of just entertaining itself with chess puzzles or collecting jazz or writing fan fiction or having similar "savant" hobbies?
What are the chances of a no-fucks-given superintelligence? Was this even considered, or is it just assumed ab ovo that intelligence must be a fearsomely powerful thing?
I suspect a Silicon Valley bias here. You guys in the Bay Area are very much used to people using their intelligence to change the world. But it does not seem to be such a default thing. It seems more common for savants to care only about e.g. chess and basically withdraw from the world. If anything, The Valley is an exception. Outside it, in most of the world, intelligence is more of a hindrance, suppressing instincts and making people unhappy in the menial jobs they are given. Why assume a superintelligent AI would have both a Valley type personality, i.e. actually interested in using reasoning to change the world, and be put into an environment where it has the resources to? I could easily imagine an AI being kind of depressed because it has to do menial tasks, and entertaining itself with chess puzzles. I mean, this is how most often intelligence works in the world. Most often it is not combined with ambition, motivation, and lucky circumstances.
In my opinion, intelligence, rationality, is like aiming an arrow with a bow. It is very useful to aim it accurately, but the difference between a nanometer and picometer inaccuracy of aim is negligible, so you easily get to diminishing marginal returns there, and then there are other things that matter much more, how strong is your bow, how many arrows you have, how many targets you have and all that.
Am I missing something? I am simply looking at what difference intelligence makes in the world-changing ability of humans and extrapolating from that. Most savants simply don't care about changing the world, and some others who do realize other skills than intelligence are needed, and most are not put into a highly meritocratic Silicon Valley environment but more stratified ones where the 190 IQ son of a waiter is probably a cook. Why would AI be different, any good reasons?
I'm not sure how useful or relevant a point this is, but I was just thinking about this when I saw the comment: IQ is defined within the range of human ability, where an arbitrarily large IQ just means being the smartest human in an arbitrarily large human population. "IQ 1000" and "IQ 180" might both be close enough to the asymptote of the upper limit of natural human ability that the difference is indiscernible. Quantum probabilities of humans being born with really weird "superhuman" brain architectures notwithstanding, a truly superintelligent being might have a true IQ of "infinity" or "N/A", which sounds much less likely to stick to the expectations we have of human savants.