Alternate Proposal : here's a specific, alternate proposal developed with feedback from members of the #FAI channel on IRC.
Instead of building non-human optimizing algorithms, we develop accurate whole brain emulations of once living people. The simulation hardware is a set of custom designed chips with hardware restricts that prevent external writes to the memory cells storing the parameters for each emulated synapse. That is to say, the emulated neural network can update it's synapses and develop new ones (aka it can learn) but the wires that allow it to totally rewrite itself are disconnected permanently in the chip. (it's basically a write once FPGA. You need to write once to take the compiled mapping of a human mind and load it into the chips)
Thus, these emulations of human beings can only change themselves in a limited manner. This restriction is present in real human brain tissue : neurons in lower level systems have far less flexibility during the lifespan of adults. You cannot "learn" to not breathe, for instance. (you can hold your breath via executive function but once you pass out, cells in the brainstem will cause you to breathe again)
This security measure prevents a lot of possible failures.
Anyways, you don't just scan and emulate one human being. An isolated person is not an entity capable of prolonged independent operation and self improvement. Humans have evolved to function properly in small tribes. So you have to scan enough people to create an entire tribe, sufficient for the necessary social bonds and so forth needed to keep people sane. During this entire process, you use hardware blocks to physically prevent the emulation speed from exceeding a certain multiple of realtime. (current limiters or something)
Once you have an entire working tribe of sane people, interconnected in a manner that allows them to act like checks on each other, you gradually increase their responsibility and capabilities. (by boosting emulation speeds, making them responsible for gradually more complex systems, etc)
Eventually, this emulated tribe would run at a maximum rate of perhaps 10^6 times real-time and be capable of performing self improvement to a limited degree. Compared to extant human beings, people like this would have effective super-intelligence and would most likely be capable of solving problems to improve the quality and length of human lives. Maybe they could not develop "magic" and take over the universe (if that is possible) but they could certainly solve the problems of humanity.
I'd much rather have a weak super-intelligence smart enough to make great quality 3d molecular printers, and giant space habitats for humans to live in, and genetic patches to stop all human aging and disease, and slow but working starships, and a sane form of government, and a method to backup human personalities to recover from accidental and violent death, and so on at the head of things.
Ideally, this super-intelligence would consist of a network of former humans who cannot change so much as to forget their roots. (because of those previously mentioned blocks in the hardware)
And then someone makes a self-improving AI anyhow, and blows right past them.
Suppose you make a super-intelligent AI and run it on a computer. The computer has NO conventional means of output (no connections to other computers, no screen, etc). Might it still be able to get out / cause harm? I'll post my ideas, and you post yours in the comments.
(This may have been discussed before, but I could not find a dedicated topic)
My ideas:
-manipulate current through its hardware, or better yet, through the power cable (a ready-made antenna) to create electromagnetic waves to access some wireless-equipped device. (I'm no physicist so I don't know if certain frequencies would be hard to do)
-manipulate usage of its hardware (which likely makes small amounts of noise naturally) to approximate human speech, allowing it to communicate with its captors. (This seems even harder than the 1-line AI box scenario)
-manipulate usage of its hardware to create sound or noise to mess with human emotion. (To my understanding tones may affect emotion, but not in any way easily predictable)
-also, manipulating its power use will cause changes in the power company's database. There doesn't seem to be an obvious exploit there, but it IS external communication, for what it's worth.
Let's hear your thoughts! Lastly, as in similar discussions, you probably shouldn't come out of this thinking, "Well, if we can just avoid X, Y, and Z, we're golden!" There are plenty of unknown unknowns here.