Would it be possible to write a safe, recursively self-improving chess-playing AI, for instance?
Would this AI think about chess in abstract, or would it play chess against real humans? More precisely, would it have a notion of "in situation X, my opponents are more likely to make a move M" even if such knowledge cannot be derived from mere rules of chess? Because if it has some concept of an opponent (even in sense of some "black box" making the moves), it could start making some assumptions about the opponent and testing them. There would be an information channel from the real world to the world of AI. A very narrow channel, but if the AI could use all bits efficiently, after getting enough bits it could develop a model of the outside world (for the purposes of predicting the opponent's moves better).
In other words, I imagine an AIXI, which can communicate with the world only through the chess board. If there is a way to influence the world outside, in a way that leads to more wins in chess, the AI would probably find it. For example, the AI could send outside a message (encoded in its choice of possible chess moves) that it is willing to help any humans if those humans will allow the AI to win in chess more often. Somebody could make a deal with the AI like this: "If you help me become the king of the world, I promise I will let you win all chess games every" and the AI would use its powers (combined with the powers of the given human) to reach this goal.
Suppose you make a super-intelligent AI and run it on a computer. The computer has NO conventional means of output (no connections to other computers, no screen, etc). Might it still be able to get out / cause harm? I'll post my ideas, and you post yours in the comments.
(This may have been discussed before, but I could not find a dedicated topic)
My ideas:
-manipulate current through its hardware, or better yet, through the power cable (a ready-made antenna) to create electromagnetic waves to access some wireless-equipped device. (I'm no physicist so I don't know if certain frequencies would be hard to do)
-manipulate usage of its hardware (which likely makes small amounts of noise naturally) to approximate human speech, allowing it to communicate with its captors. (This seems even harder than the 1-line AI box scenario)
-manipulate usage of its hardware to create sound or noise to mess with human emotion. (To my understanding tones may affect emotion, but not in any way easily predictable)
-also, manipulating its power use will cause changes in the power company's database. There doesn't seem to be an obvious exploit there, but it IS external communication, for what it's worth.
Let's hear your thoughts! Lastly, as in similar discussions, you probably shouldn't come out of this thinking, "Well, if we can just avoid X, Y, and Z, we're golden!" There are plenty of unknown unknowns here.