ZoltanBerrigomo comments on What can go wrong with the following protocol for AI containment? - Less Wrong

0 Post author: ZoltanBerrigomo 11 January 2016 11:03PM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (29)

You are viewing a single comment's thread. Show more comments above.

Comment author: Silver_Swift 12 January 2016 11:19:16AM *  0 points [-]

First off, I'm a bit skeptical about whether you can actually create a superintelligent AI by combining sped up humans like that, I don't think that is the core of your argument, though, so let's assume that you can and that the resultant society is effectively a superintelligence now.

The problem with superintelligences is that they are smarter than you. It will realize that it is in a box and that you are going to turn it off eventually. Given that this society is based on natural selection it will want to prevent that. How will it accomplish that? I don't know, I'm not a superintelligence and your scenario doesn't provide enough detail to figure out what you missed, but it is very, very difficult not to give a superintelligence any hints of how the physics of our world work. Maybe they notice minor fluctuations in the speed of the simulation based on environmental changes to the hardware or maybe they can reverse engineer things of our physiology from the way you wrote the simulation. That doesn't seem like much, but humans figured out an awful lot about (for instance) the events right after the big bang from seemingly absurdly tiny hints and we're just regular intelligences.

Even if they can't find a way out of their box in the time given to them they might try to convince you to run the simulation again with a longer running time. They could pretend to be unable to find the solution to prime factorization in time or they could convince you that the method they did find fails for very high primes (by making very subtle mistakes in its application). If it uses this method it also conveniently achieves a method of communicating with the outside world (through the investigators that inspect the dump after the simulation ends) and might be able to set up a better situation for itself the second time round.

Comment author: ZoltanBerrigomo 12 January 2016 08:21:30PM *  1 point [-]

is very, very difficult not to give a superintelligence any hints of how the physics of our world work.

I wrote a short update to the post which tries to answer this point.

Maybe they notice minor fluctuations in the speed of the simulation based on environmental changes to the hardware

I believe they should have no ability whatsoever to detect fluctuations in the speed of the simulation.

Consider how the world of world of warcraft appears to an orc inside the game. Can it tell the speed at which the hardware is running the game?

It can't. What it can do is compare the speed of different things: how fast does an apple fall from a tree vs how fast a bird flies across the sky.

The orc's inner perception of the flow of time is based on comparing these things (e.g., how fast does an apple fall) to how fast their simulated brains process information.

If everything is slowed down by a factor of 2 (so you, as a player, see everything twice is slow), nothing appears any different to a simulated being within the simulation.

Comment author: Silver_Swift 13 January 2016 11:50:06AM 1 point [-]

You are absolutely correct, they wouldn't be able to detect fluctuations in processing speed (unless those fluctuations had an influence in, for instance, the rounding errors in floating point values).

About update 1: It knows our world very likely has something approximating newtonian mechanics, that is a lot of information by itself. but more than that, it knows that the real universe is capable of producing intelligent beings that chose this particular world to simulate. From a strictly theoretical point of view that is a crapton of information, I don't know if the AI would be able to figure out anything useful from it, but I wouldn't bet the future of humanity on it.

About update 2: That does work, provided that this is implemented correctly, but it only works for problems that can be automatically verified by non-AI algorithms.

Comment author: ZoltanBerrigomo 14 January 2016 03:47:33AM *  0 points [-]

but more than that, it knows that the real universe is capable of producing intelligent beings that chose this particular world to simulate.

Good point -- this undermines a lot of what I wrote in my update 1. For example, I have no idea if F = m d^3 x / dt would result in a world that is capable of producing intelligent beings.

I should at some point produce a version of the above post with this claim, and other questionable parenthetical remarks I made, deleted, or at least acknowledging that they require further argumentation; they are not necessary for the larger point, which is that as long as the only thing the superintelligence can do (by definition) is live in a simulated world governed by Newton's laws, and as long as we don't interact with it at all except to see an automatically verified answer to a preset question (e.g., factor "111000232342342"), there is nothing it can do to harm us.