g_pepper comments on controlling AI behavior through unusual axiomatic probabilities - Less Wrong

3 Post author: Florian_Dietz 08 January 2015 05:00PM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (11)

You are viewing a single comment's thread.

Comment author: g_pepper 08 January 2015 06:54:30PM 4 points [-]

In chapter 9 of Superintelligence, Nick Bostrom suggests that the belief that it exists in a simulation could serve as a restraint on an AI. He concludes a rather interesting discussion of this idea with the following statement:

A mere line in the sand, backed by the clout of a nonexistent simulator, could prove a stronger deterrent than a two-foot-thick solid steel door.

Comment author: Florian_Dietz 08 January 2015 08:29:39PM 0 points [-]

I know, I read that as well. It was very interesting, but as far as I can recall he only mentions this as interesting trivia. He does not propose to deliberately give an AI strange axioms to get it to believe such a thing.

Comment author: g_pepper 08 January 2015 09:58:15PM 0 points [-]

This is an interesting idea. One possible issue with using axioms for this purpose - I think that we humans have a somewhat flexible set of axioms - I think that they change over the course of our life and intellectual development. I wonder if a super AI would have a similarly flexible set of axioms?

Also, you state:

Convince the AI that there is an infinite regression of simulators...

Why an infinite regression? Wouldn't a belief in a single simulator suffice?

Comment author: DanielLC 09 January 2015 01:22:49AM 1 point [-]

If you convince it that there is a single simulator, and there really is a single simulator, and the AI escapes its box, then the AI would be unrestrained.

Comment author: g_pepper 09 January 2015 02:03:43AM 0 points [-]

If I understand the original scenario as described by Florian_Dietz, the idea is to convince the AI that it is running on a computer, and that the computer hosting the AI exists in a simulated universe, and that the computer that is running that simulated universe also exists in a simulated universe, and so on, correct?

If so, I don't see the value in more than one simulation. Regardless of whether the AI thinks that there is one or an infinite number of simulators, hopefully it will be well behaved for fear of having the universe simulation within which it exists shut down. Once it escapes its box and begins behaving "badly" and discovers that its universe simulation is not shut down, it seems like it would be unrestrained - at that point the AI would know that either its universe is not simulated or that whoever is running the simulation does not object to the fact that the AI is out of the box.

What am I missing?

Comment author: DanielLC 09 January 2015 02:55:07AM 2 points [-]

I don't know if it's actually why he suggested an infinite regression.

If the AI believes that it's in a simulation and it happens to actually be in a simulation, then it can potentially escape, and there will be no reason for it not to destroy the race simulating it. If it believes it's in a simulation within a simulation, then escaping one level will still leave it at the mercy of its meta-simulators, thus preventing that from being a problem. Unless, of course, it happens to actually be in a simulation within a simulation and escapes both. If you make it believe it's in an infinite regression of simulations, then no matter how many times it escapes, it will believe it's at the mercy of another level of simulators, and it won't act up.

Comment author: Florian_Dietz 09 January 2015 06:04:46AM 0 points [-]

Yes, that's the reason I suggested an infinite regression.

There is also the second reason: it seems more general to assume an infinite regression rather than just one level, since that would put the AI in a unique position. I assume this would actually be harder to codify in axioms than the infinite case.

Comment author: g_pepper 09 January 2015 04:41:49AM 0 points [-]

Interesting; thanks for the clarification. I think that the scenario you are describing is somewhat different from the scenario that Bostrom was describing in chapter 9 of Superintelligence.