Rolf Nelson:  How to deter a rogue AI by using your first-mover advantage

Kevin

Let Pg be RAI's estimation of probability of successful completion of computation without conversion of Earth into computronium. Let Pb be RAI's estimation of probability of success if it converts Earth into computronium.

Case 1. RAI modifies itself into non-understanding of simulation threats.

When RAI is not simulated, it ignores all threats and proceeds to converting Earth into computronium, thus increasing its probability of success by Pb-Pg (it becomes protected from accidental/intentional turning-offs, etc.).

When RAI is simulated, it fails bargaining and its probability of success becomes 0.

Overall estimated probability of success: Pb times RAI's estimation of probability that it isn't being simulated (Pn).

Case 2. RAI doesn't modify itself.

Overall estimated probability of success: Pg.

Thus it is rational for RAI to modify itself iff Pb*Pn > Pg. Pn is defined by RAI's prior and cannot be updated in perfect simulation and in real world. Conclusion: don't build AI which doesn't believe simulation argument.

15

Rolf Nelson: How to deter a rogue AI by using your first-mover advantage

15

15

15

Rolf Nelson: How to deter a rogue AI by using your first-mover advantage

15

15