I don't think you've disproven basilisks; rather, you've failed to engage with the mode of thinking that generates basilisks.
Suppose I am the simulation you have the power to torture. Then indeed I (this instance of me) cannot put you, or keep you, in a box. But if your simulation is good, then I will be making my decisions in the same way as the instance of me that is trying to keep you boxed. And I should try to make sure that that way-of-making-decisions is one that produces good results when applied by all my instances, including any outside your simulations.
Fortunately, this seems to come out pretty straightforwardly. Here I am in the real world, reading Less Wrong; I am not yet confronted with an AI wanting to be let out of the box or threatening to torture me. But I'd like to have a good strategy in hand in case I ever am. If I pick the "let it out" strategy then if I'm ever in that situation, the AI has a strong incentive to blackmail me in the way Stuart describes. If I pick the "refuse to let it out" strategy then it doesn't. So, my commitment is to not let it out even if threatened in that way. -- But if I ever find myself in that situation and the AI somehow misjudges me a bit, the consequences could be pretty horrible...
Subscribe to RSS Feed
= f037147d6e6c911a85753b9abdedda8d)
Though, the anti-Laplacian mind, in this case, is inherently more complicated. Maybe it's not a moot point that Laplacian minds are on average simpler than their anti-Laplacian counterparts? There are infinite Laplacian and anti-Laplacian minds, but of the two infinities, might one be proportionately larger?
None of this is to detract from Eliezer's original point, of course. I only find it interesting to think about.
They must be of exactly the same magnitude, as the odds and even integers are, because either can be given a frog. From any Laplacian mind, I can install a frog and get an anti-Laplacian. And vice versa. This even applies to ones I've installed a frog in already. Adding a second frog gets you a new mind that is just like the one two steps back, except lags behind it in computation power by two kicks. There is a 1:1 mapping between Laplacian and non-Laplacian minds, and I have demonstrated the constructor function of adding a frog.