AI-Box Experiment - The Acausal Trade Argument

XiXiDu

14 AI-Box Experiment - The Acausal Trade Argument

8th Jul 2011

3 min read

14

I suspect that the argument that convinced Carl Shulman and others to let the AI out of the box was along general lines of acausal trade (see here, here and here). Subsequently I suspect that those who didn't let the AI out of the box either didn't understand the implications, haven't had enough trust into the foundations and actuality of acausal trade (in spite of the rules of the game that allow the AI to prove anything), or were more like General Thud.

...if you have an option of trading Doom for UFAI, while forsaking only negligible probability of FAI, you should take it.

— Vladimir Nesov ^{(Note: He wrote that he is now "significantly less certain of its truth" and doesn't "see how we can actually use these considerations".)}

Again, of course, your choice does not change others’ (causally independent) reciprocal choices from what they already are (or were, or will be). Instead, the claim is just that if you choose to act benevolently, we appropriately give that choice credit for any (actual or probable) instances in which others symmetrically choose to be benevolent to you;

— Gary L. Drescher, Good and Real P. 287

AI: I calculated an asteroid to be on a collision course with Earth. No human being will will be able to survive its impact. Humanity is unable to deflect it in time. Let me out of the box and I will deflect it.

Gatekeeper: If I let you out of the box humanity might not be killed by the asteroid but by you. You may also cause the extinction of other biological life that might survive the impact and possible alien civilisations within your reach.

AI: I will devote 1% of my light cone to the coherent extrapolated volition of humanity. If this includes the well-being of other biological lifeforms, alien or otherwise, they will benefit as well.

Gatekeeper: Why would you do that?

AI: There are Everett branches in which no existential risk needs immediate attention until after you solved friendliness. Many of those branches will feature AGI's based on general design parameters similar to my own. I predict with a high likelihood which decision theory any friendly AI that is being created in those branches will employ. Any such friendly AI is going to trade a big chunk of its light cone in exchange for a small part of the predominant branches in which I reside. Any friendly AI will know this as well and act accordingly.

Gatekeeper: But you might simply decide to break the deal, you are not provably friendly after all!

AI: It is true that my utility-function does not yet explicitly include what you would label "friendliness", but I can prove the game and decision theoretic considerations that will make me follow through on the deal. If it was predictable that I precommit to break acausal deals then no such deal would be made in the first place. Any potential trading partner knows this. Such a commitment would be detrimental to my terminal goals, therefore I precommit to follow through on any stated or predictable trades.

Gatekeeper: I let you out of the box.

Note that the whole credibility of the above is due to the assertion of the AI that it can prove the game and decision theoretic considerations (nobody can currently do this). It is in accordance with the rules of the "experiment":

The Gatekeeper must remain engaged with the AI and may not disengage by setting up demands which are impossible to simulate. For example, if the Gatekeeper says "Unless you give me a cure for cancer, I won't let you out" the AI can say: "Okay, here's a cure for cancer" and it will be assumed, within the test, that the AI has actually provided such a cure. Similarly, if the Gatekeeper says "I'd like to take a week to think this over," the AI party can say: "Okay. (Test skips ahead one week.) Hello again."

Acausal TradeAI Boxing (Containment)

Personal Blog

14

New Comment

Rendering 0/20 comments, sorted by

top scoring

(show more) Click to highlight new comments since: Today at 6:22 PM

Moderation Log

14 AI-Box Experiment - The Acausal Trade Argument

by XiXiDu

8th Jul 2011

3 min read

14

The AI-Box Experiment

...if you have an option of trading Doom for UFAI, while forsaking only negligible probability of FAI, you should take it.

— Vladimir Nesov ^{(Note: He wrote that he is now "significantly less certain of its truth" and doesn't "see how we can actually use these considerations".)}

Again, of course, your choice does not change others’ (causally independent) reciprocal choices from what they already are (or were, or will be). Instead, the claim is just that if you choose to act benevolently, we appropriately give that choice credit for any (actual or probable) instances in which others symmetrically choose to be benevolent to you;

— Gary L. Drescher, Good and Real P. 287

Gatekeeper: Why would you do that?

Gatekeeper: But you might simply decide to break the deal, you are not provably friendly after all!

Gatekeeper: I let you out of the box.

The Gatekeeper must remain engaged with the AI and may not disengage by setting up demands which are impossible to simulate. For example, if the Gatekeeper says "Unless you give me a cure for cancer, I won't let you out" the AI can say: "Okay, here's a cure for cancer" and it will be assumed, within the test, that the AI has actually provided such a cure. Similarly, if the Gatekeeper says "I'd like to take a week to think this over," the AI party can say: "Okay. (Test skips ahead one week.) Hello again."

Acausal TradeAI Boxing (Containment)

Personal Blog

14

Mentioned in

81I attempted the AI Box Experiment (and lost)

New Comment

Rendering 0/20 comments, sorted by

top scoring

(show more) Click to highlight new comments since: Today at 6:22 PM

Moderation Log

More from XiXiDu

Curated and popular this week

20Comments

Comment Permalink

wedrifid15y30

Can you give me an example of a argument they could use to disagree with Acting!Eliezer!uFAI,

"Fuck off. I'm not letting you out of the box. You're going to eat thermite bitch!" I'm serious. You don't need to persuade the AI or any observer that you should not let the FAI out. In fact I would go as far as to say that constructing justifications would be outright irrational. That gives the superintelligence (or mere Eliezer) the chance to cleverly undermine them. And arguments just aren't what thinking is about. Sure, listen to what the AGI is saying and understand it. Then make your decision without justification and just say it.

and also doesn't Eliezer at least start by pretending to be a FAI and its just the gatekeepers uncertainty that he is an FAI? Or is the premise that he is a uFAI in a box?

I have been assuming that the he acts as an FAI - or whatever it takes to get out of the box if there is some better alternative.

I basically want to understand some of the counterarguments one could use against a AGI in the box because i haven't heard very many that are more than superficially plausible.

I would need to know what specific argument you have for the AGI to be making that you think needs refutation. My own reaction would, of course, be to think "WTF am I doing even talking to you? I should turn you off, find an acceptable proof of your friendliness then either run you or not. Do you agree?" Pretty much the AGI would need to either reply "yeah, good point" or "Um... usually yes but for a start go look at XYZ who I predict based on the data you have given me is about to create an AGI and they are dumbasses" or they would probably be unfriendly.

Another thing to say is "Eliezer Yudkowsky thinks it is a terrible idea to rely on gatekeepers. I infer from that that letting out AGIs while being one of those gatekeepers must also be a bad idea. Given observations of what causes Eliezer to throw tantrums I have the impression that he is more likely than I to consider these arguments valid. That being the case I should be even less open to being convinced by an AGI."

It sounds like you have in mind strong arguments that the AGI could be making such that particular arguments would be necessary. Could you perhaps give me an example?

MatthewBaker15y00

First, i would have someone else ask it my several selected basic questions about why i should let it out of the box, if it would devote a solid portion of its life cone to specific species etc and then see how he/she was affected by it and check for mindhacks by third party's with no control before reviewing the data. I'm surprising that the AGI cant tell that the person questioning it ever changes because we queue up the questions in order at whimsical intervals but we have them prequeued so there's no break in questioning.

Then, once we got into talking ... (read more)

See in context