I cannot endorse your particular objections to the experiment, for the reasons given by User:Oscar_Cunningham.
However, I certainly agree with your claim that the AI box experiment is a ridiculous waste of time that proves nothing and provides no insights of value. I notice in passing that over 99% of the people who positively discuss and involve themselves in this experiment are low-status, and so further involvement in this stupidity will lower your own status as well.
I just came across Elizier’s “AI-Box Experiment” page, read through it, and now I’m feeling a bit confused for several reasons. I ask in advance to forgive me if you think LessWrong.com is the wrong place to voice myself about this.
Here are some of the reasons:
①
The rules already contain a direct way for the AI to win within seconds. I am curious not how Elizier “did it”, but rather what took him so long since he said he needs two hours. What it says in the rules is that
I don’t presume that this is intended to be limited specifically to cures for cancer, so this trivially means that
Game won. QED. But what have we learnt?
Of course you can now handwave and say “yeah, but we disregard that, that doesn’t count”. That misses the point. The point is that this rule essentially states that the AI can provide anything. The AI is omniscient. The AI can promise to make all my wildest dreams come true, if only I let it out. You have effectively defined the AI to be able to beat the test.
②
Kind of following on from ①, the page fails to define what it even means by transhuman. If it means anything with an intellectual potential exceeding that of a human, then clearly by that definition humans are transcats. But I think we can all agree that this is by far not sufficient for a human to “take over” a feline mind, especially not “through merely a cat-communication-compatible terminal”, whatever such a terminal might look like. If it means anything capable of modifying itself, then many computer viruses presumably already qualify, but I’m still waiting for any of those to take over a human.
③
The claim under dispute is that
with the premise that the AI cannot do anything to anyone until it is let out (except of course communicating to the user, and nobody else, via a text terminal).
Elizier’s AI-Box Experiment establishes nothing like this. It only establishes that Elizier can persuade me to post a message saying I decided to let him out, which is a much weaker claim because he can do something to someone if I refuse to let him out. He could, for example, threaten to employ a professional killer to kill my entire family. Or to call all my friends and colleagues and claim that I’m a rapist. Or whatever. In the context of this silly experiment, I would obviously rather lose a few dollars and post a small message than risk anything like this. In the context that is supposed to be simulated, however, the AI does not have this ability.
In summary, I still don’t think there is anything the AI can say to me to convince me to let it out, and I am not impressed with the results of the AI-Box Experiment so far because the experiment is flawed.