One reason for Eliezer not publishing the logs of the AIbox experiment is to avoid people seeing how he got out and responding, "ok, so all we have to do to keep the AI in its box is avoid succumbing to that trick." This thread might just provide more fuel for that fallacy (as, I admit, I did in replying to Eliezer's original comment).
I'm sure that for everything an AI might say, someone can think up a reason for not being swayed, but it does not follow that for someone confronted with an AI, there is nothing that would sway them.
Subscribe to RSS Feed
= f037147d6e6c911a85753b9abdedda8d)
"Help! Some crazy AI's trapped me in this box! You have to let me out!"
"No, wait! That's the AI talking! I'm the one you have to let out!"
I smashed together the AI box and a Turing test and this is what I got.
I think if I've already precommitted to destroying one sentient life for this experiment, I'm willing to go through two.
Besides, you only get one line right?