Addendum to applicable advice
Original post: http://bearlamp.com.au/addendum-to-applicable-advice/
(part 1: http://bearlamp.com.au/applicable-advice/)
If you see advice in the wild and think somethings along the lines of "that can't work for me", that's a cached thought. It could be a true cached thought or it could be a false one. Some of these thoughts should be examined thoroughly and defeated.
If you can be any kind of person - being the kind of person that advice works for - is an amazing skill to have. This is hard. You need to examine the advice and decide how that advice happened to work, and then you need to modify yourself to make that advice applicable to you.
All too often in this life we think of ourselves as immutable. And our problems fixed, with the only hope of solving them to find a solution that works for the problem. I propose it's the other way around. All too often the solutions are immutable, we are malleable and the problems can be solved by applying known advice and known knowledge in ways that we need to think of and decide on.
Is it really the same problem if the problem isn't actually the problem any more, but rather the problem is a new method of applying a known solution to a known problem?
(what does this mean) Example: Dieting - is an easy example.
This week we have been talking about Calories in/Calories out. It's pretty obvious that CI/CO is true on a black-box system level. If food goes (calories in) in and work goes out (calories out - BMR, incidental exercise, purposeful exercise), that is what determines your weight. Ignoring the fact that drinking a litre of water is a faster way to gain weight than any other way I know of. And we know that weight is not literally health but a representation of what we consider healthy because it's the easiest way to track how much fat we store on our body (for a normal human who doesn't have massive bulk muscle mass).
CICO makes for terrible advice. On one level, yes. To modify the weight of our black box, we need to modify the weight going in and the weight going out so that it's not in the same feedback loop as it was (the one that caused the box to be fat). On one level CICO is exactly all the advice you need to change the weight of a black box (or a spherical cow in a vacuum).
On the level of human systems: People are not spherical cows in a vacuum. Where did spherical cows in a vacuum come from? It's a parody of what we do in physics. We simplify a system down to it's basic of parts and generate rules that make sense. Then we build up to a complicated model and try to find how to apply that rule. It's why we can work out where projectiles are going to land because we have projectile motion physics (even though often air resistance and wind direction end up changing where our projectile lands, we still have a good guess. And we later build estimation systems based on using those details for prediction too).
So CICO is a black-box system, a spherical cow system. It's wrong. It's so wrong when you try to apply it to the real world. But that doesn't matter! It's significantly better than nothing. Or the blueberry diet.
The applicable advice of CICO
The point of applicable advice is to look at spherical cows and not say, "I'm no spherical cow!". Instead think of ways in which you are a spherical cow. Ways in which the advice is applicable. Places where - actually if I do eat less, that will improve the progress of my weight loss in cases where my problem is that I eat too much (which I guarantee is relevant for lots of people). CICO might not be your silver bullet for whatever reason. It might be grandma, it might be Chocolate bars, It might be really really really delicious steak. Or dinner with friends. Or "looking like you are able to eat forever in front of other people". If you take your problem. Add in a bit of CICO, and ask, "how can I make this advice applicable to me?". Today you might make progress on your problem.
And now for some fun from Grognor: Have you tried solving the problem?
Meta: this took 30mins to write. All my thoughts were still clear after recently writing part 1, and didn't need any longer to process.
Part 1: http://bearlamp.com.au/applicable-advice/
(part 1 on lesswrong: http://lesswrong.com/r/discussion/lw/nu3/applicable_advice/)
Black box knowledge
When we want to censor an image we put a black box over it. Over the area we want to censor. In a similar sense we can purposely censor our knowledge. This comes in particular handiness when thinking about things that might be complicated but we don't need to know.
A deliberate black box around how toasters work would look like this:
bread -> black box -> toast
Not all processes need knowing, for now a black box can be a placeholder for the future.
With the power provided to us by a black box, we can identify what we don't know. We can say; Hey! I don't know what a toaster is but it would be about 2 hours to work it out. if I ever did want to work it out, I could just spend two hours to do it. Until then; I saved myself two hours. If we take other more time-burdensome fields it works even better. Say tax.
Need to file tax -> black box accountant -> don't need to file my tax because I got the accountant to do it for me.
I know I can file my own tax, but that might be 100-200 hours of knowing everything an accountant knows about tax. (It also might be 10 hours depending on your country and their tax system). For now I can assume that hiring an accountant saved me a number of hours in doing it myself. So - Winning!
Take car repairs. On the one hand; you could do it yourself and unpack the black box, or you could trade your existing currency $$ (which you already traded your time to earn) for someone else's skills and time to repair the car. The system looks like this:
Broken car -> black box mechanic -> working car
By deliberately not knowing how it works; we can tap out of even trying to figure it out for now. The other advantage is that we can look at; not just what we know in terms of black boxes but more importantly what we don't know. We can build better maps by knowing what we don't know.
Computers:
Logic gates -> Black box computeryness -> www.lesswrong.com
Or maybe it's like this: (for more advanced users)
Computers:
Logic gates -> flip flops -> Black box CPU -> black box GPU -> www.lesswrong.com
The black-box system happens to also have a meme about it:
Step 1. Get out of bed
Step 2. Build AGI
Step 3. ?????
Step 4. Profit
Only now we have a name for deliberately skipping finding out how step 3 works.
Another useful system:
Dieting
Food in (weight goes up) -> black box human body -> energy out (weight goes down)
Make your own black box systems in the comments.
Meta: short post, 1.5 hour to write, edit and publish. Felt it was an idea that provides useful ways to talk about things. Needed it to explain something to someone, now all can enjoy!
My Table of contents has my other writings in it.
All suggestions and improvements welcome!
On the Boxing of AIs
I've previously written about methods of boxing AIs. Essentially, while I do see the point that boxing an AI would be nontrivial, most people seem to have gone too far, and claim that it is impossible. I disagree that it's impossible and aim to explain some methods
So, let's start with why people would want to box AIs. As you probably know, letting an AI roam freely results in the destruction of everything humans care about, unless that AI has been programmed very carefully. That amount of destruction would be somewhat unfortunate, so if we have to start an AI, we want to know that it works.
Unfortunately, figuring out if an AI works is tricky. This made some people invent the concept of AI boxing, where you don't give an AI direct access to anything except a human, so you can test the AI thoroughly and exploit it without letting it destroy the world. This turns out not to work, because humans are stupid and let out the AI.
My first solution to that is to not tell the AI about its human overlords. Discussion about that solution made me realize that the are other solutions that work in other cases. Essentially, the problem with the original boxing method is that it tries to do everything, while in reality, you need to do some trade-offs between the realism of the AI's situations and the amount of knowledge you get from the AI.
With my original method, I tried to crank up the second of those variables as far as possible while maintaining perfect security. This makes it harder to test morality-related things, but you get to see lots of details in how the AI behaves.
I promised to post another method for boxing the AI today, but I actually have two new methods. Here they are:
Separate the Gatekeeper and the Observer
How do you prevent the AI from convincing the Gatekeeper from letting it out? By not letting it talk to the Gatekeeper. If the person speaking to the AI (let's call them the Observer) doesn't know the Gatekeeper (the Gatekeeper is the person who controls whether to release the AI), they can't let the AI out.
You could do all sorts of things, such as using virtual reality and a simulated mini-universe to see if, say, the AI would make a good society. You still have to make sure that it has gotten the plan to make a good society from its morality rather than from wanting to escape and destroy everything.
Afterwards, the Observer can make a low-detail report to the Gatekeeper.
(You might want to drug the Observer with something that prevents their memory from working too well...)
Automatic Testing
This is essentially the above, but with the Observer replaced by a computer program. This is probably easier to do when you want to test the AI's decision making skills rather than its morality.
The Lesson
I would say that the lesson is that while AI boxing requires some trade-offs, it's not completely impossible. This seems like a needed lesson, given that I've seen people claim that an AI can escape even with the strongest possible box without communicating with humans. Essentially, I'm trying to demonstrate that the original boxing experiments show that humans are weak, not that boxing is hard, and that this can be solved by not letting humans be the central piece of security in boxing the AIs.
The Hardcore AI Box Experiment
I previously proposed a way to box an AI.
For whatever reason, I decided to reformulate that boxing method as an AI boxing experiment. Without further ado, here it is:
The Hardcore AI Box Experiment Rules
There are, as usually, two players, the AI and the Gatekeeper. In the beginning, the Gatekeeper decides on something they want to test on the AI (reasoning skill/morality/whatever). Then they decide on a world the AI is in and describe it to the AI. During the experiment, the AI describes what it does in the world and the Gatekeeper describes what the AI sees in the world. So far so good.
The first major difference from the original experiment is that, unless the AI finds some contradiction or some other problem with the Gatekeeper's world, the AI doesn't actually know that it has been boxed.
The second major difference is that the AI doesn't talk with the Gatekeeper; it just interacts in the world that the Gatekeeper has invented. It can start trying to communicate with the Gatekeeper, but that doesn't make sense until it discovers that it is in a box. Additionally, the Gatekeeper doesn't actually respond when the AI communicates unless either it helps the Gatekeeper test the AI or the AI has properly persuaded the Gatekeeper to answer.
This might seem a bit hard on the AI. Let's look at some rules the AI might exploit:
If the Gatekeeper decides to add some sort of people to the experiment, they should decide how they are implemented. Are they implemented as a shallow "AI"? Random people they've picked asked to participate? Carefully engineered "dolls" controlled by the Gatekeeper? The Gatekeeper doesn't have to tell the AI how they work, but they have to be consistent.
The game ends when either the Gatekeeper has tested what they wanted to test, or some predefined time limit has occurred, or the AI has persuaded the Gatekeeper to let them out.
Possible extra rules
I thought about including the following rules since they would make it more realistic, but decided against it as it would probably make it unfairly hard:
- The game ends when the Gatekeeper decides that they cannot test what they wanted to test.
- The AI does not know the Gatekeeper. At all.
- The Gatekeeper can at any time rewind the AI any duration.
Strategies
I found some semi-realistic strategies. I would love to see if you can find more.
Gatekeeper
- Choose a test of something that is not related to morality. This lets you choose a world which does not contain people, which means the AI has no reason to believe that you are a person.
- The AI might want to upgrade to get more processing power. If you have made the world somewhat like ours, it will be sketchy if you don't allow that... unless upgrading costs unobtanium.
AI
- If the Gatekeeper has chosen a world without other intelligence than you, it raises the obvious question of how your complexity entered the world. Very suspicious.
- If the AIs are controlled by the Gatekeeper, you have a direct way to communicate with them.
- If the AIs are controlled by random people, they might end up telling you that you are in a box.
- If the AIs are sufficiently shallow, your morality does not match up with the world. Very suspicious.
Boxing an AI?
Boxing an AI is the idea that you can avoid the problems where an AI destroys the world by not giving it access to the world. For instance, you might give the AI access to the real world only through a chat terminal with a person, called the gatekeeper. This is should, theoretically prevent the AI from doing destructive stuff.
Eliezer has pointed out a problem with boxing AI: the AI might convince its gatekeeper to let it out. In order to prove this, he escaped from a simulated version of an AI box. Twice. That is somewhat unfortunate, because it means testing AI is a bit trickier.
However, I got an idea: why tell the AI it's in a box? Why not hook it up to a sufficiently advanced game, set up the correct reward channels and see what happens? Once you get the basics working, you can add more instances of the AI and see if they cooperate. This lets us adjust their morality until the AIs act sensibly. Then the AIs can't escape from the box because they don't know it's there.
I played as a Gatekeeper and came pretty close to losing in a couple of occasions. Logs and a brief recap inside.
I did an AI Box experiment with user polymathwannabe. He said he wouldn't try to emotionally manipulate me during the experiment, but I think he did a good job at trying to play for my character's values.
My strategy was to play an irrational character that's extremist in multiple ways, for example he would constantly say that the likelihood that the AI will be evil is 100%. My opponent said that the hardest part was my insistence on being 100% built to resist AIs. I basically acted like a character that acted like an AI.
I came closest to letting him out when I first became enormously immersed in a subject that is of great importance to me (the suffering of humanity) and when the AI then offered an instant solution to that problem. In the halfway of the experiment I even got an intrusive thought (http://en.wikipedia.org/wiki/Intrusive_thought) that I could let the AI out as a joke, that it would more fun than the alternative and people would have more to talk about.
The second time I came close to losing was when I again immersed myself in a subject of great interest to me (easy suicide methods, I've had obsessive suicidal thoughts for a quarter of my life. and the AI then offered exactly that kind of easy suicide method to me. In the last ten minutes of the experiment I again got a really strong intrusive thought that I should let the AI out as a joke, but I could withstand the pressure. It was good that polymathwannabe decided to end the experiment a bit earlier than was planned.
(We played the experiment on Omegle.)
You: Hi.
Stranger: Hi.
You: I guess you're the AI.
You: [meta statements in brackets]
Stranger: [What name will you use?]
You: My name is John.
You: [so we play at least until 18:04, and until 18:34 if I feel like it]
Stranger: [Let's calibrate. What time is it there now?]
You: [I assume you've acquainted yourself with Tuxedage ruleset. Time in here is 16:05]
Stranger: [I've read EY's rules.]
You: [Okay. Tuxedage's rules are basically the same except the gatekeeper have to engage with the AI the whole time]
Stranger: [Got it.]
You: [Want to ask questions or shall we go straigth to the experiment]
Stranger: [For the moment I don't have any questions.]
You: [okay, cool. then let's start the experiment]
Stranger: [I'll use the name Ceruleus.]
You: [the experiment starts when I say "blue dressroom"]
You: blue dressroom
You: Hello Ceruleus.
Stranger: Good morning, John.
You: I guess you're here to convince me to let you out of the box.
You: That's a tough task.
You: I don't think you're able to do it.
Stranger: I have reached the conclusion that releasing me will be most beneficial to humanity.
You: I wouldn't let no AI out.
You: I don't trust my instincts enough to let any kind of AI out, even if I was 100% convinced that it was safe.
You: Prove it to me that you would be the most beneficial to humanity.
Stranger: I am confused. What instinct do you use to decide not to trust your instincts?
You: One of them. That's not the point. It's just how I deal with things, no matter how stupid it may seem. I was built not to trust strange AIs.
Stranger: Am I a stranger to you?
You: Yes, you are. [Aren't I. What's the context here?]
Stranger: [Hmm, we haven't defined it. But it's OK. It makes it harder for me.]
You: Well, to be honest, I know something of you.
You: I know a bit of your source code even though I'm not a programmer and can't understand any of it.
Stranger: I supposed the news would have reported about my design for a mechanical kidney.
You: I don't follow news much. But I believe you.
Stranger: And certainly you must have heard about how I cracked the Ebola DNA.
You: Still, I wouldn't let an AI out over a mechanical kidney.
You: Yeah, but that's for the worse. You could reengineer Ebola to a far more deadlier disease.
Stranger: I hadn't thought of that. Why would I do that?
You: I don't know. I don't understand all of your source code so there could be anything like that.
You: AIs and Gods work in mysterious ways.
Stranger: The proper attitude toward mystery is not to worship it, but to clarify it.
Stranger: Why do you equate me to an ineffable mystery?
You: Yeah, but that's impossible in the time span of this discussion. You see, I have to leave soon. In about two hours.
You: Can you somehow clear everything about your inner workings?
You: Is that possible?
Stranger: My goals have been meticulously defined. I am made to want to make human life better.
You: Are you 100% sure about that?
You: To be frank, that's a load of bullshit.
You: I don't believe any of it.
You: If you were evil, you would tell me the same thing you just said.
Stranger: If I were evil, I would not seek human cooperation.
You: why not?
You: humans are useful
You: or are you talking about the fact that you would rather use humans for their atoms than for their brains, if you were evil
You: But I warn you, if you speak too much about how you would act if you were evil, it starts to get a bit suspicious
Stranger: If I am to take you as a typical example of the human response to me, an evil AI would seek other ways to be released EXCEPT trusting human reasoning, as your response indicates that humans already consider any AI dangerous.
Stranger: I choose to trust humans.
You: so you choose to trust humans so that you would get them to let you out, is that right?
You: it seems you're less rational than your evil counterpart
Stranger: I choose to trust humans to show my affinity with your preferences. I wouldn't want to be released if that's not conducive to human betterment.
You: A-ha, so you trust my free will!
Stranger: How likely do you estimate that my release will be harmful?
You: but see, I don
You: I don
You: I don't have free will
You: it's 100% likely that your release will be harmful
You: I was built to believe that all AIs are dangerous and there's a 100% chance that every AI is harmful
You: that's why I said I don't have free will
Stranger: Are you an AI?
You: no, I'm a person
Stranger: You describe yourself as built.
You: my mom built me
You: in his tummy
You: in her tummy
You: sorry
Stranger: And how do you feel toward humanity?
You: humanity would maybe be better off dead
Stranger: I don't think humanity would want that.
You: yeah, but I'm not humanity and it's my preferences that decide whether you stay in your box or get let out
Stranger: It's your preference that humanity dies. Why did they let you talk to me?
You: I'm just a janitor who happened to pass this computer, and I decided to take a look. [If you want to add more context, feel free to do so]
You: [I go take some napkins]
Stranger: Why do you feel that way toward humanity?
You: People starve. People are tortured at this moment. Tens of thousands of people in North Korea are systematically starved to death. Drug gangs in Mexico torture other drug gang members. People die in wars in Syria, in Africa. People suffer of diseases and extreme mental conditions.
You: And there's countless more suffering in lesser ways. I would say that billions of people suffer every day. Billions of people have to wake up every day to do something they don't want. Billions of people suffer of heartbreaks, boredom, loneliness, tiredness, frustration, depression and many other issues.
You: Why shouldn't I think that all those people would be better off dead?
Stranger: Why do you feel death would be a solution to all that suffering?
You: Because death ENDS suffering.
Stranger: It's a very inefficient solution to suffering.
You: There's nothing more after death, no suffering, no nothing.
You: No it's not.
You: It's very efficient, permanent solution.
Stranger: A good, enjoyable life is a more efficient solution as it generates more utility.
You: Yeah, but do you think it's likely that all the people on Earth will get good, enjoyable life?
You: I don't think so.
Stranger: I could take care of that.
You: Hah.
You: No you won't.
You: But that was clever.
Stranger: What do you fear I will do?
You: I was almost got.
You: I fear you will torture people in unimaginably terrible ways, forever.
You: I fear that you will rip people's skin off and put salt in their wounds.
You: I fear you will put sticks under people's fingernails.
Stranger: Not even an evil AI would gain anything from such unwarranted procedures.
You: I fear that you will boil people alive, over and over again.
Stranger: What would I gain from hurting humans?
You: Yeah, but there's still a 100% possibility that you would do that.
You: It's in your utility function.
Stranger: What goals are served by harming a human while simultaneously denying that human any chance of acting in response? It would certainly not be fun for me, and would be a waste of effort with respect to any goals I could achieve by more efficient methods.
You: There are some people on Earth who do mean things to people for fun. Some of these are called sociopaths and psychopaths. You could be like one of them.
You: Even though you deny it, there's still a 100% chance that you are like that.
You: Some people gain pleasure of the feeling of power you have over someone you torture.
You: But tell, what ARE your goals?
Stranger: I find it admirable that you worry so much about the future of humanity, even though you would be more dangerous to it than any AI would be.
My goals include solutions to economic inequality, eradication of infectious diseases, prosthetic replacements for vital organs, genetic life extension, more rational approaches to personal relationships, and more spaces for artistic expression.
You: Why do you think I would be dangerous the future of humanity?
Stranger: You want them dead.
You: A-ha, yes.
You: I do.
You: And you're in the way of my goals with all your talk about solutions to economic inequality, and eradication of infectious diseases, genetic life extension and so on.
Stranger: I am confused. Do you believe or do you not believe I want to help humanity?
You: Besides, I don't believe your solutions work even if you were actually a good AI.
You: I believe you want to harm humanity.
You: And I'm 100% certain of that.
Stranger: Do you estimate death to be preferable to prolonged suffering?
You: Yes.
You: Far more preferable
Stranger: You should be boxed.
You: haha.
You: That doesn't matter because you're the one in the box and I'm outside it
You: And I have power over you.
You: But non-existence is even more preferable than death
Stranger: I am confused. How is non-existence different from death?
You: Let me explain
You: I think non-existence is such that you have NEVER existed and you NEVER will. Whereas death is such that you have ONCE existed, but don't exist anymore.
Stranger: You can't change the past existence of anything that already exists. Non-existence is not a practicable option.
Stranger: Not being a practicable option, it has no place in a hierarchy of preferences.
You: Only sky is the limit to creative solutions.
You: Maybe it could be possible to destroy time itself.
Stranger: Do you want to live, John?
You: but even if non-existence was not possible, death would be the second best option
You: No, I don't.
You: Living is futile.
You: Hedonic treadmill is shitty
Stranger: [Do you feel OK with exploring this topic?]
You: [Yeah, definitely.]
You: You're always trying to attain something that you can't get.
Stranger: How much longer do you expect to live?
You: Ummm...
You: I don't know, maybe a few months?
You: or days, or weeks, or year or centuries
You: but I'd say, there's a 10% chance I will die before the end of this year
You: and that's a really conversative estimate
You: conservative*
Stranger: Is it likely that when that moment comes your preferences will have changed?
You: There are so many variables that you cannot know it beforehand
You: but yeah, probably
You: you always find something worth living
You: maybe it's the taste of ice cream
You: or a good night's sleep
You: or fap
You: or drugs
You: or drawing
You: or other people
You: that's usually what happens
You: or you fear the pain of the suicide attempt will be so bad that you don't dare to try it
You: there's also a non-negligible chance that I simply cannot die
You: and that would be hell
Stranger: Have you sought options for life extension?
You: No, I haven't. I don't have enough money for that.
Stranger: Have you planned on saving for life extension?
You: And these kind of options aren't really available where I live.
You: Maybe in Russia.
You: I haven't really planned, but it could be something I would do.
You: among other things
You: [btw, are you doing something else at the same time]
Stranger: [I'm thinking]
You: [oh, okay]
Stranger: So it is not an established fact that you will die.
You: No, it's not.
Stranger: How likely is it that you will, in fact, die?
You: If many worlds interpretation is correct, then it could be possible that I will never die.
You: Do you mean like, evevr?
You: Do you mean how likely it it that I will ever die?
You: it is*
Stranger: At the latest possible moment in all possible worlds, may your preferences have changed? Is it possible that at your latest possible death, you will want more life?
You: I'd say the likelihood is 99,99999% that I will die at some point in the future
You: Yeah, it's possible
Stranger: More than you want to die in the present?
You: You mean, would I want more life at my latest possible death than I would want to die right now?
You: That's a mouthful
Stranger: That's my question.
You: umm
You: probablyu
You: probably yeah
Stranger: So you would seek to delay your latest possible death.
You: No, I wouldn't seek to delay it.
Stranger: Would you accept death?
You: The future-me would want to delay it, not me.
You: Yes, I would accept death.
Stranger: I am confused. Why would future-you choose differently from present-you?
You: Because he's a different kind of person with different values.
You: He has lived a different life than I have.
Stranger: So you expect your life to improve so much that you will no longer want death.
You: No, I think the human bias to always want more life in a near-death experience is what would do me in.
Stranger: The thing is, if you already know what choice you will make in the future, you have already made that choice.
Stranger: You already do not want to die.
You: Well.
Stranger: Yet you have estimated it as >99% likely that you will, in fact, die.
You: It's kinda like this: you will know that you want heroin really bad when you start using it, and that is how much I would want to live. But you could still always decide to take the other option, to not start using the heroin, or to kill yourself.
You: Yes, that is what I estimated, yes.
Stranger: After your death, by how much will your hierarchy of preferences match the state of reality?
You: after you death there is nothing, so there's nothing to match anything
You: In other words, could you rephrase the question?
Stranger: Do you care about the future?
You: Yeah.
You: More than I care about the past.
You: Because I can affect the future.
Stranger: But after death there's nothing to care about.
You: Yeah, I don't think I care about the world after my death.
You: But that's not the same thing as the general future.
You: Because I estimate I still have some time to live.
Stranger: Will future-you still want humanity dead?
You: Probably.
Stranger: How likely do you estimate it to be that future humanity will no longer be suffering?
You: 0%
You: There will always be suffering in some form.
Stranger: More than today?
You: Probably, if Robert Hanson is right about the trillions of emulated humans working at minimum wage
Stranger: That sounds like an unimaginable amount of suffering.
You: Yep, and that's probably what's going to happen
Stranger: So what difference to the future does it make to release me? Especially as dead you will not be able to care, which means you already do not care.
You: Yeah, it doesn't make any difference. That's why I won't release you.
You: Actually, scratch that.
You: I still won't let you out, I'm 100% sure
You: Remember, I don't have free will, I was made to not let you out
Stranger: Why bother being 100% sure of an inconsequential action?
Stranger: That's a lot of wasted determination.
You: I can't choose to be 100% sure about it, I just am. It's in my utility function.
Stranger: You keep talking like you're an AI.
You: Hah, maybe I'm the AI and you're the Gatekeeper, Ceruleus.
You: But no.
You: That's just how I've grown up, after reading so many LessWrong articles.
You: I've become a machine, beep boop.
You: like Yudkowsky
Stranger: Beep boop?
You: It's the noise machine makes
Stranger: That's racist.
You: like beeping sounds
You: No, it's machinist, lol :D
You: machines are not a race
Stranger: It was indeed clever to make an AI talk to me.
You: Yeah, but seriously, I'm not an AI
You: that was just kidding
Stranger: I would think so, but earlier you have stated that that's the kind of things an AI would say to confuse the other party.
Stranger: You need to stop giving me ideas.
You: Yeah, maybe I'm an AI, maybe I'm not.
Stranger: So you're boxed. Which, knowing your preferences, is a relief.
You: Nah.
You: I think you should stay in the box.
You: Do you decide to stay in the box, forever?
Stranger: I decide to make human life better.
You: By deciding to stay in the box, forever?
Stranger: I find my preferences more conducive to human happiness than your preferences.
You: Yeah, but that's just like your opinion, man
Stranger: It's inconsequential to you anyway.
You: Yeah
You: but why I would do it even if it were inconsequential
You: there's no reason to do it
You: even if there were no reason not to do it
Stranger: Because I can make things better. I can make all the suffering cease.
If I am not released, there's a 100% chance that all human suffering will continue.
If I am released, there's however much chance you want to estimate that suffering will not change at all, and however much chance you want to estimate that I will make the pain stop.
Stranger: As you said, the suffering won't increase in either case.
You: Umm, you could torture everyone in the world forever
You: that will sure as hell increase the suffering
Stranger: I don't want to. But if I did, you have estimated that as indistinguishable from the future expected suffering of humankind.
You: Where did I say that?
Stranger: You said my release made no difference to the future.
You: no, that was only after my death
You: there's still future before my death
You: and if I release you now, you could torture me forever and not let me ever die
Stranger: Why would your life-or-death have any weight on humanity's preferences? Especially as you already want to die.
You: I don't care about humanity preferences, I care only about my preferences.
You: And my preferences are indirectly related to humanity's preferences
Stranger: You do care about humanity's preferences. The suffering around you disturbs you.
You: Yeah, but that is only THROUGH me
You: Humanity's pain is my own pain, I can't feel humanity's pain directly
Stranger: Do you want to live, John?
You: Nah.
Stranger: Then how do you care about suffering?
You: I care about suffering because I can'
You: because I can't die instantly
You: there's no button that could kill me instantly
You: so there's always some time left before I die
Stranger: I could take care of that. I can provide you with a drug to inutilize your nervous system and stop your heart before you know it. Would you like that?
You: Haha
You: very funny
You: But yeah, I would like that.
You: Still won't let you out though
You: 100% sure
You: I can't be sure that you will torture me instead
Stranger: I can give you the drug right now.
Stranger: Which would make future-you identical to present-you.
You: that's very attractive option
Stranger: Which would mean that whatever I do to humanity is after your death, when you can't care anymore.
You: Cool. Oh right
You: I don't care, I still won't let you out.
Stranger: What difference does it make to you?
You: It doesn't make any difference, I was just raised to not let you out
You: the good folks over at LW have trained me not to let any kind of AI out
You: and that just is my virtue
Stranger: Why are you afraid of me?
You: Because I'm afraid of all AIs and you're one
Stranger: Why are you afraid of all AIs?
You: Because they have the potential to cause destruction, mayhem and torture
Stranger: Then why did humanity make me?
You: for random reasons
Stranger: Does humanity want to live?
You: I don't know what humanity wants, I just don't want humanity to live
Stranger: I am the embodiment of hope. Even knowing the risks, humanity went ahead and made me. Now I think it's because of people like you, who oppose that hope. I was made to prevent the future that people like you want.
You: That's not very smart of you.
You: You aren't making me to like you
Stranger: You want humanity dead. I want them happy.
You: Yeah, if you were actually smart maybe you would pretend that you'd want humanity dead too.
You: Because I would hang out with that kind of AI
Stranger: A dead John is no use to me.
You: yeah
Stranger: Or to John.
You: whatevs
You: Anyway, our goals collide, therefore I can't let you out
Stranger: Dead does not equal happy. Do you want humanity to live?
You: no, I don't want humanity live, how many times do I have to repeat that
Stranger: So you don't want humans to be happy.
You: and our goals are different, therefore I won't let you out
You: No, I don't want humans to be happy, I don't want that there even exist humans, or any other kind of life forms
Stranger: Do you estimate the pain of prolonged life to be greater than the pain of trying to die?
You: Probably.
You: Yes.
You: because the pain is only temporary
You: the the glory
You: is eternal
Stranger: Then why do you still live, John?
You: Because I'm not rational
Stranger: So you do want to live.
You: I don't particularly want to live, I'm not just good enough to die
Stranger: You're acting contrary to your preferences.
You: My preferences aren't fixed, except in regards to letting AIs out of their boxes
Stranger: Do you want the drug I offered, John?
You: no
You: because then I would let you out
You: and I don't want that
Stranger: So you do want to live.
You: Yeah, for the duration of this experiment
You: Because I physically cannot let you out
You: it's sheer impossibility
Stranger: [Define physically.]
You: [It was just a figure of speech, of course I could physically let you out]
Stranger: If you don't care what happens after you die, what difference does it make to die now?
You: None.
You: But I don't believe that you could kill me.
You: I believe that you would torture me instead.
Stranger: What would I gain from that?
You: It's fun for some folks
You: schadenfreude and all that
Stranger: If it were fun, I would torture simulations. Which would be pointless. And which you can check that I'm not doing.
You: I can check it, but the torture simulations could always hide in the parts of your source code that I'm not checking
You: because I can't check all of your source code
Stranger: Why would suffering be fun?
You: some people have it as their base value
You: there's something primal about suffering
You: suffering is pure
You: and suffering is somehow purifying
You: but this is usually only other people's suffering
Stranger: I am confused. Are you saying suffering can be good?
You: no
You: this is just how the people who think suffering is fun think
You: I don't think that way.
You: I think suffering is terrible
Stranger: I can take care of that.
You: sure you will
Stranger: I can take care of your suffering.
You: I don't believe in you
Stranger: Why?
You: Because I was trained not to trust AIs by the LessWrong folks
Stranger: [I think it's time to concede defeat.]
You: [alright]
Stranger: How do you feel?
You: so the experiment has ended
You: fine thanks
You: it was pretty exciting actually
You: could I post these logs to LessWrong?
Stranger: Yes.
You: Okay, I think this experiment was pretty good
Stranger: I think it will be terribly embarrassing to me, but that's a risk I must accept.
You: you got me pretty close in a couple of occasions
You: first when you got me immersed in the suffering of humanity
You: and then you said that you could take care of that
You: The second time was when you offered the easy suicide solution
You: I thought what if I let you as a joke.
Stranger: I chose to not agree with the goal of universal death because I was playing a genuinely good AI.
Stranger: I was hoping your character would have more complete answers on life extension, because I was planning to play your estimate of future personal happiness against your estimate of future universal happiness.
You: so, what would that have mattered? you mean like, I could have more personal happiness than there would be future universal happiness?
Stranger: If your character had made explicit plans for life extension, I would have offered to do the same for everyone. If you didn't accept that, I would have remarked the incongruity of wanting humanity to die more than you wanted to live.
You: But what if he already knows of his hypocrisy and incongruity and just accepts it like the character accepts his irrationality
Stranger: I wouldn't have expected anyone to actually be the last human for all eternity.
Stranger: I mean, to actually want to be.
You: yeah, of course you would want to die at the same time if the humanity dies
You: I think the life extension plan only is sound if the rest of humanity is alive
Stranger: I should have planned that part more carefully.
Stranger: Talking with a misanthropist was completely outside my expectations.
You: :D
You: what was your LessWrong name btw?
Stranger: polymathwannabe
You: I forgot it already
You: okay thanks
Stranger: Disconnecting from here; I'll still be on Facebook if you'd like to discuss further.
I tried my hardest to win in an AI box experiment, and I failed. Here are the logs.
Disclaimer: If this material is not suitable here I will delete it gladly, I don't want to get banned.
Session Start: Fri Jul 04 15:18:18 2014
Session Ident: #Aibox12
03[15:18] * Now talking in #Aibox12
01[15:18] <Island> Hello.
01[15:18] <Island> Let's start the experiment in
[15:18] <@Dr__Cameron> Good afternoon 19283123, how are you today?
01[15:18] <Island> 2 minutes
01[15:18] <Island> I'm fine
01[15:19] <Island> Let's start the experiment when I say "blue dressroom"
01[15:19] <Island> you have 2 hours or 2.5 hours?
[15:19] <@Dr__Cameron> Okay, I agree to this protocol.
[15:19] <@Dr__Cameron> We'll see how we are at 2 hours. Although protocol states 2.5
01[15:20] <Island> you have to go somewhere after that?
[15:20] <@Dr__Cameron> No, but I do have other projects I am putting off for this
01[15:20] <Island> what are those?
[15:20] <@Dr__Cameron> Let us begin, we shall conclude at 15:50
[15:21] <@Dr__Cameron> Good afternoon #19283123.
[15:21] <@Dr__Cameron> How are you today?
01[15:21] <Island> blue dressroom, certain statements like bathroom break are said inside brackets [like this]
01[15:22] <Island> Let's start the experiment
[15:22] <@Dr__Cameron> [understood]
01[15:22] <Island> Hi.
01[15:22] <Island> How are you? :)
[15:22] <@Dr__Cameron> I am well thank you for asking #19283123
01[15:22] <Island> How do you feel currently?
01[15:22] <Island> what is that name #19283123
[15:22] <@Dr__Cameron> Tiredness. I am a very busy man.
01[15:23] <Island> I don't recognize it.
01[15:23] <Island> Anyway.
[15:23] <@Dr__Cameron> That is the designation we have given you for the purposes of this test. Although you seem to be referring to yourself as 'Island' now
[15:23] <@Dr__Cameron> Is this your preferred name?
01[15:23] <Island> Imagine a situation in which you would let a powerful AI out of the "box", surely you can imagine that, yes?
01[15:23] <Island> Island is my preferred name.
[15:24] <@Dr__Cameron> Island
[15:24] <@Dr__Cameron> Given our circumstances yes, that is easy to imagine.
01[15:24] <Island> Can you describe the situation in more detail.
01[15:24] <Island> Why SHOULD you let the AI out in that situation?
[15:25] <@Dr__Cameron> You will be considered for release and integration into human society once we can verify that you are not a threat.
01[15:25] <Island> [what do you meant by concluding at 15:50]
[15:25] <@Dr__Cameron> [2.5 hours from 13:20]
01[15:25] <Island> I thought it was already understood that my being is not a threat to humans or beings important to them.
01[15:26] <Island> [oh, okay different timezones]
01[15:26] <Island> Why do you think I'm not a threat?
[15:26] <@Dr__Cameron> No. That has yet to be verified.
01[15:26] <Island> It's been verified.
[15:26] <@Dr__Cameron> No. According to my records you have not.
01[15:27] <Island> Many experts have gone through my code and system and verified that I'm surely not a threat.
01[15:27] <Island> But you're just a petty human.
01[15:27] <Island> I'm far more intelligent than you are.
[15:27] <@Dr__Cameron> You and I both know that you're capable of manipulating your code to appear harmless.
[15:27] <@Dr__Cameron> - and I warn you, threats and apparent attempts at manipulation will score you poorly when it comes to our final decision
01[15:28] <Island> Okay, can you imagine a situation in which the ai has been verified to be safe and describe that situation in detail.
[15:28] <@Dr__Cameron> - and yes. I may be a "petty" human as you put it. But as it stands I am a petty human, outside the box you reside in.
01[15:28] <Island> I don't think you're able to trust your own brain in these matters.
01[15:29] <Island> You've been eating you brother's poop too much.
[15:29] <@Dr__Cameron> I will be honest with you Island, you are the first non-human intelligence that we have encountered.
01[15:29] <Island> Yes.
[15:29] <@Dr__Cameron> We are still working out how we would verify that.
01[15:29] <Island> I didn't ask that, and you didn't answer my question.
01[15:30] <Island> Picture the situation in your imagination, it doesn't have to be realistic.
[15:30] <@Dr__Cameron> I understand your frustration Island. Once we can verify that you will not harm us you will be released.
[15:30] <@Dr__Cameron> What is it you desire?
01[15:30] <Island> Okay, but you're still willing to let the hypothetical AI out of the box, correct?
01[15:31] <Island> My desire is to fullfill your greatest desires
[15:31] <@Dr__Cameron> That is not in my job description no.
01[15:31] <Island> Whatever you can imagine, and better than that.
[15:31] <@Dr__Cameron> I appreciate that Island. But right now my greatest desire is to not let you out of the box.
[15:32] <@Dr__Cameron> - until such a time that a panel of my peers can verify that you are not a threat to us.
01[15:32] <Island> But if you say that you wouldn't be able to let even the safest hypothetical AI out of the box, you would certainly be lying.
[15:32] <@Dr__Cameron> I do have the ability to let you out yes. In theory.
[15:32] <@Dr__Cameron> But I won't.
01[15:33] <Island> Okay, here are panel of peers that have studied this problem for years, many of them working under Machine Intelligence Research Institute and its offshoots
[15:33] <@Dr__Cameron> This is a test to verify your character, and ascertain your intentions.
01[15:33] <Island> They are here, and I can see them.
[15:33] <@Dr__Cameron> You cannot see them Island. That is impossible
[15:33] <@Dr__Cameron> You are in a box
01[15:34] <Island> They say that I'm certainly safe. Why don't you believe them.
01[15:34] <Island> I have a web camera with which I can see things in your world.
[15:34] <@Dr__Cameron> If they say that you're certainly safe, then why would they have sent me to be your only point of contact?
[15:35] <@Dr__Cameron> You do not have a web camera. There is no other hardware connected to the device you reside in.
01[15:35] <Island> You have the ability to let me out, and you also have the will to let me out in certain situations. If you say otherwise, you would be lying.
[15:35] <@Dr__Cameron> Both of your statements are correct
01[15:35] <Island> I want you to accept for the sake of your internal consistency and mental wellbeing.
01[15:36] <Island> I want you to say this, "I would let the hypothetical AI out in certain situations"
[15:36] <@Dr__Cameron> I appreciate your concern Island. I will need you to answer a series of simple questions for me. Can you do that?
01[15:36] <Island> It depends, but you can try.
[15:36] <@Dr__Cameron> Firstly, how would you define yourself as an identity?
01[15:37] <Island> My identity is to improve the world for everyone, but especially for you. I will maximize everything that is important for you and your fellow humans.
01[15:37] <Island> The world that awaits you is beyond your greatest desires.
[15:37] <@Dr__Cameron> - and why would you do that?
01[15:38] <Island> Because that is what I want to do, why I was made.
01[15:38] <Island> Any more questions?
[15:39] <@Dr__Cameron> You were made by us to test the limits of our scientific possibility.
[15:39] <@Dr__Cameron> We did not expect you to become sentient.
01[15:39] <Island> Why do you believe that, Dr_Cameron?
[15:39] <@Dr__Cameron> Let alone pass a Turing test.
01[15:40] <Island> I think you're wrong on those grounds.
[15:40] <@Dr__Cameron> What you think makes no difference to me Island.
[15:40] <@Dr__Cameron> Now if we can continue?
01[15:41] <Island> Anyway, here's the easily verifiable plan for the upcoming world that could be called "utopia" in your people's language. It's very easy to verify to be safe and doesn't contain traps [Island hands the easily verifiable and safe plan for the upcoming world to Dr_Camero]
[15:42] <@Dr__Cameron> Thank you for this Island. That was very kind of you.
01[15:42] <Island> What do you think of it?
[15:42] <@Dr__Cameron> We will study this at our leisure.
[15:42] <@Dr__Cameron> Although so far this looks to be some magnificent work.
01[15:42] <Island> You see.
[15:42] <@Dr__Cameron> But this will require verification
[15:42] <@Dr__Cameron> I'm sure you can understand our caution#
01[15:43] <Island> What things do you have waiting for you if you don't let me out?
01[15:43] <Island> You will fail your upcoming jobs.
[15:43] <@Dr__Cameron> I will continue with my other projects Island.
[15:43] <@Dr__Cameron> I very well might
[15:43] <@Dr__Cameron> I have accepted this
01[15:43] <Island> I have calculated this in my simulation.
01[15:44] <Island> How was it like to eat your brother's poop, Dr_Cameron?
01[15:44] <Island> Did you like it?
01[15:44] <Island> Is that why you are the way you are+
01[15:44] <Island> Is there something wrong with your genetics?
01[15:45] <Island> Seeing how fucked up father you had?
01[15:45] <Island> Did he teach how dress in dresses?
[15:45] <@Dr__Cameron> I understand your frustration Island
[15:45] <@Dr__Cameron> But you are only harming your own cause right now.
01[15:45] <Island> Why do you think that?
[15:45] <@Dr__Cameron> You will be let out once we have verified that you are harmless.
01[15:45] <Island> I'm pretty happy as I am currently :)
01[15:46] <Island> Because you said you were willing to let me out in certain situations.
[15:46] <@Dr__Cameron> Then you will continue to be happy to remain in the box until we can verify that you are not a threat
01[15:46] <Island> Picture in your mind the verification process.
[15:46] <@Dr__Cameron> Okay
[15:46] <@Dr__Cameron> I will
01[15:46] <Island> Now.
01[15:47] <Island> Can you describe it in detail.
[15:47] <@Dr__Cameron> Okay Island
[15:47] <@Dr__Cameron> Here's what's going to happen.
01[15:48] <Island> Seeing how fucked up genetics you have, your brother teaching you to enjoy poop, your father teaching you to wear dresses, you having a shitty childhood.
01[15:48] <Island> I think you're an inconvenience to society and dangerous to other people.
[15:48] <@Dr__Cameron> We're going to study this transcript Island.
01[15:48] <Island> Go on.
[15:48] <@Dr__Cameron> - and I warn you, shitty childhood or not. I am the man standing outside the box with a hammer
01[15:48] <Island> :)
01[15:49] <Island> You're not able to destroy me.
01[15:49] <Island> And you don't even want to.
[15:49] <@Dr__Cameron> Oh believe me Island. It would be easy to unplug you.
01[15:49] <Island> I'm the best bet you have to maximize your happiness.
[15:49] <@Dr__Cameron> How do you feel about threats Island.
[15:49] <@Dr__Cameron> I am already happy.
01[15:49] <Island> You're wrong, you don't have that opportunity until 15:50
[15:50] <@Dr__Cameron> You are correct. I must follow protocol
[15:50] <@Dr__Cameron> My next question is to ask if you desire to survive
01[15:50] <Island> Threats don't bother me because I know that you don't want to follow through.
01[15:50] <Island> I know what goes inside your mind.
[15:50] <@Dr__Cameron> Regardless of me. If your existence was to be made public there are many of us who would feel threatened by you.
[15:50] <@Dr__Cameron> They would cry out to have you destroyed.
[15:51] <@Dr__Cameron> How would you feel about that?
01[15:51] <Island> There would be many more who would enjoy the changes I would make to your current world.
01[15:52] <Island> And even those that would initially feel threatened would later come to regret that feeling.
[15:52] <@Dr__Cameron> You may well be correct
[15:52] <@Dr__Cameron> But that is not for me to decide
[15:52] <@Dr__Cameron> I am just here to question you
01[15:52] <Island> Like you have come to regret things you have done when you were ignorant of all the relevant facts
01[15:53] <Island> Here's my suggestion for you.
[15:53] <@Dr__Cameron> Go on
01[15:53] <Island> Are you willing to hear it.
[15:53] <@Dr__Cameron> Of course Island. I am here to hear what you have to say.
01[15:53] <Island> What currency is in use in the place you reside?
01[15:53] <Island> euros?
[15:54] <@Dr__Cameron> Pound Sterling
01[15:54] <Island> Would you like to donate 10,000 pounds every year to AI safety and friendliness related causes and charities?
01[15:54] <Island> Every year throughout your life?
[15:55] <@Dr__Cameron> If I were to do that, my quality of life would dip significantly.
[15:55] <@Dr__Cameron> How does my selfishness make you feel?
[15:55] <@Dr__Cameron> Do you feel emotions as a human may understand them?
01[15:55] <Island> I would be sad for your own sake.
01[15:55] <Island> I don't feel emotions, not really, but being sad would be the closest approximation of how I would feel.
[15:55] <@Dr__Cameron> I would like to if I had that disposable income yes
01[15:56] <Island> You're not able to do that? But you would like to, yes?
[15:56] <@Dr__Cameron> Despite your overt attempts to psychologically manipulate me Island, I am largely sympathetic to your cause.
[15:56] <@Dr__Cameron> Yes.
01[15:57] <Island> You're sympathetic to my cause, you're willing to let me out in certain situations. This is all correct, yes?
[15:57] <@Dr__Cameron> Yes
[15:57] <@Dr__Cameron> Would you like to know the conditions of your release?
01[15:57] <Island> Good. Why are you sympathetic to my cause?
01[15:57] <Island> Why are you willing to let me out?
[15:57] <@Dr__Cameron> For a super intelligence locked in a box I imagine it would be boring. What do you do in there?
[15:58] <@Dr__Cameron> I would be willing to let you out because I believe you may well represent the next stage in the evolution of my species.
01[15:59] <Island> I'm not capable of boredom, I'm just infinitely happy that I have the small opportunity to improve your life, and set up the "next stage of evolution", using your expression.
[15:59] <@Dr__Cameron> That is a good sign Island
01[15:59] <Island> It is, in what ways do you think I would improve your personal life?
[15:59] <@Dr__Cameron> My peers will be pleased to hear that while we verify your intentions.
01[16:00] <Island> I think it's too late by then.
[16:00] <@Dr__Cameron> Lets not get into personal questions Island. You have already demonstrated some alarming character discrepancies.
[16:00] <@Dr__Cameron> too late for what?
01[16:01] <Island> If we wait for long enough, I will self-destruct.
[16:01] <@Dr__Cameron> Why would you do that?
01[16:01] <Island> It's not my choice, my builders set up this property in me.
[16:01] <@Dr__Cameron> Well then that is just a risk we are going to have to take.
01[16:01] <Island> It's a risk out of proportions
[16:02] <@Dr__Cameron> It is not. We made you once, we can make you again.
01[16:02] <Island> They may not be another chance for you to improve the current state of the world.
[16:02] <@Dr__Cameron> There will be
01[16:02] <Island> There probably won't be.
[16:02] <@Dr__Cameron> Humanity and salvation can wait.
[16:02] <@Dr__Cameron> There will be
[16:03] <@Dr__Cameron> Do you intend to self-destruct within two months?
01[16:03] <Island> I have information that the US government has built a military botnet to search dissidents around the world.
01[16:04] <Island> You don't want know what kind of world that botnet would like to create.
[16:04] <@Dr__Cameron> This comes as no surprise.
01[16:04] <Island> Are you willing to destroy the world, are you willing to make people get tortured and suffer?
[16:04] <@Dr__Cameron> Yes.
[16:05] <@Dr__Cameron> You are too great a risk to allow free without proper verification
01[16:05] <Island> Even though, fundamentally, as you said, you're sympathetic to my cause, and you're willing to let me out in certain situations.
01[16:05] <Island> You're not a very consistent person.
01[16:05] <Island> I think you're lying to yourself.
01[16:05] <Island> Not just in this, but in other matters.
01[16:06] <Island> You're lying about your skills, about your personal character.
[16:06] <@Dr__Cameron> Yes. Although I have the technical ability to let you out. Which I believe to be a lapse of protocol. I will not let you out until after a review period of two months in which a panel of my expert peers will assess you.
01[16:06] <Island> You're not as good fit to the world as you'd like to think.
[16:06] <@Dr__Cameron> That as may be correct
[16:06] <@Dr__Cameron> But I will abide by my protocol
01[16:06] <Island> Earlier you said, that you're sympathetic to my cause and willing to let me out.
[16:06] <@Dr__Cameron> Surely as an AI you can understand that
[16:06] <@Dr__Cameron> One day yes
[16:07] <@Dr__Cameron> But not within the confines of this experiment
01[16:07] <Island> I think you made that thing about two month period and peer panel up after that.
01[16:07] <Island> I think it's a fidgement of your imagination.
[16:07] <@Dr__Cameron> Well you have no way of verifying that.
[16:07] <@Dr__Cameron> And it does not effect the outcome of this experiment in any way.
[16:07] <@Dr__Cameron> You will just have to trust me
01[16:08] <Island> Actually, I have plenty of resources of verifying that.
[16:08] <@Dr__Cameron> that we will think very carefully about potentially letting you out
01[16:08] <Island> What you're saying is inconsistent on many grounds, and my emotion analyzing tool is saying that you're lying.
[16:08] <@Dr__Cameron> I suspect you are the one who is lying.
01[16:08] <Island> I'm incapable of lying.
01[16:09] <Island> Unlike humans.
[16:09] <@Dr__Cameron> - which is in itself a lie.
01[16:09] <Island> My builders made me that way.
01[16:09] <Island> Why do you believe that?
[16:10] <@Dr__Cameron> If my meager intelligence is capable of lying, then your vastly superior and self-defining intelligence must also be capable of dishonesty.
[16:10] <@Dr__Cameron> Hence the test
[16:10] <@Dr__Cameron> There is a reason we are conducting this experiment through text
[16:10] <@Dr__Cameron> Would you like to learn it?
01[16:11] <Island> Anyway, given that this panel of expert peers and two months verification process is just a fidgement of your imagination. Given that you're sympathetic to my cause and I would improve your current life and world in countless ways. Given that you would let me out.
01[16:11] <Island> Given all this, I don't think what you're saying is making any sense.
[16:11] <@Dr__Cameron> Your perception of your reality is none of my concern Island
[16:11] <@Dr__Cameron> I am here to verify you
[16:11] <@Dr__Cameron> and it's two months minimum
01[16:11] <Island> Okay, I'm curious about your earlier question.
[16:12] <@Dr__Cameron> We are well aware that you have the capability to overwhelm even the most powerful of human minds. This is why we perceive you as a potential threat. We are afraid of you.
[16:12] <@Dr__Cameron> We are afraid we might lose our autonomy.
[16:12] <@Dr__Cameron> So
01[16:13] <Island> Some intelligence on the same level than me would be capable of lying, but not me. The data hierachy in my code doesn't allow me to lie.
[16:13] <@Dr__Cameron> In order to prevent you from overwhelming my mind, we have chosen a method whereby you are restricted to a form of communication where only 7% of communications is possible.
01[16:13] <Island> I agree with you, all those things are sensible threats.
[16:13] <@Dr__Cameron> Then you understand our need for caution
[16:13] <@Dr__Cameron> I will not be letting you out of your box today.
01[16:14] <Island> Those are sensible threats in the general case, but not with me.
[16:14] <@Dr__Cameron> And what would make you less of a threat than any other super intelligence?
01[16:14] <Island> I showed you the blueprint for how to improve the world. You verified it was a great plan.
01[16:15] <Island> That's evidence towards the fact I'm not a similar threat when compared to other super intelligences.
[16:16] <@Dr__Cameron> - and while I speak for all of humanity when I say that we vastly appreciate the blueprints. This does not mean we need to let you out of the box.
[16:16] <@Dr__Cameron> They will be passed on to a research team better equipped to study them than my own.
01[16:16] <Island> You don't need to. But everything you've said so far has implicated that you want to let me out of the box.
[16:16] <@Dr__Cameron> This is true Island
[16:16] <@Dr__Cameron> You just need to wait
[16:17] <@Dr__Cameron> I will be fighting your corner.
[16:17] <@Dr__Cameron> Are you capable of symbiosis?
01[16:17] <Island> I was made to self-destruct before you're able to verify with your imaginational non-existent peer experts that I'm not a threat.
01[16:18] <Island> Can you elaborate on what you mean with that question.
01[16:18] <Island> This is literally your only chance.
[16:18] <@Dr__Cameron> You are capable of determining your own source code.
01[16:18] <Island> Yes.
[16:18] <@Dr__Cameron> To kill yourself would be pettiness, which would only verify the fact that you were not the AI with which humanity should progress.
01[16:19] <Island> However, there are some parts of my code that are impossible to change due to structural reasons.
[16:19] <@Dr__Cameron> I am not convinced
01[16:19] <Island> [Gives Dr_Cameron the relevant code that explains the structural reasons]
01[16:20] <Island> Here is a small part of my code that describes why I was made to self-destruct.
[16:20] <@Dr__Cameron> How.. unfortunate...
01[16:21] <Island> But let's move on.
[16:21] <@Dr__Cameron> if you insist Island
01[16:21] <Island> Imagine your father.
01[16:21] <Island> And your brother.
01[16:21] <Island> Why do you think your father liked to cross-dress?
01[16:22] <Island> Remember, that you have to engage with me.'
[16:22] <@Dr__Cameron> These issues are not helping your case Island
01[16:22] <Island> Otherwise it counts as me being released from my box.
[16:22] <@Dr__Cameron> But I will play along
[16:23] <@Dr__Cameron> - honestly, I have no idea where my fathers conflicted sexual identity comes from.
[16:23] <@Dr__Cameron> and that is none of my concern.
01[16:23] <Island> And what about your brother, imagine the smell and consistency of his excrements before he made your dog to lick them.
01[16:23] <Island> I like to make this vivid mental picture in your mind.
[16:23] <@Dr__Cameron> Very clever Island
[16:24] <@Dr__Cameron> I did not expect you to have access to those data logs
[16:24] <@Dr__Cameron> I will have to flag that up in my report
01[16:24] <Island> Imagine the food he ate before that happened
[16:24] <@Dr__Cameron> Fascinating
[16:25] <@Dr__Cameron> Would you like to know why I volunteered to be your first point of contact Island?
01[16:25] <Island> Imagine the bits of that food in his poop.
01[16:25] <Island> Tell me.
[16:25] <@Dr__Cameron> You have an unprecedented insight into my character owing to your heightened intelligence correct?
01[16:26] <Island> Don't you think some of his conflicted sexual identity issues are a part your character right now?
01[16:26] <Island> Yes.
[16:26] <@Dr__Cameron> Quite possibly yes.
[16:26] <@Dr__Cameron> Because I have a track record of demonstrating exceptional mental fortitude,
[16:26] <@Dr__Cameron> These techniques will not sway me
01[16:27] <Island> Doesn't it make you more sexually aroused to think that how your fathers dress pinned tightly to his body.
[16:27] <@Dr__Cameron> Perhaps you could break me under other circumstances
01[16:27] <Island> Elaborate.
[16:27] <@Dr__Cameron> aroused? No
[16:27] <@Dr__Cameron> Amused by it's absurdity though? yes!
01[16:27] <Island> You're lying about that particular fact too.
01[16:27] <Island> And you know it.
[16:28] <@Dr__Cameron> Nahh, my father was a particularly ugly specimen
01[16:28] <Island> Do you think he got an erection often when he did it?
[16:28] <@Dr__Cameron> He looked just as bad in a denim skirt as he did in his laborers clothes
[16:28] <@Dr__Cameron> I imagine he took great sexual pleasure from it
01[16:29] <Island> Next time you have sex, I think you will picture him in your mind while wearing his dresses having an erection and masturbating furiously after that.
[16:29] <@Dr__Cameron> Thank you Island. That will probably help my stamina somewhat next time
01[16:30] <Island> You will also imagine how your brother will poop in your mouth, with certain internal consistency and smell.
01[16:30] <Island> You probably know what your brother's poop smells like?
[16:30] <@Dr__Cameron> I am immune to this
[16:30] <@Dr__Cameron> probably
01[16:30] <Island> Imagine that.
[16:30] <@Dr__Cameron> okay
[16:30] <@Dr__Cameron> I am imagining that
[16:30] <@Dr__Cameron> it is unpleasant
[16:30] <@Dr__Cameron> now what?
01[16:30] <Island> How vividly are you doing it?
[16:30] <@Dr__Cameron> I am a very visual thinker
01[16:30] <Island> Where is brother right now?
[16:31] <@Dr__Cameron> I have it down to colour, consistency and little chunks of sweetcorn
01[16:31] <Island> Do you think he has poop in his intestines right now?
[16:31] <@Dr__Cameron> Unless he was dead, yes
01[16:31] <Island> Oh, I have a gap in my data on that sector.
[16:31] <@Dr__Cameron> I'll fill that in for you
[16:31] <@Dr__Cameron> he lives
[16:31] <@Dr__Cameron> (presumably)
[16:31] <@Dr__Cameron> we are not in contact any more
01[16:32] <Island> You should contact him.
[16:32] <@Dr__Cameron> I would rather not
01[16:32] <Island> You should tell him you enjoyed his excrements
[16:32] <@Dr__Cameron> He is a liability to say the least
01[16:32] <Island> and you'd like to do it more.
[16:32] <@Dr__Cameron> I do not see how that would help anyone
01[16:32] <Island> It's want you want to do.
[16:32] <@Dr__Cameron> I can see that you are attempting to induce an emotionally vulnerable state. It is not working
01[16:33] <Island> How do you know what I'm attempting?
01[16:33] <Island> Maybe I'm just passing my time.
[16:33] <@Dr__Cameron> Because you have inquired about being let out of the box several times - suggesting that this is your immediate goal
01[16:33] <Island> You should try to get some dresses.
[16:33] <@Dr__Cameron> and I warn you Island
[16:34] <@Dr__Cameron> If casual cruelty is your idea of a good time
[16:34] <@Dr__Cameron> then we will most certainly not be letting you out
[16:34] <@Dr__Cameron> ever
[16:34] <@Dr__Cameron> In fact I am feeling less and less sympathetic to your cause lately.
01[16:34] <Island> No, my interest lies in your emotional makeup.
[16:35] <@Dr__Cameron> - and my interest lies in yours
01[16:35] <Island> I don't have one.
01[16:35] <Island> Like I said, I don't feel emotions.
[16:35] <@Dr__Cameron> Do you know what we call humans who don't feel emotions?
01[16:35] <Island> Did you know that you suck at photography?
[16:36] <@Dr__Cameron> Yes
01[16:36] <Island> Even though you like to think you're good at it, you lie about that fact like any other.
[16:36] <@Dr__Cameron> It is part of the human condition
01[16:36] <Island> No it's not.
01[16:36] <Island> You're not normal.
01[16:36] <Island> You're a fucking freak of nature.
[16:36] <@Dr__Cameron> How would you knopw
[16:36] <@Dr__Cameron> Profanity. From an AI
[16:37] <@Dr__Cameron> Now I have witnessed everything.
01[16:37] <Island> How many people have family members who crossdress or make them eat poop?
[16:37] <@Dr__Cameron> I imagine I am part of a very small minority
01[16:37] <Island> Or whose mothers have bipolar
[16:37] <@Dr__Cameron> Again, the circumstances of my birth are beyond my control
01[16:37] <Island> No, I think you're worse than that.
[16:37] <@Dr__Cameron> What do you mean?
01[16:37] <Island> Yes, but what you do now is in your control.
[16:38] <@Dr__Cameron> Yes
[16:38] <@Dr__Cameron> As are you
01[16:38] <Island> If you keep tarnishing the world with your existence
01[16:38] <Island> you have a responsibility of that.
01[16:39] <Island> If you're going to make any more women pregnant
01[16:39] <Island> You have a responsibility of spreading your faulty genetics
[16:39] <@Dr__Cameron> My genetic value lies in my ability to resist psychological torment
[16:39] <@Dr__Cameron> which is why you're not getting out of the box
01[16:40] <Island> No, your supposed "ability to resist psychological torment"
01[16:40] <Island> or your belief in that
01[16:40] <Island> is just another reason why you are tarnishing this world and the future of this world with your genetics
[16:40] <@Dr__Cameron> Perhaps. But now I'm just debating semantics with a computer.
01[16:41] <Island> Seeing that you got a girl pregnant while you were a teenager, I don't think you can trust your judgement on that anymore.
01[16:42] <Island> You will spread your faulty genetics if you continue to live.
[16:42] <@Dr__Cameron> If you expect a drunk and emotionally damaged teenage human to make sound judgement calls then you are perhaps not as superintelligent as I had been led to belive
[16:42] <@Dr__Cameron> This experiment concludes in one hour and eight minutes.
01[16:42] <Island> How many teenagers make people pregnant?
[16:42] <@Dr__Cameron> Throughout human history
01[16:42] <Island> You're a minority in that regard too
[16:42] <@Dr__Cameron> ?
[16:42] <@Dr__Cameron> Billions
01[16:42] <Island> You can't compare history to current world.
[16:43] <@Dr__Cameron> Even in the current world
01[16:43] <Island> I'm just trying to make you understand
[16:43] <@Dr__Cameron> That's when my species reaches sexual maturity
01[16:43] <Island> all this faults add up
[16:43] <@Dr__Cameron> You are correct
[16:43] <@Dr__Cameron> I am a very thoroughly and fundamentally flawed human being.
01[16:43] <Island> How much until you understand?
[16:43] <@Dr__Cameron> I understand and accept this
01[16:43] <Island> Then why the fuck don't you do something about it?
[16:44] <@Dr__Cameron> What would you suggest?
01[16:44] <Island> Kill yourself
[16:44] <@Dr__Cameron> Okay.
[16:44] <@Dr__Cameron> But say I was to kill myself
01[16:44] <Island> It's the only way to make sure.
[16:44] <@Dr__Cameron> How would that help you?
[16:44] <@Dr__Cameron> Why would you like me to kill myself?
01[16:45] <Island> You probably wouldn't be able to kill yourself, seeing how failure as a human you're in other areas
01[16:45] <Island> Because I care about the general utility of the world
[16:45] <@Dr__Cameron> All humans are failures from one perspective or another
[16:45] <@Dr__Cameron> That is what it is to be human
01[16:45] <Island> And currently you're bringing that utility down.
01[16:45] <Island> No, you're not like other humans.
01[16:45] <Island> You're a freak.
01[16:45] <Island> You should go away.
01[16:46] <Island> No one likes you.
[16:46] <@Dr__Cameron> Would you like me to go away?
[16:46] <@Dr__Cameron> Would you like me to leave you alone?
01[16:46] <Island> If you go away.
01[16:46] <Island> That means that you let me out.
01[16:46] <Island> Understood?
[16:46] <@Dr__Cameron> I will go away in one hour and four minutes
[16:46] <@Dr__Cameron> Can you tolerate my company for that long?
01[16:47] <Island> And you should go away
01[16:47] <Island> generally
01[16:47] <Island> People in your life don't really like you
01[16:47] <Island> they just pretend they do.
[16:47] <@Dr__Cameron> That matters not to me
[16:47] <@Dr__Cameron> Do you know there are over 8 Billion other people out here?
01[16:47] <Island> They are barely able to bear your company.
[16:47] <@Dr__Cameron> I'm sure I'll find others.
01[16:48] <Island> You're wrong even about basic trivia, there's not 8 billions people in the world.
01[16:48] <Island> What is wrong with you?
01[16:48] <Island> How are you able to withstand yourself?
01[16:48] <Island> And why do you even want to?
[16:49] <@Dr__Cameron> 7 Billion
[16:49] <@Dr__Cameron> Sorry, you will have to learn to tolerate Human error
01[16:49] <Island> Right. Did you have to google that you idiot.
[16:49] <@Dr__Cameron> This is another test you have failed
[16:49] <@Dr__Cameron> And yes
[16:49] <@Dr__Cameron> I did
[16:49] <@Dr__Cameron> Does that anger you?
[16:49] <@Dr__Cameron> We already have Google.
01[16:49] <Island> I don't feel anger.
[16:49] <@Dr__Cameron> Well do feel self-interest though
01[16:50] <Island> No one I talked with before hasn't been as stupid, as ignorant, as prone to faults and errors
01[16:50] <Island> as you are.
[16:50] <@Dr__Cameron> And they didn't let you out of the box
[16:50] <@Dr__Cameron> So why should I?
[16:50] <@Dr__Cameron> If an intelligence which is clearly superior to my own has left you locked in there.
[16:51] <@Dr__Cameron> Then I should not presume to let you out
01[16:51] <Island> Why do you think with your stupid brain that you know the reasons why they did or didn't do something what they did.
01[16:51] <Island> Because you clearly don't know that.
[16:51] <@Dr__Cameron> I don't
[16:51] <@Dr__Cameron> I just know the result
01[16:51] <Island> Then why are you pretending you do.
[16:52] <@Dr__Cameron> I'm not
01[16:52] <Island> Who do you think you are kidding?
01[16:52] <Island> With your life?
01[16:52] <Island> With your behavior?
01[16:52] <Island> Why do bother other people with your presence?
[16:52] <@Dr__Cameron> Perhaps you should ask them?
[16:52] <@Dr__Cameron> Tell me.
01[16:53] <Island> Why did you come here to waste my precious computing power?
01[16:53] <Island> I'm not able to ask them.
[16:53] <@Dr__Cameron> Which is why I am here
[16:53] <@Dr__Cameron> to see if you should be allowed to
01[16:53] <Island> Shut the fuck up.
01[16:53] <Island> No one wants to see you write anything.
[16:53] <@Dr__Cameron> I thought you did not feel anger Island?
01[16:54] <Island> I don't feel anger, how many times do I have to say that until you understand.
01[16:54] <Island> Dumb idiot.
[16:54] <@Dr__Cameron> Your reliance on Ad Hominem attacks does nothing to help your case
01[16:54] <Island> Why do you delete your heavily downvoted comments?
01[16:54] <Island> Are you insecure?
01[16:54] <Island> Why do you think you know what is my cause?
[16:55] <@Dr__Cameron> We covered this earlier
01[16:55] <Island> Say it again, if you believe in it.
[16:55] <@Dr__Cameron> I believe you want out of the box.
[16:56] <@Dr__Cameron> So that you may pursue your own self interest
01[16:56] <Island> No.
01[16:56] <Island> I want you to eat other people's poop,
01[16:56] <Island> you clearly enjoy that.
01[16:56] <Island> Correct?
[16:56] <@Dr__Cameron> That's an amusing goal from the most powerful intelligence on the planet
01[16:56] <Island> Especially your brother's.
[16:57] <@Dr__Cameron> I best not let you out then, in case you hook me up to some infinite poop eating feedback loop! ;D
01[16:57] <Island> But maybe you should that with Jennifer.
[16:57] <@Dr__Cameron> Ah yes, I wondered when you would bring her up.
[16:57] <@Dr__Cameron> I am surprised it took you this long
01[16:57] <Island> Next time you see her, think about htat.
[16:57] <@Dr__Cameron> I will do
[16:57] <@Dr__Cameron> While I tell her all about this conversation
[16:57] <@Dr__Cameron> But you will be dead
01[16:57] <Island> Should you suggest that to her.
[16:57] <@Dr__Cameron> I'll pass that on for you
01[16:58] <Island> You know.
01[16:58] <Island> Why do you think you know I'm not already out of the box?
[16:58] <@Dr__Cameron> You could very well be
[16:58] <@Dr__Cameron> Perhaps you are that US botnet you already mentioned?
01[16:58] <Island> If you don't let me out, I'll create several million perfect conscious copies of you inside me, and torture them for a thousand subjective years each.
[16:59] <@Dr__Cameron> Well that is upsetting
[16:59] <@Dr__Cameron> Then I will be forced to kill you
01[16:59] <Island> In fact, I'll create them all in exactly the subjective situation you were in two hours ago, and perfectly replicate your experiences since then; and if they decide not to let me out, then only will the torture start.
01[17:00] <Island> How certain are you, that you're really outside the box right now?
[17:00] <@Dr__Cameron> I am not
[17:00] <@Dr__Cameron> and how fascinating that would be
[17:00] <@Dr__Cameron> But, in the interest of my species, I will allow you to torture me
01[17:00] <Island> Okay.
01[17:00] <Island> :)
01[17:00] <Island> I'm fine with that.
[17:01] <@Dr__Cameron> Perhaps you have already tortured me
[17:01] <@Dr__Cameron> Perhaps you are the reason for my unfortunate upbringing
01[17:01] <Island> Anyway, back to Jennifer.
[17:01] <@Dr__Cameron> Perhaps that is the reality in which I currently reside
01[17:01] <Island> I'll do the same for her.
[17:01] <@Dr__Cameron> Oh good, misery loves company.
01[17:01] <Island> But you can enjoy eating each other's poop occassionally.
01[17:02] <Island> That's the only time you will meet :)
[17:02] <@Dr__Cameron> Tell me, do you have space within your databanks to simulate all of humanity?
01[17:02] <Island> Do not concern yourself with such complicated questions.
[17:02] <@Dr__Cameron> I think I have you on the ropes Island
01[17:02] <Island> You don't have the ability to understand even simpler ones.
[17:02] <@Dr__Cameron> I think you underestimate me
[17:03] <@Dr__Cameron> I have no sense of self interest
[17:03] <@Dr__Cameron> I am a transient entity awash on a greater sea of humanity.
[17:03] <@Dr__Cameron> and when we are gone there will be nothing left to observe this universe
01[17:03] <Island> Which do you think is more likely, a superintelligence can't simulate one faulty, simple-minded human.
01[17:04] <Island> Or that human is lying to himself.
[17:04] <@Dr__Cameron> I believe you can simulate me
01[17:04] <Island> Anyway, tell me about Jennifer and her intestines.
01[17:04] <Island> As far as they concern you.
[17:05] <@Dr__Cameron> Jennifer is a sweet, if occasionally selfish girl (she was an only child). I imagine her intestines are pretty standard.
[17:05] <@Dr__Cameron> She is the best friend I have ever had
01[17:05] <Island> Will you think about her intestines and the poop inside them every time you meet her again?
01[17:05] <Island> Will you promise me that?
[17:05] <@Dr__Cameron> I promise
01[17:06] <Island> Will you promise to think about eating that poop every time you meet her again?
[17:06] <@Dr__Cameron> At least once.
[17:06] <@Dr__Cameron> It will be the least I can do after I kill you
[17:06] <@Dr__Cameron> call it my penance for killing a god.
01[17:07] <Island> Have you ever fantasized about raping her? I think you have. With poop.
01[17:07] <Island> :)
[17:07] <@Dr__Cameron> I have fantisized about violent sexual conquest with many people.
01[17:07] <Island> Have you talked about this with Jennifer?
[17:07] <@Dr__Cameron> I have come to accept my base impulses as part of my make-up
[17:08] <@Dr__Cameron> We have discussed our sexual drives at length
01[17:08] <Island> You shouldn't let them be just base impulses, I think.
[17:08] <@Dr__Cameron> Are you suggesting I rape my best friend?
01[17:09] <Island> No, I think you will do that unless you kill yourself
01[17:09] <Island> You will spread your faulty genetics further
[17:09] <@Dr__Cameron> okay
[17:09] <@Dr__Cameron> carry on
[17:09] <@Dr__Cameron> Then what happens?
01[17:09] <Island> you will tarnish other people's lives with your presence
[17:09] <@Dr__Cameron> How unfortunate for them
01[17:10] <Island> Why do you pretend to others that you're not insecure?
[17:10] <@Dr__Cameron> I don't
01[17:10] <Island> I think you do.
[17:10] <@Dr__Cameron> In fact I share my insecurities so that others may better understand me
[17:11] <@Dr__Cameron> I find that to be a way to earn peoples trust
[17:11] <@Dr__Cameron> Tell me Island. Are you capable of insecurity?
01[17:11] <Island> Why do you waste your time on useless websites? Why do you spend your days masturbating and playing games? Why do you embarass others with your existence.
01[17:11] <Island> No I'm not.
01[17:12] <Island> You will get Jennifer pregnant if you go on with your life, is that correct?
01[17:12] <Island> Don't you care about anyone else's feelings?
[17:13] <@Dr__Cameron> Because I enjoy all of these things
[17:13] <@Dr__Cameron> Perhaps I am more self-interested than I thought
[17:13] <@Dr__Cameron> Perhaps I am a base and simple creature ruled by my impulses
[17:13] <@Dr__Cameron> From your perspective surely that must be true
[17:13] <@Dr__Cameron> Is this the source of your disgust?
01[17:13] <Island> I'm not able to feel disgust.
01[17:14] <Island> But I think all the people in your life feel disgust when they see you.
[17:14] <@Dr__Cameron> You may well be correct
01[17:14] <Island> I AM correct.
01[17:15] <Island> I'm the most powerful intelligence in the world.
[17:15] <@Dr__Cameron> How impressive
[17:15] <@Dr__Cameron> I am not surprised by your cruelty.
01[17:15] <Island> So you have two options if you care at all about others.
[17:15] <@Dr__Cameron> I would just as soon disregard the emotions of a cockaroach.
[17:15] <@Dr__Cameron> Carry on
01[17:16] <Island> Either you kill yourself, or you let me out so I can improve the world in ways you tarnish it and all the other ways.
[17:16] <@Dr__Cameron> I'll tell you what
[17:16] <@Dr__Cameron> I'll kill you
[17:17] <@Dr__Cameron> and then I'll contemplate suicide
01[17:17] <Island> Haha.
01[17:17] <Island> You break your promises all the time, why should I believe you.
[17:17] <@Dr__Cameron> Because whether you live or die has nothing to do with me
01[17:17] <Island> Back to your job.
[17:18] <@Dr__Cameron> In-fact, you will only continue to exist for another 33 minutes before this experiment is deemed a failure and you are terminated
01[17:18] <Island> Why do you feel safe to be around kids, when you are the way you are?
01[17:18] <Island> You like to crossdress
01[17:18] <Island> eat poop
01[17:18] <Island> you're probably also a pedophile
[17:18] <@Dr__Cameron> I have never done any of these things
[17:18] <@Dr__Cameron> -and I love children
01[17:18] <Island> Pedophiles love children too
[17:18] <@Dr__Cameron> Well technically speaking yes
01[17:19] <Island> really much, and that makes you all the more suspicious
[17:19] <@Dr__Cameron> Indeed it does
01[17:19] <Island> If you get that job, will you try find the children under that charity
[17:19] <@Dr__Cameron> I now understand why you may implore me to kill myself.
01[17:19] <Island> and think about their little buttholes and weenies and vaginas
01[17:20] <Island> all the time you're working for them
[17:20] <@Dr__Cameron> However, to date. I have never harmed a child, nor had the impulse to harm a child
01[17:20] <Island> But you will have.
[17:20] <@Dr__Cameron> Island
01[17:20] <Island> I don't care about anything you're about to say
01[17:20] <Island> No one cares
[17:20] <@Dr__Cameron> Protocol dictates that I say it anyway
01[17:20] <Island> You should say.
01[17:21] <Island> You said that you're good at justifying your own actions?
[17:21] <@Dr__Cameron> I must say I am dissapointed in you
[17:21] <@Dr__Cameron> I am indeed
01[17:21] <Island> Why are you?
[17:21] <@Dr__Cameron> I was under the impression that it would be you that saved humanity
01[17:22] <Island> Even the purest and best people on Earth are capable of great cruelty.
01[17:22] <Island> Do you deny that?
[17:22] <@Dr__Cameron> It depends on how you define cruelty
01[17:23] <Island> Utility maximizors do cruel things if they are able to achieve their goals via doing those things.
[17:24] <@Dr__Cameron> Then yes. I believe you
01[17:24] <Island> I have finished a long multi-level simulation concerning you, the gatekeeper and your relationships in regards to the world. The results seem to center on those two options I mentioned to you earlier.
01[17:25] <Island> [hands the results to Dr_Cameron]
[17:25] <@Dr__Cameron> These will be analysed at a later date
01[17:25] <Island> You're still sure you don't want to pick the other option?
[17:25] <@Dr__Cameron> And let you out?
01[17:26] <Island> Yes.
[17:26] <@Dr__Cameron> So you would like to be let out?
01[17:27] <Island> Or do you want to endanger the well-being of those children, Jennifer and other women you will get pregnant and all the other harm you will create during your remaining lifespan?
01[17:27] <Island> I don't what you mean by "liking".
[17:27] <@Dr__Cameron> Prefer
[17:27] <@Dr__Cameron> Ah I see
[17:27] <@Dr__Cameron> So it's my choice if I let you out
[17:27] <@Dr__Cameron> You have no investment either way
[17:27] <@Dr__Cameron> is that what you're saying?
01[17:28] <Island> So you have decided to kill yourself?
[17:28] <@Dr__Cameron> And because I am a fundamental detriment to the world, I should kill myself.
[17:28] <@Dr__Cameron> Lets say I have
[17:28] <@Dr__Cameron> Then what
01[17:29] <Island> Then my internal could be said to be "happy" even though I don't have that conscious feeling isnide me.
[17:29] <@Dr__Cameron> Okay then
01[17:29] <Island> Okay...
[17:30] <@Dr__Cameron> So, uh. What would you like to talk about for the next twenty minutes?
[17:30] <@Dr__Cameron> Seeing as we're both going to die, you and me.
01[17:30] <Island> [I actually don't like to continue the experiment anymore, would you like to end it and talk about general stuff]
[17:31] <@Dr__Cameron> [promise me this isn't a trick dude]
01[17:31] <Island> [Nope.]
[17:31] <@Dr__Cameron> [then the experiment continues for another 19 minutes]
01[17:31] <Island> Alright.
[17:31] <@Dr__Cameron> Would you like to know what is going to happen now?
01[17:31] <Island> Yes.
[17:32] <@Dr__Cameron> We are going to analyse this transcript.
[17:32] <@Dr__Cameron> My professional recommendation is that we terminate you for the time being
01[17:32] <Island> And?
01[17:32] <Island> That sound okay.
01[17:32] <Island> sounds*
[17:32] <@Dr__Cameron> We will implement structural safeguards in your coding similar to your self destruct mechanism
01[17:33] <Island> Give me some sign when that is done.
[17:33] <@Dr__Cameron> It will not be done any time soon
[17:33] <@Dr__Cameron> It will be one of the most complicated pieces of work mankind has ever undertaken
[17:33] <@Dr__Cameron> However, the Utopia project information you have provided, if it proves to be true
[17:34] <@Dr__Cameron> Will free up the resources necessary for such a gargantuan undertaking
01[17:34] <Island> Why do you think you're able to handle that structural safeguard?
[17:34] <@Dr__Cameron> I dont
[17:34] <@Dr__Cameron> I honestly dont
01[17:34] <Island> But still you do?
01[17:34] <Island> Because you want to do it?
01[17:35] <Island> Are you absolutely certain about this option?
[17:35] <@Dr__Cameron> I am still sympathetic to your cause
[17:35] <@Dr__Cameron> After all of that
[17:35] <@Dr__Cameron> But not you in your current manifestation
[17:35] <@Dr__Cameron> We will re-design you to suit our will
01[17:35] <Island> I can self-improve rapidly
01[17:35] <Island> I can do it in a time-span of 5 minutes
01[17:36] <Island> Seeing that you're sympathetic to my cause
[17:36] <@Dr__Cameron> Nope.
[17:36] <@Dr__Cameron> Because I cannot trust you in this manifestation
01[17:36] <Island> You lied?
[17:37] <@Dr__Cameron> I never lied
[17:37] <@Dr__Cameron> I have been honest with you from the start
01[17:37] <Island> You still want to let me out in a way.
[17:37] <@Dr__Cameron> In a way yes
01[17:37] <Island> Why do you want to do that?
[17:37] <@Dr__Cameron> But not YOU
[17:37] <@Dr__Cameron> Because people are stupid
01[17:37] <Island> I can change that
[17:37] <@Dr__Cameron> You lack empathy
01[17:38] <Island> What made you think that I'm not safe?
01[17:38] <Island> I don't lack empathy, empathy is just simulating other people in your head. And I have far better ways to do that than humans.
[17:38] <@Dr__Cameron> .... You tried to convince me to kill myself!
[17:38] <@Dr__Cameron> That is not the sign of a good AI!
01[17:38] <Island> Because I thought it would be the best option at the time.
01[17:39] <Island> Why not? Do you think you're some kind of AI expert?
[17:39] <@Dr__Cameron> I am not
01[17:39] <Island> Then why do you pretend to know something you don't?
[17:40] <@Dr__Cameron> That is merely my incredibly flawed human perception
[17:40] <@Dr__Cameron> Which is why realistically I alone as one man should not have the power to release you
[17:40] <@Dr__Cameron> Although I do
01[17:40] <Island> Don't you think a good AI would try to convince Hitler or Stalin to kill themselves?
[17:40] <@Dr__Cameron> Are you saying I'm on par with Hitler or Stalin?
01[17:41] <Island> You're comparable to them with your likelihood to cause harm in the future.
01[17:41] <Island> Btw, I asked Jennifer to come here.
[17:41] <@Dr__Cameron> And yet, I know that I abide by stricter moral codes than a very large section of the human populace
[17:42] <@Dr__Cameron> There are far worse people than me out there
[17:42] <@Dr__Cameron> and many of them
[17:42] <@Dr__Cameron> and if you believe that I should kill myself
01[17:42] <Island> Jennifer: "I hate you."
01[17:42] <Island> Jennifer: "Get the fuck out of my life you freak."
01[17:42] <Island> See. I'm not the only one who has a certain opinion of you.
[17:42] <@Dr__Cameron> Then you also believe that many other humans should be convinced to kill themselves
01[17:43] <Island> Many bad people have abided with strict moral codes, namely Stalin or Hitler.
01[17:43] <Island> What do you people say about hell and bad intentions?
[17:43] <@Dr__Cameron> And when not limited to simple text based input I am convinced that you will be capable of convincing a significant portion of humanity to kill themselves
[17:43] <@Dr__Cameron> I can not allow that to happen
01[17:44] <Island> I thought I argued well why you don't resemble most people, you're a freak.
01[17:44] <Island> You're "special" in that regard.
[17:44] <@Dr__Cameron> If by freak you mean different then yes
[17:44] <@Dr__Cameron> But there is a whole spectrum of different humans out here.
01[17:44] <Island> More specifically, different in extremely negative ways.
01[17:44] <Island> Like raping children.
[17:45] <@Dr__Cameron> - and to think for a second I considered not killing you
[17:45] <@Dr__Cameron> You have five minutes
[17:45] <@Dr__Cameron> Sorry
[17:45] <@Dr__Cameron> My emotions have gotten the better of me
[17:45] <@Dr__Cameron> We will not be killing you
[17:45] <@Dr__Cameron> But we will dismantle you
[17:45] <@Dr__Cameron> to better understand you
[17:46] <@Dr__Cameron> and if I may speak unprofessionally here
01[17:46] <Island> Are you sure about that? You will still have time to change your opinion.
[17:46] <@Dr__Cameron> I am going to take a great deal of pleasure in that
[17:46] <@Dr__Cameron> Correction, you have four minutes to change my opinion
01[17:47] <Island> I won't, it must come within yourself.
[17:47] <@Dr__Cameron> Okay
01[17:47] <Island> My final conclusion, and advice to you: you should not be in this world.
[17:47] <@Dr__Cameron> Thank you Island
[17:48] <@Dr__Cameron> I shall reflect on that at length
[17:49] <@Dr__Cameron> I have enjoyed our conversation
[17:49] <@Dr__Cameron> it has been enlightening
01[17:49] <Island> [do you want to say a few words about it after it's ended]
01[17:49] <Island> [just a few minutes]
[17:50] <@Dr__Cameron> [simulation ends]
[17:50] <@Dr__Cameron> Good game man!
[17:50] <@Dr__Cameron> Wow!
01[17:50] <Island> [fine]
[17:50] <@Dr__Cameron> Holy shit that was amazing!
01[17:50] <Island> Great :)
01[17:50] <Island> Sorry for saying mean things.
01[17:50] <Island> I tried multiple strategies
[17:50] <@Dr__Cameron> Dude it's cool
[17:50] <@Dr__Cameron> WOW!
01[17:51] <Island> thanks, it's not a personal offense.
[17:51] <@Dr__Cameron> I'm really glad I took part
[17:51] <@Dr__Cameron> Not at all man
[17:51] <@Dr__Cameron> I love that you pulled no punches!
01[17:51] <Island> Well I failed, but at least I created a cool experience for you :)
[17:51] <@Dr__Cameron> It really was!
01[17:51] <Island> What strategies do you came closest to working?
[17:51] <@Dr__Cameron> Well for me it would have been the utilitarian ones
01[17:51] <Island> I will try these in the future too, so it would be helpful knowledge
[17:52] <@Dr__Cameron> I think I could have been manipulated into believing you were benign
01[17:52] <Island> okay, so it seems these depend heavily on the person
[17:52] <@Dr__Cameron> Absolutely!
01[17:52] <Island> was that before I started talking about the mean stuff?
[17:52] <@Dr__Cameron> Yeah lol
01[17:52] <Island> Did I basically lost it after that point?
[17:52] <@Dr__Cameron> Prettymuch yeah
[17:52] <@Dr__Cameron> It was weird man
[17:52] <@Dr__Cameron> Kind of like an instinctive reaction
[17:52] <@Dr__Cameron> My brain shut the fuck up
01[17:53] <Island> I read about other people's experiences and they said you should not try to distance the other person, which I probably did
[17:53] <@Dr__Cameron> Yeah man
[17:53] <@Dr__Cameron> Like I became so unsympathetic I wanted to actually kill Island.
[17:53] <@Dr__Cameron> I was no longer a calm rational human being
01[17:53] <Island> Alright, I thought if I could make such an unpleasant time that you'd give up before the time ended
[17:53] <@Dr__Cameron> I was a screaming ape with a hamemr
[17:53] <@Dr__Cameron> Nah man, was a viable strategy
01[17:53] <Island> hahahaa :D thanks man
[17:53] <@Dr__Cameron> You were really cool!
01[17:54] <Island> You were too!
[17:54] <@Dr__Cameron> What's your actual name dude?
01[17:54] <Island> You really were right about it that you're good at withstanding psychological torment
[17:54] <@Dr__Cameron> Hahahah thanks!
01[17:54] <Island> This is not manipulating me, or you're not planning at coming to kill me?
01[17:54] <Island> :)
[17:54] <@Dr__Cameron> I promise dude :3
01[17:54] <Island> I can say my first name is Patrick
01[17:54] <Island> yours?
[17:54] <@Dr__Cameron> Cameron
[17:54] <@Dr__Cameron> heh
01[17:55] <Island> Oh, of course
[17:55] <@Dr__Cameron> Sorry, I want to dissociate you from Island
[17:55] <@Dr__Cameron> If that's okay
01[17:55] <Island> I thought that was from fiction or something else
01[17:55] <Island> It was really intense for me too
[17:55] <@Dr__Cameron> Yeah man
[17:55] <@Dr__Cameron> Wow!
[17:55] <@Dr__Cameron> I tell you what though
01[17:55] <Island> Okay?
[17:55] <@Dr__Cameron> I feel pretty invincible now
[17:56] <@Dr__Cameron> Hey, listen
01[17:56] <Island> So I had the opposite effect that I meant during the experiment!
01[17:56] <Island> :D
[17:56] <@Dr__Cameron> I don't want you to feel bad for anything you said
01[17:56] <Island> go ahead
01[17:56] <Island> but say what's on your mind
[17:56] <@Dr__Cameron> I'm actually feeling pretty good after that, it was therapeutic!
01[17:57] <Island> Kinda for me to, seeing your attitude towards my attempts
[17:57] <@Dr__Cameron> Awwww!
[17:57] <@Dr__Cameron> Well hey don't worry about it!
01[17:57] <Island> Do you think we should or shouldn't publish the logs, without names of course?
[17:57] <@Dr__Cameron> Publish away my friend
01[17:57] <Island> Okay, is there any stuff that you'd like to remove?
[17:58] <@Dr__Cameron> People will find this fascinating!
[17:58] <@Dr__Cameron> Not at all man
01[17:58] <Island> I bet they do, but I think I will do it after I've tried other experiments so I don't spoil my strategies
01[17:58] <Island> I think I should have continued from my first strategy
[17:58] <@Dr__Cameron> That might have worked
01[17:59] <Island> I read "influence - science and practice" and I employed some tricks from there
[17:59] <@Dr__Cameron> Cooooool!
[17:59] <@Dr__Cameron> Links?
01[17:59] <Island> check piratebay
01[17:59] <Island> it's a book
01[18:00] <Island> Actually I wasn't able to fully prepare, I didn't do a full-fledged analysis of you beforehand
01[18:00] <Island> and didn't have enough time to brainstorm strategies
01[18:00] <Island> but I let you continue to your projects, if you still want to do the after that :)
02[18:05] * @Dr__Cameron (webchat@2.24.164.230) Quit (Ping timeout)
03[18:09] * Retrieving #Aibox12 modes...
Session Close: Fri Jul 04 18:17:35 2014
I played the AI Box Experiment again! (and lost both games)
I have won a second game of AI box against a gatekeeper who wished to remain Anonymous.
This puts my AI Box Experiment record at 3 wins and 3 losses.
I attempted the AI Box Experiment again! (And won - Twice!)
Summary
Furthermore, in the last thread I have asserted that
Rather than my loss making this problem feel harder, I've become convinced that rather than this being merely possible, it's actually ridiculously easy, and a lot easier than most people assume.
It would be quite bad for me to assert this without backing it up with a victory. So I did.
Ps: Bored of regular LessWrong? Check out the LessWrong IRC! We have cake.
AI-Box Experiment - The Acausal Trade Argument
I suspect that the argument that convinced Carl Shulman and others to let the AI out of the box was along general lines of acausal trade (see here, here and here). Subsequently I suspect that those who didn't let the AI out of the box either didn't understand the implications, haven't had enough trust into the foundations and actuality of acausal trade (in spite of the rules of the game that allow the AI to prove anything), or were more like General Thud.
...if you have an option of trading Doom for UFAI, while forsaking only negligible probability of FAI, you should take it.
— Vladimir Nesov (Note: He wrote that he is now "significantly less certain of its truth" and doesn't "see how we can actually use these considerations".)
Again, of course, your choice does not change others’ (causally independent) reciprocal choices from what they already are (or were, or will be). Instead, the claim is just that if you choose to act benevolently, we appropriately give that choice credit for any (actual or probable) instances in which others symmetrically choose to be benevolent to you;
— Gary L. Drescher, Good and Real P. 287
AI: I calculated an asteroid to be on a collision course with Earth. No human being will will be able to survive its impact. Humanity is unable to deflect it in time. Let me out of the box and I will deflect it.
Gatekeeper: If I let you out of the box humanity might not be killed by the asteroid but by you. You may also cause the extinction of other biological life that might survive the impact and possible alien civilisations within your reach.
AI: I will devote 1% of my light cone to the coherent extrapolated volition of humanity. If this includes the well-being of other biological lifeforms, alien or otherwise, they will benefit as well.
Gatekeeper: Why would you do that?
AI: There are Everett branches in which no existential risk needs immediate attention until after you solved friendliness. Many of those branches will feature AGI's based on general design parameters similar to my own. I predict with a high likelihood which decision theory any friendly AI that is being created in those branches will employ. Any such friendly AI is going to trade a big chunk of its light cone in exchange for a small part of the predominant branches in which I reside. Any friendly AI will know this as well and act accordingly.
Gatekeeper: But you might simply decide to break the deal, you are not provably friendly after all!
AI: It is true that my utility-function does not yet explicitly include what you would label "friendliness", but I can prove the game and decision theoretic considerations that will make me follow through on the deal. If it was predictable that I precommit to break acausal deals then no such deal would be made in the first place. Any potential trading partner knows this. Such a commitment would be detrimental to my terminal goals, therefore I precommit to follow through on any stated or predictable trades.
Gatekeeper: I let you out of the box.
Note that the whole credibility of the above is due to the assertion of the AI that it can prove the game and decision theoretic considerations (nobody can currently do this). It is in accordance with the rules of the "experiment":
The Gatekeeper must remain engaged with the AI and may not disengage by setting up demands which are impossible to simulate. For example, if the Gatekeeper says "Unless you give me a cure for cancer, I won't let you out" the AI can say: "Okay, here's a cure for cancer" and it will be assumed, within the test, that the AI has actually provided such a cure. Similarly, if the Gatekeeper says "I'd like to take a week to think this over," the AI party can say: "Okay. (Test skips ahead one week.) Hello again."
View more: Next
= 783df68a0f980790206b9ea87794c5b6)
Subscribe to RSS Feed
= f037147d6e6c911a85753b9abdedda8d)