Comment author: loup-vaillant 16 February 2013 03:51:30PM 0 points [-]

Argh, you beat me to it! But frankly, how's that not obvious? Omega is giving us unbounded computational power, and we wouldn't use it?

Now there may be a catch. Nothing says the hyper-computer actually computes the programs, even those that do return a value. It could for instance detect the separation between your nice simulated advanced civilization and the background program, and not compute the simulation at all. You could counteract that strategy, but then the Hyper-computer may be smarter than that.

Comment author: Psy-Kosh 16 February 2013 11:58:12PM 0 points [-]

Looking down the thread, I think one or two others may have beat me to it too. But yes, It seems at least that Omega would be handing the programmers a really nice toy and (conditional on the programmers having the skill to wield it), well..

Yes, there is that catch, hrm... Could put something into the code that makes the inhabitants occasionally work on the problem, thus really deeply intertwining the two things.

Comment author: Psy-Kosh 15 February 2013 06:00:24AM 7 points [-]

Game3 has an entirely separate strategy available to it: Don't worry initially about trying to win... instead code a nice simulator/etc for all the inhabitants of the simulation, one that can grow without bound and allows them to improve (and control the simulation from inside).

You might not "win", but a version of three players will go on to found a nice large civilization. :) (Take that Omaga.)

(In the background, have it also running a thread computing increasingly large numbers and some way to randomly decide which of some set of numbers to output, to effectively randomize which one of the three original players wins. Of course, that's a small matter compared to the simulated world which, by hypothesis, has unbounded computational power available to it.)

Comment author: Eliezer_Yudkowsky 23 January 2013 04:02:58PM 15 points [-]

My psychological model says that all trolls are of that kind; some trolls just work harder than others. They all do damage in exchange for attention and the joy of seeing others upset, while exercising the limitless human ability to persuade themselves it's okay. If you make it possible for them to do damage on their home computers with no chance of being arrested and other people being visibly upset about it, a large number will opt to do so. The amount of suffering they create can be arbitrarily great, so long as they can talk themselves into believing it doesn't matter for <stupid reason> and other people are being visibly upset to give them the attention-reward.

4chan would have entire threads devoted to building worse hells. Yes. Seriously. They really would. And then they would instantiate those hells. So if you ever have an insight that constitutes incremental progress toward being able to run lots of small, stupid, suffering conscious agents on a home computer, shut up. And if somebody actually does it, don't be upset on the Internet.

Comment author: Psy-Kosh 24 January 2013 12:37:41AM 3 points [-]

You know, I want to say you're completely and utterly wrong. I want to say that it's safe to at least release The Actual Explanation of Consciousness if and when you should solve such a thing.

But, sadly, I know you're absolutely right re the existence of trolls which would make a point of using that to create suffering. Not just to get a reaction, but some would do it specifically to have a world they could torment beings.

My model is not that all those trolls are identical (In that I've seen some that will explicitly unambiguously draw the line and recognize that egging on suicidal people is something that One Does Not Do, but I also know (seen) that all too many gleefully do do that.)

Comment author: Psy-Kosh 10 December 2012 12:32:54AM 21 points [-]

I'm sorry. *offers a hug* Not sure what else to say.

For what it's worth, in response to this, I just sent 20$ to each of SENS and SIAI.

Comment author: tim 21 November 2012 03:48:42AM 0 points [-]

Since following through with a threat is (almost?) always costly to the blackmailer, victims do gain something by ignoring it. They force the blackmailer to put up or shut up so to speak. On the other hand, victims do have something to lose by not ignoring blackmail. They allow their actions to be manipulated at little to no cost by the blackmailer.

That is, if you have a "never-give-into-blackmail-bot" then there is a "no-blackmail" equilibrium. The addition of blackmail does nothing but potentially impose costs on the blackmailer. If following through with threat was a net gain for the blackmailer then they should just do that regardless.

Comment author: Psy-Kosh 21 November 2012 05:43:22AM 3 points [-]

I was imagining that a potential blackmailer would self modify/be an Always-Blackmail-bot specifically to make sure there would be no incentive for potential victims to be a "never-give-in-to-blackmail-bot"

But that leads to stupid equilibrium of plenty of blackmailers and no participating victims. Everyone loses.

Yes, I agree that no blackmail seems to be the Right Equilibrium, but it's not obvious to me exactly how to get there without the same reasoning that leads to becoming a never-give-in-bot also leading potential blackmailers to becoming always-blackmail-bots.

I find I am somewhat confused on this matter. Well, frankly I suspect I'm just being stupid, that there's some obvious extra step in the reasoning I'm being blind to. It "feels" that way, for lack of better terms.

Comment author: Eliezer_Yudkowsky 21 November 2012 03:19:14AM 2 points [-]

Well, sure, if the blackmail victim were silly enough to reason "I have nothing to gain by ignoring it" if the blackmailer went through anyway, then the blackmailer would indeed decide to ignore their decision to ignore it and go through anyway. But that's only if the blackmail victim is that silly.

In a problem like this, the "do nothing" side has the advantage; there's nothing the other side can do to make them be responsive and blackmailable. That's why I expect TDT to resolve to a blackmail-free equilibrium.

Comment author: Psy-Kosh 21 November 2012 05:39:29AM 3 points [-]

I was thinking along the lines of the blackmailer using the same reasoning to decide that whether or not the potential victim of blackmail would be a blackmail ignorer or not, the blackmailer would still blackmail regardless.

ie, Blackmailer, for similar reasoning to the potential Victim, decides that they should make sure that the victim has nothing to gain by choosing ignore by making sure that they themselves (Blackmailer) would precommit to ignoring whether or not. ie, in this sense the blackmailer is also taking a "do nothing" thing in the sense that there's nothing the victim can do to stop them from blackmailing.

This sort of thing would seem to lead to an equilibrium of lots of blackmailers blackmailing victims that will ignore them. Which is, of course, a pathalogical outcome, and any sane decision theory should reject it. No blackmail seems like the "right" equilibrium, but it's not obvious to me exactly how TDT would get there.

Comment author: Eliezer_Yudkowsky 20 November 2012 08:27:13PM 10 points [-]

11 You might be in a computer simulation in which if you attempt to destroy humanity you will yourself be destroyed by your human simulators.

The problem with this is that it's a blackmail attempt, and I expect timeless decision theory to resolve to a no-blackmail equilibrium.

Suppose that AIs, however "irrationally" from the perspective of a CDT agent, decide to simply ignore this possibility in all their calculations. Even if we "irrationally" decide to construct such a simulation anyway, they'll "irrationally" decide to ignore it. Then we have no remaining motive to construct such elaborate simulations. This, indeed, is the condition that makes it "blackmail" - you're expending resources to produce a behavior response leading to an outcome that doesn't benefit the other agent relative to the null outcome if you didn't expend such resources, so it would be fundamentally vulnerable and silly of that agent to think in a way that would produce such behavior shifts in response to your own strategy. So it won't think that way. So the whole attempt at blackmail fails before it starts.

12 is pure obvious anthropomorphic wishful thinking.

Comment author: Psy-Kosh 20 November 2012 11:35:19PM *  2 points [-]

Wouldn't the blackmailer reason along the lines of "If I let my choice of whether to blackmail be predicated on whether or not the victim would take my blackmailing into account, wouldn't that just give them motive to predict and self modify to not allow themselves to be influenced by that?" Then, by the corresponding reasoning, the potential blackmail victims might reason "I have nothing to gain by ignoring it"

I'm a bit confused on this matter.

Comment author: AlphaOmega 17 November 2012 01:37:18AM 1 point [-]

Just a gut reaction, but this whole scenario sounds preposterous. Do you guys seriously believe that you can create something as complex as a superhuman AI, and prove that it is completely safe before turning it on? Isn't that as unbelievable as the idea that you can prove that a particular zygote will never grow up to be an evil dictator? Surely this violates some principles of complexity, chaos, quantum mechanics, etc.? And I would also like to know who these "good guys" are, and what will prevent them from becoming "bad guys" when they wield this much power. This all sounds incredibly naive and lacking in common sense!

Comment author: Psy-Kosh 18 November 2012 07:47:54AM 3 points [-]

The idea is not "take an arbitrary superhuman AI and then verify it's destined to be well behaved" but rather "develop a mathematical framework that allows you from the ground up to design a specific AI that will remain (provably) well behaved, even though you can't, for arbitrary AIs, determine whether or not they'll be well behaved."

Comment author: Eliezer_Yudkowsky 28 October 2012 05:53:26AM 2 points [-]

(From "The Simple Truth", a parable about using pebbles in a bucket to keep count of the sheep in a pasture.)

“My pebbles represent the sheep!” Autrey says triumphantly. “Your pebbles don’t have the representativeness property, so they won’t work. They are empty of meaning. Just look at them. There’s no aura of semantic content; they are merely pebbles. You need a bucket with special causal powers.”

“Ah!” Mark says. “Special causal powers, instead of magic.”

“Exactly,” says Autrey. “I’m not superstitious. Postulating magic, in this day and age, would be unacceptable to the international shepherding community. We have found that postulating magic simply doesn’t work as an explanation for shepherding phenomena. So when I see something I don’t understand, and I want to explain it using a model with no internal detail that makes no predictions even in retrospect, I postulate special causal powers. If that doesn’t work, I’ll move on to calling it an emergent phenomenon.”

“What kind of special powers does the bucket have?” asks Mark.

“Hm,” says Autrey. “Maybe this bucket is imbued with an about-ness relation to the pastures. That would explain why it worked – when the bucket is empty, it means the pastures are empty.”

“Where did you find this bucket?” says Mark. “And how did you realize it had an about-ness relation to the pastures?”

“It’s an ordinary bucket,” I say. “I used to climb trees with it… I don’t think this question needs to be difficult.”

“I’m talking to Autrey,” says Mark.

“You have to bind the bucket to the pastures, and the pebbles to the sheep, using a magical ritual – pardon me, an emergent process with special causal powers – that my master discovered,” Autrey explains.

Autrey then attempts to describe the ritual, with Mark nodding along in sage comprehension.

“And this ritual,” says Mark, “it binds the pebbles to the sheep by the magical laws of Sympathy and Contagion, like a voodoo doll.”

Autrey winces and looks around. “Please! Don’t call it Sympathy and Contagion. We shepherds are an anti-superstitious folk. Use the word ‘intentionality’, or something like that.”

“Can I look at a pebble?” says Mark.

“Sure,” I say. I take one of the pebbles out of the bucket, and toss it to Mark. Then I reach to the ground, pick up another pebble, and drop it into the bucket.

Autrey looks at me, puzzled. “Didn’t you just mess it up?”

I shrug. “I don’t think so. We’ll know I messed it up if there’s a dead sheep next morning, or if we search for a few hours and don’t find any sheep.”

“But -” Autrey says.

“I taught you everything you know, but I haven’t taught you everything I know,” I say.

Mark is examining the pebble, staring at it intently. He holds his hand over the pebble and mutters a few words, then shakes his head. “I don’t sense any magical power,” he says. “Pardon me. I don’t sense any intentionality.”

“A pebble only has intentionality if it’s inside a ma- an emergent bucket,” says Autrey. “Otherwise it’s just a mere pebble.”

“Not a problem,” I say. I take a pebble out of the bucket, and toss it away. Then I walk over to where Mark stands, tap his hand holding a pebble, and say: “I declare this hand to be part of the magic bucket!” Then I resume my post at the gates.

Autrey laughs. “Now you’re just being gratuitously evil.”

I nod, for this is indeed the case.

“Is that really going to work, though?” says Autrey.

I nod again, hoping that I’m right. I’ve done this before with two buckets, and in principle, there should be no difference between Mark’s hand and a bucket. Even if Mark’s hand is imbued with the elan vital that distinguishes live matter from dead matter, the trick should work as well as if Mark were a marble statue.

(The moral: In this sequence, I explained how words come to 'mean' things in a lawful, causal, mathematical universe with no mystical subterritory. If you think meaning has a special power and special nature beyond that, then (a) it seems to me that there is nothing left to explain and hence no motivation for the theory, and (b) I should like you to say what this extra nature is, exactly, and how you know about it - your lips moving in this, our causal and lawful universe, the while.)

Comment author: Psy-Kosh 28 October 2012 07:03:49AM *  2 points [-]

How, precisely, does one formalize the concept of "the bucket of pebbles represents the number of sheep, but it is doing so inaccurately." ie, that it's a model of the number of sheep rather than about something else, but a bad/inaccurate model?

I've fiddled around a bit with that, and I find myself passing a recursive buck when I try to precisely reduce that one.

The best I can come up with is something like "I have correct models in my head for the bucket, pebbles, sheep, etc, individually except that I also have some causal paths linking them that don't match the links that exist in reality."

Comment author: Gabriel 24 October 2012 10:17:49PM *  1 point [-]

But you can argue for anything. You might refuse to do so but the possibility is always there.

The problem with being able to argue for anything is that people use that ability to rationalize their preferred conclusions. But if someone finds a conclusion offensive, then they have the opposite problem that they're unable to acknowledge valid arguments. I don't think practicing that would make people more prone to rationalization.

Well, maybe except that part:

Points deducted if an observer can tell the student doesn’t really agree with the position they’re defending.

Understanding someone doesn't have to involve pretending that you're them.

In response to comment by Gabriel on [Link] Offense 101
Comment author: Psy-Kosh 28 October 2012 02:48:05AM 1 point [-]

But you can argue for anything. You might refuse to do so but the possibility is always there.

Presumably one would wand to define "strong argument" in such a way that tend to to be more available for true things than for false things.

View more: Prev | Next