Comment author: Sebastian_Hagen2 04 September 2008 07:35:14PM 1 point [-]

Do you really truly think that the rational thing for both parties to do, is steadily defect against each other for the next 100 rounds?

No. That seems obviously wrong, even if I can't figure out where the error lies.
We only get a reversion to the (D,D) case if we know with a high degree of confidence that the other party doesn't use naive Tit for Tat, and they know that we don't. That seems like an iffy assumption to me. If we knew the *exact* algorithm the other side uses, it would be trivial to find a winning strategy; so how do we know it isn't naive Tit for Tat? If there's a sufficiently high chance the other side is using naive Tit for Tat, it might well be optimal to repeat their choices until the second-to-last round.

Comment author: Sebastian_Hagen2 04 September 2008 12:34:52AM 6 points [-]

Definitely defect. Cooperation only makes sense in the iterated version of the PD. This isn't the iterated case, and there's no prior communication, hence no chance to negotiate for mutual cooperation (though even if there was, meaningful negotiation may well be impossible depending on specific details of the situation). Superrationality be damned, humanity's choice doesn't have any causal influence on the paperclip maximizer's choice. Defection is the right move.

In response to Unnatural Categories
Comment author: Sebastian_Hagen2 24 August 2008 11:35:28AM 0 points [-]

Nitpicking your poison category:

What is a poison? ... Carrots, water, and oxygen are "not poison". ... (... You're really asking about fatality from metabolic disruption, after administering doses small enough to avoid mechanical damage and blockage, at room temperature, at low velocity.)

If I understand that last definition correctly, it should classify water as a poison.

Comment author: Sebastian_Hagen2 18 August 2008 03:40:40PM 3 points [-]

Doug S.:

What character is ◻?

That's u+25FB ('WHITE MEDIUM SQUARE').

Eliezer Yudkowsky:

Larry, interpret the smiley face as saying:

PA + (◻C -> C) |-

I'm still struggling to completely understand this. Are you also changing the meaning of ◻ from 'derivable from PA' to 'derivable from PA + (◻C -> C)'? If so, are you additionally changing L to use provability in PA + (◻C -> C) instead of provability in PA?

Comment author: Sebastian_Hagen2 11 August 2008 03:06:36PM 0 points [-]

Quick correction: s/abstract rational reasoning/abstract moral reasoning/

Comment author: Sebastian_Hagen2 11 August 2008 03:04:03PM 0 points [-]

Jadagul:

But my moral code does include such statements as "you have no fundamental obligation to help other people." I help people because I like to.

While I consider myself an altruist in principle (I have serious akrasia problems in practice), I do agree with this statement. Altruists don't have any obligation to help people, it just often makes sense for them to do so; sometimes it doesn't, and then the proper thing for them is not to do it.

Roko:

In the modern world, people have to make moral choices using their general intelligence, because there aren't enough "yuck" and "yum" factors around to give guidance on every question. As such, we shouldn't expect much more moral agreement from humans than from rational (or approximately rational) AIs.

There might not be enough "yuck" and "yum" factors around to offer direct guidance on every question, but they're still the basis for abstract rational reasoning. Do you think "paperclip optimizer"-type AIs are impossible? If so, why? There's nothing incoherent about a "maximize the number of paperclips over time" optimization criterion; if anything, it's a lot simpler than those in use by humans.

Eliezer Yudkowsky:

If I have a value judgment that would not be interpersonally compelling to a supermajority of humankind even if they were fully informed, then it is proper for me to personally fight for and advocate that value judgment, but not proper for me to preemptively build an AI that enforces that value judgment upon the rest of humanity.

I don't understand this at all. How is building a superintelligent AI not just a (highly effective, if you do it right) special method of personally fighting for your value judgement? Are you saying it's ok to fight for it, as long as you don't do it too effectively?

Comment author: Sebastian_Hagen2 11 August 2008 02:24:59AM 0 points [-]

I think my highest goal in life is to make myself happy. Because I'm not a sociopath making myself happy tends to involve having friends and making them happy. But the ultimate goal is me.

If you had a chance to take a pill which would cause you to stop caring about your friends by permanently maxing out that part of your hapiness function regardless of whether you had any friends, would you take it?
Do non-psychopaths that given the chance would self-modify into psychopaths fall into the same moral reference frame as stable psychopaths?

In response to The Meaning of Right
Comment author: Sebastian_Hagen2 30 July 2008 08:17:00PM 0 points [-]

After all, if the humans have something worth treating as spoils, then the humans are productive and so might be even more useful alive.

Humans depend on matter to survive, and increase entropy by doing so. Matter can be used for storage and computronium, negentropy for fueling computation. Both are limited and valuable (assuming physics doesn't allow for infinite-resource cheats) resources.

I read stuff like this and immediately my mind thinks, "comparative advantage." The point is that it can be (and probably is) worthwhile for Bob and Bill to trade with each other even if Bob is better at absolutely everything than Bill.

Comparative advantage doesn't matter for powerful AIs at massively different power levels. It exists between some groups of humans because humans don't differ in intelligence all that much when you consider all of mind design space, and because humans don't have the means to easily build subservient-to-them minds which are equal in power to them.
What about a situation where Bob can defet Bill very quickly, take all its resources, and use them to implement a totally-subservient-to-Bob mind which is by itself better at everything Bob cares about than Bill was? Resolving the conflict takes some resources, but leaving Bill to use them a) inefficiently and b) for not-exactly-Bob's goals might waste (Bob's perspective) even more of them in the long run. Also, eliminating Bill means Bob has to worry about one less potential threat that it would otherwise need to keep in check indefinitely.

The FAI may be an unsolvable problem, if by FAI we mean an AI into which certain limits are baked.

You don't want to build an AI with certain goals and then add on hard-coded rules that prevent it from fulfilling those goals with maximum efficiency. If you put your own mind against that of the AI, a sufficiently powerful AI will always win that contest. The basic idea behind FAI is to build an AI that genuinely wants good things to happen; you can't control it after it takes off, so you put in your conception of "good" (or an algorithm to compute it) into the original design, and define the AI's terminal values based on that. Doing this right is an extremely tough technical problem, but why do you believe it may be impossible?

In response to The Meaning of Right
Comment author: Sebastian_Hagen2 30 July 2008 05:05:00PM 0 points [-]

Constant [sorry for getting the attribution wrong in my previous reply] wrote:

We do not know very well how the human mind does anything at all. But that the the human mind comes to have preferences that it did not have initially, cannot be doubted.

I do not know whether those changes in opinion indicate changes in terminal values, but it doesn't really matter for the purposes of this discussion, since humans aren't (capital-F) Friendly. You definitely don't want an FAI to unpredictably change its terminal values. Figuring out how to reliably prevent this kind of thing from happening, even in a strongly self-modifying mind (which humans aren't), is one of the sub-problems of the FAI problem.
To create a society of AIs, hoping they'll prevent each other from doing too much damage, isn't a viable solution to the FAI problem, even in the rudimentary "doesn't kill all humans" sense. There's various problems with the idea, among them:
1. Any two AIs are likely to have a much vaster difference in effective intelligence than you could ever find between two humans (for one thing, their hardware might be much more different than any two working human brains). This likelihood increases further if (at least) some subset of them is capable of strong self-improvement. With enough difference in power, cooperation becomes a losing strategy for the more powerful party.
2. The AIs might agree that they'd all be better off if they took the matter currently in use by humans for themselves, dividing the spoils among each other.

In response to The Meaning of Right
Comment author: Sebastian_Hagen2 30 July 2008 12:32:00PM 0 points [-]

TGGP wrote:

We've been told that a General AI will have power beyond any despot known to history.

Unknown replied:

If that will be then we are doomed. Power corrupts. In theory an AI, not being human, might resist the corruption, but I wouldn't bet on that. I do not think it is a mere peculiarity of humanity that we are vulnerable to corruption.

A tendency to become corrupt when placed into positions of power is a feature of some minds. Evolutionary psychology explains nicely why humans have evolved this tendency. It also allows you to predict that other intelligent organisms, evolved in a sufficiently similar way, would be likely to have a similar feature.
Humans having this kind of tendency is a predictable result of what their design was optimized to do, and as such them having it doesn't imply much for minds from a completely different part of mind design space.
What makes you think a human-designed AI would be vulnerable to this kind of corruption?

View more: Prev | Next