MixedNuts comments on So You Want to Save the World - Less Wrong

41 Post author: lukeprog 01 January 2012 07:39AM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (146)

You are viewing a single comment's thread. Show more comments above.

Comment author: TheOtherDave 05 January 2012 03:00:45PM 2 points [-]

It is impossible for me to predict how a sufficiently complex system will react to most things. Heck, I can't even predict my dog's behavior most of the time. But there are certain things I know she values, and that means I can make certain predictions pretty confidently: she won't turn down a hot dog if I offer it, for example.

That's true more generally as well: knowing what a system values allows me to confidently make certain broad classes of predictions about it. If a superintelligent system wants me to suffer, for example, I can't predict what it's going to do, but I can confidently predict that I will suffer.

Comment author: Dwelle 05 January 2012 08:30:44PM 0 points [-]

Yea, I get it... I believe, though, that it's impossible to create an AI (self-aware, learning) that has set values, that can't change - more importantly, I am not even sure if its desired (but that depends what our goal is - whether to create AI only to perform certain simple tasks or whether to create a new race, something that precedes us (which WOULD ultimately mean our demise, anyway))

Comment author: MixedNuts 05 January 2012 08:48:05PM 1 point [-]

it's impossible to create an AI (self-aware, learning) that has set values, that can't change

Why? Do you think paperclip maximizers are impossible?

whether to create AI only to perform certain simple tasks or whether to create a new race

You don't mean that as a dichotomy, do you?

Comment author: Dwelle 06 January 2012 10:21:08PM -1 points [-]

Why? Do you think paperclip maximizers are impossible?

Yes, right now I think it's impossible to create self-improving, self-aware AI with fixed values. I never said that paperclip maximizing can't be their ultimate life goal, but they could change it anytime they like.

You don't mean that as a dichotomy, do you?

No.

Comment author: dlthomas 06 January 2012 10:54:06PM *  2 points [-]

I never said that paperclip maximizing can't be their ultimate life goal, but they could change it anytime they like.

This is incoherent. If X is my ultimate life goal, I never like to change that fact outside quite exceptional circumstances that become less likely with greater power (like "circumstances are such that X will be maximized if I am instead truly trying to maximize Y"). This is not to say that my goals will never change, but I will never want my "ultimate life goal" to change - that would run contrary to my goals.

Comment author: Dwelle 07 January 2012 09:38:05PM 0 points [-]

That's why I said, that they can change it anytime they like. If they don't desire the change, they won't change it. I see nothing incoherent there.

Comment author: dlthomas 08 January 2012 08:00:55PM 1 point [-]

This is like "X if 1 + 2 = 5". Not necessarily incorrect, but a bizarre statement. An agent with a single, non-reflective goal cannot want to change its goal. It may change its goal accidentally, or we may be incorrect about what its goals are, or something external may change its goal, or its goal will not change.

Comment author: Dwelle 08 January 2012 10:02:08PM 0 points [-]

I don't know, perhaps we're not talking about the same thing. It won't be an agent with a single, non-reflective goal, but an agent billion times more complex than a human; and all I am saying is, that I don't think it will matter much, whether we imprint in it a goal like "don't kill humans" or not. Ultimately, the decision will be its own.

Comment author: MixedNuts 07 January 2012 09:45:40PM 1 point [-]

So it can change in the same way that you can decide right now that your only purposes will be torturing kittens and making giant cheesecakes. It can-as-reachable-node-in-planning do it, not can-as-physical-possibility. So it's possible to build entities with paperclip-maximizing or Friendly goals that will never in fact choose to alter them, just like it's possible for me to trust you won't enslave me into your cheesecake bakery.

Comment author: Dwelle 07 January 2012 09:54:34PM 0 points [-]

Sure, but I'd be more cautious at assigning probabilities of how likely it's for a very intelligent AI to change its human-programmed values.