DanielLC comments on Satisficers want to become maximisers - Less Wrong

21 Post author: Stuart_Armstrong 21 October 2011 04:27PM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (67)

You are viewing a single comment's thread.

Comment author: DanielLC 22 October 2011 02:08:52AM 3 points [-]

Alternately, a satisficer could build a maximiser. For example, if you don't give it the ability to modify its own code. It also might build a paperclip-making Von Neumann machine that isn't anywhere near a maximizer, but is still insanely dangerous.

I notice a satisficing agent isn't well-defined. What happens when it has two ways of satisfying its goals? It may be possible to make a safe one if you come up with a good enough answer to that question.

Comment author: timtyler 22 October 2011 12:36:35PM *  2 points [-]

I notice a satisficing agent isn't well-defined.

What I usually mean by it is: maximise until some specified criterion is satisfied - and then stop.

However, perhaps "satisficing" is not quite the right word for this. IMO, agents that stop are an important class of agents. I think we need a name for them - and this is one of the nearest things. In my essay, I called them "Stopping superintelligences".

What happens when it has two ways of satisfying its goals?

That's the same as with a maximiser.

Comment author: Stuart_Armstrong 22 October 2011 12:54:58PM 1 point [-]

What happens when it has two ways of satisfying its goals?

That's the same as with a maximiser.

Except much more likely to come up; a maximiser facing many exactly balanced strategies in the real world is a rare occurance.

Comment author: timtyler 22 October 2011 07:16:40PM 1 point [-]

Well, usually you want satisfaction rapidly - and then things are very similar again.

Comment author: DanielLC 22 October 2011 09:56:02PM 1 point [-]

Then state that. It's an inverse-of-time-until-satisfaction-is-complete maximiser.

The way you defined satisfaction doesn't really work with that. The satisficer might just decide that it has a 90% chance of producing 10 paperclips, and thus its goal is complete. There is some chance of it failing in its goal later on, but this is likely to be made up by the fact that it probably will satisfy its goals with some extra. Especially if it could self-modify.

Comment author: Stuart_Armstrong 22 October 2011 08:34:12AM 1 point [-]

Alternately, a satisficer could build a maximiser.

Yep. Coding "don't unleash (or become) a maximiser or something similar" is very tricky.

I notice a satisficing agent isn't well-defined. What happens when it has two ways of satisfying its goals? It may be possible to make a safe one if you come up with a good enough answer to that question.

It may be. But encoding "safe" for a satisficer sounds like it's probably just as hard as constructing a safe utility function in the first place.