Manfred comments on Satisficers want to become maximisers - Less Wrong

21 Post author: Stuart_Armstrong 21 October 2011 04:27PM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (67)

You are viewing a single comment's thread. Show more comments above.

Comment author: BruceyB 22 October 2011 03:27:47AM 7 points [-]

Here is a (contrived) situation where a satisficer would need to rewrite.

Sally the Satisficer gets invited to participate on a game show. The game starts with a coin toss. If she loses the coin toss, she gets 8 paperclips. If she wins, she gets invited to the Showcase Showdown where she will first be offered a prize of 9 paperclips. If she turns down this first showcase, she is offered the second showcase of 10 paper clips (fans of The Price is Right know the second showcase is always better).

When she first steps on stage she considers whether she should switch to maximizer mode or stick with her satisficer strategy. As a satisficer, she knows that if she wins the coin toss she won't be able to refuse the 9 paperclip prize since it satisfies her target expected utility of 9. So her expected utility as a satisficer is (1/2) * 8 + (1/2) * 9 = 8.5. If she won the flip as a maximizer, she would clearly pass on the first showcase and receive the second showcase of 10 paperclips. Thus her expected utility as a maximizer is (1/2) * 8 + (1/2) * 10 = 9. Switching to maximizer mode meets her target while remaining a satisficer does not, so she rewrites herself to be a maximizer.

Comment author: Manfred 22 October 2011 04:30:09AM *  2 points [-]

Ah, good point. So "picking the best strategy, not just the best individual moves" is similar to self-modifying to be a maximizer in this case.

On the other hand, if our satisficer runs on updateless decision theory, picking the best strategy is already what it does all the time. So I guess it depends on how your satisficer is programmed.

Comment author: Stuart_Armstrong 22 October 2011 12:46:47PM 0 points [-]

On the other hand, if our satisficer runs on updateless decision theory...

This seems to imply that an updatless satisficer would behave like a maximiser - or that an updatless satisficer with bounded rationality would make themselves into a maximiser as a precaution.

Comment author: Manfred 22 October 2011 08:05:31PM 1 point [-]

A UDT satisficer is closer to the original than a pure maximizer, because where different strategies fall above the threshold the original tie-breaking rule can still be applied.