I previously posted an example of a satisficer (an agent seeking to achieve a certain level of expected utility u) transforming itself into a maximiser (an agent wanting to maximise expected u) to better achieve its satisficing goals.
But the real problem with satisficers isn't that they "want" to become maximisers; the real problem is that their behaviour is undefined. We conceive of them as agents that would do the minimum required to reach a certain goal, but we don't specify "minimum required".
For example, let A be a satisficing agent. It has a utility u that is quadratic in the number of paperclips it builds, except that after building 10100, it gets a special extra exponential reward, until 101000, where the extra reward becomes logarithmic, and after 1010000, it also gets utility in the number of human frowns divided by 3↑↑↑3 (unless someone gets tortured by dust specks for 50 years).
A's satisficing goal is a minimum expected utility of 0.5, and, in one minute, the agent can press a button to create a single paperclip.
So pressing the button is enough. In the coming minute, A could decide to transform itself into a u-maximiser (as that still ensures the button gets pressed). But it could also do a lot of other things. It could transform itself into a v-maximiser, for many different v's (generally speaking, given any v, either v or -v will result in the button being pressed). It could break out, send a subagent to transform the universe into cream cheese, and then press the button. It could rewrite itself into a dedicated button pressing agent. It could write a giant Harry Potter fanfic, force people on Reddit to come up with creative solutions for pressing the button, and then implement the best.
All these actions are possible for a satisficer, and are completely compatible with its motivations. This is why satisficers are un(der)defined, and why any behaviour we want from it - such as "minimum required" impact - has to be put in deliberately.
I've got some ideas for how to achieve this, being posted here.
A given decision agent is making choices, including clippy, maximizers, and satisficers. All of them have utility functions which include increasing utility for things they like. Generally, both maximizers and satisficers have declining marginal utility for things they like, but increasing absolute utility for them. U(n things) < U(n+1 things), but U(thing #n) < U(thing #n+1).
Agents have competing desires (more than one thing in their utility function). So choices they make have to weigh different things. Do I want N of x and M+1 of y, or do I want N+1 of x and M of y? This is where it gets interesting: a satisficer generally values minimizing time and hassle more than getting more of a thing than really necessary. An optimizer values minimizing time and hassle, but less so compared to getting more desirable future states.
Clippy doesn't have multiple things to balance against each other, so it doesn't matter whether its utility function has declining marginal utilty, nor to what degree it declines. It has increasing absolute utilty, and there's nothing else to optimize, so more clips is always better. This is an unrelated topic to satisficers vs maximizers.
Okay, thank you. I was focusing on the pathological case.