I previously posted an example of a satisficer (an agent seeking to achieve a certain level of expected utility u) transforming itself into a maximiser (an agent wanting to maximise expected u) to better achieve its satisficing goals.
But the real problem with satisficers isn't that they "want" to become maximisers; the real problem is that their behaviour is undefined. We conceive of them as agents that would do the minimum required to reach a certain goal, but we don't specify "minimum required".
For example, let A be a satisficing agent. It has a utility u that is quadratic in the number of paperclips it builds, except that after building 10100, it gets a special extra exponential reward, until 101000, where the extra reward becomes logarithmic, and after 1010000, it also gets utility in the number of human frowns divided by 3↑↑↑3 (unless someone gets tortured by dust specks for 50 years).
A's satisficing goal is a minimum expected utility of 0.5, and, in one minute, the agent can press a button to create a single paperclip.
So pressing the button is enough. In the coming minute, A could decide to transform itself into a u-maximiser (as that still ensures the button gets pressed). But it could also do a lot of other things. It could transform itself into a v-maximiser, for many different v's (generally speaking, given any v, either v or -v will result in the button being pressed). It could break out, send a subagent to transform the universe into cream cheese, and then press the button. It could rewrite itself into a dedicated button pressing agent. It could write a giant Harry Potter fanfic, force people on Reddit to come up with creative solutions for pressing the button, and then implement the best.
All these actions are possible for a satisficer, and are completely compatible with its motivations. This is why satisficers are un(der)defined, and why any behaviour we want from it - such as "minimum required" impact - has to be put in deliberately.
I've got some ideas for how to achieve this, being posted here.
This is a conception of maximizers that I generally like, and is true if "cost of analysis" is part of the objective function, but it's important to note that this is not the most generic class of maximizers, but a subset of that class. Note that any maximizer that comes up with a proof that it's found an optimal solution implicitly knows that the EV of continuing to analyze actions is lower than going ahead with that solution.
I think what you have in mind is more typically referred to as an "optimizer," like in "metaheuristic optimization." Tabu search isn't guaranteed to find you a globally optimal solution, but it'll get you a better solution than you started with faster than other approaches, and that's what people generally want. There's no use taking five years to produce an absolute best plan for assigning packages to trucks going out for delivery tomorrow morning.
But the distinction that Stuart_Armstrong cares about holds: maximizers (as I defined them, without taking analysis costs into consideration) seem easy to analyze and optimizers seem hard to analyze: I can figure out the properties that an absolute best solution has, and there's a fairly small set of those, but I might have a much harder time figuring out the properties that a solution returned by running tabu search overnight will have. But that might just be a perspective thing; I can actually run tabu search overnight a bunch of times, but I might not be able to actually figure out the set of absolute best solutions.
My intuition is telling me that resource costs are relevant to an agent whether it has a term in the objective function or not. Omohundro's instrumental goal of efficiency...?