I realize this is old (which is why I'm replying to a comment to draw attention), but still, the entire post seems to be predicated on a poor specification of the utility function. Remember, the utility function by definition includes/defines the full preference ordering over outcomes, and must therefore include the idea of acting "satisfied" inside it.
Here, instead, you seem to define a "fake" utility function of U = E(number of paperclips) and then say that the AI will be satisfied at a certain number of paperclips, even though it clearly won't be because that's not part of the utility function. That is, something with this purported utility function is already a pure maximiser, not a satisficer at all. Instead, the utility function you're constructing should be something like U = {9 if E(paperclips) >= 9, E(paperclips) otherwise}, in which case in the above example the satisficer really wouldn't care if it ended up with 9 or 10 paperclips and would remain a satisficer. The notion that a satisficer wants to become a maximiser arises only because you made the "satisficer's" utility function identical to a maximiser's to begin with.
(There may be other issues with satisficers, but I don't think this is one of them. Also, sorry if that came across as confrontational - I just wanted to make my objection as clear as possible.)
Satisficing is a term for a specific type of decision making - quoting wikipedia: "a decision-making strategy that attempts to meet an acceptability threshold. This is contrasted with optimal decision-making, an approach that specifically attempts to find the best option available."
So by definition a satisficer is an agent that is content with a certain outcome, even though they might prefer a better one. Do you think my model - utility denoting the ideal preferences, and satisficing being content with a certain threshold - is a poor model of this type of agent?
(with thanks to Daniel Dewey, Owain Evans, Nick Bostrom, Toby Ord and BruceyB)
In theory, a satisficing agent has a lot to recommend it. Unlike a maximiser, that will attempt to squeeze the universe to every drop of utility that it can, a satisficer will be content when it reaches a certain level expected utility (a satisficer that is content with a certain level of utility is simply a maximiser with a bounded utility function). For instance a satisficer with a utility linear in paperclips and a target level of 9, will be content once it's 90% sure that it's built ten paperclips, and not try to optimize the universe to either build more paperclips (unbounded utility), or obsessively count the ones it has already (bounded utility).
Unfortunately, a self-improving satisficer has an extremely easy way to reach its satisficing goal: to transform itself into a maximiser. This is because, in general, if E denotes expectation,
E(U(there exists an agent A maximising U)) ≥ E(U(there exists an agent A satisficing U))
How is this true (apart from the special case when other agents penalise you specifically for being a maximiser)? Well, agent A will have to make decisions, and if it is a maximiser, will always make the decision that maximises expected utility. If it is a satisficer, it will sometimes not make the same decision, leading to lower expected utility in that case.
So hence if there were a satisficing agent for U, and it had some strategy S to accomplish its goal, then another way to accomplish this would be to transform itself into a maximising agent and let that agent implement S. If S is complicated, and transforming itself is simple (which would be the case for a self-improving agent), then self-transforming into a maximiser is the easier way to go.
So unless we have exceedingly well programmed criteria banning the satisficer from using any variant of this technique, we should assume satisficers are as likely to be as dangerous as maximisers.
Edited to clarify the argument for why a maximiser maximises better than a satisficer.
Edit: See BruceyB's comment for an example where a (non-timeless) satisficer would find rewriting itself as a maximiser to be the only good strategy. Hence timeless satisficers would behave as maximisers anyway (in many situations). Furthermore, a timeless satisficer with bounded rationality may find that rewriting itself as a maximiser would be a useful precaution to take, if it's not sure to be able to precalculate all the correct strategies.