Less Wrong is a community blog devoted to refining the art of human rationality. Please visit our About page for more information.

Satisficers want to become maximisers

21 Post author: Stuart_Armstrong 21 October 2011 04:27PM

(with thanks to Daniel Dewey, Owain Evans, Nick Bostrom, Toby Ord and BruceyB)

In theory, a satisficing agent has a lot to recommend it. Unlike a maximiser, that will attempt to squeeze the universe to every drop of utility that it can, a satisficer will be content when it reaches a certain level expected utility (a satisficer that is content with a certain level of utility is simply a maximiser with a bounded utility function). For instance a satisficer with a utility linear in paperclips and a target level of 9, will be content once it's 90% sure that it's built ten paperclips, and not try to optimize the universe to either build more paperclips (unbounded utility), or obsessively count the ones it has already (bounded utility).

Unfortunately, a self-improving satisficer has an extremely easy way to reach its satisficing goal: to transform itself into a maximiser. This is because, in general, if E denotes expectation,

E(U(there exists an agent A maximising U))  ≥  E(U(there exists an agent A satisficing U))

How is this true (apart from the special case when other agents penalise you specifically for being a maximiser)? Well, agent A will have to make decisions, and if it is a maximiser, will always make the decision that maximises expected utility. If it is a satisficer, it will sometimes not make the same decision, leading to lower expected utility in that case.

So hence if there were a satisficing agent for U, and it had some strategy S to accomplish its goal, then another way to accomplish this would be to transform itself into a maximising agent and let that agent implement S. If S is complicated, and transforming itself is simple (which would be the case for a self-improving agent), then self-transforming into a maximiser is the easier way to go.

So unless we have exceedingly well programmed criteria banning the satisficer from using any variant of this technique, we should assume satisficers are as likely to be as dangerous as maximisers.

Edited to clarify the argument for why a maximiser maximises better than a satisficer.

Edit: See BruceyB's comment for an example where a (non-timeless) satisficer would find rewriting itself as a maximiser to be the only good strategy. Hence timeless satisficers would behave as maximisers anyway (in many situations). Furthermore, a timeless satisficer with bounded rationality may find that rewriting itself as a maximiser would be a useful precaution to take, if it's not sure to be able to precalculate all the correct strategies.

Comments (67)

Comment author: gwern 21 October 2011 05:54:57PM 10 points [-]

If that were not the case, then the maximising agent would transform itself into a satisficing agent, but, (unless there are other agents out there penalising you for your internal processes), there is no better way of maximising the expected U than by attempting to maximise the expected U.

Is that really true? This seems to be the main and non-trivial question here, presented without proof. It seems to me that there ought to be plenty of strategies that a satisficer would prefer over a maximizer, just like risk-averse strategies differ from optimal risk-neutral strategies. eg. buying +EV lottery tickets might be a maximizer's strategy but not a satisficer.

Comment author: Stuart_Armstrong 21 October 2011 06:38:34PM 1 point [-]

I reworded the passage to be:

How come this is true (apart from the special case when other agents penalise you specifically for being a maximiser)? Well, agent A will have to make decisions, and if it is a maximiser, will always make the decision that maximises expected utility. If it is a satisficer, it will sometimes not make the same decision, leading to lower expected utility in that case.

Yes, the satisficer can be more risk averse than the maximiser - but it's precisely that that makes a worse expected utility maximiser.

Comment author: gwern 21 October 2011 06:55:01PM 0 points [-]

OK, that makes more sense to me.

Comment author: dlthomas 21 October 2011 08:04:54PM 8 points [-]

I don't think this follows. Consider the case where there's two choices:

1) 10% chance of no paperclips, 90% chance of 3^^^3 paperclips 2) 100% chance of 20 paperclips

The maximizer will likely pick 1, while the satisficer will definitely prefer 2.

Comment author: timtyler 22 October 2011 07:29:18PM *  4 points [-]

I don't think this follows.

As I understand it, the actual problem in this area is not so much that "satisficers want to become maximisers" - but rather that a simple and effective means of satisficing fairly often involves constructing a maximising resource-gathering minion. Then after the satisficer is satisfied, the minion(s) may continue unsupervised - unless care is taken. I discussed this issue in my 2009 essay on the topic.

Comment author: Stuart_Armstrong 21 October 2011 08:36:59PM 3 points [-]

The expected utility of option 1 is higher than the threshold for the satisficer, so it could just as easily pick 1) as 2); it'll be indifferent between the two choices, and will need some sort of tie-breaker.

Comment author: dlthomas 21 October 2011 09:25:34PM *  3 points [-]

But inasmuch as it will want to want one over the other, it will want to want 2 which is guaranteed to continue to satisfice over 1 which has only a 90% chance of continuing to satisfice, so it should not want to become a maximizer.

Comment author: Manfred 21 October 2011 11:04:22PM *  4 points [-]

So that's actually the "bounded utility" definition, which Stuart says he isn't using. It does seem more intuitive though... I think you can get a paradox out of Stuart's definition, actually, which should not be surprising, since it isn't a utility-maximizer.

Comment author: Stuart_Armstrong 22 October 2011 08:01:32AM 0 points [-]

A satisficer is not motivated to continue to satisfice. It is motivated to take an action that is a satisficing action, and 1) and 2) are equally satisficing.

I know what you're trying to do, I think. I tried to produce a "continuously satisficing agent" or "future satisficing agent", but couldn't get it to work out.

Comment author: timtyler 22 October 2011 07:25:14PM 0 points [-]

It is motivated to take an action that is a satisficing action, and 1) and 2) are equally satisficing.

Surey option 1 has a 10% chance of failing to satisfy.

Comment author: Stuart_Armstrong 23 October 2011 12:01:57AM 2 points [-]

Option 1) already satisfies. Taking option 1) brings the expected utility up above the threshold, so the satisficer is done.

If you add the extra requirement that the AI must never let the expected utility fall below the threshold in future, then the AI will simply blind itself or turn itself off, once the satisficing level is reached; then its expected utility will never fall, as no extra information ever arrives.

Comment author: timtyler 23 October 2011 02:10:43AM 2 points [-]

Sorry - a failure to reread the question on my part :-(

Comment author: antigonus 21 October 2011 08:14:42PM *  1 point [-]

Right, the satisficer will not have an incentive to increase its expected utility by becoming a maximizer when its expected utility (by remaining a satisficer) is already over the threshold. But surely this condition would fail frequently.

Comment author: dlthomas 21 October 2011 09:26:40PM 0 points [-]

If it isn't over the threshold, it could just keep making the same decisions a maximizer would.

Comment author: JGWeissman 21 October 2011 07:36:55PM 4 points [-]

I see that a satisficer would assign higher expected utility to being a maximizer than to being a satisficer. But if the expected utility of being a satisficer were high enough, wouldn't it be satisfied to remain a satisficer?

Comment author: Stuart_Armstrong 21 October 2011 08:40:49PM 0 points [-]

The act of becoming a maximiser is an act that would, in itself, satisfy its satisficing requirement. The act of staying a satisficer might not do so (because if it did, for ever, then the satisficer will just be content with remaining a satisficer for ever, and never getting anything done).

Comment author: DanielLC 22 October 2011 02:08:52AM 3 points [-]

Alternately, a satisficer could build a maximiser. For example, if you don't give it the ability to modify its own code. It also might build a paperclip-making Von Neumann machine that isn't anywhere near a maximizer, but is still insanely dangerous.

I notice a satisficing agent isn't well-defined. What happens when it has two ways of satisfying its goals? It may be possible to make a safe one if you come up with a good enough answer to that question.

Comment author: timtyler 22 October 2011 12:36:35PM *  2 points [-]

I notice a satisficing agent isn't well-defined.

What I usually mean by it is: maximise until some specified criterion is satisfied - and then stop.

However, perhaps "satisficing" is not quite the right word for this. IMO, agents that stop are an important class of agents. I think we need a name for them - and this is one of the nearest things. In my essay, I called them "Stopping superintelligences".

What happens when it has two ways of satisfying its goals?

That's the same as with a maximiser.

Comment author: Stuart_Armstrong 22 October 2011 12:54:58PM 1 point [-]

What happens when it has two ways of satisfying its goals?

That's the same as with a maximiser.

Except much more likely to come up; a maximiser facing many exactly balanced strategies in the real world is a rare occurance.

Comment author: timtyler 22 October 2011 07:16:40PM 1 point [-]

Well, usually you want satisfaction rapidly - and then things are very similar again.

Comment author: DanielLC 22 October 2011 09:56:02PM 1 point [-]

Then state that. It's an inverse-of-time-until-satisfaction-is-complete maximiser.

The way you defined satisfaction doesn't really work with that. The satisficer might just decide that it has a 90% chance of producing 10 paperclips, and thus its goal is complete. There is some chance of it failing in its goal later on, but this is likely to be made up by the fact that it probably will satisfy its goals with some extra. Especially if it could self-modify.

Comment author: Stuart_Armstrong 22 October 2011 08:34:12AM 1 point [-]

Alternately, a satisficer could build a maximiser.

Yep. Coding "don't unleash (or become) a maximiser or something similar" is very tricky.

I notice a satisficing agent isn't well-defined. What happens when it has two ways of satisfying its goals? It may be possible to make a safe one if you come up with a good enough answer to that question.

It may be. But encoding "safe" for a satisficer sounds like it's probably just as hard as constructing a safe utility function in the first place.

Comment author: Manfred 21 October 2011 11:15:54PM 3 points [-]

If the way to satisfice best is to act like a maximizer, then wouldn't an optimal satisficer simply act like a maximizer, no self-rewriting required?

Comment author: BruceyB 22 October 2011 03:27:47AM 7 points [-]

Here is a (contrived) situation where a satisficer would need to rewrite.

Sally the Satisficer gets invited to participate on a game show. The game starts with a coin toss. If she loses the coin toss, she gets 8 paperclips. If she wins, she gets invited to the Showcase Showdown where she will first be offered a prize of 9 paperclips. If she turns down this first showcase, she is offered the second showcase of 10 paper clips (fans of The Price is Right know the second showcase is always better).

When she first steps on stage she considers whether she should switch to maximizer mode or stick with her satisficer strategy. As a satisficer, she knows that if she wins the coin toss she won't be able to refuse the 9 paperclip prize since it satisfies her target expected utility of 9. So her expected utility as a satisficer is (1/2) * 8 + (1/2) * 9 = 8.5. If she won the flip as a maximizer, she would clearly pass on the first showcase and receive the second showcase of 10 paperclips. Thus her expected utility as a maximizer is (1/2) * 8 + (1/2) * 10 = 9. Switching to maximizer mode meets her target while remaining a satisficer does not, so she rewrites herself to be a maximizer.

Comment author: Manfred 22 October 2011 04:30:09AM *  2 points [-]

Ah, good point. So "picking the best strategy, not just the best individual moves" is similar to self-modifying to be a maximizer in this case.

On the other hand, if our satisficer runs on updateless decision theory, picking the best strategy is already what it does all the time. So I guess it depends on how your satisficer is programmed.

Comment author: Stuart_Armstrong 22 October 2011 12:46:47PM 0 points [-]

On the other hand, if our satisficer runs on updateless decision theory...

This seems to imply that an updatless satisficer would behave like a maximiser - or that an updatless satisficer with bounded rationality would make themselves into a maximiser as a precaution.

Comment author: Manfred 22 October 2011 08:05:31PM 1 point [-]

A UDT satisficer is closer to the original than a pure maximizer, because where different strategies fall above the threshold the original tie-breaking rule can still be applied.

Comment author: D_Alex 24 October 2011 08:56:23AM 1 point [-]

Cool example! But your argument relies on certain vagueness in the definitions of "satisficer" and "maximiser", that between:

  • A: an agent "content when it reaches a certain level expected utility"; and
  • B: "simply a maximiser with a bounded utility function"

(These definitions are from the OP).

Looking at the situation you presented: "A" would recognise the situation as having an expected utility as 9, and be content with it (until she loses the coin toss...). "B" would not distinguish between the utility of 9 and the utility of 10. Neither agent would see a need to self-modify.

Your argument treats Sally as (seeing itself) morphing from "A" before the coin toss to "B" after - this, IMO, invalidates your example.

Comment author: Stuart_Armstrong 22 October 2011 12:44:46PM 0 points [-]

I like this, I really do. I've added a mention to it in the post. Note that your point not only shows that a non-timeless satisficer would want to become a maximiser, but that a timeless satisficer would behave as a maximiser already.

Comment author: Elithrion 29 January 2013 05:23:35PM 0 points [-]

I realize this is old (which is why I'm replying to a comment to draw attention), but still, the entire post seems to be predicated on a poor specification of the utility function. Remember, the utility function by definition includes/defines the full preference ordering over outcomes, and must therefore include the idea of acting "satisfied" inside it.

Here, instead, you seem to define a "fake" utility function of U = E(number of paperclips) and then say that the AI will be satisfied at a certain number of paperclips, even though it clearly won't be because that's not part of the utility function. That is, something with this purported utility function is already a pure maximiser, not a satisficer at all. Instead, the utility function you're constructing should be something like U = {9 if E(paperclips) >= 9, E(paperclips) otherwise}, in which case in the above example the satisficer really wouldn't care if it ended up with 9 or 10 paperclips and would remain a satisficer. The notion that a satisficer wants to become a maximiser arises only because you made the "satisficer's" utility function identical to a maximiser's to begin with.

(There may be other issues with satisficers, but I don't think this is one of them. Also, sorry if that came across as confrontational - I just wanted to make my objection as clear as possible.)

Comment author: Stuart_Armstrong 30 January 2013 12:42:45PM 0 points [-]

Satisficing is a term for a specific type of decision making - quoting wikipedia: "a decision-making strategy that attempts to meet an acceptability threshold. This is contrasted with optimal decision-making, an approach that specifically attempts to find the best option available."

So by definition a satisficer is an agent that is content with a certain outcome, even though they might prefer a better one. Do you think my model - utility denoting the ideal preferences, and satisficing being content with a certain threshold - is a poor model of this type of agent?

Comment author: Elithrion 30 January 2013 06:20:52PM *  0 points [-]

Do you think my model - utility denoting the ideal preferences, and satisficing being content with a certain threshold - is a poor model of this type of agent?

Yes, as I said, I think any preferences of the agent, including being "satisfied", need to be internalized in the utility function. That is, satisficing should probably be content not with a certain level of utility, but with a certain level of the objective. Anything that's "outside" the utility function, as satisficing is in this case, will naturally be seen as an unnecessary imposition by the agent and ultimately ignored (if the agent is able to ignore it), regardless of what it is.

For a contrived analogy, modeling a satisficer this way is similar to modelling an honest man as someone who wants to maximize money, but who lives under the rule of law (and who is able to stop the law applying to him whenever he wants at that).

Comment author: Stuart_Armstrong 31 January 2013 11:17:59AM 0 points [-]

So I did a post saying that a satisfier would turn into an expected utility maximiser, and your point is... that any satisficer should already be an expected utility maximiser :-)

Comment author: Elithrion 31 January 2013 05:29:20PM 0 points [-]

...and your point is... that any satisficer should already be an expected utility maximiser :-)

No, only one that's modeled the way you're modeling. I think I'm somehow not being clear, sorry =( My point is that your post is tautological and does an injustice to satisficers. If you move the satisfaction condition inside the utility function, e.g. U = {9 if E(paperclips) >= 9, E(paperclips) otherwise}, so that its utility increases to 9 as it gains expected paperclips, and then stops at 9 (which is also not really an optimal definition, but an adequate one), the phenomenon of wanting to be a maximiser disappears. With that utility function, it would be indifferent between being a satisficer and a maximiser.

If you instead changed to a utility function like, let's say: U = {1 if 8 < E(paperclips) < 11, 0 otherwise}, then it would strictly prefer to remain a satisficer, since a maximiser would inevitably push it into the 0 utility area of the function. I think this is the more standard way to model a satisficer (also with a resource cost thrown in as well), and it's certainly the more "steelmaned" one, as it avoids problems like the ones in this post.

Comment author: Stuart_Armstrong 01 February 2013 12:01:18PM 0 points [-]

That's just a utility maximiser with a bounded utility function.

But this has become a linguistic debate, not a conceputal one. One version of satisficisers (the version I define, which some people intuitively share) will tend to become maximisers. Another version (the bounded utility maximisers that you define) are already maximisers. We both agree on these facts - so what is there to argue about but the linguistics?

Since satisficing is more intuitively that rigorously defined (multiple formal definitions on wikipedia), I don't think there's anything more to dispute?

Comment author: Elithrion 01 February 2013 06:40:43PM 1 point [-]

All right, I agree with that. It does seem like satisficers are (or quickly become) a subclass of maximisers by either definition.

Although I think the way I define them is not equivalent to a generic bounded maximiser. When I think of one of those it's something more like U = paperclips/(|paperclips|+1) than what I wrote (i.e. it still wants to maximize without bound, it's just less interested in low probabilities of high gains), which would behave rather differently. Maybe I just have unusual mental definitions of both, however.

Comment author: antigonus 21 October 2011 08:00:56PM 3 points [-]

Doesn't follow if an agent wants to satisfice multiple things, since maximizing the amount of one thing could destroy your chances of bringing about a sufficient quantity of another.

Comment author: Stuart_Armstrong 21 October 2011 08:43:26PM *  2 points [-]

Interesting idea, but I think it reduces to the single case. If you want to satisfice, say, E(U1) > N1 and E(U2) > N2, then you could set U3=min(U1-N1,U2-N2) and satisfice E(U3) > 0.

Comment author: antigonus 21 October 2011 09:05:47PM *  0 points [-]

Sure, but there are vastly more constraints involved in maximizing E(U3). It's easy to maximize E(U(A)) in such a way as to let E(U(B)) go down the tubes. But if C is positively correlated with either A or B, it's going to be harder to max-min A and B while letting E(U(C)) plummet. The more accounted-for sources of utility there are, the likelier a given unaccounted-for source of utility X is to be entangled with one of the former, so the harder it will be for the AI to max-min in such a way that neglects X. Perhaps it gets exponentially harder! And humans care about hundreds or thousands of things. It's not clear that a satisficer that's concerned with a significant fraction of those would be able to devise a strategy that fails utterly to bring the others up to snuff.

Comment author: Stuart_Armstrong 22 October 2011 08:04:10AM 0 points [-]

I'm simply pointing out that a multi-satisficer is the same as a single-satisficer. Putting all the utilities together is a sensible thing; but making a satisficer out of that combination isn't.

Comment author: timtyler 21 October 2011 08:39:31PM *  2 points [-]

I described this issue - and discussed some strategies for dealing with it - in 2009 here.

Comment author: shminux 21 October 2011 05:01:10PM 3 points [-]

E(U(there exists an agent A maximising U) ≥ E(U(there exists an agent A satisficing U)

It's a good idea to define your symbols and terminology in general before (or right after) using them. Presumably U is utility, but what it E? Expectation value? How do you calculate it? What is an agent? How do you calculate utility of an existential quantifier? If this is all common knowledge, at least give a relevant link. Oh, and it is also a good idea to prove or at least motivate any non-trivial formula you present.

Feel free to make your post (which apparently attempts to make an interesting point) more readable for the rest of us (i.e. newbies like me).

Comment author: Stuart_Armstrong 21 October 2011 06:47:37PM 1 point [-]

Reworded somewhat. E is expectation value, as is now stated; it does not need to calculated, we just need to know that a maximiser will always make the decision that maximises the expected value of U, while a satisficer may sometimes make a different decision; hence the presence of a U-maximiser increases the expected value of U over the presence of an otherwise equivalent U-satisficer.

An agent is "An entity which is capable of Action)"; an AI or human being or collection of neurons that can do stuff. It's a general term here, so I didn't define it.

Comment author: scmbradley 17 February 2012 05:09:59PM 1 point [-]

As I understand what is meant by satisficing, this misses the mark. A satisficer will search for an action until it finds one that is good enough, then it will do that. A maximiser will search for the best action and then do that. A bounded maximser will search for the "best" (best according to its bounded utility function) and then do that.

So what the satisficer picks depends on what order the possible actions are presented to it in a way it doesn't for either maximiser. Now, if easier options are presented to it first then I guess your conclusion still follows, as long as we grant the premise that self-transforming will be easy.

But I don't think it's right to identify bounded maximisers and satisficers.

Comment author: Sniffnoy 22 October 2011 06:33:52AM 1 point [-]

It seems to me that a satisficer that cares about expected utility rather than actual utility is not even much of a satisficer in the first place, in that it doesn't do what we expect of satisficers (mostly ignoring small probabilities of gains much greater than its cap in favor of better probabilities of ones that just meet the cap). Whereas the usual satisficer, maximizer with the bounded utility function (well, not just bounded - cut off) does.

Comment author: RolfAndreassen 21 October 2011 07:47:22PM 1 point [-]

What if the satisficer is also an optimiser? That is, its utility function is not only flat in the number of paperclips after 9, but actually decreasing.

Comment author: Stuart_Armstrong 22 October 2011 08:37:10AM *  0 points [-]
Comment author: [deleted] 21 October 2011 06:40:46PM *  1 point [-]

E(U(there exists an agent A maximising U) ≥ E(U(there exists an agent A satisficing U)

The reason this equation looks confusing is because (I presume) there ought to be a second closing bracket on both sides.

Anyhow, I agree that a satisficer is almost as dangerous a maximiser. However, I've never come across the idea that a satisficing agent "has a lot to recommend it" on Less Wrong.

I thought that the vast majority of possible optimisation processes - maximisers, satisficers or anything else - are very likely to destroy humanity. That is why CEV, or in general the incorporation into an AI of at least one complete set of human values, is necessary in order for AGI not to be almost certainly uFAI.

Comment author: Stuart_Armstrong 21 October 2011 06:52:08PM *  2 points [-]

The reason this equation looks confusing is because (I presume) there ought to be a second closing bracket on both sides.

There are second closing brakets on both sides. Look closely. They have always been there. Honest, guv. No, do not look into your cache or at previous versions. They lie! I would never have forgotten to put closing brakets.

Nor would I ever misspell the word braket. Or used irony in a public place ;-)

Comment author: Stuart_Armstrong 21 October 2011 06:49:42PM 2 points [-]

This is a simple argument, that I hadn't seen before as to why satisficers are not a good way to go about things.

I've been looking at Oracles and other non-friendly AGIs that may nevertheless be survivable, so it's good to know that satisficers are not to be counted among them.

Comment author: Brian_Tomasik 11 August 2015 10:25:57PM 0 points [-]

As I understand it, your satisficing agent has essentially the utility function min(E[paperclips], 9). This means it would be fine with a 10^-100 chance of producing 10^101 paperclips. But isn't it more intuitive to think of a satisficer as optimizing the utility function E[min(paperclips, 9)]? In this case, the satisficer would reject the 10^-100 gamble described above, in favor of just producing 9 paperclips (whereas a maximizer would still take the gamble and hence would be a poor replacement for the satisficer).

A satisficer might not want to take over the world, since doing that would arouse opposition and possibly lead to its defeat. Instead, the satisficer might prefer to request very modest demands that are more likely to be satisfied (whether by humans or by an ascending uncontrolled AI who wants to mollify possible opponents).

Comment author: timtyler 24 October 2011 01:42:04PM 0 points [-]

it is very easy to show that any "satisficing problem" can be formulated as an equivalent "optimization problem"

Comment author: nazgulnarsil 23 October 2011 09:29:59PM 0 points [-]

Satisficing seems a great way to describe the behavior of maximizers with multiple-term utility functions and an ordinal ranking of preference satisfaction i.e. humans. This sounds like it should have some fairly serious implications.

Comment author: moridinamael 21 October 2011 05:25:35PM 0 points [-]

Build the utility function such that excesses above the target level are penalized. If the agent is motivated to build 9 paperclips only and absolutely no more, then the idea of becoming a maximizer becomes distasteful.

This amuses me because I know actual human beings who behave as satisficers with extreme aversion to waste, far out of proportion to the objective costs of waste. For example: Friends who would buy a Toyota Corolla based on its excellent value-to-cost ratio, and who would not want a cheaper, less reliable car, but who would also turn down a much nicer car offered to them at a severe discount, on the grounds that the nicer car is "indulgent."

Comment author: RobertLumley 21 October 2011 06:29:21PM 3 points [-]

But you run into other problems then, like the certainty the OP touched on. Then the agent will spend significant resources ensuring that it has exactly 9 paperclips made, and wouldn't accept a 90% probability of making 10 paperclips, because a 99.9999% probability of making 9 paperclips would yield more utility for it.

Comment author: timtyler 22 October 2011 12:39:52PM 1 point [-]

Sooo - you would normally give such an agent time and resource-usage limits.

Comment author: RobertLumley 22 October 2011 04:15:15PM 0 points [-]

But the entire point of building FAI is to not require it to have resource usage limits, because it can't help us if it's limited. And such resource limits wouldn't necessarily be useful for "testing" whether or not an AI was friendly, because if it weren't, it would mimic the behaviour of a FAI so that it could get more resources.

Comment author: timtyler 22 October 2011 07:21:38PM *  -1 points [-]

But the entire point of building FAI is to not require it to have resource usage limits, because it can't help us if it's limited.

Machines can't cause so much damage if they have resource-usage limits. This is a prudent safety precaution. It is not true that resource-limited machines can't help us.

And such resource limits wouldn't necessarily be useful for "testing" whether or not an AI was friendly, because if it weren't, it would mimic the behaviour of a FAI so that it could get more resources.

So: the main idea is to attempt damage limitation. If the machine behaves itself, you can carry on with another session. If it does not, it is hopefully back to the drawing board, without too much damage done.

Comment author: Stuart_Armstrong 21 October 2011 06:22:45PM 6 points [-]

If the agent is motivated to build 9 paperclips only and absolutely no more, then the idea of becoming a maximizer becomes distasteful.

That is already a maximiser - Its utility is maximised by building exactly 9 paperclips. It will take over universe to build more and more sophisticated ways of checking that there are exactly 9 paperclips, and more ways of preventing itself (however it defines itself) from inadvertently building more. In fact it may take over the universe first, put all the precautions in place, and build exactly 9 paperclips just before heat-death wipes out everything remaining.

Comment author: moridinamael 21 October 2011 07:33:56PM 2 points [-]

Ah, I see. Thanks for correcting me.

So my friends are maximizers in the sense that they seek very specific targets in car-space, and the fact that those targets sit in the middle of a continuum of options is not relevant to the question at hand.

Comment author: Vaniver 23 October 2011 05:06:29PM 0 points [-]

Um, the standard AI definition of a satisficer is:

"optimization where 'all' costs, including the cost of the optimization calculations themselves and the cost of getting information for use in those calculations, are considered."

That is, a satisficer explicitly will not become a maximizer, because it is consciously aware of the costs of being a maximizer rather than a satisficer.

A maximizer might have a utility function like "p", where p is the number of paperclips, while a satisficer would have a utility function like "p-c", where p is the number of paperclips and c is the cost of the optimization process. The maximizer is potentially unbounded; the satisficer stops when marginal reward equals marginal cost (which could also be unbounded, but is less likely to be so).

Comment author: timtyler 24 October 2011 02:09:15PM *  0 points [-]

That is, a satisficer explicitly will not become a maximizer, because it is consciously aware of the costs of being a maximizer rather than a satisficer.

According to the page you cite, satisficers are a subset of maximisers. Satisficers are just maximisers whose utility functions factor in constraints.

Comment author: Vaniver 24 October 2011 09:30:16PM 0 points [-]

Yes for some definitions of maximizers. The article Stuart_Armstrong wrote seems have to differing definitions: maximizers are agents that seek to get as much X as possible, and his satisficers want to get as much E(X) as possible. Then, trivially, those reduce to agents that want to get as much X as possible.

I don't see that as novel or relevant since what I would call satisficers are those that try to set marginal gain equal to marginal cost. Those generally do not reduce to agents that seek to get as much X as possible.

Comment author: PuyaSharif 22 October 2011 01:39:17PM 0 points [-]

Can you really assume the agent to have a utility function that is both linear in paperclips (which implies risk neutrality) and bounded + monotonic?

Comment author: Stuart_Armstrong 22 October 2011 05:36:48PM 0 points [-]

No; but you can assume it's linear up to some bound.