Decision theory: Why we need to reduce “could”, “would”, “should”

AnnaSalamon

38 Decision theory: Why we need to reduce “could”, “would”, “should”

2nd Sep 2009

5 min read

38

(This is the second post in a planned sequence.)

Let’s say you’re building an artificial intelligence named Bob. You’d like Bob to sally forth and win many utilons on your behalf. How should you build him? More specifically, should you build Bob to have a world-model in which there are many different actions he “could” take, each of which “would” give him particular expected results? (Note that e.g. evolution, rivers, and thermostats do not have explicit “could”/“would”/“should” models in this sense -- and while evolution, rivers, and thermostats are all varying degrees of stupid, they all still accomplish specific sorts of world-changes. One might imagine more powerful agents that also simply take useful actions, without claimed “could”s and “woulds”.)

My aim in this post is simply to draw attention to “could”, “would”, and “should”, as concepts folk intuition fails to understand, but that seem nevertheless to do something important for real-world agents. If we want to build Bob, we may well need to figure out what the concepts “could” and “would” can do for him.*

Introducing Could/Would/Should agents:

Let a Could/Would/Should Algorithm, or CSA for short, be any algorithm that chooses its actions by considering a list of alternatives, estimating the payoff it “would” get “if” it took each given action, and choosing the action from which it expects highest payoff.

That is: let us say that to specify a CSA, we need to specify:

A list of alternatives a_1, a_2, ..., a_n that are primitively labeled as actions it “could” take;
For each alternative a_1 through a_n, an expected payoff U(a_i) that is labeled as what “would” happen if the CSA takes that alternative.

To be a CSA, the algorithm must then search through the payoffs for each action, and must then trigger the agent to actually take the action a_i for which its labeled U(a_i) is maximal.

Note that we can, by this definition of “CSA”, create a CSA around any made-up list of “alternative actions” and of corresponding “expected payoffs”.

The puzzle is that CSAs are common enough to suggest that they’re useful -- but it isn’t clear why CSAs are useful, or quite what kinds of CSAs are what kind of useful. To spell out the puzzle:

Puzzle piece 1: CSAs are common. Humans, some (though far from all) other animals, and many human-created decision-making programs (game-playing programs, scheduling software, etc.), have CSA-like structure. That is, we consider “alternatives” and act out the alternative from which we “expect” the highest payoff (at least to a first approximation). The ubiquity of approximate CSAs suggests that CSAs are in some sense useful.

Puzzle piece 2: The naïve realist model of CSAs’ nature and usefulness doesn’t work as an explanation.

That is: many people find CSAs’ usefulness unsurprising, because they imagine a Physically Irreducible Choice Point, where an agent faces Real Options; by thinking hard, and choosing the Option that looks best, naïve realists figure that you can get the best-looking option (instead of one of those other options, that you Really Could have gotten).

But CSAs, like other agents, are deterministic physical systems. Each CSA executes a single sequence of physical movements, some of which we consider “examining alternatives”, and some of which we consider “taking an action”. It isn’t clear why or in what sense such systems do better than deterministic systems built in some other way.

Puzzle piece 3: Real CSAs are presumably not built from arbitrarily labeled “coulds” and “woulds” -- presumably, the “woulds” that humans and others use, when considering e.g. which chess move to make, have useful properties. But it isn’t clear what those properties are, or how to build an algorithm to compute “woulds” with the desired properties.

Puzzle piece 4: On their face, all calculations of counterfactual payoffs (“woulds”) involve asking questions about impossible worlds. It is not clear how to interpret such questions.

Determinism notwithstanding, it is tempting to interpret CSAs’ “woulds” -- our U(a_i)s above -- as calculating what “really would” happen, if they “were” somehow able to take each given action.

But if agent X will (deterministically) choose action a_1, then when he asks what would happen “if” he takes alternative action a_2, he’s asking what would happen if something impossible happens.

If X is to calculate the payoff “if he takes action a_2” as part of a causal world-model, he’ll need to choose some particular meaning of “if he takes action a_2” – some meaning that allows him to combine a model of himself taking action a_2 with the rest of his current picture of the world, without allowing predictions like “if I take action a_2, then the laws of physics will have been broken”.

We are left with several questions:

Just what are humans, and other common CSAs, calculating when we imagine what “would” happen “if” we took actions we won’t take?
In what sense, and in what environments, are such “would” calculations useful? Or, if “would” calculations are not useful in any reasonable sense, how did CSAs come to be so common?
Is there more than one natural way to calculate these counterfactual “would”s? If so, what are the alternatives, and which alternative works best?

*A draft-reader suggested to me that this question is poorly motivated: what other kinds of agents could there be, besides “could”/“would”/“should” agents? Also, how could modeling the world in terms of “could” and “would” not be useful to the agent?

My impression is that there is a sort of gap in philosophical wariness here that is a bit difficult to bridge, but that one must bridge if one is to think well about AI design. I’ll try an analogy. In my experience, beginning math students simply expect their nice-sounding procedures to work. For example, they expect to be able to add fractions straight across. When you tell them they can’t, they demand to know why they can’t, as though most nice-sounding theorems are true, and if you want to claim that one isn’t, the burden of proof is on you. It is only after students gain considerable mathematical sophistication (or experience getting burned by expectations that don’t pan out) that they place the burden of proofs on the theorems, assume theorems false or un-usable until proven true, and try to actively construct and prove their mathematical worlds.

Reaching toward AI theory is similar. If you don’t understand how to reduce a concept -- how to build circuits that compute that concept, and what exact positive results will follow from that concept and will be absent in agents which don’t implement it -- you need to keep analyzing. You need to be suspicious of anything you can’t derive for yourself, from scratch. Otherwise, even if there is something of the sort that is useful in the specific context of your head (e.g., some sort of “could”s and “would”s that do you good), your attempt to re-create something similar-looking in an AI may well lose the usefulness. You get cargo cult could/woulds.

+ Thanks to Z M Davis for the above gorgeous diagram.

Decision theory

Personal Blog

38

New Comment

Rendering 0/48 comments, sorted by

top scoring

(show more) Click to highlight new comments since: Today at 6:58 AM

Moderation Log

38 Decision theory: Why we need to reduce “could”, “would”, “should”

by AnnaSalamon

2nd Sep 2009

5 min read

38

A list of alternatives a_1, a_2, ..., a_n that are primitively labeled as actions it “could” take;
For each alternative a_1 through a_n, an expected payoff U(a_i) that is labeled as what “would” happen if the CSA takes that alternative.

To be a CSA, the algorithm must then search through the payoffs for each action, and must then trigger the agent to actually take the action a_i for which its labeled U(a_i) is maximal.

Note that we can, by this definition of “CSA”, create a CSA around any made-up list of “alternative actions” and of corresponding “expected payoffs”.

Just what are humans, and other common CSAs, calculating when we imagine what “would” happen “if” we took actions we won’t take?
In what sense, and in what environments, are such “would” calculations useful? Or, if “would” calculations are not useful in any reasonable sense, how did CSAs come to be so common?
Is there more than one natural way to calculate these counterfactual “would”s? If so, what are the alternatives, and which alternative works best?

+ Thanks to Z M Davis for the above gorgeous diagram.

Decision theory

Personal Blog

38

Mentioned in

51Original Research on Less Wrong

48Decision theory: Why Pearl helps reduce “could” and “would”, but still leaves us with at least three alternatives

46Solutions to Political Problems As Counterfactuals

31Decision theory: An outline of some upcoming posts

30Overcoming Clinginess in Impact Measures

Load More (5/9)

New Comment

Rendering 0/48 comments, sorted by

top scoring

(show more) Click to highlight new comments since: Today at 6:58 AM

Moderation Log

More from AnnaSalamon

Curated and popular this week

48Comments

Comment Permalink

ChrisHibbert17y40

I think some concreteness might be useful here. When I write code (no pretense at AI here), I often write algorithms that take different actions depending on the circumstances. I can't recall a time when I collected possible steps, evaluated them, and executed the possibility with the highest utility. Instead I, as the programmer, attempt to divide the world into disjoint possibilities, write an evaluation procedure that will distinguish between them (if-then-else, or using OO I ensure that the right kind of object will be acting at the time), and design the code so that it will take a specific action that I expected would make sense for that context when that is the path chosen. There's little of "could" or "should" here.

On the other hand, when I walk into the kitchen thinking thoughts of dessert, I generate possibilities based on my recollection of what's in the fridge and the cupboards or sometimes based on a search of those locations. I then think about which will taste better, which I've had more recently, which is getting old and needs to be used up, and then pick one (without justifying the choice based on the evaluations.) There seems to be lots of CSA going on here, even though it seems like a simple, highly constrained problem area.

When human chess masters play, they retain more could-ness in their evaluations if they consider the possibility of not making the "optimal" move in order to psych out their opponents. I don't know whether the chess-playing automatons consider those possibilities. Without it, you could say they are constrained to make the move that leaves them in the best position according to their evaluation metric. So even though they do explicitly evaluate alternatives, they have a single metric for making a choice. The masters I just described have multiple metrics and a vague approach to combining them, but that's the essence of good game playing.

Bottom line? When I'm considering a big decision, I want to leave more variables open, to simulate more possible worlds and the consequences of my choices. When I'm on well-trodden ground, I hope for an optimized decision procedure that knows what to do and has simple rules that allow it to determine which pre-analyzed direction is the right one. The reason we want AIs to be open in this way is that we're hoping they have the breadth of awareness to tackle problems that they haven't explicitly been programmed for. I don't think you (the programmer) can leave out the could-ness unless you can enumerate the alternative actions and program in the relevant distinctions ahead of time.

See in context