User Comment Replies

Adaptation Executors and the Telos Margin

In that case, "purely observational" would describe an expectation for behavior and not the actual pattern of behavior. This is not at all what the conversion I described involves.

Remember: I'm allowing unlimited memory, taking into account the full history of inputs and outputs (i.e. environmental information received and agent response).

In your example, the history X might be (for example) A(ab)B(bc)C(ca)A, where (pq) is the action that happens to cause the environment to produce Q after P. In this case, the behavioral function B(X) would yield (ab... (read more)

Adaptation Executors and the Telos Margin

Plinthist3y10

Any form of generalization can be represented by a function on behavior which produces its results and yields actions based on them - I'm not following you here. Can you give me an example of a model of behavior that isn't purely observational, in the sense that it can't be represented as a function of the full history of actions and responses? Any model with such a representation is susceptible to a utility function that just checks whether each past action adhered to said function.

1JBlack3y

A purely observational model of behaviour is simply a list of actions that have actually been observed, and the histories of the universe that led to them. For example, with my trivial agent you could observe: "With the history of the universe being just the list of states [A], it performed action b leading to state B. With the list being [AB] it performed action c leading to state C. With the list being [ABC] it performed action a leading to state A." From this model you can conclude that if the universe was somehow rewound, and placed into state A, that the agent would once again perform action a. This agent is deterministic. From these observations you can fit any utility function with U(ABCA) > U(ABC) > U(AB) > U(A). But it's useless, since the history of the universe now contains states ABCA and you can't in fact roll back the universe. In particular, you have no idea whether U(ABCAB) > U(ABCAA) or not because your observations don't tell you. There are infinitely many behavioural rules that are not purely observational, but are compatible with the observations. Some of them allow predictions, some of them don't. Independently of that, some of them are compatible with a utility function, some of them aren't. The rules I gave for my agent are not purely observational - they are the actual rules that the agent uses for its actions (in a simplified, quantized universe) and not just some finite set of observations. The behavioural model corresponding to those rules is incompatible with every utility function.

Adaptation Executors and the Telos Margin

Plinthist3y10

I see - so you're describing a purely input-based and momentary utility function, which can rely only the time-independent response from the environment. For the incomplete-information circumstances that I'm modeling, agents representable in this way would need to be ridiculously stupid, as they couldn't make any connections between their actions and the feedback they get, nor between various instances of feedback. For example, a paperclip maximizer of this form could only check whether a paperclip is currently in its sensory access, in the best case.

Do you see how, if we expand the utility function's scope to both the agent's actions and its full history, a "behavior-adherence" utility function becomes trivial?

1JBlack3y

No, "state" here refers to the entire state of the universe including the agent's internal state. My example doesn't care about the internal state of the agent, but that's because the example is indeed a very stupid agent for simplicity, and not because this is in any way intrinsically required. Any purely observational model of behaviour can always correspond to a utility function, true. But also useless since such a model doesn't predict anything at all. As soon as you allow generalization beyond strictly what you have observed, you lose the guarantee that a utility function exists corresponding to that behaviour model.

Adaptation Executors and the Telos Margin

Plinthist3y10

By "utility function" here, I just mean a function encoding the preferences of an agent - one that it optimizes - based on everything available to it. So, for any behavioral model, you could construct such a function that universally prefers the agent's actions to be linked to its information by that model.

It sounds like this may not be what you associate this word with. Could you give me an example of a behavior pattern that is not optimized by any utility function?

1JBlack3y

A simple and familiar example is that if preferences are not transitive, then there does not exist any utility function that models them. Similar problems arise with other failures of the VNM axioms, all of which are capable of being violated by the actual behaviour model of an agent. Simplest example of non-transitivity: in state A the agent always takes action b, which yields state B. In state B the agent always takes action c, which yields state C. In state C the agent always takes action a, yielding state A. It's a very stupid agent, but it's obviously one that can exist. The inference of preferences from actions says that it prefers state B over A, state C over B, and state A over C. There is no utility function U such that U(A) > U(C) > U(B) > U(A).

LESSWRONG
LW

All of Plinthist's Comments + Replies