Less Wrong is a community blog devoted to refining the art of human rationality. Please visit our About page for more information.

# Vladimir_Nesov comments on Towards a New Decision Theory - Less Wrong

43 13 August 2009 05:31AM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Sort By: Best

Comment author: 13 August 2009 04:10:42PM *  0 points [-]

Notice that which instances of the agent (making the choice) are possible in general depends on what choice it makes.

Consider what is accessible if you trace the history of the agent along counterfactuals. Let's say the time is discrete, and at each moment the agent is in a certain state. Going forwards in time, you include both options for the agent's state after receiving a binary observation from environment, and conversely, going backwards, you include both options for the agent's state before each option for a binary action that agent could make to arrive to the current state (action and observation are dual under time-reversal in reversible deterministic world dynamic). Iterating with these operations, you construct a "state network" of accessible agent states. (You include the states arrived at by "zig-zag" as well: first, a step to the past, then, a step to the future along an observation other than the one that led to the original state from which the tracing began - and you arrive at a counterfactual state in the usual sense - but these time-forward and time-backward steps can be repeated infinite number of times.)

Now, the set of all possible states of the agent becomes divided into equivalence classes of states belonging to the same state networks. If the agent belongs to one of the state networks, if couldn't be in any other state network (in the generalized sense of "coundn't"). But which states belong to which network depends on the agent's algorithm. In fact, the choice of the algorithm is equivalent to the choice of networks that cover the state set. I'm not really sure what to do with this construction, and whether the structure of the networks other that the network that contains the current state should matter. From the principle that observations shouldn't influence the choice of strategy, the other state networks should matter just as well, but then again they are not even counterfactual...

Comment author: 30 August 2009 08:22:51PM 0 points [-]

Action and observation are not "intuitively" dual, to my first thought they are invariant on time reversal. Action is a state-transition of the environment, and observation is a state-transition of the agent. I can see how the duality can be suggested by viewing action as a move of the agent-player and observation as a move of the environment-player. But here duality is in that a node which in one direction was a move by A (associated with arrows to the right), in the other direction is a move by E (associated with arrows to the left).

Comment author: 20 August 2009 11:55:24AM 0 points [-]

Ok, I understood this on my second reading, but I don't know what to make of it either. Why did you decide to think about agents like this, or did the idea just pop into your head and you wanted to see if it has any applications?

Comment author: 20 August 2009 12:18:42PM 1 point [-]

It's more or less a direct rendition of the idea of UDT: actions (with state transitions) depend on state of knowledge, so what does it say about the geometry of state transitions?

More relevant to the recent discussion: Where does logical dependence come from and how to track it in a representation detailed enough? The source of logical dependence, beside what comes from the common algorithm, is actions and observations. In forward-time, all states following a given observation become dependent on that observation, and in backward-time, states preceding an action. A single observation can make multiple actions depend on it, and thus make them dependent.

Connection with logic: states of knowledge in the state network are programs/proofs, and actions/observations are variables parameterizing more general programs that resolve into specific states of knowledge given these actions/observations. Also related to game semantics. This is one dimension along which to compress the knowledge representation and seek further understanding.