Comment Permalink

My previous post examined a specific example in detail, although UDT handles it slightly different.

Where do such A and B in the example come from?

In the example of Parfit's Hitchhiker, B is an agent for which we want to calculate a counterfactual, but we run into consistency issues. For example, B could be an agent that always defects and we could want to counterfactually calculate what B would do in town. A would be any agent which actually could actually arrive in town without an inconsistency.

And what data do they specify?

I don't really follow this question. If you want to know what the inputs are, in my last past I pretended that they had direct access to an oracle with details about the situation, in addition to making observations as per the scenario. UDT assumes that the agent has a Mathematical Intuition Function so the input is only real observations.

The outputs are merely the actions that the agents take in particular scenarios or the actions that they are predicted to take if they counterfactually received a particular output.

Is decision an explicit part of this data, so that they can differ in decision without differing in code?

No. Agents with the same code will always produce the same decision given the same inputs.

An agent is not agent-plus-decision, and an agent-in-a-situation is not agent-in-a-situation-plus-decision, instead decision is a consequence of the agent or something considered apart from the agent

I don't claim that it is. What did I say that made you think I might have believed this?

See in context

11 A Short Note on UDT

by Chris_Leong

8th Aug 2018

2 min read

11

In my last post, I stumbled across some ideas which I thought were original, but which were already contained in UDT. I suspect that was because these ideas haven’t been given much emphasis in any of the articles I’ve read about UDT, so I wanted to highlight them here.

We begin with some definitions. Some inputs in an Input-Output map will be possible for some agents to experience, but not for others. We will describe such inputs and the situations they represent as conditionally consistent. Given a particular agent, we will call a input/situation compatible if the agent is consistent with the corresponding situation and incompatible otherwise. We will call agents consistent with a conditionally consistent input/situation compatible and those who aren’t incompatible.

We note the following points:

UDT uses an Input-Output map instead of a Situation-Output map. It is easy to miss how important this choice is. Suppose we have an input representing a situation that is conditionally consistent. Trying to ask what an incompatible agent does in such a situation is problematic or at least difficult as the Principle of Explosion means that all such situations are equivalent. On the other hand, it is much easier to ask how the agent responds to a sequences of inputs representing an incompatible situation. The agent must respond somehow to such an input, even if it is by doing nothing or crashing. Situations are also modelled (via the Mathematical Intuition Function), but the point is that UDT models inputs and situations separately.
Given the previous point, it is convenient to define an agent’s counterfactual action in an incompatible situation as its response to the input representing the situation. For all compatible situations, this produces the same action as if we’d simply asked what the agent would do in such a situation. For conditionally consistent situations the agent is incompatible with, it explains the incompatibility: any agent that would respond a certain way to particular inputs won’t be put in such a situation. (There might be conditionally consistent situations where compatibility isn’t dependent on responses to inputs, ie. only agents running particular source code are placed in a particular position, but UDT isn’t designed to solve these problem)
Similarly, UDT predictors don’t actually predict what an agent does in a situation, but what an agent does when given an input representing a situation. This is a broader concept that allows them to predict behaviours in inconsistent situations. For a more formal explanation of these ideas, see Logical Counterfactuals for Perfect Predictors.

Updateless Decision Theory

Frontpage

11

Mentioned in

18Alignment Newsletter #19

12Counterfactuals for Perfect Predictors

New Comment

9 comments, sorted by

top scoring

Click to highlight new comments since: Today at 7:28 AM

[-]Chris_Leong7y20

This seems to be an empty comment.

[-]Vladimir_Nesov7y20

Inputs for which the choice of responses matters are inputs whose possibility can't be ruled out. The inputs might not mean what they appear to mean, they might misrepresent the situations where they are received, but still they are seen in situations that are potentially real. So if there is any sense in which you can believe that the things you observe might be real, and so their understanding gives some sort of description of the world, it's probably the same sense that applies to any observations for which your decisions seem like they might matter.

One way to state the point of UDT1.1 is that maybe a model of the world based on particular observations to the exclusion of other possible observations is incomplete, it ignores relevant things, which precludes coordination with your versions that find themselves in those counterfactual situations, so you should be somewhat agnostic to what you see right now.

To deny/ignore reality of what you see or could see altogether and say that observations are just index by which your decision is to be looked up in a global strategy seems like it's throwing out useful understanding. Maybe it would make it difficult to discuss moral worth of the situations corresponding to possible observations.

[-]Chris_Leong7y20

"To deny/ignore reality of what you see or could see altogether and say that observations are just index by which your decision is to be looked up in a global strategy seems like it's throwing out useful understanding" - I'm not quite reducing them that mere indexes since I'm only making them represent indexes for conditionally consistent situations with incompatible agents. Suppose we have a conditionally consistent situation S and a compatible agent A who derives that they are in S when they see a set of observations O. Then we are using O as the lookup index representing the counterfactual of S for an incompatible agent B. Because O is interpreted as S by any compatible agent, it is hardly just an arbitrary index.

I honestly can't see how we can do better than this. S is incompatible with B, so the only way to make this meaningful will be to ask a slightly different question. We could ask about what B does in S given a paraconsistent logic, but this would involve asking a slightly different question as well.

But anyway, even though I thought this was a new proposal in my last post, it seems to be what UDT is already doing, unless I'm misunderstanding it.

[-]Vladimir_Nesov7y20

Suppose we have a conditionally consistent situation S and a compatible agent A who derives that they are in S when they see a set of observations O. Then we are using O as the lookup index representing the counterfactual of S for an incompatible agent B.

Where do such A and B in the example come from, and what data do they specify? Is decision an explicit part of this data, so that they can differ in decision without differing in code? (Why is B interested in A's location and not in their own?) An agent is not agent-plus-decision, and an agent-in-a-situation is not agent-in-a-situation-plus-decision, instead decision is a consequence of the agent or something considered apart from the agent. Do we need to consider together an agent, its observation, and a decision, something potentially "conditionally inconsistent"?

[-]Chris_Leong7y20

My previous post examined a specific example in detail, although UDT handles it slightly different.

Where do such A and B in the example come from?

And what data do they specify?

The outputs are merely the actions that the agents take in particular scenarios or the actions that they are predicted to take if they counterfactually received a particular output.

Is decision an explicit part of this data, so that they can differ in decision without differing in code?

No. Agents with the same code will always produce the same decision given the same inputs.

An agent is not agent-plus-decision, and an agent-in-a-situation is not agent-in-a-situation-plus-decision, instead decision is a consequence of the agent or something considered apart from the agent

I don't claim that it is. What did I say that made you think I might have believed this?

[-]Vladimir_Nesov7y20

For example, B could be an agent that always defects and we could want to counterfactually calculate what B would do in town.

But then you are not considering a decision. My comments were under the assumption that there is a decision to make, not an impossible situation to construct.

UDT assumes that the agent has a Mathematical Intuition Function so the input is only real observations.

I don't understand this statement (I'm thinking of UDT1.1, i.e. decision is a decision about strategy, so there is no input to consider during decision making).

What did I say that made you think I might have believed this?

I was mistaken in thinking that you were discussing decision making by a particular agent, in which case this was a possible source of contradictions in descriptions of situations. Still not clear what motivates considering the contradictory situations, what kinds of situations are to be considered, and what this has to do with UDT.

[-]Chris_Leong7y40

My comments were under the assumption that there is a decision to make, not an impossible situation to construct.

Well, the question is what should you do in Parfit's Hitchhiker with a perfect predictor. And before you can even talk about the predictor, you need to define what it predicts. Maybe it would have been clearer if I'd written, "B could be an agent that defects in any coherent situation and we want to construct a coherent counterfactual so that the predictor can predict it defecting"

UDT assumes that the agent has a Mathematical Intuition Function so the input is only real observations.

I wrote this last sentence with UDT 1.0 in mind, which makes it confusing as I referred to Input-Output maps which are part of UDT 1.1. In UDT 1.0, even though you don't perform Bayesian updates on input, they determine the observer set that is considered. Maybe it'd help to say that I think of UDT 1.1 as a modified version of UDT 1.0.

Still not clear what motivates considering the contradictory situations, what kinds of situations are to be considered, and what this has to do with UDT.

UDT is often argued to solve problems like Parfit's Hitchhiker

[-]Vladimir_Nesov7y20

I see. I think you are right, there is something wrong with Parfit's Hitchhiker, when it's understood in the way you did in the post, and UDT can't handle this either.

My guess is that usually it's understood differently, and I wasn't following the way you understood it in the post. The desert and the town are usually taken to be separate, so that they can always be instantiated separately, no matter what predictor in the desert or agent in town decide. So it's fine to have an agent in town with the memory of the predictor expecting them to not pay in town and not taking them there (and then for that agent to decide to pay).

It's an impossible situation similar to open box Newcomb's problem, but still a situation where the agent can be located, for purposes of finding dependencies to maximize over. These impossible situations need to be taken as "real enough" to find the agent in them. The dependency-based approach favored in UDT, TDT and now Functional Decision Theory doesn't help with clarifying this issue. For these, there's only the "living in impossible situations" philosophy that I alluded to in comments to your previous post that helps with setting up the problems so that they can be understood in terms of dependencies. Your take on this was to deny impossible situations and replace them with observations, which is easier to describe, but more difficult to reason about in unexpected examples.

(See also this comment for another way of tackling this issue.)

[-]Chris_Leong7y20

I see. I think you are right, there is something wrong with Parfit's Hitchhiker, when it's understood in the way you did in the post, and UDT can't handle this either.

This statement confuses me. My argument is that UDT already does do this, but that it does so without explicit explanation or justification of what it is doing.

So it's fine to have an agent in town with the memory of the predictor expecting them to not pay in town and not taking them there

Hmm... An agent that defects in any possible situation, for example, can figure out that the situation with this memory is impossible is impossible. So perhaps they're using a para-consistent logic. This would still work on a representation of a system, rather than the system. But the problem with doing this is that it assumes that the agent has the ability to represent para-consistent situations. And without knowing anything about para-consistent logic, I would suspect that there would be multiple approaches. How can we justify a specific approach? It seems much easier to avoid all of this and work directly with the inputs given that any real agent ultimately works on inputs. Or even if we do adopt a para-consistent logic, it seems like the justification for choosing the specific logic would be ultimately grounded in inputs.

Your take on this was to deny impossible situations and replace them with observations, which is easier to describe, but more difficult to reason about in unexpected examples.

How so? As I said, UDT already seems to do this.

Moderation Log