In my last post, I stumbled across some ideas which I thought were original, but which were already contained in UDT. I suspect that was because these ideas haven’t been given much emphasis in any of the articles I’ve read about UDT, so I wanted to highlight them here.
We begin with some definitions. Some inputs in an Input-Output map will be possible for some agents to experience, but not for others. We will describe such inputs and the situations they represent as conditionally consistent. Given a particular agent, we will call a input/situation compatible if the agent is consistent with the corresponding situation and incompatible otherwise. We will call agents consistent with a conditionally consistent input/situation compatible and those who aren’t incompatible.
We note the following points:
- UDT uses an Input-Output map instead of a Situation-Output map. It is easy to miss how important this choice is. Suppose we have an input representing a situation that is conditionally consistent. Trying to ask what an incompatible agent does in such a situation is problematic or at least difficult as the Principle of Explosion means that all such situations are equivalent. On the other hand, it is much easier to ask how the agent responds to a sequences of inputs representing an incompatible situation. The agent must respond somehow to such an input, even if it is by doing nothing or crashing. Situations are also modelled (via the Mathematical Intuition Function), but the point is that UDT models inputs and situations separately.
- Given the previous point, it is convenient to define an agent’s counterfactual action in an incompatible situation as its response to the input representing the situation. For all compatible situations, this produces the same action as if we’d simply asked what the agent would do in such a situation. For conditionally consistent situations the agent is incompatible with, it explains the incompatibility: any agent that would respond a certain way to particular inputs won’t be put in such a situation. (There might be conditionally consistent situations where compatibility isn’t dependent on responses to inputs, ie. only agents running particular source code are placed in a particular position, but UDT isn’t designed to solve these problem)
- Similarly, UDT predictors don’t actually predict what an agent does in a situation, but what an agent does when given an input representing a situation. This is a broader concept that allows them to predict behaviours in inconsistent situations. For a more formal explanation of these ideas, see Logical Counterfactuals for Perfect Predictors.
My previous post examined a specific example in detail, although UDT handles it slightly different.
In the example of Parfit's Hitchhiker, B is an agent for which we want to calculate a counterfactual, but we run into consistency issues. For example, B could be an agent that always defects and we could want to counterfactually calculate what B would do in town. A would be any agent which actually could actually arrive in town without an inconsistency.
I don't really follow this question. If you want to know what the inputs are, in my last past I pretended that they had direct access to an oracle with details about the situation, in addition to making observations as per the scenario. UDT assumes that the agent has a Mathematical Intuition Function so the input is only real observations.
The outputs are merely the actions that the agents take in particular scenarios or the actions that they are predicted to take if they counterfactually received a particular output.
No. Agents with the same code will always produce the same decision given the same inputs.
I don't claim that it is. What did I say that made you think I might have believed this?