Vladimir_Nesov comments on UDT agents as deontologists - Less Wrong
You are viewing a comment permalink. View the original post to see all comments and the full post content.
You are viewing a comment permalink. View the original post to see all comments and the full post content.
Comments (109)
To the agent's builders.
ETA: I make that clear later in the post, but I'll add it to the intro paragraph.
I'm not sure what you mean. What I'm describing as coded into the agent "from birth" is Wei Dai's function P, which takes an output string Y as its argument (using subscript notation in his post).
ETA: Sorry, that is not right. To be more careful, I mean the "mathematical intuition" that takes in an input X and returns such a function P. But P isn't controlled by the agent's decisions.
ETA2: Gah. I misremembered how Wei Dai used his notation. And when I went back to the post to answer your question, I skimmed to quickly and misread.
So, final answer, when I say that "the agent always cares about all possible worlds according to how probable those worlds seemed to the agent's builders when they wrote the agent's source code", I'm talking about the "preference vector" that Wei Dai denotes by "<E1, E2, . . . >" and which he says "defines its preferences on how those programs should run."
I took him to be thinking of these entries Ei as corresponding to probabilities because of his post What Are Probabilities, Anyway?, where he suggests that "probabilities represent how much I care about each world".
ETA3: Nope, this was another misreading on my part. Wei Dai does not say that <E1, E2, . . . > is a vector of preferences, or anything like that. He says that it is an input to a utility function U, and that utility function is what "defines [the agent's] preferences on how those programs should run". So, what I gather very tentatively at this point is that the probability of each possible world is baked into the utility function U.
Very very wrong. The world program P (or what it does, anyway) is the only thing that's actually controlled in this control problem statement (more generally, a list <P1, P2, P3, ...> of programs, which could equivalently be represented by one program parametrized by an integer).
Edit: I misinterpreted the way Tyrrell used "P", correction here.
Here is the relevant portion of Wei Dai's post:
If I am reading him correctly, he uses the letter "P" in two different ways. In one use, he writes Pi, where i is an integer, to denote a program. In the other use, he writes P_Y, where Y is an output vector, to denote a probability distribution.
I was referring to the second use.
Okay, the characterization of P_Y seems right. For my reaction I blame the prior.
Returning to the original argument,
P_Y is not a description of probabilities of possible worlds conceived by agent's builder, it's something produced by "mathematical intuition module" for a given output Y (or, strategy Y if you incorporate the later patch to UDT).
You are right here. Like you, I misremembered Wei Dai's notation. See my last (I hope) edit to that comment.
I would appreciate it if you edited your comment where you say that I was "very very wrong" to say that P isn't controlled by the agent's decisions.
It's easier to have a linear discussion, rather than trying to patch everything by reediting it from the start (just saying, you are doing this for the third time to that poor top-level comment). You've got something wrong, then I've got something wrong, the errors were corrected as the discussion developed, moving on. The history doesn't need to be corrected. (I insert corrections to comments this way, without breaking the sequence.)
Thank you for the edit.