AlephNeil comments on What is Wei Dai's Updateless Decision Theory? - Less Wrong

37 Post author: AlephNeil 19 May 2010 10:16AM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (63)

You are viewing a single comment's thread. Show more comments above.

Comment author: PhilGoetz 20 May 2010 01:53:52AM 0 points [-]

What if, in addition to freely-willed 'Player' nodes and random 'Nature' nodes, there is a third kind of node where the branch followed depends on the Player's strategy for a particular information state, regardless of whether that strategy has yet been executed. In other words, what if the universe contains 'telepathic robots' (whose behaviour is totally mechanical - they're not trying to maximize a utility function) that can see inside the Player's mind before they have acted?

Why is this worth considering?

Comment author: AlephNeil 20 May 2010 07:15:45AM 0 points [-]

Because this is the 'smallest generalisation' sufficient to permit newcomblike problems (such as the three I mentioned).

Btw, don't read too much into the fact that I've called these things 'robots' because in a sense everything is a robot. What I mean is something like "an agent or machine whose algorithm-governing-behaviour is 'given to us' without us having to do any decision theory". Or if we want to stick more closely to the AI context in which Wei proposed UDT, we're just talking about "another subroutine whose source code we can inspect in order to try to figure out what it does."

Comment author: drcode 20 May 2010 01:43:44PM 1 point [-]

I interpreted the question PG was asking as, "why is it worth considering newcomb-like problems?"

(Of course, any philosophical idea is worth considering, but the question is whether this line of reasoning has any practical benefits for developing AI software)

Comment author: AlephNeil 20 May 2010 02:33:21PM 2 points [-]

Ah, I see.

I'm not really qualified to give an answer (as I don't have any background in AI) but I'll try anyway: The strategies which succeed in newcomblike problems are in a certain sense 'virtuous'. By expanding the scope of their concern from the immediate indexical 'self' to the 'world as a whole' they realise that in the long run you do better if you're 'honest', and fulfil your 'obligations'. So a decision theory which can deduce and justify the 'right' choices on such problems is desirable.

UDT reminds me of Kant's categorical imperative "Act only according to that maxim whereby you can at the same time will that it should become a universal law."

I think the way in which moral behaviour gradually emerges out of 'enlightened self-interest' is profoundly relevant to anyone interested in the intersection of ethics and AI.

Comment author: Vladimir_Nesov 20 May 2010 03:28:29PM *  3 points [-]

UDT doesn't search the environment for copies of the agent, it merely accepts a problem statement where multiple locations of the agent are explicitly stated. Thus, if you don't explicitly tell UDT that those other agents are following the same decision-making process as you do, it won't notice, even if the other agents all have source code that is equal to yours.

Edit: This is not quite right. See Wei Dai's clarification and my response.

Comment author: AlephNeil 21 May 2010 01:13:05AM *  0 points [-]

So 'my version' of UDT is perhaps brushing over the distinction between "de facto copies of the agent that were not explicitly labelled as such in the problem statement" and "places where a superbeing or telepathic robot (i.e. Omega) is simulating the agent"?

The former would be subroutines of the world-program different from S but with the same source code as S, whereas the latter would be things of the form "Omega_predict(S, argument)"? (And a 'location of the agent explicitly defined as such' would just be a place where S itself is called?)

That could be quite important...

So I wonder how all this affects decision-making. If you have an alternate version of Newcomb's paradox where rather than OmegaPredict(S) we have OmegaPredict(T) for some T with the same source code as S, does UDT two-box?

Also, how does it square with the idea that part of what it means for an agent to be following UDT is that it has a faculty of 'mathematical intuition' by which it computes the probabilities of possible execution histories (based on the premise that its own output is Y)? Is it unreasonable to suppose that 'mathematical intuition' extends as far as noticing when two programs have the same source code?

Comment author: Vladimir_Nesov 21 May 2010 06:16:05PM 0 points [-]

Is it unreasonable to suppose that 'mathematical intuition' extends as far as noticing when two programs have the same source code?

You are right. See Wei Dai's clarification and my response.

Comment author: Vladimir_Nesov 21 May 2010 09:29:21AM *  0 points [-]

Since UDT receives environment parametrized by the source code, there is no way to tell what agent's source code is, and so there is no way of stating that environment contains another instance of agent's source code or of a program that does the same thing as agent's program, apart from giving the explicit dependence already. Explicit parametrization here implies absence of information about the parameter. UDT is in a strange situation of having to compute its own source code, when, philosophically, that doesn't make sense. (And it also doesn't know its own source code when, in principle, it's not a big deal.)

So the question of whether UDT is able to work with slightly different source code passed to Omega, or the same source code labeled differently, is not in the domain of UDT, it is something decided "manually" before the formal problem statement is given to UDT.

Edit: This is not quite right. See Wei Dai's clarification and my response.

Comment author: Wei_Dai 21 May 2010 04:29:23PM 5 points [-]

[I'm writing this from a hotel room in Leshan, China, as part of a 10-day 7-city self-guided tour, which may help explain my relative lack of participation in this discussion.]

Nesov, if by UDT you mean the version I gave in the article that AlpheNeil linked to in this post (which for clarity I prefer to call UDT1), it was intended that the agent knows its own source code. It doesn't explicitly look for copies of itself in the environment, but is supposed to implicitly handle other copies of itself (or predictions of itself, or generally, other agents/objects that are logically related to itself in some way). The way it does so apparently has problems that I don't know how to solve at this point, but it was never intended that locations of the agent are explicitly provided to the agent.

I may have failed to convey this because whenever I write out a world program for UDT1, I always use "S" to represent the agent, but S is supposed stand for the actual source code of the agent (i.e., a concrete implementation of UDT1), not a special symbol that means "a copy of the agent". And S is supposed to know its own source code via a quining-type trick.

(I'm hoping this is enough to get you and others to re-read the original post in a new light and understand what I was trying to get at. If not, I'll try to clarify more at a later time.)

Comment author: cousin_it 26 May 2010 03:09:32AM 2 points [-]

And S is supposed to know its own source code via a quining-type trick.

This phrase got me thinking in another completely irrelevant direction. If you know your own source code by quining, how do you know that it's really your source code? How does one verify such things?

Comment author: Wei_Dai 29 May 2010 04:16:25AM 1 point [-]

Here's a possibly more relevant variant of the question: we human beings don't have access to our own source code via quining, so how are we supposed to make decisions?

My thoughts on this so far are that we need to develop a method of mapping an external description of an mathematical object to what it feels like from the inside. Then we can say that the consequences of "me choosing option A" is the logical consequences of all objects with the same subjective experiences/memories as me choosing option A.

I think the quining trick may just be a stopgap solution, and the full solution even for AIs will need to involve something like the above. That's one possibility that I'm thinking about.

Comment author: Vladimir_Nesov 26 May 2010 11:25:38AM 0 points [-]

What is the "you" that is supposed to verify that? It's certainly possible if "you" already have your source code via the quine trick, so that you just compare it with the one given to you. On the other hand, if "you" are a trivial program that is not able to do that and answers "yes, it's my source code all right" unconditionally, there is nothing to be done about that. You have to assume something about the agent.

Comment author: Vladimir_Nesov 21 May 2010 06:14:16PM 2 points [-]

I understand now. So UDT is secretly ambient control, expressed a notch less formally (without the concept of ambient dependence). It is specifically the toy examples you considered that take the form of what I described as "explicit updateless control", where world-programs are given essentially parametrized by agent's source code (or, agent's decisions), and I mistook this imprecise interpretation of the toy examples for the whole picture. The search for the points from which the agent controls the world in UDT is essentially part of "mathematical intuition" module, so AlephNeil got that right, where I failed.

Comment author: cousin_it 20 May 2010 03:36:57PM *  2 points [-]

Newcomblike problems are not required for that. The usual story says that moral behavior emerges from repeated games.

Comment author: thomblake 20 May 2010 02:38:20PM 2 points [-]

I think the way in which moral behaviour gradually emerges out of 'enlightened self-interest' is profoundly relevant to anyone interested in the intersection of ethics and AI.

I agree, with the caveat that what applies to ethics might not apply naturally to Friendliness.

Comment author: PhilGoetz 20 May 2010 05:54:35PM 0 points [-]

Do you have a justification for choosing a decision logic that produces your morals, instead of choosing the morals provided by your decision logic?