Vladimir_Nesov comments on Another attempt to explain UDT - Less Wrong
You are viewing a comment permalink. View the original post to see all comments and the full post content.
You are viewing a comment permalink. View the original post to see all comments and the full post content.
Comments (50)
Oh, lots of open problems remain. Here's a handy list of what I have in mind right now:
1) 2TDT-1CDT.
2) "Agent simulates predictor", or ASP: if you have way more computing power than Omega, then Omega can predict you can obtain its decision just by simulation, so you will two-box; but obviously this isn't what you want to do.
3) "The stupid winner paradox": if two superintelligences play a demand game for $10, presumably they can agree to take $5 each to avoid losing it all. But a human playing against a superintelligence can just demand $9, knowing the superintelligence will predict his decision and be left with only $1.
4) "A/B/~CON": action A gets you $5, action B gets you $10. Additionally you will receive $1 if inconsistency of PA is ever proved. This way you can't write a terminating utility() function, but can still define the value of utility axiomatically. This is supposed to exemplify all the tractable cases where one action is clearly superior to the other, but total utility is uncomputable.
5) The general case of agents playing a non-zero-sum game against each other, knowing each other's source code. For example, the Prisoner's Dilemma with asymmetrized payoffs.
I could make a separate post from this list, but I've been making way too many toplevel posts lately.
How is this not resolved? (My comment and the following Eliezer's comment; I didn't re-read the rest of the discussion.)
This basically says that the predictor is a rock, doesn't depend on agent's decision, which makes the agent lose because of the way problem statement argues into stipulating (outside of predictor's own decision process) that this must be a two-boxing rock rather than a one-boxing rock.
Same as (2). We stipulate the weak player to be a $9 rock. Nothing to be surprised about.
Requires ability to reason under logical uncertainty, comparing theories of consequences and not just specific possible utilities following from specific possible actions. Under any reasonable axioms for valuation of sets of consequences, action B wins.
Without good understanding of reasoning under logical uncertainty, this one remains out.
True, it doesn't "depend" on the agent's decision in the specific sense of "dependency" defined by currently-formulated UDT. The question (as with any proposed DT) is whether that's in fact the right sense of "dependency" (between action and utility) to use for making decisions. Maybe it is, but the fact that UDT itself says so is insufficient reason to agree.
[EDIT: fixed typo]
The arguments behind UDT's choice of dependence could prove strong enough to resolve this case as well. The fact that we are arguing about UDT's answer in no way disqualifies UDT's arguments.
My current position on ASP is that reasoning used in motivating it exhibits "explicit dependence bias". I'll need to (and probably will) write another top-level post on this topic to improve on what I've already written here and on the decision theory list.
About 2TDT-1CDT Wei didn't seem to consider it 100% solved, as of this August or September if I recall right. You'll have to ask him.
About ASP I agree with Gary: we do not yet completely understand the implications of the fact that a human like me can win in this situation, while UDT can't.
About A/B/~CON I'd like to see some sort of mechanical reasoning procedure that leads to the answer. You do remember that Wei's "existential" patch has been shown to not work, and my previous algorithm without that patch can't handle this particular problem, right?
(For onlookers: this exchange refers to a whole lot of previous discussion on the decision-theory-workshop mailing list. Read at your own risk.)
Both outcomes are stipulated in the corresponding unrelated decision problems. This is an example of explicit dependency bias, where you consider a collection of problem statements indexed by agents' algorithms, or agents' decisions in an arbitrary way. Nothing follows from there being a collection with so and so consequences of picking a certain element of it. Relation between the agents and problem statements connected in such a collection is epiphenomenal to agents' adequacy. I should probably write up a post to that effect. Only ambient consequences count, where you are already the agent that is part of (state of knowledge about) an environment and need to figure out what to do, for example which AI to construct and submit your decision to. Otherwise you are changing the problem, not reasoning about what to do in a given problem.
You can infer that A=>U \in {5,6} and B=>U \in {10,11}. Then, instead of only recognizing moral arguments of the form A=>U=U1, you need to be able to recognize such more general arguments. It's clear which of the two to pick.