# Wei_Dai comments on A Problem About Bargaining and Logical Uncertainty - Less Wrong

21 21 March 2012 09:03PM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Sort By: Best

Comment author: 23 March 2012 06:51:45AM 1 point [-]

Do you mean that I won't have enough computing power also later, after the staple maximizer's proposal is stated, or that there isn't enough computing power just during the thought experiment?

You will have enough computing power later.

What does it mean to self-modify if no action is being performed, that is any decision regarding that action could be computed later without any preceding precommitments?

I mean suppose Omega gives you the option (now, when you don't have enough computing power to compute the millionth digit of pi) of replacing yourself with another AI that has a different decision theory, one that would later give control of the universe to the staples maximizer. Should you take this option? If not, what decision theory would refuse it? (Again, from your current perspective, taking the option gives you 1/2 "logical" probability of 10^20 paperclips instead of 1/2 "logical" probability of 10^10 paperclips. How do you justify refusing this?)

Comment author: 23 March 2012 09:54:16PM *  2 points [-]

(continuing from here)

I've changed my mind back. The 10^20 are only on the table for the loser, and can be given by the winner. When the winner/loser status is unknown, a winner might cooperate, since it allows the possibility of being a loser and receiving the prize. But if the winner knows own status, it can't receive that prize, and the loser has no leverage. So there is nothing problematic about 10^20 becoming inaccessible: it is only potentially accessible to the loser, when the winner is weak (doesn't know own status), while an informed winner won't give it away, so that doesn't happen. Resolving logical uncertainty makes the winner stronger, makes the loser weaker, and so the prize for the loser becomes smaller.

Comment author: 23 March 2012 08:02:17PM *  1 point [-]

Edit: Nope, I changed my mind back.

You've succeeded in convincing me that I'm confused about this problem, and don't know how to make decisions in problems like this.

There're two types of players in this game: those that win the logical lottery and those that lose (here, paperclip maximizer is a winner, and staple maximizer is a loser). A winner can either cooperate or defect against its loser opponent, with cooperation giving the winner 0 and loser 10^20, and defection giving the winner 10^10 and loser 0.

If a player doesn't know whether it's a loser or a winner, coordinating cooperation with its opponent has higher expected utility than coordinating defection, with mixed strategies presenting options for bargaining (the best coordinated strategy for a given player is to defect, with opponent cooperating). Thus, we have a full-fledged Prisoner's Dilemma.

On the other hand, obtaining information about your identity (loser or winner) transforms the problem into one where you seemingly have only the choice between 0 and 10^10 (if you're a winner), or always 0 with no ability to bargain for more (if you're a loser). Thus, it looks like knowledge of a fact turns a problem into one of lower expected utility, irrespective of what the fact turns out to be, and takes away the incentives that would've made a higher win (10^20) possible. This doesn't sound right, there should be a way of making the 10^20 accessible.

Comment deleted 23 March 2012 07:25:18PM *  [-]
Comment author: 23 March 2012 07:58:35PM 1 point [-]

I don't know what is the correct decision in this situation anymore, or how to think about such decisions.

Good, I'm in a similar state. :)

The problem has ASP-ish feel to it, you're punished for taking too much information into account, even though from the point of view of having taken that information into account, your resulting decision seems correct.

Yes, I noticed the similarity as well, except in the ASP case it seems clearer what the right thing to do is.

Comment author: 23 March 2012 08:08:30PM 0 points [-]

(Grandparent was my comment, deleted while I was trying to come up with a clearer statement of my confusion, before I saw the reply. The new version is here.)