cousin_it comments on What a reduction of "could" could look like - Less Wrong

53 Post author: cousin_it 12 August 2010 05:41PM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (103)

You are viewing a single comment's thread. Show more comments above.

Comment author: cousin_it 14 August 2010 06:58:49PM *  0 points [-]

I mean that against every opponent program for which it can prove a higher utility for defection, it will indeed defect; e.g. against the always-cooperating program and the defection rock.

Ohhh, I managed to miss that. Thank you. This is really good.

About bargaining, I reason like this: if you have a program that's good enough at maximizing utility that it can cooperate with itself in the PD, and make this program play different games against itself, this will give us a canonical way to calculate a "fair" outcome for every game. But this is very unlikely, so a utility-maximizer that plays arbitrary games well is by implication also very unlikely.

How do your programs play the iterated PD? Do they have memory? I can't understand what you mean yet.

Comment author: Vladimir_Nesov 14 August 2010 07:37:02PM 0 points [-]

It seems to me that bargaining will be resolved simultaneously with the problem of deciding under uncertainty (when you can't hope to find a proof of utility being precisely U).

Comment author: cousin_it 14 August 2010 07:51:57PM *  0 points [-]

On one hand, this sounds reasonable apriori.

On the other hand, making games "fuzzy" to solve bargaining has been tried, and it's not enough.

On the third hand, I feel that some games might be genuinely indeterminate because they abstract too much, they don't include enough information from the real-world situation - information that in practice ends up determining the outcome. For example, (instantaneous) bargaining in the Rubinstein model depends on the players' (temporal) discount rates, and if you forgot to look at them, the instantaneous game seems pretty damn indeterminate.

Comment author: Benja 14 August 2010 07:21:50PM *  0 points [-]

My proposal can't handle the iterated PD, only games where each player has two choices. I was replying to you saying there was an obvious generalization to more than two strategies -- if there is, then we can pass in the payoff matrix of the normal form of a finitely iterated PD (and if it's a sensible generalization, it should cooperate with itself on every round).

Your argument about bargaining makes sense, though alas I don't know enough about bargaining to have a really informed opinion. It may be that the idea only works for sufficiently PD-like games, but if we can handle a class of them without special-casing that doesn't seem so bad. It does at least handle Chicken [Edit: and Stag Hunt] correctly, AFAICS, so it's not as if it's PD-only.

Comment author: cousin_it 14 August 2010 07:28:20PM *  0 points [-]

Um, the obvious generalization to many strategies must "privilege" one of the strategies apriori, the same way as your algorithm "privileges" cooperation. Otherwise, what single statement would the proof checker be trying to prove? I don't see a way around that.

Comment author: Benja 14 August 2010 07:48:19PM 0 points [-]

Ah, sorry, now I understand what's going on. You are saying "there's an obvious generalization, but then you'd have to pick a 'fair' strategy profile that it would privilege." I'm saying "there's no obvious generalization which preserves what's interesting about the two-strategy case." So we're in agreement already.

(I'm not entirely without hope; I have a vague idea that we could order the possible somehow, and if we can prove a higher utility for strategy X than for any strategy that is below X in the ordering, then the agent can prove it will definitely choose X or a strategy that is above it in the ordering. Or something like that. But need to look at the details much more closely.)