Benja comments on What a reduction of "could" could look like - Less Wrong

53 Post author: cousin_it 12 August 2010 05:41PM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (103)

You are viewing a single comment's thread. Show more comments above.

Comment author: Benja 14 August 2010 03:29:38PM 0 points [-]

The point is that I relied on the fact that if the agent can prove "If I cooperate, I get 5" and "If I defect, I get 2", then the agent will cooperate. This is a fact, but the agent cannot prove it: if its proof system were inconsistent, then it might (within maxsteps) also be able to prove "If I defect, I get 42," and thus end up defecting.

Maybe your comment suggests that the agent should choose to cooperate as soon as it has proven the first two statements above; but as far as the agent knows, if its proof system is inconsistent, it might find the proof of "If I defect, I get 42" earlier than the proof of "If I defect, I get 2." The agent cannot prove that "I will prove C -> 5 and D -> 2" implies "I will cooperate."

Comment author: Vladimir_Nesov 14 August 2010 05:01:46PM *  1 point [-]

The agent can just cooperate as soon as it proves "If I cooperate, I get X" and "If I defect, I get Y", for X>Y. This algorithm is justified, and the agent doesn't need to write its own algorithm, we are writing it.

Comment author: cousin_it 14 August 2010 05:36:33PM *  1 point [-]

I don't completely understand your idea, but Löbian cooperation seems to require enumerating a lot of proofs without stopping early. If Bob can stop early, Alice now needs to prove that Bob will stumble on a specific proof earlier than that, rather than at all. I remember trying and failing to remove the exhaustive search for proofs from my previous post. Or maybe I'm missing something?

Maybe you could formalize your idea and give some sort of proof? If it works, it'll be definite progress.

Comment author: Benja 14 August 2010 05:47:13PM 0 points [-]

IIUC, Vladimir is arguing that we can write an algorithm specifically for the PD that will cooperate if it finds any two proofs "C -> I get X" and "D -> I get Y", for X>Y. I think this algorithm will indeed do the right thing, do you agree?

Intuitively, the point is that both in Vladimir's algorithm and your earlier proof, it is sufficient to know something of the form "there exists a proof such that..." to conclude that the program will have a certain behavior (cooperate), whereas in my purported proof, you need to know something of the form "there exists no proof such that...", aka "for all proofs, it holds that..." (which you'd also need in both Vladimir's proposal and your earlier proof to show that one of the agents defects).

Comment author: cousin_it 14 August 2010 05:59:11PM *  0 points [-]

Ah. Now I see where the actual work is being done. Vladimir, sorry, I retract my request for clarification.

The proposed algorithm will probably work for the usual Löbian reasons (don't have the energy to write a formal proof right now, though). But it is not utility maximization. This algorithm asymmetrically "privileges" cooperation over defection: finding a proof that cooperation is better will make it cooperate, but finding a proof that defection is better won't make it defect. So in spirit it's pretty similar to my old algorithm, and we probably won't find a solution to game theory on these lines.

Comment author: Vladimir_Nesov 14 August 2010 07:58:42PM *  0 points [-]

Conjecture:

If you just look for the proofs of statements of the form

(action=A1 => U=U1) AND (action=A2 => U=U2) AND (U1>=U2)

where U1 and U2 are variables, and A1 and A2 are different constants (cooperate and defect in some order), up to some sufficient timeout (after which you act randomly), and perform A1 the moment you see the first one, you'll play correctly against your near-copy (with different timeout or syntactic differences), and defecting and cooperating rocks.

Comment author: Benja 14 August 2010 06:16:42PM *  0 points [-]

This algorithm asymmetrically "privileges" cooperation over defection

Now that I disagree with! If the algorithm finds a proof that defection is better, it will defect, because the proof system is consistent, so the algorithm can't in this case find a proof that cooperation is better. (It's true that it cannot prove that it will defect, of course.) In fact I'd say the algorithm privileges defection, because it'll defect unless it finds a reason to cooperate.

I don't see any straight-forward generalization to arbitrary games, but a natural generalization to games where every player has only two choices springs to mind. Take as parameters the payoff matrix and the other players' source code; then play minimax unless you can prove that playing the other strategy will result in a higher payoff. [Edit: Er, unless the utility you can prove you get under the assumption that you play the other strategy is higher than etc etc... you know what I mean.] Haven't yet looked into whether this actually does give sensible answers to non-PD games, though. *goes off to do just that*

Comment author: cousin_it 14 August 2010 06:33:02PM *  1 point [-]

Now that I disagree with! If the algorithm finds a proof that defection is better, it will defect

What on Earth do you mean by this counterfactual? :-)

The obvious extension of your proposed algorithm to games with many players and many strategies: assume any commonly-accepted method of finding a "fair strategy profile", attempt Löbian cooperation to reach it, otherwise play minimax. This doesn't solve fairness/bargaining, and we all are quite familiar with this sort of trick already.

Essentially, what you're doing is "porting" old ideas from one enforcement method (quining) to another (Löbian cooperation), hoping that fairness will magically pop out this time. No. Fairness and enforcement are orthogonal issues. I'm really frustrated that people stubbornly fail to see fairness as the serious obstacle to "solving" game theory. There's no cheap accidental way around it.

ETA: on rereading, this comment sounded harsh. Please don't take it personally - I just feel that at this point in the discussion, it's time to forcefully state some of my background intuitions so other people can break them down more easily.

Comment author: Benja 14 August 2010 06:56:05PM *  1 point [-]

What on Earth do you mean by this counterfactual? :-)

It's not a counterfactual, it's a logical implication, so it's true! :-) No, seriously, if it wasn't clear, I mean that against every opponent program for which it can prove a higher utility for defection, it will indeed defect; e.g. against the always-cooperating program and the defection rock.

My intuition is that the algorithm above precisely does not encode a notion of "fair strategy profile," it just tries to maximize its own utility; I suppose that's a fuzzy notion, though. I wasn't arguing that it solves bargaining. An extension to many strategies should cooperate in the finitely iterated PD without wiring in any more notion about "fairness" than in the algorithm above, and without special-casing IPD -- if that's indeed simple I'm curious how it works, it's not obvious to me.

Comment author: cousin_it 14 August 2010 06:58:49PM *  0 points [-]

I mean that against every opponent program for which it can prove a higher utility for defection, it will indeed defect; e.g. against the always-cooperating program and the defection rock.

Ohhh, I managed to miss that. Thank you. This is really good.

About bargaining, I reason like this: if you have a program that's good enough at maximizing utility that it can cooperate with itself in the PD, and make this program play different games against itself, this will give us a canonical way to calculate a "fair" outcome for every game. But this is very unlikely, so a utility-maximizer that plays arbitrary games well is by implication also very unlikely.

How do your programs play the iterated PD? Do they have memory? I can't understand what you mean yet.

Comment author: Vladimir_Nesov 14 August 2010 07:37:02PM 0 points [-]

It seems to me that bargaining will be resolved simultaneously with the problem of deciding under uncertainty (when you can't hope to find a proof of utility being precisely U).

Comment author: Benja 14 August 2010 07:21:50PM *  0 points [-]

My proposal can't handle the iterated PD, only games where each player has two choices. I was replying to you saying there was an obvious generalization to more than two strategies -- if there is, then we can pass in the payoff matrix of the normal form of a finitely iterated PD (and if it's a sensible generalization, it should cooperate with itself on every round).

Your argument about bargaining makes sense, though alas I don't know enough about bargaining to have a really informed opinion. It may be that the idea only works for sufficiently PD-like games, but if we can handle a class of them without special-casing that doesn't seem so bad. It does at least handle Chicken [Edit: and Stag Hunt] correctly, AFAICS, so it's not as if it's PD-only.

Comment author: Benja 14 August 2010 08:29:30PM *  0 points [-]

Essentially, what you're doing is "porting" old ideas from one enforcement method (quining) to another (Löbian cooperation), hoping that fairness will magically pop out this time.

I'm not sure when exactly you added this, is this still what you think I'm doing? What I'm trying to do is to give an algorithm that correctly handles games "where the other agents' decisions can be viewed as functions of one argument, that argument being your own choice in that particular case" (same Eliezer quote as upthread), without special-casing for the details of particular games like you did with your solution of the PD. I.e., I'm trying to do what I at first thought your original post did, but what you explained to me it doesn't. AFAICS, if Eliezer's informal arguments are correct, these games have a clear correct solution without solving bargaining.

[The ETA's duly noted :-)]

Comment author: cousin_it 15 August 2010 02:25:13AM *  0 points [-]

Can you describe this class of games precisely? And can you define the solution precisely, without referring to our algorithms?

Comment author: Benja 15 August 2010 02:38:30PM 0 points [-]

I can describe a class of such games and its solution, but I'd hope that if a good decision theoretical agent exists that solves this class, it might also solve some wider class of problems in an intuitively correct way. -- That said, the class is symmetric games, and the solution is the strategy profile that yields the highest utility, among those strategy profiles in which all players choose the same strategy. [More precisely, a good algorithm should choose this algorithm when playing against itself or a similar algorithm, should maximize its utility when playing against an opponent that unconditionally plays some given strategy, and should "do something sensible" in other cases.]

Comment author: Vladimir_Nesov 14 August 2010 07:32:56PM *  0 points [-]

Now that I disagree with! If the algorithm finds a proof that defection is better, it will defect, because the proof system is consistent, so the algorithm can't in this case find a proof that cooperation is better.

I don't see what theory do you refer to here, in saying "proof theory is consistent". We have an algorithm that proves some theorems in PA. The only theory under discussion is PA. If the algorithm finds a proof that defection is better, it defects, period.

Comment author: Benja 14 August 2010 08:00:13PM 0 points [-]

If PA is the system the agent uses, then by "the proof system," I mean PA. I was speaking generically, in case you'd prefer to use ZFC or PA + Con(PA) or whatever.

If the algorithm finds a proof that defection is better, it defects, period.

Er, are you saying that as soon as the algorithm finds such a proof, it stops searching and returns "defect"? That will not work, because then the Löbian reasoning won't go through, for the reason cousin_it gave. [Because PA cannot prove Con(PA), it also cannot prove that "there's a short proof that cooperate is better" implies "the agent cooperates" -- for all PA knows, PA might be inconsistent, and there might be an even shorter proof that defect is better.]

Or do you mean that the algorithm looks through the whole list of proofs, and if it never finds a proof that cooperate is better, it defects? Then we're in agreement; I was simply pointing out that in this case, we know that if the algorithm proves "defect is better," then it will not cooperate, because we know that PA is consistent, so if the algorithm proves "defect is better" then it will not prove "cooperate is better."

Comment author: Vladimir_Nesov 14 August 2010 08:27:28PM *  0 points [-]

Because PA cannot prove Con(PA), it also cannot prove that "there's a short proof that cooperate is better" implies "the agent cooperates"

It can enumerate all shorter proofs, and see if any of them imply a different action of the agent. (Note that PA is reflexive, that is can prove that any given finite set of its statements is logically consistent.)

Comment author: Benja 14 August 2010 08:57:17PM 0 points [-]

Can you give some link or citation about "reflexive"? I can't figure out what this means. It seems to me that if PA can prove every finite subset of its axioms consistent, then it can prove itself consistent: "If PA is inconsistent, then there would be a statement A and a proof of A and a proof of NOT A; but each proof only uses a finite set of axioms; the union of two finite sets is finite; so there is a finite set of axioms of PA that proves both A and NOT A; but every finite subset of PA is consistent, contradiction. Ergo, PA is consistent." Clearly, I'm misunderstanding something.

(PA can't talk about sets in general, but finite sets are fine, and "finite subset of PA" just means "finite set of propositions, all of which are in the enumeration that is PA.")

Comment author: Benja 14 August 2010 08:40:49PM 0 points [-]

I don't understand. The relevant step in the argument, as far as I understand it, requires an inference from "there exists a proof in n steps that cooperate is better" to "the agent cooperates". It seems to me that "enumerating all shorter proofs" would requiring knowing the precise proof (or at least its length), not just the statement "there exists such a proof". But I'm probably not following what, exactly, you have in mind; could you expand on your argument?

[I should say I was wrong to say "it ... cannot prove that" above -- all I can say is that one particular technique for showing that it can prove it doesn't go through.]