Benja comments on What a reduction of "could" could look like - Less Wrong

53 Post author: cousin_it 12 August 2010 05:41PM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (103)

You are viewing a single comment's thread. Show more comments above.

Comment author: Benja 14 August 2010 02:30:40PM *  0 points [-]

But in general, I don't believe this sort of reasoning carries over to multi-player games. If we take the Prisoner's Dilemma and tweak the payoffs slightly to make them non-symmetric, the agents are back to square one.

I don't understand. Do you mean changing the numeric values, leaving the game the same, or changing the ordering, turning it into different game? In the first case, it seems to me that the argument from your other post should work just fine [Edit: Wrong, see cousin_it's explanation below] -- roughly:

Suppose there's a short proof that agent1()=="C" iff agent2()=="cooperate". Then there's an only slightly longer proof that agent1()=="C" implies that both agents get the C/C payoff, and that agent1()=="D" implies both get the D/D payoff. Therefore, under the supposition, agent1 will choose "C"; similarly, agent2 will choose "cooperate".

In other words: If there's a short proof that agent1()=="C" iff agent2()=="cooperate", then agent1()=="C" and agent2()=="cooperate". By the Löbian reasoning in your previous post (linked above), it follows that both agents cooperate.

Am I confused about something? Or is it just that you were in fact talking about changing the payoffs so that the game is not a PD any longer? (I think it still at least bears mentioning if your work already solves the class of problems that Eliezer has been talking about, "where the other agents' decisions can be viewed as functions of one argument, that argument being your own choice in that particular case.")

Comment author: cousin_it 14 August 2010 02:45:47PM *  1 point [-]

This post does solve the class of problems Eliezer was talking about (I call it the "fair" class), but my previous Löbian post used a different algorithm that didn't do utility maximization. That algorithm explicitly said that it cooperates if it can find any proof that agent1()==agent2(), without comparing utilities or trying to prove other statements, and that explicitness made the proof work in the first place.

Your proof fails here:

Therefore, under the supposition, agent1 will choose "C"; similarly, agent2 will choose "cooperate".

It's extremely difficult for the agent to conclude that it will make any particular choice, because any proof of that (starting from your assumptions or any other assumptions) must also prove that the agent won't stumble on any other proofs that lead to yet higher outcomes. This amounts to the agent assuming the consistency of its own proof checker, which is a no go.

As far as I can see right now, making the payoffs slightly non-symmetric breaks everything. If you think an individual utility-maximizing algorithm can work in general game-theoretic situations, you have a unique canonical solution for all equilibrium selection and bargaining problems with non-transferable utility. Judging from the literature, finding such a unique canonical solution is extremely unlikely. (The transferable utility case is easy - it is solved by the Shapley value and my Freaky Fairness algorithm.)

Comment author: Vladimir_Nesov 14 August 2010 03:06:57PM *  0 points [-]

It's extremely difficult for the agent to conclude that it will make any particular choice, because any proof of that (starting from your assumptions or any other assumptions) must also prove that the agent won't stumble on any other proofs that lead to yet higher outcomes.

When does the agent need to have that knowledge? It just acts on certain conclusions, at the moment it obtains them. It can as well act on conclusions1 about its future conclusions2, the only problem might be that it never reaches those conclusions1 (and so won't act on them)

Comment author: Benja 14 August 2010 03:29:38PM 0 points [-]

The point is that I relied on the fact that if the agent can prove "If I cooperate, I get 5" and "If I defect, I get 2", then the agent will cooperate. This is a fact, but the agent cannot prove it: if its proof system were inconsistent, then it might (within maxsteps) also be able to prove "If I defect, I get 42," and thus end up defecting.

Maybe your comment suggests that the agent should choose to cooperate as soon as it has proven the first two statements above; but as far as the agent knows, if its proof system is inconsistent, it might find the proof of "If I defect, I get 42" earlier than the proof of "If I defect, I get 2." The agent cannot prove that "I will prove C -> 5 and D -> 2" implies "I will cooperate."

Comment author: Vladimir_Nesov 14 August 2010 05:01:46PM *  1 point [-]

The agent can just cooperate as soon as it proves "If I cooperate, I get X" and "If I defect, I get Y", for X>Y. This algorithm is justified, and the agent doesn't need to write its own algorithm, we are writing it.

Comment author: cousin_it 14 August 2010 05:36:33PM *  1 point [-]

I don't completely understand your idea, but Löbian cooperation seems to require enumerating a lot of proofs without stopping early. If Bob can stop early, Alice now needs to prove that Bob will stumble on a specific proof earlier than that, rather than at all. I remember trying and failing to remove the exhaustive search for proofs from my previous post. Or maybe I'm missing something?

Maybe you could formalize your idea and give some sort of proof? If it works, it'll be definite progress.

Comment author: Benja 14 August 2010 05:47:13PM 0 points [-]

IIUC, Vladimir is arguing that we can write an algorithm specifically for the PD that will cooperate if it finds any two proofs "C -> I get X" and "D -> I get Y", for X>Y. I think this algorithm will indeed do the right thing, do you agree?

Intuitively, the point is that both in Vladimir's algorithm and your earlier proof, it is sufficient to know something of the form "there exists a proof such that..." to conclude that the program will have a certain behavior (cooperate), whereas in my purported proof, you need to know something of the form "there exists no proof such that...", aka "for all proofs, it holds that..." (which you'd also need in both Vladimir's proposal and your earlier proof to show that one of the agents defects).

Comment author: cousin_it 14 August 2010 05:59:11PM *  0 points [-]

Ah. Now I see where the actual work is being done. Vladimir, sorry, I retract my request for clarification.

The proposed algorithm will probably work for the usual Löbian reasons (don't have the energy to write a formal proof right now, though). But it is not utility maximization. This algorithm asymmetrically "privileges" cooperation over defection: finding a proof that cooperation is better will make it cooperate, but finding a proof that defection is better won't make it defect. So in spirit it's pretty similar to my old algorithm, and we probably won't find a solution to game theory on these lines.

Comment author: Vladimir_Nesov 14 August 2010 07:58:42PM *  0 points [-]

Conjecture:

If you just look for the proofs of statements of the form

(action=A1 => U=U1) AND (action=A2 => U=U2) AND (U1>=U2)

where U1 and U2 are variables, and A1 and A2 are different constants (cooperate and defect in some order), up to some sufficient timeout (after which you act randomly), and perform A1 the moment you see the first one, you'll play correctly against your near-copy (with different timeout or syntactic differences), and defecting and cooperating rocks.

Comment author: Benja 14 August 2010 06:16:42PM *  0 points [-]

This algorithm asymmetrically "privileges" cooperation over defection

Now that I disagree with! If the algorithm finds a proof that defection is better, it will defect, because the proof system is consistent, so the algorithm can't in this case find a proof that cooperation is better. (It's true that it cannot prove that it will defect, of course.) In fact I'd say the algorithm privileges defection, because it'll defect unless it finds a reason to cooperate.

I don't see any straight-forward generalization to arbitrary games, but a natural generalization to games where every player has only two choices springs to mind. Take as parameters the payoff matrix and the other players' source code; then play minimax unless you can prove that playing the other strategy will result in a higher payoff. [Edit: Er, unless the utility you can prove you get under the assumption that you play the other strategy is higher than etc etc... you know what I mean.] Haven't yet looked into whether this actually does give sensible answers to non-PD games, though. *goes off to do just that*

Comment author: cousin_it 14 August 2010 06:33:02PM *  1 point [-]

Now that I disagree with! If the algorithm finds a proof that defection is better, it will defect

What on Earth do you mean by this counterfactual? :-)

The obvious extension of your proposed algorithm to games with many players and many strategies: assume any commonly-accepted method of finding a "fair strategy profile", attempt Löbian cooperation to reach it, otherwise play minimax. This doesn't solve fairness/bargaining, and we all are quite familiar with this sort of trick already.

Essentially, what you're doing is "porting" old ideas from one enforcement method (quining) to another (Löbian cooperation), hoping that fairness will magically pop out this time. No. Fairness and enforcement are orthogonal issues. I'm really frustrated that people stubbornly fail to see fairness as the serious obstacle to "solving" game theory. There's no cheap accidental way around it.

ETA: on rereading, this comment sounded harsh. Please don't take it personally - I just feel that at this point in the discussion, it's time to forcefully state some of my background intuitions so other people can break them down more easily.

Comment author: Vladimir_Nesov 14 August 2010 07:32:56PM *  0 points [-]

Now that I disagree with! If the algorithm finds a proof that defection is better, it will defect, because the proof system is consistent, so the algorithm can't in this case find a proof that cooperation is better.

I don't see what theory do you refer to here, in saying "proof theory is consistent". We have an algorithm that proves some theorems in PA. The only theory under discussion is PA. If the algorithm finds a proof that defection is better, it defects, period.

Comment author: Benja 14 August 2010 02:55:55PM 0 points [-]

It's extremely difficult for the agent to conclude that it will make any particular choice, because any proof of that (starting from your assumptions or any other assumptions) must also prove that the agent won't stumble on any other proofs that lead to yet higher outcomes.

I.e., that the agent won't find (contradictory) proofs that the same actions will lead to different, even higher utilities. Right, thanks.

Comment author: Vladimir_Nesov 14 August 2010 02:55:16PM 0 points [-]

(The following sounds too confusing, but making it clearer doesn't seem straightforward. Posting just in case communication succeeds on some level.)

Payoffs are part of preference, preference is (aspect of) world program, and world program determines the outcome. By changing the payoffs, you change the world program, and where decision-making is based on logical inferences about it, you can thus fundamentally change the shape of possible inferences. "Nice" solutions mean parametrization of the solution by elements of the problem statement, but here parametrization is bound to talk about what can be logically inferred, so expect no nice solutions.

Comment author: Vladimir_Nesov 14 August 2010 05:05:37PM 0 points [-]

Suppose there's a short proof that agent1()=="C" iff agent2()=="cooperate". [...] Therefore, under the supposition, agent1 will choose "C"; similarly, agent2 will choose "cooperate".

Sounds right (but it's a rather strong supposition).

Comment author: Benja 14 August 2010 05:52:22PM 0 points [-]

Sorry, I wasn't being as clear as I should have been:

The above is true as a metatheorem, but uninteresting, because as you say, the supposition is rather strong.

What I was trying to argue is that in the agent's proof system (for concreteness, say PA), when assuming the supposition, the conclusion follows, so by the deduction theorem, PA proves "the supposition implies the conclusion". I wanted to use this for a proof-size-bounded Löbian argument à la cousin_it.

But this does not work because in PA, the conclusion does not follow when assuming the supposition, as pointed out by cousin_it.