cousin_it comments on What a reduction of "could" could look like - Less Wrong

53 Post author: cousin_it 12 August 2010 05:41PM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (103)

You are viewing a single comment's thread. Show more comments above.

Comment author: cousin_it 13 August 2010 05:18:04AM *  0 points [-]

The implementation of agent() given in the post is guaranteed to terminate. It enumerates proofs about itself and the world up to a specified max proof length, it doesn't call itself recursively.

Comment author: Perplexed 13 August 2010 06:00:08AM 0 points [-]

Ok, I had been mistakenly assuming that it accomplished the proofs by simulating world(), which would necessarily involve recursive calls back to agent(), at least in a multi-player game.

My bad. But I am still not convinced that agent() is guaranteed to terminate. Suppose, for example, that it cannot find any proofs of the form "agent()==A implies world()==U"?

Also, in my example in which world() is supposed to compute the results of a Prisoner's Dilemma between two identical agents, is the "knowlege" that the two agents are indeed identical supposed to reside in world() or in agent()? I suspect that the right way to extend this thought experiment to the multi-player game scenario is to keep world() in the dark as to whether agent1() and agent2() are identical algorithms. Don't hard-code this fact into the agents, either. I think you have to let the two agents discover that they are identical to each other. The thing is, this "understanding" has to be achieved by proving things about proofs. So, I think that modal logic slips back in anyways.

Comment author: cousin_it 13 August 2010 06:07:48AM *  2 points [-]

It's true that the program might fail to find any proofs at all. Then it could output a random action, I guess? Or just return zero?

Puzzled about your last paragraph. The "knowledge" that the agents are equivalent has to be derivable by inspecting the program as a whole. (This is a very strong condition, but it can be satisfied.) Proving things about proofs is something proof checkers with sufficiently strong axioms can do just fine, no modal logic needed.

But in general, I don't believe this sort of reasoning carries over to multi-player games. If we take the Prisoner's Dilemma and tweak the payoffs slightly to make them non-symmetric, the agents are back to square one. Game theory is inherently harder than individual rationality. I really should write another post on that, because it seems to be a Pons Asinorum that many intelligent people fail to cross.

Comment author: Perplexed 13 August 2010 03:12:17PM 1 point [-]

Puzzled about your last paragraph.

Seems to me you understood it and responded appropriately.

Game theory is inherently harder than individual rationality.

Yes it is. One important difference is that game-playing agents generally use mixed strategies. So we need to distinguish the reading the strategy source code of another player from reading the pseudo-random-number source code and "seed". Rational agents will want to keep secrets.

A second difference is that the "preference structure" of the agent ought to be modeled as data about the agent, rather than data about the world. And some of this is information which a rational agent will wish to conceal as well.

So, as a toy model illustrating the compatibility of determinism and free will, your example (i.e. this posting) works fine. But as a way of modeling a flavor of rationality that "solves" one-shot PD and Newcomb-like problems in a "better" way than classical game theory, I think it doesn't (yet?) work.

Comment author: cousin_it 13 August 2010 03:21:21PM *  2 points [-]

Slightly tangential: I have a semi-worked-out notion of a "fair" class of problems which includes Newcomb's Problem because the predictor makes a decision based only on your action, not the "ritual of cognition" that preceded it (to borrow Eliezer's phrase). But the one-shot PD doesn't reside in this "fair" class because you don't cooperate by just predicting the opponent's action. You need to somehow tie the knot of infinite recursion, e.g. with quining or Löbian reasoning, and all such approaches seem to inevitably require inspecting the other guy's "ritual of cognition". This direction of research leads to all sorts of clever tricks, but ultimately it might be a dead end.

As for your main point, I completely agree. Any model that cannot accommodate limited information is bound to be useless in practice. I'm just trying to solve simple problems first, see what works and what doesn't in very idealized conditions. Don't worry, I'll never neglect game theory - it was accidentally getting my hands on Ken Binmore's "Fun and Games" that brought me back to math in the first place :-)

Comment author: Benja 14 August 2010 02:30:40PM *  0 points [-]

But in general, I don't believe this sort of reasoning carries over to multi-player games. If we take the Prisoner's Dilemma and tweak the payoffs slightly to make them non-symmetric, the agents are back to square one.

I don't understand. Do you mean changing the numeric values, leaving the game the same, or changing the ordering, turning it into different game? In the first case, it seems to me that the argument from your other post should work just fine [Edit: Wrong, see cousin_it's explanation below] -- roughly:

Suppose there's a short proof that agent1()=="C" iff agent2()=="cooperate". Then there's an only slightly longer proof that agent1()=="C" implies that both agents get the C/C payoff, and that agent1()=="D" implies both get the D/D payoff. Therefore, under the supposition, agent1 will choose "C"; similarly, agent2 will choose "cooperate".

In other words: If there's a short proof that agent1()=="C" iff agent2()=="cooperate", then agent1()=="C" and agent2()=="cooperate". By the Löbian reasoning in your previous post (linked above), it follows that both agents cooperate.

Am I confused about something? Or is it just that you were in fact talking about changing the payoffs so that the game is not a PD any longer? (I think it still at least bears mentioning if your work already solves the class of problems that Eliezer has been talking about, "where the other agents' decisions can be viewed as functions of one argument, that argument being your own choice in that particular case.")

Comment author: cousin_it 14 August 2010 02:45:47PM *  1 point [-]

This post does solve the class of problems Eliezer was talking about (I call it the "fair" class), but my previous Löbian post used a different algorithm that didn't do utility maximization. That algorithm explicitly said that it cooperates if it can find any proof that agent1()==agent2(), without comparing utilities or trying to prove other statements, and that explicitness made the proof work in the first place.

Your proof fails here:

Therefore, under the supposition, agent1 will choose "C"; similarly, agent2 will choose "cooperate".

It's extremely difficult for the agent to conclude that it will make any particular choice, because any proof of that (starting from your assumptions or any other assumptions) must also prove that the agent won't stumble on any other proofs that lead to yet higher outcomes. This amounts to the agent assuming the consistency of its own proof checker, which is a no go.

As far as I can see right now, making the payoffs slightly non-symmetric breaks everything. If you think an individual utility-maximizing algorithm can work in general game-theoretic situations, you have a unique canonical solution for all equilibrium selection and bargaining problems with non-transferable utility. Judging from the literature, finding such a unique canonical solution is extremely unlikely. (The transferable utility case is easy - it is solved by the Shapley value and my Freaky Fairness algorithm.)

Comment author: Vladimir_Nesov 14 August 2010 03:06:57PM *  0 points [-]

It's extremely difficult for the agent to conclude that it will make any particular choice, because any proof of that (starting from your assumptions or any other assumptions) must also prove that the agent won't stumble on any other proofs that lead to yet higher outcomes.

When does the agent need to have that knowledge? It just acts on certain conclusions, at the moment it obtains them. It can as well act on conclusions1 about its future conclusions2, the only problem might be that it never reaches those conclusions1 (and so won't act on them)

Comment author: Benja 14 August 2010 03:29:38PM 0 points [-]

The point is that I relied on the fact that if the agent can prove "If I cooperate, I get 5" and "If I defect, I get 2", then the agent will cooperate. This is a fact, but the agent cannot prove it: if its proof system were inconsistent, then it might (within maxsteps) also be able to prove "If I defect, I get 42," and thus end up defecting.

Maybe your comment suggests that the agent should choose to cooperate as soon as it has proven the first two statements above; but as far as the agent knows, if its proof system is inconsistent, it might find the proof of "If I defect, I get 42" earlier than the proof of "If I defect, I get 2." The agent cannot prove that "I will prove C -> 5 and D -> 2" implies "I will cooperate."

Comment author: Vladimir_Nesov 14 August 2010 05:01:46PM *  1 point [-]

The agent can just cooperate as soon as it proves "If I cooperate, I get X" and "If I defect, I get Y", for X>Y. This algorithm is justified, and the agent doesn't need to write its own algorithm, we are writing it.

Comment author: cousin_it 14 August 2010 05:36:33PM *  1 point [-]

I don't completely understand your idea, but Löbian cooperation seems to require enumerating a lot of proofs without stopping early. If Bob can stop early, Alice now needs to prove that Bob will stumble on a specific proof earlier than that, rather than at all. I remember trying and failing to remove the exhaustive search for proofs from my previous post. Or maybe I'm missing something?

Maybe you could formalize your idea and give some sort of proof? If it works, it'll be definite progress.

Comment author: Benja 14 August 2010 05:47:13PM 0 points [-]

IIUC, Vladimir is arguing that we can write an algorithm specifically for the PD that will cooperate if it finds any two proofs "C -> I get X" and "D -> I get Y", for X>Y. I think this algorithm will indeed do the right thing, do you agree?

Intuitively, the point is that both in Vladimir's algorithm and your earlier proof, it is sufficient to know something of the form "there exists a proof such that..." to conclude that the program will have a certain behavior (cooperate), whereas in my purported proof, you need to know something of the form "there exists no proof such that...", aka "for all proofs, it holds that..." (which you'd also need in both Vladimir's proposal and your earlier proof to show that one of the agents defects).

Comment author: Benja 14 August 2010 02:55:55PM 0 points [-]

It's extremely difficult for the agent to conclude that it will make any particular choice, because any proof of that (starting from your assumptions or any other assumptions) must also prove that the agent won't stumble on any other proofs that lead to yet higher outcomes.

I.e., that the agent won't find (contradictory) proofs that the same actions will lead to different, even higher utilities. Right, thanks.

Comment author: Vladimir_Nesov 14 August 2010 02:55:16PM 0 points [-]

(The following sounds too confusing, but making it clearer doesn't seem straightforward. Posting just in case communication succeeds on some level.)

Payoffs are part of preference, preference is (aspect of) world program, and world program determines the outcome. By changing the payoffs, you change the world program, and where decision-making is based on logical inferences about it, you can thus fundamentally change the shape of possible inferences. "Nice" solutions mean parametrization of the solution by elements of the problem statement, but here parametrization is bound to talk about what can be logically inferred, so expect no nice solutions.

Comment author: Vladimir_Nesov 14 August 2010 05:05:37PM 0 points [-]

Suppose there's a short proof that agent1()=="C" iff agent2()=="cooperate". [...] Therefore, under the supposition, agent1 will choose "C"; similarly, agent2 will choose "cooperate".

Sounds right (but it's a rather strong supposition).

Comment author: Benja 14 August 2010 05:52:22PM 0 points [-]

Sorry, I wasn't being as clear as I should have been:

The above is true as a metatheorem, but uninteresting, because as you say, the supposition is rather strong.

What I was trying to argue is that in the agent's proof system (for concreteness, say PA), when assuming the supposition, the conclusion follows, so by the deduction theorem, PA proves "the supposition implies the conclusion". I wanted to use this for a proof-size-bounded Löbian argument à la cousin_it.

But this does not work because in PA, the conclusion does not follow when assuming the supposition, as pointed out by cousin_it.