Perplexed comments on What a reduction of "could" could look like - Less Wrong
You are viewing a comment permalink. View the original post to see all comments and the full post content.
You are viewing a comment permalink. View the original post to see all comments and the full post content.
Comments (103)
I'm uncomfortable with the unbounded CPU time assumption. For example, what if world() computes the payoff in a Prisoner's Dilemma between two identical agent()s? You get non-termination due to a recursion without an escape clause.
The implementation of agent() given in the post is guaranteed to terminate. It enumerates proofs about itself and the world up to a specified max proof length, it doesn't call itself recursively.
Ok, I had been mistakenly assuming that it accomplished the proofs by simulating world(), which would necessarily involve recursive calls back to agent(), at least in a multi-player game.
My bad. But I am still not convinced that agent() is guaranteed to terminate. Suppose, for example, that it cannot find any proofs of the form "agent()==A implies world()==U"?
Also, in my example in which world() is supposed to compute the results of a Prisoner's Dilemma between two identical agents, is the "knowlege" that the two agents are indeed identical supposed to reside in world() or in agent()? I suspect that the right way to extend this thought experiment to the multi-player game scenario is to keep world() in the dark as to whether agent1() and agent2() are identical algorithms. Don't hard-code this fact into the agents, either. I think you have to let the two agents discover that they are identical to each other. The thing is, this "understanding" has to be achieved by proving things about proofs. So, I think that modal logic slips back in anyways.
It's true that the program might fail to find any proofs at all. Then it could output a random action, I guess? Or just return zero?
Puzzled about your last paragraph. The "knowledge" that the agents are equivalent has to be derivable by inspecting the program as a whole. (This is a very strong condition, but it can be satisfied.) Proving things about proofs is something proof checkers with sufficiently strong axioms can do just fine, no modal logic needed.
But in general, I don't believe this sort of reasoning carries over to multi-player games. If we take the Prisoner's Dilemma and tweak the payoffs slightly to make them non-symmetric, the agents are back to square one. Game theory is inherently harder than individual rationality. I really should write another post on that, because it seems to be a Pons Asinorum that many intelligent people fail to cross.
Seems to me you understood it and responded appropriately.
Yes it is. One important difference is that game-playing agents generally use mixed strategies. So we need to distinguish the reading the strategy source code of another player from reading the pseudo-random-number source code and "seed". Rational agents will want to keep secrets.
A second difference is that the "preference structure" of the agent ought to be modeled as data about the agent, rather than data about the world. And some of this is information which a rational agent will wish to conceal as well.
So, as a toy model illustrating the compatibility of determinism and free will, your example (i.e. this posting) works fine. But as a way of modeling a flavor of rationality that "solves" one-shot PD and Newcomb-like problems in a "better" way than classical game theory, I think it doesn't (yet?) work.
Slightly tangential: I have a semi-worked-out notion of a "fair" class of problems which includes Newcomb's Problem because the predictor makes a decision based only on your action, not the "ritual of cognition" that preceded it (to borrow Eliezer's phrase). But the one-shot PD doesn't reside in this "fair" class because you don't cooperate by just predicting the opponent's action. You need to somehow tie the knot of infinite recursion, e.g. with quining or Löbian reasoning, and all such approaches seem to inevitably require inspecting the other guy's "ritual of cognition". This direction of research leads to all sorts of clever tricks, but ultimately it might be a dead end.
As for your main point, I completely agree. Any model that cannot accommodate limited information is bound to be useless in practice. I'm just trying to solve simple problems first, see what works and what doesn't in very idealized conditions. Don't worry, I'll never neglect game theory - it was accidentally getting my hands on Ken Binmore's "Fun and Games" that brought me back to math in the first place :-)
I don't understand. Do you mean changing the numeric values, leaving the game the same, or changing the ordering, turning it into different game? In the first case, it seems to me that the argument from your other post should work just fine [Edit: Wrong, see cousin_it's explanation below] -- roughly:
Suppose there's a short proof that agent1()=="C" iff agent2()=="cooperate". Then there's an only slightly longer proof that agent1()=="C" implies that both agents get the C/C payoff, and that agent1()=="D" implies both get the D/D payoff. Therefore, under the supposition, agent1 will choose "C"; similarly, agent2 will choose "cooperate".
In other words: If there's a short proof that agent1()=="C" iff agent2()=="cooperate", then agent1()=="C" and agent2()=="cooperate". By the Löbian reasoning in your previous post (linked above), it follows that both agents cooperate.
Am I confused about something? Or is it just that you were in fact talking about changing the payoffs so that the game is not a PD any longer? (I think it still at least bears mentioning if your work already solves the class of problems that Eliezer has been talking about, "where the other agents' decisions can be viewed as functions of one argument, that argument being your own choice in that particular case.")
This post does solve the class of problems Eliezer was talking about (I call it the "fair" class), but my previous Löbian post used a different algorithm that didn't do utility maximization. That algorithm explicitly said that it cooperates if it can find any proof that agent1()==agent2(), without comparing utilities or trying to prove other statements, and that explicitness made the proof work in the first place.
Your proof fails here:
It's extremely difficult for the agent to conclude that it will make any particular choice, because any proof of that (starting from your assumptions or any other assumptions) must also prove that the agent won't stumble on any other proofs that lead to yet higher outcomes. This amounts to the agent assuming the consistency of its own proof checker, which is a no go.
As far as I can see right now, making the payoffs slightly non-symmetric breaks everything. If you think an individual utility-maximizing algorithm can work in general game-theoretic situations, you have a unique canonical solution for all equilibrium selection and bargaining problems with non-transferable utility. Judging from the literature, finding such a unique canonical solution is extremely unlikely. (The transferable utility case is easy - it is solved by the Shapley value and my Freaky Fairness algorithm.)
When does the agent need to have that knowledge? It just acts on certain conclusions, at the moment it obtains them. It can as well act on conclusions1 about its future conclusions2, the only problem might be that it never reaches those conclusions1 (and so won't act on them)
The point is that I relied on the fact that if the agent can prove "If I cooperate, I get 5" and "If I defect, I get 2", then the agent will cooperate. This is a fact, but the agent cannot prove it: if its proof system were inconsistent, then it might (within maxsteps) also be able to prove "If I defect, I get 42," and thus end up defecting.
Maybe your comment suggests that the agent should choose to cooperate as soon as it has proven the first two statements above; but as far as the agent knows, if its proof system is inconsistent, it might find the proof of "If I defect, I get 42" earlier than the proof of "If I defect, I get 2." The agent cannot prove that "I will prove C -> 5 and D -> 2" implies "I will cooperate."
The agent can just cooperate as soon as it proves "If I cooperate, I get X" and "If I defect, I get Y", for X>Y. This algorithm is justified, and the agent doesn't need to write its own algorithm, we are writing it.
I don't completely understand your idea, but Löbian cooperation seems to require enumerating a lot of proofs without stopping early. If Bob can stop early, Alice now needs to prove that Bob will stumble on a specific proof earlier than that, rather than at all. I remember trying and failing to remove the exhaustive search for proofs from my previous post. Or maybe I'm missing something?
Maybe you could formalize your idea and give some sort of proof? If it works, it'll be definite progress.
I.e., that the agent won't find (contradictory) proofs that the same actions will lead to different, even higher utilities. Right, thanks.
(The following sounds too confusing, but making it clearer doesn't seem straightforward. Posting just in case communication succeeds on some level.)
Payoffs are part of preference, preference is (aspect of) world program, and world program determines the outcome. By changing the payoffs, you change the world program, and where decision-making is based on logical inferences about it, you can thus fundamentally change the shape of possible inferences. "Nice" solutions mean parametrization of the solution by elements of the problem statement, but here parametrization is bound to talk about what can be logically inferred, so expect no nice solutions.
Sounds right (but it's a rather strong supposition).
Sorry, I wasn't being as clear as I should have been:
The above is true as a metatheorem, but uninteresting, because as you say, the supposition is rather strong.
What I was trying to argue is that in the agent's proof system (for concreteness, say PA), when assuming the supposition, the conclusion follows, so by the deduction theorem, PA proves "the supposition implies the conclusion". I wanted to use this for a proof-size-bounded Löbian argument à la cousin_it.
But this does not work because in PA, the conclusion does not follow when assuming the supposition, as pointed out by cousin_it.