I wrote this post in the course of working through Vladimir Slepnev's A model of UDT with a halting oracle. This post contains some of the ideas of Slepnev's post, with all the proofs written out. The main formal difference is that while Slepnev's post is about programs with access to a halting oracle, the "decision agents" in this post are formulas in Peano arithmetic. They are generally uncomputable and do not reason under uncertainty.
These ideas are due to Vladimir Slepnev and Vladimir Nesov. (Please let me know if I should credit anyone else.) I'm pretty sure none of this is original material on my part. It is possible that I have misinterpreted Slepnev's post or introduced errors.
We are going to define a world function , a
-ary function1 that outputs an ordered pair
of payoff values. There are functions
such that
and
for any
. In fact
is a function in the three variables
and
.
We are also going to define an agent function that outputs the symbol
or
. The argument
is supposed to be the Gödel number of the world function, and
is some sort of indexical information.
We want to define our agent such that
( denotes the Gödel number of
.
means that
is provable in Peano arithmetic.
represents the numeral for
. I don't care what value
has when
isn't the Gödel number of an appropriate
-ary function.)
There is some circularity in this tentative definition, because a formula standing for appears in the definition of
itself. We get around this by using diagonalization. We'll describe how this works just this once: First define the function
as follows:
This function can be defined by a formula. Then the diagonal lemma gives us a formula such that
.
This is our (somewhat) rational decision agent. If it can prove it will do one thing, it does another; this is what Slepnev calls "playing chicken with the universe". If it can prove that is an optimal strategy, it chooses
; and otherwise it chooses
.
First, a lemma about the causes and consequences of playing chicken:
Lemma 1. For any ,
( is a binary-valued function such that
is true exactly when there is a proof of
in Peano arithmetic. For brevity we write
instead.
is the proposition that Peano arithmetic, plus the axiom that Peano arithmetic is consistent, is a consistent theory.)
Proof. (1) By definition of ,
So
(2) By the principle of explosion,
(3) By the definition of ,
(4)
If we assume consistency of (which entails consistency of
), then parts (1) and (3) of Lemma 1 tell us that for any
,
and
. So the agent never actually plays chicken.
Now let's see how our agent fares on a straightforward decision problem:
Proposition 2. Let and suppose
Assume consistency of . Then
if and only if
.
Proof. If we assume consistency of , then Lemma 1 tells us that the agent doesn't play chicken. So the agent will choose
if and only if it determines that choosing
is optimal.
We have
Suppose . Then clearly
So .
As for the converse: We have . If also
and , then
. By Lemma 1(3) and consistency of
, this cannot happen. So
Similarly, we have
for all . So
So the agent doesn't decide that is optimal, and
.
Now let's see how fares on a symmetric Prisoner's Dilemma with itself:
Proposition 3. Let
Then, assuming consistency of , we have
.
Proof.
(This proof uses Löb's theorem, and that makes it confusing. Vladimir Slepnev points out that Löb's theorem is not really necessary here; a simpler proof appears in the comments.)
Looking at the definition of , we see that
By Lemma 1, (1) and (3),
Similarly,
So
Applying Lemma 1(2) and (4),
By Löb's theorem,
By , we have
So, assuming , we conclude that
.
The definition of treats the choices
and
differently; so it is worth checking that
behaves correctly in the Prisoner's Dilemma when the effects of
and
are switched:
Proposition 4. Let
Then, assuming consistency of , we have
.
A proof appears in the comments.
There are a number of questions one can explore with this formalism: What is the correct generalization of that can choose between
actions, and not just two? How about infinitely many actions? What about theories other than Peano arithmetic? How do we accomodate payoffs that are real numbers? How do we make agents that can reason under uncertainty? How do we make agents that are computable algorithms rather than arithmetic formulas? How does
fare on a Prisoner's Dilemma with asymmetric payoff matrix? In a two-person game where the payoff to player
is independent of the behavior of
, can
deduce the behavior of
? What happens when we replace the third line of the definition of
with
? What is a (good) definition of "decision problem"? Is there a theorem that says that our agent is, in a certain sense, optimal?
1Every -ary function
in this article is defined by a formula
with
free variables such that
and
. By a standard abuse of notation, when the name of a function like
appears in a formula of arithmetic, what we really mean is the formula
that defines it.
Well the agent definition contains a series of conditionals. You have as the last three lines: if "cooperating is provably better than defecting", then cooperate; else, if "defecting is provably better than cooperating" then defect; else defect. Intuitively, assuming the agent's utility function is consistent, only one antecedent clause will evaluate to true. In the case that the first one does, the agent will output C. Otherwise, it will move through to the next part of the conditional and if that evaluates to true the agent will output D. If not, then it will output D anyway. Because of this, I would go for proving that line three and 4 can't both obtain.
How about this? Suppose condition 3 and 4 both obtain. Then there exists and a and b such that U(C) = #a and U(D) = #b (switching out the underline for '#'). Also, there exists a p and q such that p > q and U(D) = #p and U(C) = #q. Now U(C) = #a > #b = U(D) = #c > #q = U(C) so U(C) > U(C). I may actually be confused on some details, since you indicate that a and b are numbers rather than variables (for instance in proposition 2 you select a and b as rational numbers), yet you further denote numerals for them, but I'm assuming that a >b iff #a > #b. Is this a valid assumption? I feel as though I'm missing some details here, and I'm not 100% sure how to fill them in. If my assumption isn't valid I have a couple of other ideas. I'm not sure if I should be thinking of #a as 'a' under some interpretation.
Going a little bit further, it looks like your lemma 1 only invokes the first two lines and can easily be extended for use in conjecture 4 - look at parts 1 and 3. Part 1 is one step away from being identical to part 3 with C instead of D.
~Con(PA) -> ~ Con(PA + anything), just plug in Con(PA) in there. Adding axioms can't make an inconsistent theory consistent. Getting the analogous lemma to use in conjecture 4 looks pretty straightforward - switching out C with D and vice versa in each of the parts yields another true four part lemma, identical in form but for the switched D's and C's so you can use them analogously to how you used the original lemma 1 in the proof of proposition 3.
Yep, this sounds about right. I think one do has to prove "#a > #b if a > b", or rather
)