Douglas_Knight comments on A model of UDT with a halting oracle - Less Wrong Discussion
You are viewing a comment permalink. View the original post to see all comments and the full post content.
You are viewing a comment permalink. View the original post to see all comments and the full post content.
Comments (100)
Since we like symmetry, I'm going to change notation from A and B to I and O for "I" and "opponent." (or maybe "input" and "output")
We should be careful about the definition of B. Simply saying that it cooperates if I()=O() causes it to blow up against the defectbot. Instead, consider the propositions PC: I()=C ⇒ O()=C and PD: I()=D ⇒ O()=D. We really mean that B should cooperate if S proves P=PC∧PD. What if it doesn't? There are several potential agents: B1 defects if S doesn't prove P; B2 defects if S proves ¬P, but breaks down and cries if it is undecidable; B3 breaks down if either PC and PD are undecidable, but defects they are both decidable and one is false. B3 sounds very similar to A and so I think that symmetry proves that they cooperate together. If we modified A not to require that every action had a provable utility, but only that one action had a utility provably as big as all others, then I think it would cooperate with B2.
These examples increase my assessment of the possibility that A and B1 cooperate.
(I'm ignoring the stuff about playing chicken, because the comment I'm responding to seems to say I can.)
I think your conclusions can be right, but the proofs are vague. Can you debug your reasoning like you debugged mine above?