Charlie: Your proposal to remove condition 3 works when is a finite theory, but you cannot do that when is infinite. Indeed, we use 3 for the infinite extension form to . I suspect that you cannot remove condition 3.
Is the idea that the proof necessary to use 1 is of infinite length, and you want your logic to be finitary? Hm. This seems odd, because is in some sense already a function with an infinitely long argument. How do you feel about using 2 in the form of , therefore , which has the same amount of argument as ? I'm confused about at least one thing.
Also, is there some reason you prefer not to reply using the button below the comment?
It's interesting that this is basically the opposite of the Gaifman condition - clearly there are conflicting intuitions about what makes a 'good' conditional logical probability.
On the open problem; In order to prove 3 from 2, all you need is that when proves - 3 follows from 2 if you do that substitution, and then divide by , which is less than or equal 1 (this may assume an extra commonsense axiom that probabilities are positive).
Now consider applying rule 1 to , T proven by s. R proves that only one of is true, and also proves that only one of is true. Thus 3 is derivable from 1 and 2.
This is interesting! I would dispute, though, that a good logical conditional probability must be able to condition on arbitrary, likely-non-r.e. sets of sentences.
Hm; we could add an uninterpreted predicate symbol to the language of arithmetic, and let and . Then, it seems like the only barrier to recursive enumerability of is that 's opinions about aren't computable; this seems worrying in practice, since it seems certain that we would like logical uncertainty to be able to reason about the values of computations that use more resources than we use to compute our own probability estimates. But on the other hand, all of this makes this sound like an issue of self-reference, which is its own can of worms (once we have a computable process assigning probabilities to the value of computations, we can consider the sentence saying "I'm assigned probability " etc.).
Nice! Basically, it looks like you construct a theory by assembling an infinite quantity of what the prior takes as evidence about , so that either the prior or the posterior has to take the most extreme odds on . It's pretty intuitive in that light, and so I'm not dismayed that the "0 and 1 are not probabilities" property can't hold when conditioned on arbitrary theories.
Important typo: assigns conditional probability 1 to some statement that cannot be proven.
Fix a theory T over a language L. A coherent probability function is one that satisfies laws of probability theory, each coherent probability function represents a probability distribution on complete logical extensions of T.
One of many equivalent definitions of coherence is that P is coherent if P(s1)+P(s2)+…+P(sk)=1 whenever T can prove that exactly one of s1,…,sk is true.
Another very basic desirable property is that P(s)=1 only when s is provable. There have been many proposals of specific coherent probability assignments that all satisfy this basic requirement. Many satisfy stronger requirements that give bounds on how far P(s) is from 1 when s is not provable.
In this post, I modify the framework slightly to instead talk about conditional probability. Consider a function P which takes in a consistent theory T and a sentence s, and outputs a number P(s|T)∈[0,1], which represents the conditional probability of s given T.
We say that P is coherent if:
P(s1|T)+P(s2|T)+…+P(sk|T)=1 whenever T can prove that exactly one of s1,…,sk is true, and
P(s∧r|T)=P(r|T)⋅P(s|T∪{r}).
If s proves every sentence in T, then P(s|R∪T)≥P(s|R).
Theorem: There is no coherent conditional probability function P such that P(s|T)=1 only when T proves s.
Proof:
This proof will use the notation of log odds ℓ(p)=log2(p1−p) to make things simpler.
Let P be a coherent conditional probability function. Fix a sentence s which is neither provable nor disprovable from the empty theory. Construct an infinite sequences of theories as follows:
T0 is the empty theory.
To construct Tn+1, choose a sentence rn such that neither s→rn nor s→¬rn are provable in Tn. If P(s∧rn|Tn)≤P(s∧¬rn|Tn), then let Tn+1=Tn∪{s→rn}. Otherwise, let Tn+1=Tn∪{s→¬rn}.
Fix an n, and without loss of generality, assume P(s∧rn|Tn)≤P(s∧¬rn|Tn). Since P is coherent we have P(s∧r|Tn)+P(s∧¬r|Tn)=P(s|Tn). In particular, this means that P(s∧r|Tn)≤12P(s|Tn).
Observe that P(s∧(s→r)|Tn)=P(s|Tn+1)P(s→r|Tn), and P(¬s∧(s→r)|Tn)=P(¬s|Tn+1)P(s→r|Tn). Therefore, P(s∧r|Tn)/P(¬s|Tn)=P(s|Tn+1)/P(¬s|Tn+1), so 12P(s|Tn)/P(¬s|Tn)≥P(s|Tn+1)/P(¬s|Tn+1).
In the language of log odds, this means that ℓ(P(s|Tn))−1≥ℓ(P(s|Tn+1)).
Let T∞ be the union of all the Tn. Note that by the third condition of coherence, ℓ(P(¬s|T∞))≥ℓ(P(¬s|Tn)) for all n, so ℓ(P(s|T∞))≤ℓ(P(s|Tn)) for all n
Consider ℓ(P(s|T0)) and ℓ(P(s|T∞)). These numbers cannot both be finite, since ℓ(P(s|T∞))≤ℓ(P(s|Tn))≤ℓ(P(s|T0))−n. Therefore, at least one of P(s|T0) and P(s|T∞) must be 0 or 1. However neither T0 nor T∞ prove or disprove s, so this means that P assigns conditional probability 1 to some statement that cannot be proven.
Open Problem: Does this theorem still hold if we leave condition 3 out of the definition of coherence?