I've recently been getting into all of this wonderful Information Theory stuff and have come across a paper (thanks to John Salvatier) that was written by Kevin H. Knuth:
The paper sets up some intuitive minimal axioms for quantifying power sets and then (seems to) use them to derive Bayesian probability theory, information gain, and Shannon Entropy. The paper also claims to use less assumptions than both Cox and Kolmogorov when choosing axioms. This seems like a significant foundation/unification. I'd like to hear whether others agree and what parts of the paper you think are the significant contributions.
If a 14 page paper is too long for you, I recommend skipping to the conclusion (starting at the bottom of page 12) where there is a nice picture representation of the axioms and a quick summary of what they imply.
I admit that I still haven't fully digested their purported proof. But does the proof require lots of equally-valued atoms? Or does it just accommodate lots of equally-valued atoms? (The regrading Θ doesn't have to be unique.)
Yes, I think that they forgot to say that axiom 1 only applies when y is non-null. Axiom 1 is based on equation (1): "x < x ∨ y". When they state equation (1), they do remember to require that y be non-null. So, let us charitably read axiom 1 as assuming that y is non-null.
Furthermore, in the sentence following axiom 3, they say "These equations are to hold for arbitrary values m(x), m(y), m(z) assigned to the disjoint x, y, z." A fair criticism is that they never defined "disjoint". I think that they mean for x and y to be disjoint if, when you write each as a join of atoms, no atom appears in both expressions. Strangely, they forgot to require that x and y be disjoint when they state equation (1). But it seems clear that they intended to require this, because they mention the disjointness condition explicitly when they give their "grounds" for equation (1).
The upshot is that I'd read both equation (1) and axiom 1 as assuming that y is non-null and disjoint from x.
Yes, and I see now that they do claim to regrade ⊕ to be + "over positive reals". Suppose we charitably weaken their statement to the claim that ⊕ is + up to regrading when restricted to the image of the lattice under the valuation. That is, consider the claim that there is a regrading Θ satisfying Θ(x ⊕ y) = Θ(x) + Θ(y) for all x, y in the image of the valuation (which is a finite set). Is this claim false? If I understand their goals correctly, this claim is all that they really need, anyway. Why should they care about real numbers that never appear as values of elements in the lattice?
Of course, if I saw an unfixable inconsistency, I would start my first comment with that. It is just a minor inconsistency.
As for amount of needed atoms - they use unlimited number of atoms to prove the existence of Θ. Maybe they don't really need it, but well... What they write is not a proof, what they prove is technically false.
Their claim may be accidentally true for a finite lattice. So far I fail to prove it or to find a counterexample.
Regardless of whether their axioms are sufficient they provide a very good motivation for a well-connected concept ... (read more)