I've recently been getting into all of this wonderful Information Theory stuff and have come across a paper (thanks to John Salvatier) that was written by Kevin H. Knuth:
The paper sets up some intuitive minimal axioms for quantifying power sets and then (seems to) use them to derive Bayesian probability theory, information gain, and Shannon Entropy. The paper also claims to use less assumptions than both Cox and Kolmogorov when choosing axioms. This seems like a significant foundation/unification. I'd like to hear whether others agree and what parts of the paper you think are the significant contributions.
If a 14 page paper is too long for you, I recommend skipping to the conclusion (starting at the bottom of page 12) where there is a nice picture representation of the axioms and a quick summary of what they imply.
Are you referring to the line with "to arbitrary precision" on the bottom of page 17?
Although they don't express themselves as clearly as they could, I don't think that they mean anything like, "and hence we arrive at the exact regrading Θ in the limit by sending the number of atoms with the same valuation to infinity." Rather, I think that they mean that a larger number of atoms with the same valuations puts stronger constraints on the regrading Θ, but it is never so constrained that it can't exist.
In other words, their proof accommodates arbitrarily many atoms with the same valuation, but it doesn't require it.
The more closely I've read their proof, the more confident I've become that they prove the following:
Let L be a finite lattice satisfying equations (0)–(3), and let a valuation m: L → R and a binary operation ⊕ on R satisfying axioms (0)–(3) be given. (Here, I take the equations and axioms to be corrected as described here).
Then there exists a strictly monotonically increasing function Θ: R → R such that Θ(a ⊕ b) = Θ(a) + Θ(b) for all a, b ∈ R such that a, b, and a ⊕ b are in the range of m.
Well, they claim that the interleaving of a and b is linear. When they prove that if a:b>5/3 then a/b>3/2 (this implication is necessary to be consistent once we have 5 copies of a), they use 9 copies of a. It is easy to prove this particular case without appealing to more than 5 copies of a, but you need to do the things that the authors seem to specifically avoid.
The worst part is that they seem to use "to arbitrary precision" argument to prove how adding a and b would work.
Let me look up the paper once more..
OK. I give up.
Do they require... (read more)