twanvl comments on Results from MIRI's December workshop - Less Wrong

45 Post author: Benja 15 January 2014 10:29PM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (43)

You are viewing a single comment's thread. Show more comments above.

Comment author: twanvl 16 January 2014 03:59:51PM 1 point [-]

I am not convinced by the problematic example in the "Scientific Induction in Probabilistic Mathematics" writeup. Let's say that there are n atoms ϕ(1)..ϕ(n). If you don't condition, then because of symmetry, all consistent sets S drawn from the process have equal probability. So the prior on S is uniform and the probability of ϕ(i) is therefore 1/2, by

P(ϕ(i)) = ∑{S} 1[ϕ(i)∈S] * P(S)

n a consistent set S drawn from the process is exactly 1/2 for all i, this must be true by symmetry because μ(x)=μ(¬x). Now what you should do to condition on some statement X is simply throw out the sets S which don't satisfy that statement, i.e.

P(ϕ(i)|X) = ∑{S} 1[ϕ(i)∈S] * P(S) * 1[X(S)] / ∑{S} P(S) * 1[X(S)]

Since the prior on S was uniform, it will still be uniform on the restricted set after conditioning. So

P(ϕ(i)|X) = ∑{S} 1[ϕ(i)∈S] * 1[X(S)] / ∑{S} 1[X(S)]

Which should just be 90% in the example where X is "90% of the ϕ are true"

The mistake in the writeup is to directly define P(S|X) in an inconsistent way.

To avoid drowning in notation, let's consider a simpler example with the variables a, b and c. We will first pick a or ¬a uniformly, then b or ¬b, and finally c or ¬c. Then we try to condition on X="exactly one of a,b,c is true". You obviously get prior probabilities P(S) = 1/8 for all consistent sets.

If you condition the right way, you get P(S) = 1/3 for the sets with one true attom, and P(S)=0 for the other sets. So then

P(a|X) = P(a|{a,¬b,¬c})P({a,¬b,¬c}|X) + P(a|{¬a,b,¬c})P({¬a,b,¬c}|X) + P(a|{¬a,¬b,c})P({¬a,¬b,c}|X)
= 1/3

What the writeup does instead is first pick a or ¬a uniformly. If it picks a, we know that b and c are false. If we pick ¬a we continue. The uniform choice of a is akin to saying that

P({a,b,c}|X) = P(a) * P({b,c}|a,X).

But that first term should be P(a|X), not P(a)!

Comment author: twanvl 17 January 2014 02:59:51PM 0 points [-]

After writing this I realize that there is a much simpler prior on finite sets S of consistent statements: simply have a prior over all sets of statements, and keep only the consistent ones. If your language is chosen such that it contains X if and only if it also contains ¬X, then this is equivalent to choosing a truth value for each basic statement, and a uniform prior over these valuations would work fine.

Comment author: JeremyHahn 17 January 2014 03:54:58PM 0 points [-]

The key here is that you are using finite S. What do you do if S is infinite? More concretely, is your schema convergent if you grow your finite S by adding more and more statements? I believe we touch on such worries in the writeup.