Comment author: taiyo 01 October 2011 08:38:02PM 3 points [-]

Thank you for this info. I've signed up. I think this flipped my mood from gloomy to happy.

Incidentally, this is the second study I've signed up for via the web. The first is the Good Judgement Project which has been a fun exercise so far.

In response to Bayesian Minesweeper
Comment author: Oscar_Cunningham 20 September 2011 12:50:18AM 3 points [-]

Using probability to win normal minesweeper: http://nothings.org/games/minesweeper/

Comment author: taiyo 20 September 2011 04:45:00AM 1 point [-]

I think minesweeper makes a nice analogy with many of the ideas of epistemic rationality espoused in this community. At a basic level, it demonstrates how probabilities are subjectively objective -- our state of information (the board state) is what determines the probability of a mine under an unknown square but that there really is only one correct set of mine probabilities. However, we also run quickly into the problem of bounded cognition. In this situation we resort to heuristics. Of course, heuristics are of varying quality, and it is possible with mathematics, to make better heuristics.

For example, if you find that the set of possible configurations of mines in a particular neighborhood is partitioned into, say, those that involve k mines and those that involve k+1 mines, then you can get a pretty good estimate of the probability that the true configuration will be in one partition or the other. It depends on the density of mines under the squares that aren't known (something like a prior).

There are situations which come up in which one must decide just what one's goals are -- is it to survive the next click or to maximize the chance to win the game? Often these two goals result in the same decision, but sometimes, interestingly, they result in different decisions.

I like to play on a 24x30 board (maximum allowed on windows machines), with 200 mines. This makes the game rarely about deductive logic alone. Situations in which probability theory is necessary come up all the time with this density.

Comment author: Soki 01 July 2010 03:23:58PM 2 points [-]

I could not figure out why alpha > 0 neither and it seems wrong to me too. But this does not look like a problem.

We know that J is an increasing function because of 2-49. So in 2-53, alpha and log(x/S(x)) must have the same sign, since the remaining of the right member tends toward 0 when q tends toward + infinity.

Then b is positive and I think it is all that matters.

However, if alpha = 0, b is not defined. But if alpha=0 then log(x/S(x))=0 as a consequence of 2-53, so x/S(x)=1. There is only one x that gives us this since S is strictly decreasing. And by continuity we can still get 2-56.

Comment author: taiyo 01 July 2010 08:53:06PM *  0 points [-]

Lovely. Thanks.

Comment author: Morendil 01 July 2010 04:17:26PM 0 points [-]

I'm totally stuck on getting 2.50 from 2.48, would appreciate a hint.

Comment author: taiyo 01 July 2010 05:44:01PM 1 point [-]

K. S. Van Horn gives a few lines describing the derivation in his PT:TLoS errata. I don't understand why he does step 4 there -- it seems to me to be irrelevant. The two main facts which are needed are step 2-3 and step 5, the sum of a geometric series and the Taylor series expansion around y = S(x). Hopefully that is a good hint.

Nitpicking with his errata, 1/(1-z) = 1 + z + O(z^2) for all z is wrong since the interval of convergence for the RHS is (-1,1). This is not important to the problem since the z here will be z = exp(-q) which is less than 1 since q is positive.

Comment author: taiyo 30 June 2010 01:52:01AM *  6 points [-]

I would like to share some interesting discussion on a hidden assumption used in Cox's Theorem (this is the result which states that what falls out of the desiderata is a probability measure).

First, some criticism of Cox's Theorem -- a paper by Joseph Y. Halpern published in the Journal of AI Research. Here he points out an assumption which is necessary to arrive at the associative functional equation:

F(x, F(y,z)) = F(F(x,y), z) for all x,y,z

This is (2.13) in PT:TLoS

Because this equation was derived by using the associativity of the conjunction operation A(BC) = (AB)C, there are restrictions on what values the plausibilities x, y, and z can take. If these restrictions were stringent enough that x,y and z could only take on finitely many values or if they were to miss an entire interval of values, then the proof would fall apart. There needs to be an additional assumption that the values they can take form a dense subset. Halpern argues that this assumption is unnatural and unreasonable since it disallows "notions of belief with only finitely many gradations." For example, many AI projects have only finitely many propositions that are considered.

K. S. Van Horn's article on Cox's Theorem addresses this criticism directly and powerfully starting on page 9. He argues that the theory that is being proposed should be universal and so having holes in the set of plausibilities should be unacceptable.

Anyhow, I found it interesting if only because it makes explicit a hidden assumption in the proof.

Comment author: Cyan 29 June 2010 10:56:39PM *  1 point [-]

Given C -> ~A, ({any proposition} | AC) is undefined. That's why I couldn't follow your argument all the way.

Comment author: taiyo 29 June 2010 11:33:08PM 0 points [-]

Ah OK. You're right. I guess I was taking the 'extension of logic' thing a little too far there. I had it in my head that ({any prop} | {any contradiction}) = T since contradictions imply anything. Thanks.

Comment author: Cyan 29 June 2010 02:27:58PM *  1 point [-]

Suppose (AB|C) = F[(A|BC) , (B|AC)]. Compare A=B=C with (A = B) AND (C -> ~A).

Not sure what you're getting at. To rule out (AB|C) = F[(A|BC) , (B|AC)], set A = B and let A's plausibility given C be arbitrary. Let T represent the (fixed) plausibility of a tautology. Then we have

(A|BC) = (B|AC) = T (because A = B)
(AB|C) = F(T, T) = constant

But (AB|C) is arbitrary by hypothesis, so (AB|C) = F[(A|BC) , (B|AC)] is not useful.

ETA: Credit where it's due: page 13, point 4 of Kevin S. Van Horne's guide to Cox's theorem (warning: pdf).

Comment author: taiyo 29 June 2010 03:10:44PM 0 points [-]

Yeah. My solution is basically the same as yours. Setting A=B=C makes F(T,T) = T. But setting A=B AND C -> ~A makes F(T,T) = F (warning: unfortunate notation collision here).

Comment author: Morendil 29 June 2010 08:13:25AM 2 points [-]

OK, thanks.

I'm able to follow a fair bit of what's going on here; the hard portions for me are when Jaynes gets some result without saying which rule or operation justifies it - I suppose it's obvious to someone familiar with calculus, but when you lack these background assumptions it can be very hard to infer what rules are being used, so I can't even find out how I might plug the gaps in my knowledge. (Definitely "deadly unk-unk" territory for me.)

(Of course "follow" isn't the same thing at all as "would be able to get similar results on a different but related problem". I grok the notion of a functional equation, and I can verify intermediate steps using a symbolic math package, but Jaynes' overall strategy is obscure to me. Is this a common pattern, taking the derivative of a functional equation then integrating back?)

The next bit where I lose track is 2.22. What's going on here, is this a total derivative?

Comment author: taiyo 29 June 2010 08:56:07AM *  1 point [-]

Yeah. A total derivative. The way I think about it is the dv thing there (jargon: a differential 1-form) eats a tangent vector in the y-z plane. It spits out the rate of change of the function in the direction of the vector (scaled appropriately with the magnitude of the vector). It does this by looking at the rate of change in the y-direction (the dy stuff) and in the z-direction (the dz stuff) and adding those together (since after taking derivatives, things get nice and linear).

I'm not too familiar with the functional equation business either. I'm currently trying to figure out what the heck is happening on the bottom half of page 32. Figuring out the top half took me a really long while (esp. 2.50).

I'm convinced that the inequality in eqn 2.52 shouldn't be there. In particular, when you stick in the solution S(x) = 1 - x, it's false. I can't figure out if anything below it depends on that because I don't understand much below it.

Comment author: Morendil 29 June 2010 12:50:05AM *  0 points [-]

One question to start this off:

  • To derive the Product Rule we are invited to consider (AB|C) as a functional relation of other plausibilities; one candidate is ruled out with the aid of a scenario involving blue eyes and brown eyes. Can you think of similar examples ruling out other candidate relations?

(More discussion questions always welcome !)

I have lots of questions about the math itself. I'll limit myself to one to start with. Can someone with more math than I have confirm that we get (2-7 (*)) by application of the chain rule of calculus?

(*) ETA - that's 2.16 in the printed edition

Comment author: taiyo 29 June 2010 06:21:08AM *  0 points [-]

I did not go through the 9 remaining cases, but I did think about one...

Suppose (AB|C) = F[(A|BC) , (B|AC)]. Compare A=B=C with (A = B) AND (C -> ~A).

Re 2-7: Yep, chain rule gets it done. By the way, took me a few minutes to realize that your citation "2-7" refers to a line in the pdf manuscript of the text. The numbering is different in the hardcopy version. In particular, it uses periods (e.g. equation 2.7) instead of dashes (e.g. equation 2-7), so as long as we're all consistent with that, I don't suppose there will be much confusion.

Comment author: Morendil 22 June 2010 07:37:32AM 1 point [-]

Book Club Update

As promised, this is a "minor" update, i.e. I'm not making a new top-level post to prompt new reading for this week, but sticking to a comment. We have new information on meeting times, and new chunks to read. Next week we will start on Chapter 2, this time with a top-level update. We'll see how this works.

New live meeting schedule

The spreadsheet has proven effective as a way to coordinate meeting times for widely scattered participants starting from suboptimal initial values. The most voted-on time is UTC+18 which is around 1pm in the Bay area, 9pm in Europe (other offsets can be looked up in the table). Participants have suggested a weekend meeting. I have updated the post above to reflect the new information.

Stil, of (now) 80 participants listed in the spreadsheet, only 16 have indicated a preferred meeting time so far. If you're interested in live meetings and haven't updated your info yet, please do so.

Reading for the week of 21/06

We continue with Chapter 1, sections: Boolean Algebra - Adequate Sets of Operations - The Basic Desiderata - Comments - Common Language vs Formal Logic - Nitpicking

(The Comments section is well worth reading, as it introduces the Mind Projection Fallacy which LW readers who have gone through the Sequences should be familiar with.)

Questions for the second part of Chapter 1 (some participants have already started on that, which is fine):

  • Jaynes discusses a "tricky point" with regard to the difference between the everyday meaning of the verb "imply" and its logical meaning; are there other differences between the formal language of logic and everyday language?
  • Can you think of further desiderata for plausible inference, or find issues with the one Jaynes lays out?
Comment author: taiyo 24 June 2010 10:18:18PM 1 point [-]

Jaynes discusses a "tricky point" with regard to the difference between the everyday >meaning of the verb "imply" and its logical meaning; are there other differences between >the formal language of logic and everyday language?

In formal logic, the disjunction "or" is inclusive -- "A or B" is true if A and B are true. In everyday language, typically "or" is exclusive -- "A or B" is meant to exclude the possibility that A and B are both true.

View more: Next