Comment author: DavidS 23 October 2014 12:39:14PM 36 points [-]

I am curious what kind of analysis you plan to run on the calibration questions. Obvious things to do:

For each user, compute the correlation between their probabilities and the 0-1 vector of right and wrong answers. Then display the correlations in some way (a histogram?).

For each question, compute the mean (or median) of the probability for the correct answers and for the wrong answers, and see how separated they are.

But neither of those feels like a really satisfactory measure of calibration.

Comment author: William_Quixote 24 September 2014 07:23:49PM 3 points [-]

More substantively, can we express mathematically how the correlation between leaked signal and final choice effects the degree of sub optimality in final payouts?

Naively in the actual Newcombe's problem if omega is only correct 1/999,000+epsilon percent of the time then CDT seems to do about as well as whatever theory that solves this problem. Is there a known general case for this reasoning?

Comment author: DavidS 03 October 2014 08:09:54PM *  1 point [-]

"Naively in the actual Newcombe's problem if omega is only correct 1/999,000+epsilon percent of the time…"

I'd like to argue with this by way of a parable. The eccentric billionaire, Mr. Psi, invites you to his mansion for an evening of decision theory challenges. Upon arrival, Mr. Psi's assistant brings you a brandy and interviews you for hours about your life experiences, religious views, favorite philosophers, ethnic and racial background … You are then brought into a room. In front of you is a transparent box with a $1 bill in it, and an opaque box. Mr. Psi explains:

"You may take just the solid box, or both boxes. If I predicted you take one box, then that box contains $1000, otherwise it is empty. I am not as good at this game as my friend Omega, but out of my last 463 games, I predicted "one box" 71 times and was right 40 times out of 71; I picked "two boxes" 392 times and was right 247 times out of 392. To put it another way, those who one-boxed got an average of (40$1000+145$0)/185 = $216 and those who two-boxed got an average of (31$1001+247$1)/278=$113. "

So, do you one-box?

"Mind if I look through your records?" you say. He waves at a large filing cabinet in the corner. You read through the volumes of records of Mr. Psi's interviews, and discover his accuracy is as he claims. But you also notice something interesting (ROT13): Ze. Cfv vtaberf nyy vagreivrj dhrfgvbaf ohg bar -- ur cynprf $1000 va gur obk sbe gurvfgf naq abg sbe ngurvfgf. link.

Still willing to say you should one-box?

By the way, if it bothers you that the odds of $1000 are less than 50% no matter what, I also could have made Mr. Psi give money to 99/189 one boxers (expected value $524) and only to 132/286 two boxers (expected value $463) just by hfvat gur lrne bs lbhe ovegu (ROT13). This strategy has a smaller difference in expected value, and a smaller success rate for Mr. Psi, but might be more interesting to those of you who are anchoring on $500.

Comment author: RichardKennaway 01 April 2014 01:43:56PM 2 points [-]

The problem I find with all pop-level proofs of Gödel's theorems and similar material, including this one, is that they gloss over a key component: how to make a machine that talks about itself. After the part quoted above, a blogger (not Smullyan) does go on to say:

The proof of Gödel's theorem shows that there are statements of pure arithmetic that essentially express NPRNPR; the trick is to find some way to express NPRNPR as a statement about arithmetic, and most of the technical details (and cleverness!) of Gödel's theorem are concerned with this trick.

No explanation of this essential part of the proof is given. Unless you do that part, there's nothing in the supposed proof to limit it to systems that include arithmetic.

Comment author: DavidS 04 April 2014 12:41:41AM 1 point [-]

A few years ago, I tried to write a friendly introduction to this technical part.

Comment author: Dahlen 20 February 2014 05:51:32PM *  0 points [-]

"Is false when preceded by its quotation" is false when preceded by its quotation.

I feel stupid for this, but I can't quite wrap my head around it. Can somebody please ELI5? (I'm asking LW because it seems to have more than its fair share of math & logic whizzes.)

Comment author: DavidS 21 February 2014 06:41:59AM *  0 points [-]

The grammar of the sentence is a bit hard to follow. When I am presenting this paradox to friends (I have interesting friends), I hand them a piece of paper with the following words on it:

Take another piece of paper and copy these words:

"Take another piece of paper and copy these words: "QQQ" Then replace the three consecutive capital letters with another copy of those words. The resulting paragraph will make a false claim."

Then replace the three consecutive capital letters with another copy of those words. The resulting paragraph will make a false claim.

I urge you to carry out the task. You should wind up with a paper that has the exact same words on it as the paper I gave you.

If you believe that the statement on my paper is true, then you should believe that the statement on your paper is false, and vice versa. Yet they are the same statement! Assuming that you think truth or falsehood is a property of grammatical sentences, independent of where they are written, this should bother you. Moreover, unlike the standard liar paradox, the paper I gave never talks about itself, it only talks about a message you will write on some other piece of paper (which does not, in turn, talk about the original message) when you perform some simple typographical operations.

Quine constructed this example to demonstrate the sort of subtleties that come up in order to invent a mathematical formalism that can talk about truth, and can talk about manipulating symbols, without bringing in the liar paradox. (To learn how this problem is solved, take a course on mathematical logic and Goedel's theorem.)

Comment author: Oscar_Cunningham 20 February 2014 06:32:01PM 3 points [-]

A monkey who picked randomly between the experts would do better than both the "frequentist" and the "bayesian". Maybe that should worry us...

Comment author: DavidS 20 February 2014 09:19:18PM 1 point [-]

Well, I was trying to make the simplest possible example. We can of course add the monkey to our pool of experts. But part of the problem of machine learning is figuring out how long we need to watch an expert fail before we go to the monkey.

Comment author: trist 11 February 2014 04:27:46AM -1 points [-]

And how do you expect to do better by using the weighted majority algorithm? Not in the worst case, but on average?

Comment author: DavidS 20 February 2014 04:22:28PM *  2 points [-]

Suppose there are two experts, and two horses. Expert 1 always predicts horse A, expert 2 always predicts horse B, the truth is that the winning horse cycles ABABABABABA... The frequentist randomizes choice of expert according to weights; the Bayesian always chooses the expert who currently has more successes, and flips a coin when the experts are tied. (Disclaimer: I am not saying that this is the only possible strategies consistent with these philosophies, I am just saying that that these seem like the simplest possible instantiations of "when I actually choose which person to follow on a given round, I randomize according to my weights, whereas a Bayesian would always want to deterministically pick the person with the highest expected value.")

If the frequentist starts with weights (1,1), then we will have weights (1/2^k, 1/2^k) half the time and (1/2^k, 1/2^{k+1}) half the time. In the former case, we will guess A and B with equal probability and have a success rate of 50%; in the latter case, we will guess A (the most recent winner) with probability 2/3 and B with probability 1/3, for a success rate of 33%. On average, we achieve 5/12 = 41.7% success. Note that 0.583/0.5 = 1.166... < 1.39, as predicted.

On half the other horses, expert 1 has one more correct guess than expert 2, so the Bayesian will lose everyone of these bets. In addition, the Bayesian is guessing at random on the other horses, so his or her total success rate is 25%. Note that the experts are getting 50% success rates. Note that 0.75/0.5 = 1.5 < 2.41, as we expect.

As is usually the case, reducing the penalty from 1/2 to (for example) 0.9, would yield to slower convergence but better ultimate behavior. In the limit where the penalty goes to 1, the frequentist is achieving the same 50% that the "experts" are, while the Bayesian is still stuck at 25%.

Now, of course, one would hope that no algorithm would be so Sphexish as to ignore pure alternation like this. But the point of the randomized weighted majority guarantee is that it holds (up to the correctness of the random number generator) regardless of how much more complicated reality may be than the experts' models.

In response to comment by DavidS on Even Odds
Comment author: Coscott 13 January 2014 04:54:14PM 3 points [-]

I thought it was very interesting that my natural assumptions lead to a Brier score like system rather than Bayes score. I really don't think Bayesianists respect Brier score enough.

In response to comment by Coscott on Even Odds
Comment author: DavidS 13 January 2014 06:20:26PM 0 points [-]

I thought it was interesting too. As far as I can tell, your result is special to the situation of two bettors and two events. The description I gave describes a betting method when there are more than two alternatives, and that method is strategy proof, but it is not fair, and I can't find a fair version of it.

I am really stumped about what to do when there are three people and a binary question. Naive approaches give no money to the person with the median opinion.

In response to Even Odds
Comment author: DavidS 13 January 2014 04:14:53PM *  4 points [-]

Here is another attempt to present the same algorithm, with the goal of making it easier to memorize:

"Each puts in the square of their surprise, then swap."

To spell this out, I predict that some event will happen with probability 0.1, you say it is 0.25. When it happens, I am 0.9 surprised and you are only 0.75 surprised. So I put down (0.9)^2 * D, you put down (0.75)^2 * D, and we swap our piles of money. Since I was more surprised, I come out the loser on the deal.

"Square of the surprise" is a quantity commonly used to measure the failure rate of predicative agents in machine learning; it is also known as Brier score. So we could describe this rule as "each bettor pays the other his or her Brier score." There was some discussion of the merits of various scoring systems in an earlier post of Coscott's.

Comment author: So8res 01 November 2013 01:07:39AM *  0 points [-]

Thanks! Good point, that distinction is useful. I've updated the post to make this more clear (under the "models" header).

Personally, I tend to view things the other way around. As far as I'm concerned, a model of abelian group theory is anything that interprets sentences appropriately (while obeying the rules of the logic), for some value of "interprets".

It so happens that any model of group theory is isomorphic to some pointed set with an associative operator for which the point is an identity, but the model doesn't have to be a pointed set with an associative operator for which point is an identity. It could also be operations on a rubix cube. From my point of view, you've got 'IS' and 'DOES' backwards :-)

It's just perspective, I suppose. I don't particularly view set theory as foundational; I view it as one formalization that happens to have high enough fidelity to represent the behavior of any given model.

Still, your view is definitely the more standard one.

As far as I can tell, you nowhere say what a model is, even approximately

Well, this post is "Context for Model Theory": I didn't intend to introduce models themselves here. Though your concerns probably apply to the follow-up post as well.

Comment author: DavidS 01 November 2013 02:10:57AM *  0 points [-]

The operations on a Rubix cube aren't abelian. Is that just a typo on your part, or am I missing some subtle point you are making?

I'm not sure what you are getting at when you say you don't want to found math on sets. I definitely intended to use the word "set" in a naive sense, so that it is perfectly fine for sets to contain numbers, or rotations of a Rubix cube, or for that matter rocks and flowers. I wasn't trying to imply that the elements of a model had to be recursively constructed from the nullset by the axioms of ZFC. If you prefer "collection of things", I'd be glad to go with that. But I (and more to the point, model theorists) do want to think of a model as a bunch of objects with functions that take as inputs these objects and make other objects, and relations which do and do not hold between various pairs of the objects.

I'm retracting a bunch of the other things I wrote after this because, on reflection, the later material was replying to a misreading of what you wrote in your following post. I still think your de-emphasis on the fact that the model is the universe is very confusing, especially when you then talk about the cardinality of models. (What is the cardinality of a rule for assigning truth values?) But on careful reading, you weren't actually saying something wrong.

Comment author: shminux 01 November 2013 12:25:02AM *  -1 points [-]

However, you shouldn't identify an abelian group with a way of assigning truth values to statements about abelian groups. For example, the rational numbers and the real numbers are both abelian groups and, as it turns out, there is no statement using only +, 0, = and logical connectives whose truth value is different in these two groups. Nonetheless, they are different models.

Hmm, but the axiom sets are different for rationals and reals, since the latter require Dedekind-completeness, which selects a different theory from the language+logic (in So8res's terms). Why would one try to compare/distinguish models in different theories based on a subset of the logic and a subset of axioms?

Comment author: DavidS 01 November 2013 12:54:17AM 1 point [-]

The reals can be studied as models of many theories. They (with the operation +, relation = and element 0) are a model of the axioms of an abelian group. They are also a model of the axioms of a group. The reals with (+, *, 0, 1, =) are a model of the axioms of a field. The reals with (+, *, 0, 1, =, &lt;) are a model of the axioms of an ordered field. Etcetera...

Models are things. Theories are collections of statements about things. A model can satisfy many theories; a theory can have many models. I agree completely with So8res statement that it is important to keep the two straight.

In addition, your example of Dedekind completeness is an awkward one because the Dedekind completeness axiom is a good example of the kind of thing you can't say in first order logic. (There are partial ways around this, but I'm trying to keep my replies on the introductory level of this post.) But I can just imagine that you had distinguished the reals and the rationals by saying that, in R, ∃ x : x^2=1+1 is true and in Q it is false, so I don't need to focus on that.

View more: Next