BrianScurfield comments on Taking Ideas Seriously - Less Wrong

51 Post author: Will_Newsome 13 August 2010 04:50PM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (257)

You are viewing a single comment's thread. Show more comments above.

Comment author: [deleted] 24 August 2010 07:40:22PM 0 points [-]

A better phrasing for that might have been "certain knowledge is a myth." What cannot be logically justified is reasoning from particular observations to certainty in universal truths.

No, that is not equivalent. Popper wrote that "inference based on many observations is a myth". He is saying that we never reason from observations, never mind reasoning to certainty. In order to observe, you need theories. Without those, you cannot know what things you should observe or even make sense of any observation. Observation enables us to test theories, it never enables us to construct theories. Furthermore, Popper throws out the whole idea of justifying theories. We don't need justification at all to progress. Judging from Where Recursive Justification Hits Bottom, this is something Eliezer has not fully taken on board (though I may be wrong). He sees the problem of the tu-quoque, but he still says [e]verything, without exception, needs justification. No, nothing can be justified. Knowledge advances not positively by justifying things but negatively by refuting things. Eliezer does see the importance of criticism, but my impression is that he doesn't know Popper well enough.

Comment author: timtyler 24 August 2010 08:48:32PM *  2 points [-]

For Yudkowsky on Popper, start here:

"Previously, the most popular philosophy of science was probably Karl Popper's falsificationism - this is the old philosophy that the Bayesian revolution is currently dethroning."

...and keep reading - at least as far as:

"On the other hand, Popper's idea that there is only falsification and no such thing as confirmation turns out to be incorrect. Bayes' Theorem shows that falsification is very strong evidence compared to confirmation, but falsification is still probabilistic in nature; it is not governed by fundamentally different rules from confirmation, as Popper argued."

Comment author: [deleted] 24 August 2010 09:54:20PM 3 points [-]

Yudhowsky gets a lot wrong even in a few sentences:

Previously, the most popular philosophy of science was probably Karl Popper's falsificationism - this is the old philosophy that the Bayesian revolution is currently dethroning.

First, Popper's philosophy cannot be accurately described as falsificationism - that is just a component of it, and not the most important component. Popperian philosophy consists of many inter-related ideas and arguments. Yudhowsky makes an error that Popperian newbies make. One suspects from this that Yudhowsky is making himself out to be more familiar with Popper than he actually is. His claim to be dethroning Popper would then be dishonest as he does not have detailed knowledge of the rival position. Also he is wrong that Popper is popular: he isn't. Furthermore, Popper is familiar with Bayesian epistemology and actually discusses it in his books. So calling Popper's philosophy old and making out that Bayesian epistemology is new is wrong also.

Karl Popper's idea that theories can be definitely falsified, but never definitely confirmed, is yet another special case of the Bayesian rules;

Popper never said theories can be definitely falsified. He was a thoroughgoing fallibilist and viewed falsifications as fallible conjectures. Also he said that theories can never be confirmed at all, not that they can be partially or probabilistically confirmed, which the above sentence suggests he said. Saying falsification is a special case of the Bayesian rules also doesn't make sense: falsification is anti-induction whereas Bayesian epistemology is pro-induction.

Comment author: [deleted] 25 August 2010 04:41:27PM -1 points [-]

Further comments on Yudhowski's explanation of Bayes:

science itself is a special case of Bayes' Theorem; experimental evidence is Bayesian evidence.

Science revolves around explanation and criticism. Most scientific ideas never get to the point of testing (which is a form of criticism), they are rejected via criticism alone. And they are rejected because they are bad explanations. Why is the emphasis in the quote solely on evidence? If science is a special case of Bayes, shouldn't Bayes have something to say about explanation and criticism? Do you assign probabilities to criticism? That seems silly. Explanations and criticism enable us to understand things and to see why they might be true or false. Trying to reduce things to probabilities is to completely ignore the substance of explanations and criticisms. Instead of trying to get a probability that something is true, you should look for criticisms. You accept as tentatively true anything that is currently unproblematic and reject as tentatively false anything that is currently problematic. It's a boolean decision: problematic or unproblematic.

Comment author: whpearson 25 August 2010 05:00:04PM 5 points [-]

Both bayesian induction (as we currently know it) and Popper fail my test for a complete epistemology.

The test is simple. Can I use the description of the formalism to program a real computer to do science? And it should, in theory, be able to bootstrap itself from no knowledge of science to our level.

Comment author: Perplexed 25 August 2010 05:33:57PM 4 points [-]

I think that the contribution that Bayesian methodology makes toward good criticism of a scientific hypothesis is that to "do the math", you need to be able to compute P(E|H). If H is a bad explanation, you will notice this when you try to determine (before you see E) how you would go about computing P(E|H). Alternately, you discover it when you try to imagine some E such that P(E|H) is different from P(E|not H).

No, you don't assign probabilities to criticisms, as such. But I do think that every atomic criticism of a hypothesis H contains at its heart a conditional proposition of the form (E|H) or else a likelihood odds ratio P(E|H)/P(E|not H) together with a challenge, "So how would you go about calculating that?"

Incidentally, you also ought to look at some of the earlier postings where EY was, in effect, using naive Bayes classifiers to classify (i.e. create ontologies), rather than using Bayes's theorem to evaluate hypotheses that predict. Also take a look at Pearl's book to get a modern Bayesian view of what explanation is all about.

Comment author: timtyler 25 August 2010 05:13:37PM *  3 points [-]

Instead of trying to get a probability that something is true, you should look for criticisms.

If you were asked to bet on whether it was true or not, then you should assign a probability.

Scientists often do something like that when deciding how to allocate their research funds.

Comment author: [deleted] 25 August 2010 05:26:46PM 0 points [-]

But then we have to develop a quantitative formalism for both beliefs and utilities. Is it really necessary to attack both problems at once?

Comment author: [deleted] 25 August 2010 05:50:57PM 2 points [-]

Human beings don't actually seem to have utility functions, all they really have are "preferences" i.e. a method for choosing between alternatives. But von Neumann and Morgenstern showed that under some conditions this is the same as having a utility function.

Now Scurfield is saying that human beings, even smart ones like scientists, don't have prior probability distributions, all they really have is a database of claims and criticisms of those claims. Is there any result analogous to von Neumann-Morgenstern that says this is the same thing as having a prior, under conditions?

Comment author: Perplexed 26 August 2010 12:32:40AM 4 points [-]

Yes. The question has been addressed repeatedly by a variety of people. John Maynard Keynes may have been the first. Notable formulations since his include de Finetti, Savage, and Jeffrey's online book.

Discovering subjective probabilities is usually done in conjunction with discovering utilities by revealed preferences because much of the machinery (choices between alternatives, lotteries) is shared between the two problems. People like Jaynes who want a pure epistemology uncontaminated by crass utility considerations have to demand that their "test subjects" adhere to some fairly hard-to-justify consistency rules. But people like de Finetti don't impose arbitrary consistency, instead they prove that inconsistent probability assignments lose money to clever gamblers who construct "Dutch books".

Comment author: Cyan 26 August 2010 01:30:34AM 0 points [-]

some fairly hard-to-justify consistency rules

I'd be interested in reading more about your views on this (unless you're referring to Halpern's papers on Cox's theorem).

Comment author: Perplexed 26 August 2010 01:57:55AM 1 point [-]

I'm not even familiar with Halpern's work. The only serious criticism I have seen regarding the usual consistency rules for subjective probabilities dealt with the "sure thing rule". I didn't find it particularly convincing.

No, I have no trouble justifying a mathematical argument in favor of this kind of consistency. But not everyone else is all that convinced by mathematics. Their attention can be grabbed, however, by the danger of being taken to the cleaners by Dutch book professional bookies.

One of these days, I will get around to producing a posting on probability, developing it from what I call the "surprisal" of a proposition - the amount, on a scale from zero to positive infinity, by which you would be surprised upon learning that a proposition is true.

  • Prob(X) = 2^(-Surp(X)).
  • Surp(coin flip yields heads)= 1 bit.
  • Surp(A) + Surp(B|A) = Surp(A&B)

That last formula strikes me as particularly easy to justify (surprisals are additive). Given that and the first formula, you can easily derive Bayes law. The middle formula simply fixes the scale for surprisals. I suppose we also need a rule that Surp(True)=0

Comment author: Sniffnoy 26 August 2010 03:33:57AM 0 points [-]

developing it from what I call the "surprisal" of a proposition

Actually "Surprisal" is a pretty standard term, I think.

Comment author: [deleted] 26 August 2010 02:02:41AM 0 points [-]

surprisal

Yudkowsky suggests calling it "absurdity" here

Comment author: [deleted] 26 August 2010 12:43:15AM 0 points [-]

Wild. Is there an exposition of subjective expected utility better than wikipedia's?

Comment author: Perplexed 26 August 2010 12:47:57AM 1 point [-]

Jeffrey's book, which I already linked, or any good text on Game theory. Myerson, for example, or Luce and Raiffa.

Comment author: timtyler 25 August 2010 05:32:41PM 0 points [-]

Agents can reasonably be expected to quantify both beliefs and utilities. How the ability to do that is developed - is up to the developer.

Comment author: [deleted] 25 August 2010 05:33:55PM 0 points [-]

People are agents, and they are very bad at quantifying their beliefs and utilities.

Comment author: [deleted] 25 August 2010 05:01:48PM 1 point [-]

I like this point a lot. But it seems very convenient and sensible to say that some things are more problematic than others. And at least for certain kinds of claims it's possible to quantify how problematic they are with numbers. This leads one (me at least) to want a formalism -- for handling beliefs -- that involves numbers, and Bayesianism is a good one.

What's the conjectures-and-refutations way of handling claims like "it's going to snow in February"? Do you think it's meaningless or useless to attach a probability to that claim?

Comment author: [deleted] 25 August 2010 09:12:10PM 1 point [-]

There is no problem with theories that make probabilistic predictions. But getting a probabilistic prediction is not tantamount to assigning a probability to the theory that made the prediction.

Comment author: Perplexed 25 August 2010 09:44:46PM *  2 points [-]

True. But you seem to be assuming that a "theory" has to be a universal law of nature. You are too attached to physics. But in other sciences, you can have a theory which is quite explanatory, but is not in any sense a "law", but rather it is an event. Examples:

  • the theory that the moon was formed by a collision between the earth and a Mars-sized planetesimal.
  • the theory that modern man originated in Africa within the past 200,000 years and that the Homo erectus population outside of Africa did not contribute to our ancestry.
  • the theory that Napolean was poisoned with arsenic in St. Helena.
  • the "aquatic ape theory"
  • the endosymbiotic theory of the origin of mitochondria
  • the theory that the Chinese discovered America in 1421.

Probabilities can be assigned to these theories.

And even for universal theories, you can talk about the relative odds of competing theories being correct - say between a supersymetric GUT based on E6 and one based on E8. (Notice, I said "talk about the odds", not "calculate them") And you can definitely calculate how much one particular experimental result shifts those odds.

Comment author: [deleted] 25 August 2010 11:38:01PM *  2 points [-]

As you pointed out earlier, we have two ostensibly different ways of investigating the theory that the Chinese discovered America in 1421: the Popperian way, in which this theory and alternatives to it are criticized. And the Bayesian way, in which those criticisms are broken down into atomic criticisms, and likelihood ratios are attached and multiplied.

I've seen plenty of rigorous Popperian discussions but not very many very rigorous -- or even modestly rigorous -- Bayesian discussions, even on this website. One piece of evidence for the China-discovered-America theory is some business about old Chinese maps. How does a Bayesian go about estimating the likelihood ratio P(China discovered America | old maps) / P(China discovered America | no old maps)?

Comment author: Perplexed 25 August 2010 11:58:22PM 2 points [-]

I think you want to ask about P(maps|discover) / P(no maps|discover). Unless both wikipedia and my intuition are wrong.

Does catching you in this error relieve me of the responsibility of answering the question? I hope so. Because I would want to instead argue using something like P(maps|discover) vs P(maps|not discover). That doesn't take you all the way to P(discover), but it does at least give you a way to assess the evidential weight of the map evidence.

Comment author: [deleted] 26 August 2010 12:28:17AM 0 points [-]

Now P(Sewing-Machine is a phony) = ?

Here's another personal example of Bayesianism in action. Do you have a sense of how much you updated by? P(Richard Dawkins praises Steven Pinker | EP is bunk)/ P(Richard Dawkins praises Steven Pinker | EP is not bunk) is .5? .999? Any idea?

Comment author: Perplexed 26 August 2010 12:57:56AM 1 point [-]

P("Sewing Machine" is a nym) = 1.0
P(Sewing Machine has been disingenuous) = 0.5 and rising
P(Dawkins praises Pinker|EP is not bunk) is ill defined because
P(EP is not bunk) = ~0
but I have updated P(Dawkins believes EP is not bunk) to at least 0.5

Comment author: timtyler 25 August 2010 05:08:50PM 0 points [-]

More from Yudkowsky on the philosophy of science:

http://lesswrong.com/lw/ig/i_defy_the_data/

Comment author: timtyler 25 August 2010 05:00:27PM *  0 points [-]

The chance of a criticism being correct can unproblematically be assigned a probability.

Comment author: [deleted] 25 August 2010 05:04:48PM *  -2 points [-]

A criticism can have many components, some of which are correct and some of which are incorrect. Breaking a criticism down into its components can be difficult/problematic.

Edit: The way I put that sounds stupid. Let me try again: occasionally, a pair of math papers are released, one purports to prove a conjecture, and one purports to disprove it. The authors then criticize each others papers (let's say). Would you really characterize the task of assigning probabilities in this situation as "unproblematic"?

Comment author: timtyler 25 August 2010 05:10:43PM 0 points [-]

The point is that - if you were asked to bet on the criticism being correct - you would come up with some odds ratio.

Comment author: Perplexed 25 August 2010 05:40:46PM 2 points [-]

Maybe you would do that. I would instead bog down in a discussion of whether the criticism was a nitpick or a "real" criticism. But I would be interested to see what odds ratio you come up with for this criticism being correct.

Comment author: timtyler 25 August 2010 05:45:26PM 0 points [-]

Heh - is that your criticism? - or did you get it from Douglas Hofstadter? ;-)

Comment author: [deleted] 25 August 2010 05:24:40PM 0 points [-]

And in the math papers example, how exactly are you going to do that? Presumably you are going to go through the papers and the criticisms in detail and evaluate the content. And when you do that you are going to think of reasons why one is right and the other wrong. And then probabilities become irrelevent. It's your understanding of the content that will enable you to choose.

Comment author: timtyler 25 August 2010 05:29:47PM -2 points [-]

Right - but you don't "choose" - you assign probabilities. Rejecting something completely would be bad - because of:

http://lesswrong.com/lw/mp/0_and_1_are_not_probabilities/

Comment author: [deleted] 25 August 2010 05:32:21PM -1 points [-]

I don't think anyone is falling into this trap. It sounds like the Popperian version is replacing "true" and "false" by "tentatively true" and "tentatively false."

Comment author: timtyler 25 August 2010 05:34:39PM *  0 points [-]

"Tentatively true" and "tentatively false" sound a lot like probabilities which are not expressed in a format which is compatible with Bayes rule.

It is hard to see how that adds anything - but rather easy to see how it subtracts the ability to quantitatively analyse problems.