Constant comments on On Debates with Trolls - Less Wrong
You are viewing a comment permalink. View the original post to see all comments and the full post content.
You are viewing a comment permalink. View the original post to see all comments and the full post content.
Comments (248)
I think you're problem is you don't understand what the issues at stake are, so you don't know what you're trying to find.
You said:
But then when you found a well known book by Popper which does have those words, and which does discuss Bayes' equation, you were not satisfied. You asked for something which wasn't actually what you wanted. That is not my fault.
You also said:
But you don't seem to understand that Popper's solution to the problem of induction is the same topic. You don't know what you're looking for. It wasn't a change of topic. (Hence I thought we should discuss this. But you refused. I'm not sure how you expect to make progress when you refuse to discuss the topic the other guy thinks is crucial to continuing.)
Bayesian updating, as a method of learning in general, is induction. It's trying to derive knowledge from data. Popper's criticisms of induction, in general, apply. And his solution solves the underlying problem rendering Bayesian updating unnecessary even if it wasn't wrong. (Of course, as usual, it's right when applied narrowly to certain mathematical problems. It's wrong when extended out of that context to be used for other purposes, e.g. to try to solve the problem of induction.)
So, question: what do you think you're looking for? There is tons of stuff about probability in various Popper books including chapter 8 of LScD titled "probability". There is tons of explanation about the problem of induction, and why support doesn't work, in various Popper books. Bayesian updating is a method of positively supporting theories; Popper criticized all such methods and his criticisms apply. In what way is that not what you wanted? What do you want?
So for example I opened to a random page in that chapter and found, p 183, start of section 66, the first sentence is:
This is a criticism of the Bayesian approach as unscientific. It's not specifically about the Bayesian approach in that it applies to various non-Bayesian probabilistic approaches (whatever those may be. can you think of any other approaches besides Bayesian epistemology that you think this is targeted at? How would you do it without Bayes' theorem?). In any case it is a criticism and it applies straightforwardly to Bayesian epistemology. It's not the only criticism.
The point of this criticism is that to even begin the Bayesian updating process you need probability estimates which are created unscientifically by making them up (no, making up a "prior" which assigns all of them at once, in a way vague enough that you can't even use it in real life without "estimating" arbitrarily, doesn't mean you haven't just made them up).
EDIT: read the first 2 footnotes in section 81 of LScD, plus section 81 itself. And note that the indexer did not miss this but included it...
Only in a sense so broad that Popper can rightly be accused of the very same thing. Bayesians use experience to decide between competing hypotheses. That is the sort of "derive" that Bayesians do. But if that is "deriving", then Popper "derives". David Deutsch, who you know, says the following:
I direct you specifically to this sentence:
This is what Bayesians do. Experience is what Bayesians use to choose between theories which have already been guessed. They do this using Bayes' Theorem. But look back at the first sentence of the passage:
Clearly, then, Deutsch does not consider using the data to choose between theories to be "deriving". But Bayesians use the data to choose between theories. Therefore, as Deutsch himself defines it, Bayesians are not "deriving".
Yes, the Bayesians make them up, but notice that Bayesians therefore are not trying to derive them from data - which was your initial criticism above. Moreover, this is not importantly different from a Popperian scientist making up conjectures to test. The Popperian scientist comes up with some conjectures, and then, as Deutsch says, he uses experimental data to "choose between theories that have already been guessed". How exactly does he do that? Typical data does not decisively falsify a hypothesis. There is, just for starters, the possibility of experimental error. So how does one really employ data to choose between competing hypotheses? Bayesians have an answer: they choose on the basis of how well the data fits each hypothesis, which they interpret to mean how probable the data is given the hypothesis. Whether he admits it or not, the Popperian scientist can't help but do something fundamentally the same. He has no choice but to deal with probabilities, because probabilities are all he has.
The Popperian scientist, then, chooses between theories that he has guessed on the basis of the data. Since the data, being uncertain, does not decisively refute either theory but is merely more, or less, probable given the theory, then the Popperian scientist has no choice but to deal with probabilities. If the Popperian scientist chooses the theory that the data fits best, then he is in effect acting as a Bayesian who has assigned to his competing theories the same prior.
Where do you get the theories you consider?
Do you understand DD's point that the majority of the time theories are rejected without testing which is in both his books? Testing is only useful when dealing with good explanations.
Do you understand that data alone cannot choose between the infinitely many theories consistent with it, which reach a wide variety of contradictory and opposite conclusions? So Bayesian Updating based on data does not solve the problem of choosing between theories. What does?
Bayesians are also seriously concerned with the fact that an infinity of theories are consistent with the evidence. DD evidently doesn't think so, given his comments on Occam's Razor, which he appears to be familiar with only in an old, crude version, but I think that there is a lot in common between his "good explanation" criterion and parsimony considerations.