I'm pretty sure almost all of freqeuntist methods are derivable as from bayes, or close approximations of bayes. Do they have any tool which is radically un-bayesian?
I think the interpretation of probability and what methods to use for inference are two separate debates. There was a really good discussion post on this a while back.
I completely agree with this. It seems to me that we should completely throw away the question of what probability is, and look at which form of inference is optimal.
yea, that's just a wiki entry. And the problems there are much more general than the sort of thing i am imagining. I'm thinking things like, interpretations of dutch book arguments, solutions to grue, optimizing problems, newcomb problems, open decision theory problems, and the like.
Why can't a frequentist say: "Bayesians are conflating probability with subjective degree of belief." ? They were here first after all.
Probability does model frequency, and it does model subjective degree of believe, and this is not a contradiction. Using the copula is the problem, obviously: if subjective degree of believe is not frequency, and probability is frequency, then probability is not subjective degree of belief. Analogously, if subjective degree of believe is not frequency, and probability is subjective degree of belief, then probability is not frequency.
The problem is that they all conflate "probability" with "subjective degree of belief" and "frequency", the bayesian conflates subjective degree of belief and probability. The frequentist conflates probability and frequency.
The frequentist/Bayesian dispute is of real import, because ad-hoc frequentist statistical methods often break down in extreme cases, throw away useful data, only work well with Gaussian sampling distributions etc.
The debate over whether to use Bayesian methods or frequentest methods is of import. I think potato was trying to say this here:
How we should actually model the situation as a probability distribution depends on our goal. But remember that Bayesianism is the stronger magic.
But the question of whether probability is frequency, or if probability is subjective degree of belief, is just as silly as a dispute over whether numbers are quantity, or if they are orders. The answer is that numbers model both, and are neither.
Narrowing my interests is probably not an option. The fact that I can practically work on anything and still be a philosopher is one of the things that appeals to me about the field, but maybe that has something to do with why it so rarely done competently :/ My only other option is to work my butt off, but I know that to be a generalist and contribute takes lots of work. I do specialize in what I like to call algorithmic philosophy, and philosophy of mathematics, but that is only because I think they are of great import to my other fields of interest.
So what if anything is the standard lesswrong approach to Nelson Goodman's grue problem? If there is any paradox I could imagine someone posing against LW, I would imagine it would be the Grue problem.
(damn down voters edit): Not that I think it would pose any real threat. Just curious, I'm sure LW has a brilliant solution. And if not it can def be made by assembling the bits of other posts. I would really like to know why this got down voted.
"Minorfalsology" is totally the best word for it.
Being anti-philosophy is something philosophy needs. Not in a boring, the field is dead Rorty sense. In a, these are scientific questions with definite right and wrong answers, kind of way.
I don't think anyone is ever really anti-philosophy; perhaps my imagination is so daft that I can't imagine someone with different tastes. I think philosophy has really frustrated a lot of truth seekers because it was being done poorly. Even in analytic philosophy, only ever so rarely does a tool from analytic philosophy come about that could not be compared to using a stick to break apart and probe matter.
Lesswrong needs to solve philosophical problems to do its job, whether to build AI, or systematically cause rationality. It needs to solve scientific problems too, but lesswrong's practice seems to consist primarily in long winded, immersive, and concentrated discussion, using previously established technical terminology and calculi, with the aim of settling the truth value of some claim. The method of argument is the method of philosophy. This mixed with the philosophical nature of much of the content here on LW, are enough for me to think of LW as a philosophical movement. But a philosophical movement separated from the long western tradition stretching back to plato.
I like to think of LW as a philosophical movement, analogously to that famous internet meme about that statistician which goes something like this:
Derp was late to his probability class, and quickly jotted down the HW for that week's class. He worked on it for quite a while. When he got there next week, he told his professor that he found the HW harder than usual. Derp's professor informed him that what he had jotted down was not the HW, it was three unsolved conjectures. Derp then presented those proofs with the help of his professor as his dissertation.
LW solves some seemingly unsolvable philosophical dilemmas in a similar fashion; and if the average LW user is somehow helped in solving open and VERY DIFFICULT philosophical problems in the manner of insanely competent philosophers, by not thinking of him/herself as a philosopher, or by just treating philosophical problems as trivial HW, then who gives a damn? "Philosophy" is a pretty lame word anyway, "Lesswrongianism" however, that's a badass word. If you guys want us to be called "LWers" instead of "philosophers" I don't care, as long as we still solve the open philosophical problems of the previous and new century.
Wouldn't the rule be something more like:
((P(H|E) > P(H)) if and only if (P(H) > P(H|~E))) and ((P(H|E) = P(H)) if and only if (P(H) = P(H|~E)))
So, if some statement is evidence of a hypothesis, its negation must be evidence against. And if some statement's truth value is independent of a hypothesis, then so is that statements negation.
This is implied by the expectation of posterior probabilities version. Since P(E) + P(~E) = 1, that means that P(H|E) and P(H|~E) are either equal, or one is greater than P(H) and one is less than. If they were both less than P(H), then P(H|E)P(E)+P(H|~E)P(~E) would have a lesser value than the largest conditional probability in that formula; suppose P(H|E) is the greater one, then P(H|E)P(E)+P(H|~E)P(~E) < P(H|E) and P(H|E) < P(H), so P(H|E)P(E)+P(H|~E)P(~E) ≠ P(H). If they are both larger than P(H), then P(H|E)P(E)+P(H|~E)P(~E) must be larger than the smallest conditional probability in that formula; suppose that P(H|E) is the smaller one, then we have P(H|E)P(E)+P(H|~E)P(~E) > P(H|E), and P(H|E) > P(H), so P(H) ≠ P(H|E)P(E)+P(H|~E)P(~E). And if both posterior probabilities are equal, then P(H|E)P(E)+P(H|~E)P(~E) = P(H|E), and both posteriors must eqaul the prior. Q.e.d.
I think that the formula that expresses the prior as the average of the posterior probability weighted by the probabilities of observing that evidence and not observing that evidence, is a great way to express the point of this article. But it might not be trivial for everyone to get:
((P(H|E) > P(H)) if and only if (P(H) > P(H|~E))) and ((P(H|E) = P(H)) if and only if (P(H) = P(H|~E)))
from
P(H) = P(H|E)P(E) + P(H|~E)P(~E)
That something is evidence in favor if and only if its negation is evidence against, and that some result is independent of some hypothesis if and only if not observing that result is independent of that hypothesis, are the take home messages of this post as far as i can tell. The law that "P(H) = P(H|E)P(E) + P(H|~E)P(~E)" says more than that, it also tells you how to get P(H|~E) from P(H|E), P(H) and P(E). But adding the boolean statement and its proof from the weighted average statement to the post, or at least to a comment on this post, not even necessarily using the boolean symbols or formalisms, might help a lot of students that come across this long after algebra class. I know it would have helped me.
I am confused, I thought we were to weight hypotheses by 2^-(kolmogorov(H)) not 2^-length(H). Am I missing something?