Hmm, thanks. Seems similar to my description above, though as far as I can tell it doesn't deal with my criticisms. It is rather evasive when it comes to the question of what status models have in Bayesian calculations.
I am curious; what is the general LessWrong philosophy about what truth "is"? Personally I so far lean towards accepting an operational subjective Bayesian definition, i.e. the truth of a statement is defined only so far as we agree on some (in principle) operational procedure for determining its truth; that is we have to agree on what observations make it true or false.
For example "it will rain in Melbourne tomorrow" is true if we see it raining in Melbourne tomorrow (trivial, but also means that the truth of the statement doesn't depe...
Lol that is a nice story in that link, but it isn't a Dutch book. The bet in it isn't set up to measure subjective probability either, so I don't really see what the lesson in it is for logical probability.
Say that instead of the digits of pi, we were betting on the contents of some boxes. For concreteness let there be three boxes, one of which contains a prize. Say also that you have looked inside the boxes and know exactly where the prize is. For me, I have some subjective probability P( X_i | I_mine ) that the prize is inside box i. For you, all your s...
That sounds to me more like an argument for needing lower p-values, not higher ones. If there are many confounding factors, you need a higher threshold of evidence for claiming that you are seeing a real effect.
Physicists need low p-values for a different reason, namely that they do very large numbers of statistical tests. If you choose p=0.05 as your threshold then it means that you are going to be claiming a false detection at least one time in twenty (roughly speaking), so if physicists did this they would be claiming false detections every other day and their credibility would plummet like a rock.
Is there any more straightforward way to see the problem? I argued with you about this for a while and I think you convinced me, but it is still a little foggy. If there is a consistency problem, surely this means that we must be vulnerable to Dutch books doesn't it? I.e. they would not seem to be Dutch books to us, with our limited resources, but a superior intelligence would know that they were and would use them to con us out of utility. Do you know of some argument like this?
Very well, then i will wait for the next entry. But i thought the fact that we were explicitly discussing things the robot could not compute made it clear that resources were limited. There is clearly no such thing as logical uncertainty to the magic logic god of the idealised case.
No we aren't, we're discussing a robot with finite resources. I obviously agree that an omnipotent god of logic can skip these problems.
It was your example, not mine. But you made the contradictory postulate that P("wet outside"|"rain")=1 follows from the robots prior knowledge and the probability axioms, and simultaneously that the robot was unable to compute this. To correct this I alter the robots probabilities such that P("wet outside"|"rain")=0.5 until such time as it has obtained a proof that "rain" correlates 100% with "wet outside". Of course the axioms don't determine this; it is part of the robots prior, which is not det...
You haven't been very specific about what you think I'm doing incorrectly so it is kind of hard to figure out what you are objecting to. I corrected your example to what I think it should be so that it satisfies the product rule; where's the problem? How do you propose that the robot can possibly set P("wet outside"|"rain")=1 when it can't do the calculation?
Ok sure, so you can go through my reasoning leaving out the implication symbol, but retaining the dependence on the proof "p", and it all works out the same. The point is only that the robot doesn't know that A->B, therefore it doesn't set P(B|A)=1 either.
You had "Suppose our robot knows that P(wet outside | raining) = 1. And it observes that it's raining, so P(rain)=1. But it's having trouble figuring out whether it's wet outside within its time limit, so it just gives up and says P(wet outside)=0.5. Has it violated the product rule? Yes...
Hmm this does not feel the same as what I am suggesting.
Let me map my scenario onto yours:
A = "raining"
B = "wet outside"
A->B = "It will be wet outside if it is raining"
The robot does not know P("wet outside" | "raining") = 1. It only knows P("wet outside" | "raining", "raining->wet outside") = 1. It observes that it is raining, so we'll condition everything on "raining", taking it as true.
We need some priors. Let P("wet outside") = 0.5. We also need a ...
But it turns out that there is one true probability distribution over mathematical statements, given the axioms. The right distribution is obtained by straightforward application of the product rule - never mind that it takes 4^^^3 steps - and if you deviate from the right distribution that means you violate the product rule at some point.
This does not seem right to me. I feel like you are sneakily trying to condition all of the robots probabilities on mathematical proofs that it does not have a-priori. E.g. consider A, A->B, therefore B. To learn th...
Perhaps, though, you could argue it differently. I have been trying to understand so-called "operational" subjective statical methods recently (as advocated by Frank Lad and his friends), and he is insisting on only calling a thing a [meaningful, I guess] "quantity" when there is some well-defined operational procedure for measuring what it is. Where for him "measuring" does not rely on a model, he is refering to reading numbers off some device or other, I think. I don't quite understand him yet, since it seems to me that the numbers reported by devices all rely on some model or other to define them, but maybe one can argue their way out of this...
Thanks, this seems interesting. It is pretty radical; he is very insistent on the idea that for all 'quantities' about which we want to reason there must some operational procedure we can follow in order to find out what it is. I don't know what this means for the ontological status of physical principles, models, etc, but I can at least see the naive appeal... it makes it hard to understand why a model could ever have the power to predict new things we have never seen before though, like Higgs bosons...
An example of a "true number" is mass. We can measure the mass of a person or a car, and we use these values in engineering all the time. An example of a "fake number" is utility. I've never seen a concrete utility value used anywhere, though I always hear about nice mathematical laws that it must obey.
It is interesting that you choose mass as your prototypical "true" number. You say we can "measure" the mass of a person or car. This is true in the sense that we have a complex physical model of reality, and in one...
Sure, I don't want to suggest we only use the word 'probability' for epistemic probabilities (although the world might be a better place if we did...), only that if we use the word to mean different sorts of probabilities in the same sentence, or even whole body of text, without explicit clarification, then it is just asking for confusion.
Hmm, do you know of any good material to learn more about this? I am actually extremely sympathetic to any attempt to rid model parameters of physical meaning; I mean in an abstract sense I am happy to have degrees of belief about them, but in a prior-elucidation sense I find it extremely difficult to argue about what it is sensible to believe a-priori about parameters, particularly given parameterisation dependence problems.
I am a particle physicist, and a particular problem I have is that parameters in particle physics are not constant; they vary with re...
Hmm, interesting. I will go and learn more deeply what de Finetti was getting at. It is a little confusing... in this simple case ok fine p can be defined in a straightforward way in terms of the predictive distribution, but in more complicated cases this quickly becomes extremely difficult or impossible. For one thing, a single model with a single set of parameters may describe outcomes of vastly different experiments. E.g. consider Newtonian gravity. Ok fine strictly the Newtonian gravity part of the model has to be coupled to various other models to des...
Are you referring to De Finetti's theorem? I can't say I understand your point. Does it relate to the edit I made shortly before your post? i.e. Given a stochastic model with some parameters, you then have degrees of belief about certain outcomes, some of which may seem almost the same thing as the parameters themselves? I still maintain that the two are quite different: parameters characterise probability distributions, and just in certain cases happen to coincide with conditional degrees of belief. In this 'beliefs about beliefs' context, though, it is the parameters we have degrees of belief about, we do not have degrees of belief about the conditional degrees of belief to which said parameters may happen to coincide.
"Jonah was looking at probability distributions over estimates of an unknown probability (such as the probability of a coin coming up heads)"
It sounds like you are just confusing epistemic probabilities with propensities, or frequencies. I.e, due to physics, the shape of the coin, and your style of flipping, a particular set of coin flips will have certain frequency properties that you can characterise by a bias parameter p, which you call "the probability of landing on heads". This is just a parameter of a stochastic model, not a degre...
"I view these sorts of distributions over distributions as that- there's some continuous parameter potentially in the world (the proportion of white and black balls in the urn), and that continuous parameter may determine my subjective probability about binary events (whether ball #1001 is white or black)."
To me this just sounds like standard conditional probability. E.g. let p(x|I) be your subjective probability distribution over the parameter x (fraction of white balls in urn), given prior information I. Then
p("ball 1001 is white"|I)...
Lol ok, so long as I get my answer eventually :p.
Was the "Putting in the Numbers" post the one you were referring to? You didn't post that on Saturday, but now it is Monday and there doesn't seem be a third post. Anyway I did not see this question answered anywhere in "Putting in the Numbers"...
Yeah I think integral( p*log(p) ) is it. The simplest problem is that if I have some parameter x to which I want to assign a prior (perhaps not over the whole real set, so it can be proper as you say -- the boundaries can be part of the maxent condition set), then via the maxent method I will get a different prior depending on whether I happen to assign the distribution over x, or x^2, or log(x) etc. That is, the prior pdf obtained for one parameterisation is not related to the one obtained for a different parameterisation by the correct transformation rul...
Refering to this:
"Simply knowing the fact that the entropy is concave down tells us that to maximize entropy we should split it up as evenly as possible - each side has a 1/4 chance of showing."
Ok, that's fine for discrete events, but what about continuous ones? That is, how do I choose a prior for real-valued parameters that I want to know about? As far as I am aware, MAXENT doesn't help me at all here, particularly as soon as I have several parameters, and no preferred parameterisation of the problem. I know Jaynes goes on about how continuous ...
It would have been kind of impossible to work on AI in 1850, before even modern set theory was developed. Unless by work on AI, you mean work on mathematical logic in general.
Ok, but do you really mean that sentence how it is written? To me it means the same thing as saying that assigning probability to anything is logically equivalent to assigning probability to 0=1 (which I am perfectly happy to do so if that is the point then fine, but that doesn't seem to be your implication)
"But to assign some probability to the wrong answer is logically equivalent to assigning probability to 0=1."
Only if you know it is the wrong answer. You say the robot doesn't know, so what's the problem? We assign probabilities to propositions which are wrong all the time, before we know if they are wrong or not.
The statistics also remains important at the frontier of high energy physics. Trying to do reasoning about what models are likely to replace the Standard Model is plagued by every issue in the philosophy of statistics that you can imagine. And the arguments about this affect where billions of dollars worth of research funding end up (build bigger colliders? more dark matter detectors? satellites?)
I can't disagree with that :p. I will concede that the survey question needs some refinement.
Hmm, I couldn't agree with that later definition. Physics is just the "map" after all, and we are always improving it. Mathematics (or some future "completed" mathematics) seems to me the space of things that are possible. I am not certain, but this might be along the lines of what Wittgenstein means when he says things like
"In logic nothing is accidental: if a thing can occur in an atomic fact the possibility of that atomic fact must already be prejudged in the thing.
If things can occur in atomic facts, this possibility must alre...
But don't you think there is an important distinction between events that defy logical description of any kind, and those that merely require an outlandish multi-layered reality to explain? I admit I can't think of anything that could occur in our world that cannot be explained by the simulation hypothesis, but assuming that some world DOES exist outside the layers of nested simulation I can (loosely speaking) imagine that some things really are logically impossible there. And that if the inhabitants of that world observe such impossible events, well, they...
I'm no theologian, but it seems to me that this view of the supernatural does not conform to the usual picture of God philosophers put forward, in terms of being the "prime mover" and so on. They are usually trying to solve the "first cause" problem, among other things, which doesn't really mesh with God as the super-scientist, since one is still left wondering about where the world external to the simulation comes from.
I agree that my definition of the supernatural is not very useful in practice, but I think it is necessary if one is t...
To me, the simulation hypothesis definitely does not imply a supernatural creator. 'Supernatural' implies 'unconstrained by natural laws', at least to me, and I see no reason to expect that the simulation creators are free from such constraints. Sure, it means that supernatural-seeming events can in principle occur inside the simulation, and the creators need not be constrained by the laws of the simulation since they are outside of it, but I fully expect that some laws or other would govern their behaviour.
You don't think people here have a term for their survey-completing comrades in their cost function? Since I probably won't win either way this term dominated my own cost function, so I cooperated. An isolated defection can help only me, whereas an isolated cooperation helps everyone else and so gets a large numerical boost for that reason.
Lol, I cooperated because $60 was not a large enough sum of money for me to really care about trying to win it, and in the calibration I assumed most people would feel similarly. Reading your reasoning here, however, it is possible I should have accounted more strongly for people who like to win just for the sake of winning, a group that may be larger here than in the general population :p.
Edit: actually that's not really what I mean. I mean people who want to make a rational choice to maximum the probability of winning for its own sake, even if they don't...
It defined "God" as supernatural didn't it? In what sense is someone running a simulation supernatural? Unless you think for some reason that the real external world is not constrained by natural laws?
In this case, Feynman is worth listening to slowly. There is something about the way he explains this that the transcript does not do justice to.
When you prove something in mathematics, at very least you implicitly assume you have made no mistakes anywhere, are not hallucinating, etc. Your "real" subjective degree of belief in some mathematical proposition, on the other hand, must take all these things into account.
For practical purposes the probability of hallucinations etc. may be very small and so you can usually ignore them. But the OP is right to demonstrate that in some cases this is a bad approximation to make.
Deductive logic is just the special limiting case of probability theory...
It is not very useful to discriminate between "seeing with your eyes" and "seeing with the aid of scientific instruments". Vast amounts of information processing occurs between light landing on your retina and an image forming in your brain, so if you are happy to call looking through glasses, or a microscope, or a telescope, "seeing with your eyes" then I see no reason to make a distinction when the information-carrying particle switches from photons to electrons. Especially since we mostly use digital microscopes etc. these days.
Bayes theorem only works with as much information as you put into it. Humans can only ever be approximate Bayesian agents. If you learn about some proposition you never though of before it is not a failing of Bayesian reasoning, it is just that you learn you have been doing it wrong up until that point and have to recompute everything.
I'd just like to point out that even #1 of the OP's "lessons" is far more problematic than they make it seem. Consider the statement:
"The fact that there are myths about Zeus is evidence that Zeus exists. Zeus's existing would make it more likely for myths about him to arise, so the arising of myths about him must make it more likely that he exists." (supposedly an argument of the form P(E | H) > P(E)).
So first, "Zeus's existing would make it more likely for myths about him to arise" - more likely than what? Than "a pr...
If you are introduced to 5 blue-haired Xians but no black-haired Xians, you might infer that all or most Xians have blue hair. That is a pretty obvious case of sampling bias.
If a-priori you had no reason to expect that the population was dominantly blue-haired then you should begin to suspect some alternative hypothesis, like your sampling is biased for some reason, rather than believe everyone is blue haired.
Of course acting on beliefs is a decision theory matter. You don't have terribly much to lose by buying a losing lottery ticket, but you have a very large amount to gain if it wins, so yes 1/132 chance of winning sounds well worth $20 or so.
Keynes in his "Treatise on probability" talks a lot about analogies in the sense you use it here, particularly in "part 3: induction and analogy". You might find it interesting.