ChristianKl comments on Open thread, Oct. 12 - Oct. 18, 2015 - Less Wrong Discussion
You are viewing a comment permalink. View the original post to see all comments and the full post content.
You are viewing a comment permalink. View the original post to see all comments and the full post content.
Comments (250)
I'm contemplating a discussion post on this topic, but first I'll float it here, since there's a high chance that I'm just being really stupid.
I'm abysmally unsuccessful at using anything like Bayesian reasoning in real life.
I don't think it's because I'm doing anything fundamentally wrong. Maybe what I'm doing wrong is attempting to think of these things in a Bayesian way in the first place.
Let's use a concrete example. I bought a house. My prior probability that any given household appliance or fixture will break and/or need maintenance in a given month is on the order of 5%, obviously with some variability depending on what appliance we're talking about. This prior is an off-the-cuff intuitive figure based on decades of living in houses.
Within a month of buying this house, things immediately start breaking. The dishwasher breaks. Then the garbage disposal. The sump pump fails completely. The humidifier needs repair. The air conditioner unit needs to be entirely replaced. The siding needs to be repainted. A section of fence needs to be replaced. The sprinklers don't work. This is all within roughly the first four months.
So, my prior was garbage, but the real issue for me is that Bayesian reasoning didn't really help me. The dishwasher breaking didn't cause me to shift my Background Probabilistic Breakage Rate much at all. One thing breaking within the first month is allowed for by my prior model. Then the second thing breaks - okay, maybe I need to adjust my BPBR a a bit. Still, there's little reason to expect that several more important things will break in short order. But that's exactly what happened.
There is a causal story that explains everything (apparently) breaking at basically the same time, which is that the previous owners were not taking good care of the house, and various things were already subtly broken and limping along at passable functionality for a long time. The problem is that this causal story only becomes promoted to "hypothesis with significant probability mass" after two or three consecutive major appliance disasters.
What is annoying about all this is that my wife doesn't attempt to use any kind of probabilistic reasoning, and she is basically right all the time. I was saying things like, "I really doubt the garbage disposal is really broken, we just had two other major things replaced, what are the odds that another thing would break so quickly?" and she would reply along the lines of, "I'm pretty sure it's actually broken, and I can't fathom why you keep talking about odds when your odds-based assessments are always wrong," and I'm at the point of agreeing with her. Not to mention that she was the one who suggested the "prior owners didn't maintain the house" hypothesis, while I was still grimly clinging to my initial model, increasingly bewildered by each new disaster.
I am probably a poster child for "doing probabilistic thinking wrong" in some obvious way that I am blind to. Please help me figure out how and where. I have my own thoughts, but I will wait for others to respond so as to avoid anchoring.
You are indeed doing it very wrong. As far as proablisitic reasoning goes the fact that one item broke doesn't reduce the chances that a second item breaks at all.
Yeah, okay, I worded that stupidly. It's more like this:
"This 20-sided-die just came up 20 twice in a row. The odds of three consecutive rolls of 20 is 0.0125%. I acknowledge that this next roll has a 1/20 chance of coming up 20, assuming the die is fair. However, if this next roll comes up 20, we are witnessing an extremely improbable sequence, so improbable that I have to start considering that the die is loaded."
The equivalent of "considering that the die is loaded" in your example is "the previous owners did a bad job of maintaining the house". It's indeed makes sense to come to that conclusion. That's also basically what your wife did.
Apart from that the difference between sequences picked by humans to look random and real random data is that real random data more frequently contains such improbable sequences.
The "however" part seems irrelevant.
I mean, regardless of what were the previous two rolls -- let's call them "X" and "Y" -- if the next roll comes up 20, we are witnessing a sequence "X, Y, 20", which has a probability 0.0125%. That's true even when "X" and "Y" are different than 20.
You could make the sequence even more improbable by saying "if this next roll comes up, we are witnessing an extremely improbably sequence -- we are living in a universe whose conditions allow creation of matter, we happen to be on a planet where life exists, dinousaurs were killed by a comet, I decided to roll the 20-sided-die three times, the first two rolls were 20... and now the third roll is also 20? Well this all just seems very very unlikely."
Or you could decide that the past is fixed, if you happen to be in some branch of the universe you are already there, and you are only going to estimate the probability of future events.
Even better, what ChristianKl said. A better model would be that depending on the existing state of the house there is a probability P saying how frequently things will break. At the beginning there is some prior distribution of P, but when things start breaking too fast, you should update that P is probably greater than you originally thought... and now you should expect things to break faster than you expected originally.
Yes, all sequences X,Y,Z are equally (im)probable if the d20 is a fair one. But some sequences -- in particular those with X=Y=Z, and in more-particular those with X=Y=Z=1 or X=Y=Z=20, are more likely if the die is unfair because they're relatively easy and/or relatively useful/amusing for a die-fixer to induce.
As you consider longer and longer sequences 20,20,20,... their probability conditional on a fair d20 goes down rapidly, whereas their probability conditional on a dishonest d20 goes down much less rapidly because there's some nonzero chance that someone's made a d20 that almost always rolls 20s.