MileyCyrus comments on A (very) tentative refutation of Pascal's mugging - Less Wrong

0 Post author: Arran_Stirton 30 March 2012 06:43AM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (34)

You are viewing a single comment's thread.

Comment author: MileyCyrus 30 March 2012 07:19:17AM *  0 points [-]

"So for every extra unit of disutility predicted the probability penalty due to not knowing enough about the current state of the universe becomes greater."

Sure, but the probability shrinks slower than the disutility rises. A scenario in which 1000 times 3^^3 people are tortured has more probability that the probability that 3^^3 people are tortured, divided by 1000. Or more formally:

[P(Mugger tortures 1000*3^^3 people)] > [P(Mugger tortures 3^^3 people)]/1000

Read about Solomonoff Induction to find out why this is true.

Comment author: Dmytry 31 March 2012 06:57:47AM *  1 point [-]

How's about that: the probabilities of torture of exact number of beings, got to sum to 1 or less?

Comment author: Manfred 30 March 2012 12:32:50PM *  1 point [-]

A word of caution - Solomonoff induction applies to things like the laws of physics, not to all hypotheses. Otherwise, if you flipped a coin 100 times, you would expect to see 100 heads much more often than average, and we don't.

Comment author: MileyCyrus 30 March 2012 03:40:11PM 2 points [-]

Otherwise, if you flipped a coin 100 times, you would expect to see 100 heads much more often than average, and we don't.

If you flip a coin 15 times, this result:

HHHHHHHHHHHHHHH

is far more probable than this:

HTHTTHTHTTTHHTH

That's because some coins are rigged, and it's much easier to rig a coin to conform the first pattern than the second.

Comment author: Antisuji 30 March 2012 04:09:19PM 2 points [-]

This is true, but doesn't explain why we're more surprised when we see the former than the latter.

Comment author: [deleted] 30 March 2012 04:21:07PM 1 point [-]

we're more surprised when we see the former than the latter

I don't think this is actually true. If MileyCyrus successfully predicted the exact sequence of coinflips HTHTTHTHTTTHHTH, wouldn't you be more surprised than if it were HHHHHHHHHHHHHHH?

Comment author: Antisuji 30 March 2012 06:44:55PM 1 point [-]

Of course. When I said "we're more surprised" I was referring to the typical person who hasn't read this discussion thread. In the absence of the above prediction, I would be far more surprised to see HHHHHHHHHHHHHHH than HTHTTHTHTTTHHTH. Once the prediction is made, I become extremely surprised if either sequence appears, but somewhat more surprised by HTHTTHTHTTTHHTH.

Comment author: [deleted] 30 March 2012 06:58:47PM 1 point [-]

Oh, I see. In the case of the typical person, the answer is even easier: Lack of understanding of the conjunction rule of probability. HTHTTHTHTTTHHTH feels more representative of a random series of coin flips, so it is intuitively judged as more probable than HHHHHHHHHHHHHHH.

Comment author: Manfred 30 March 2012 04:24:39PM *  0 points [-]

I suppose that isn't all that unintuitive (though does this actually work if you start with a uniform prior over weights and do the math?). But does your intuitive model also predict the fact that HTHTHTHTHT is more probable than HTHHTHTHTT? :D

Comment author: Dmytry 31 March 2012 07:01:21AM *  1 point [-]

Well, it is the case that all the random sequences together have much larger probability than HHHHHHHHHHHH , and so we should expect the sequence to be one among the random sequences.

edit: interesting issue: suppose you assign some prior probability to each possible sequence. Upon seeing the actual sequence, with probability that your eyes deceived you 0.0001, how are you to update the probability of this particular sequence? Why would we assume sensory failure (or a biased coin) when we observe hundred heads, but not something random-looking? It should have to do with the sensory failure being much less likely for something random looking.

Comment author: philh 30 March 2012 06:56:36PM 0 points [-]

First reaction: I don't know about "far" more probable. What's the prior that a coin is rigged? I would have said less than 1/32768, but low confidence on that.

According to this, you can't rig a coin to do that, which increases my confidence.

But you can rig your tossing, even by mistake; if it lands heads, and you balance it to flip with heads up again, then it's slightly more likely to land heads. I remember hearing a figure of 51% for that; in which case H*15 has probability 1/24331 instead of 1/32768; about a third more probable. But that scenario (fifteen times) is itself unlikely... if we estimate P(next is heads | last was heads) = 0.505 (corresponding to keeping the same side up 3/4 of the time, I still feel that's an overestimate), we get 1/28204, 16% more likely.

If we switched to dice, I would agree that 666666666666666 is far more probable than 136112642345553.

Comment author: Arran_Stirton 31 March 2012 02:16:12PM 0 points [-]

I'm treating the current state of the universe as a different thing entirely to the mugger's implied hypothesis about how the universe works. Both a program simulating Maxwell’s equations would obviously win out over a program simulating Thor, but in terms of predicting the shape of a magnetic field in a certain spot, that depends on the current state of the universe (at least the parts of the universe relevant to the equation).

Though if this is an invalid line of reasoning for some reason, please let me know, thanks.

Comment author: MileyCyrus 31 March 2012 05:53:58PM 0 points [-]

I have no idea where you're going with this.

Both a program simulating Maxwell’s equations would obviously win out over a program simulating Thor,

You use the word "both" but then refer to only one object. Did you forget to include something?

Comment author: Arran_Stirton 04 April 2012 12:35:47AM 0 points [-]

Sorry I'll try to clarify:

If you want to predict the exact state of a system five minutes into the future you need to know the current state of the system and the laws of that system. Call the current state s and the future state s', the laws of the system are simulated by the Turing machine L. Instead of knowing the state of the system, we only know its laws (or rather we take them as a given).

Then any prediction we make about the future state of the system will restrict the range of value for s' that will validate our prediction. The more specific we are about s' the smaller the range of values it can be. In turn this restricts the range of possible values for s (as L(s) = s') that will give s'.

Because we have no information about the current state of the system all possible states are equally likely, and as such the probability that the system will end up in a particular range of s' is the same as the fraction of s (out of all possible s) that will map there.

This is not in relation to any hypothesis about the laws of the system, but instead the current state of the system. I hope this makes my original argument make more sense. If not I'm sorry; please highlight to me where my explanation is going wrong.