Decimal digit computations as a testing ground for reasoning about probabilities

4 Post author: multifoliaterose 15 July 2011 04:19AM

Points in this article emerged from a conversation with Anna Salamon

I think that thinking about decimal expansions of real numbers provides a good testing ground for one's intuition about probabilities. The context of computation is very different from most of the contexts that humans deal with; in particular it's much cleaner. As such, this testing ground should not be used in isolation; the understanding that one reaps from it needs to be integrated with knowledge from other contexts. Despite its limitations I think that it has something to add.

Given a computable real number x, a priori the probability that any string of n decimal digits comprises the first n decimal digits of x is 10-n. For concreteness, we'll take x to be pi. It has long been conjectured that pi is a normal number. This is consistent with the notion that the digits of pi are "random" in some sense, and in this respect pi contrasts with (say) rational numbers and Liouville's constant.

According to the Northwestern University homepage, pi has been computed to five trillion of digits. So to the extent that one trusts the result of the computation; there exists an example of a statement which had an a priori probability of 10-n  with n > 5•1012 of being true which we now know to be true with high confidence. How much should we trust the computation? Well, I don't know whether it's been verified independently and there are a variety of relevant issues about which I know almost nothing (coding issues; hardware issues; the degree of rigor with which the algorithm used has been proven to be correct, etc.). One would have more confidence if one knew that several independent teams had succeeded in verifying the result using different algorithms & hardware. One would have still more confidence if one were personally involved in such a team and became convinced of the solidity of the methods used. Regardless:

(a) As early as 1962 mathematicians had computed pi to 105 digits. Presumably since then their computation has been checked many times over by a diversity of people and methods. Trusting a single source is still problematic as there may have been a typo or whatever, but it seems uncontroversial to think that if one uses the nearest apparently reliable computational package (say, Wolfram Alpha) then chance that the output is correct is > 10%. Thus we see how an initial probability estimate of 10-100000 can rise to a probability over 10-1 in practice.

(b) If one was determined one could probably develop ~90% confidence the accuracy over a billion digits of pi. I say this because it's been over 20 years since computational power and algorithms have permitted such a vast computation; presumably by studying, testing and tweaking all programs written since then, one could do many checks on the accuracy of each of the first billion digits. Assuming that this is possible, an initial probability estimate of 10-1000000000 can in practice grow as large as  > 0.9.

This shows that probabilities which are apparently very small can rapidly shift to being quite large with the influx of new information. There's more that I could say about this but I think that the chunk that I've written so far is enough to warrant posting and that the rest of my thoughts are sufficiently ill-formed so that I shouldn't try to say more right now. I welcome thoughts and comments.

Comments (31)

Comment author: Wei_Dai 15 July 2011 01:59:00PM 10 points [-]

This shows that probabilities which are apparently very small can rapidly shift to being quite large with the influx of new information.

Isn't there a simpler way of making this point, without going into the properties of pi and the history of its computation? For example, I could write a small program that will fill my screen with a random n digit numbers at time X. Before time X, the probability of me seeing any particular number is 10^-n, and after time X it's 1 for one number and 0 for others.

But I'm having trouble seeing the practical significance of this. It would be interesting to see Anna's side of this conversation, to get a better idea of what you guys were talking about...

Comment author: multifoliaterose 15 July 2011 09:37:27PM *  1 point [-]

Isn't there a simpler way of making this point, without going into the properties of pi and the history of its computation? For example, I could write a small program that will fill my screen with a random n digit numbers at time X. Before time X, the probability of me seeing any particular number is 10^-n, and after time X it's 1 for one number and 0 for others.

  1. It may be that my exposition could be improved. I'm not sure whether or not the use of a particular real number is central to the line of thinking that I'm interested in exploring.

  2. If any two independent people compute the first thousand digits of pi then the results should be identical whereas if two people generate 1000 random digits then their results will not be identical. In this way "the first 1000 digits of pi" is well defined in a way that "1000 random digits" is not.

  3. Note that it's possible to think that a string of n digits is probably the first n digits of pi and then learn that there was an error in the coding/algorithm used which causes a dramatic drop in the probability that the string of digits is correct whereas it's not possible to learn that a string of random digits is "wrong."One may learn that the algorithm used was not a good random number generator but that's not quite the same thing...

  4. Some readers may find the properties of pi and the history of its computation to be of independent interest.

But I'm having trouble seeing the practical significance of this. It would be interesting to see Anna's side of this conversation, to get a better idea of what you guys were talking about...

It was just a casual conversation; a tangent to an unrelated subject. But my interest in the topic is that I think that reasoning about small probabilities may be important to x-risk reduction and I want to get a feel for the types of errors & the times of valid heuristics attached to the domain with a view toward thinking about x-risk reduction.

Comment author: Alexei 15 July 2011 05:26:14AM 10 points [-]

Once I found out that you can compute any digit of pi without computing any other digit (albeit in hexadecimal), my life just wasn't the same anymore.

Comment author: NancyLebovitz 15 July 2011 09:21:29AM *  4 points [-]

It surprised me tremendously, but I don't think it's affected me otherwise. If you were being literal, what changed for you?

It isn't limited to hexidecimal.

Comment author: Nisan 15 July 2011 10:18:03PM 0 points [-]

I like the way the author formatted their summation formulas in ascii.

Comment author: Alexei 15 July 2011 06:38:30PM 0 points [-]

I think it showed me that Kolmogorov complexity of pi's digits is really low (this was before I even understood what Kolmogorov complexity was). Looking at it now, however, it's not all all surprising, given that pi's definition is already very low complexity.

Comment author: Armok_GoB 15 July 2011 11:50:25AM 0 points [-]

!!!

Unless I misunderstand something that could be used as an RNG with properties I've been looking for for ages! Sadly, I don't understand half the math involved. Example code would be nice...

Comment author: malthrin 15 July 2011 02:18:32PM 2 points [-]

Is there something wrong with the Mersenne Twister?

Comment author: Armok_GoB 15 July 2011 03:41:44PM *  0 points [-]

Yes, lots of things, although I don't know a quick or non-technical way of summing up what.

Looking at my response to Emile below will probably help.

Comment author: saturn 15 July 2011 08:32:05PM 1 point [-]

Is there something wrong with AES in counter mode?

Comment author: Armok_GoB 15 July 2011 08:35:46PM 0 points [-]

what is that?

Comment author: saturn 15 July 2011 08:43:03PM 2 points [-]

Encrypting successive integers with AES gives you a pretty fast and very high quality PRNG.

Comment author: Armok_GoB 15 July 2011 10:46:01PM 0 points [-]

apperently I'm geting to tired to cognite any compsci rigth now, it might work thou.

Comment author: Emile 15 July 2011 02:41:57PM 1 point [-]

I'm curious, what properties are you looking for?

(I'm pretty interested in roguelikes and procedural content generation in games, and have played around with various algorithms for infinite consistent worlds etc. - if your name's anything to go by, I expect you might be interested in that stuff too)

Comment author: Armok_GoB 15 July 2011 03:47:19PM *  1 point [-]

Something that can act like a lookup table of random binary data of infinite size and dimensionality, which is as fast as possible but have very low demands on the quality of the randomness.

This should if I got things right be both faster and more random than XORing a load of bitstrings repeating with different prime numbers length.

Comment author: Khoth 15 July 2011 04:13:43PM 1 point [-]

I don't think it's fast. The any-base one that NancyLebovitz linked to claims to be O((n log n)^3). The original hex one, I don't know exactly but from the description on wikipedia it's definitely worse than O(n) (I'd guess significantly worse).

Comment author: Armok_GoB 15 July 2011 04:17:32PM 1 point [-]

oh, yea then it's probably useless.

Comment author: Emile 15 July 2011 04:07:30PM *  1 point [-]

That sounds like a hash function, or are you looking for something else? (something faster?)

What applications do you have in mind? Procedural generation in games, or something else?

Comment author: Armok_GoB 15 July 2011 04:42:30PM *  0 points [-]

reads up on hash functions

Oooh, interesting. While not exactly right some could probably be re-purposed as an RNG yes... Especially cryptographic hash functions seem interesting, but they are probably far to slow. Something that looks like a cryptographic hash function to someone who really sucks at cryptography is probably what I'm looking for. Also there are probably some things to gain from collisions being a non-issue and the output needed being extremely short.

I don't actually know exactly what I need this for, it's just somehting I for some reason tend to stumble upon repeatedly when I think about compsci things and has thus concluded would be generally useful. Procedural generation, either for games, art, or math experiments, probably constitutes most of it thou.

Comment author: Emile 15 July 2011 04:57:25PM *  4 points [-]

In Python, you can do things like

>>> import random
>>> def random_lookup(key):
random.seed(("Whatever", key))
return random.randint(0, 999)
>>> random_lookup(11)
885
>>> random_lookup((42, 115, 802))
481
>>> random_lookup((31, 42, 717))
805
>>> random_lookup("strings are good keys too")
564
>>> random_lookup(11)
885
>>> random_lookup("strings are good keys too")
564

... internally, he's just using the hash of your input as a random seed, but then you can get random floats, etc.

I used something like that to get boundless fields of random temperatures, elevation and humidity, resulting in a tile-based world with mountains, oceans, swamps, forests etc. that you could scroll as far as you wanted to in any direction, and always get the same thing for a given position.

Comment author: Armok_GoB 15 July 2011 07:00:47PM 0 points [-]

Nice trick, thanks, but isn't it awfully slow?

Comment author: Pavitra 15 July 2011 07:20:09PM 2 points [-]

How fast do you need it to be? My laptop can do SHA1 about 2^20 times with no noticeable delay.

Comment author: Armok_GoB 15 July 2011 07:33:32PM *  0 points [-]

Ok, that's pretty fast for python. Should be fast enough for anything with speed requirements that makes python a viable language in the first place. Unless I did the math wrong somewhere.

Thanks.

Comment author: Emile 15 July 2011 08:26:04PM *  1 point [-]

Not awfully - I haven't counted but for a screenful of terrain tiles there can't be more than a thousand calls, which could be a bit slow if you're recalculating everything every frame, but is fast enough as soon as you cache your results, so that you only need to calculate new, undiscovered terrain.

If performance is really a problem, it's always possible to rewrite the slow bits in C, using simpler algorithms than the Python hash function.

Comment author: Eugine_Nier 15 July 2011 05:40:12AM 2 points [-]

One would have still more confidence if one were personally involved in such a team.

I don't think this statement is correct since it contradicts the law of conservation of expected evidence, i.e., if you anticipate that knowing some piece of information would increase your probability estimate of something, then simply knowing that that piece of information exists should correspondingly increase your estimate unless you also suspect that knowing the information might actually decrease your estimate by a corresponding amount.

Comment author: multifoliaterose 15 July 2011 08:20:06AM 2 points [-]

This is true. I meant assuming that the computations are sound, being personally involved would increase one's confidence relative to what it would otherwise be. I'll change the post accordingly.

Comment author: MixedNuts 15 July 2011 07:01:35AM 2 points [-]

If one had the inside info given by being part of the team, and this info decreased one's confidence, one wouldn't be part of the team.