Can you recognize a random generator?

2 Post author: uzalud 28 December 2011 01:59PM

I can't seem to get my head around a simple issue of judging probability. Perhaps someone here can point to an obvious flaw in my thinking.

Let's say we have a binary generator, a machine that outputs a required sequence of ones and zeros according to some internally encapsulated rule (deterministic or probabilistic). All binary generators look alike and you can only infer (a probability of) a rule by looking at its output.

You have two binary generators: A and B. One of these is a true random generator (fair coin tosser). The other one is a biased random generator: stateless (each digit is independently calculated from those given before), with probability of outputting zero p(0) somewhere between zero and one, but NOT 0.5 - let's say it's uniformly distributed in the range [0; .5) U (.5; 1]. At this point, chances that A is a true random generator are 50%.

Now you read the output of first ten digits generated by these machines. Machine A outputs 0000000000. Machine B outputs 0010111101. Knowing this, is the probability of machine A being a true random generator now less than 50%?

My intuition says yes.

But the probability that a true random generator will output 0000000000 should be the same as the probability that it will output 0010111101, because all sequences of equal length are equally likely. The biased random generator is also just as likely to output 0000000000 as it is 0010111101.

So there seems to be no reason to think that a machine outputting a sequence of zeros of any size is any more likely to be a biased stateless random generator than it is to be a true random generator.

I know that you can never know that the generator is truly random. But surely you can statistically discern between random and non-random generators?

Comments (55)

Comment author: benelliott 28 December 2011 02:29:40PM 19 points [-]

The biased random generator is also just as likely to output 0000000000 as it is 0010111101.

This is the mistake.

If you actually do the maths the biased generator is significantly more likely to output 0000000000 than 0010111101.

For a much simpler example, suppose we run on two times. The random generator outputs 00 25% of the time, 01 25% of the time, 10 25% of the time and 11 25% of the time.

For the biased generator, we need calculus. First suppose its p(0) = x. Then p(00 | p(0) = x ) = x^2. Since we have what is essentially a uniform distribution over [0,1] (the presence of absence of a single point makes no difference) we need to integrate f(x) = x^2 over the interval [0,1], which gives and answer of p(00) = 1/3. The same method gives p(11) = 1/3 and p(01) = p(10) = 1/6.

The general rule, is that if we run it n times, for any k between 0 and n the chance of it outputting k 1s is 1/(n+1), and that probability is shared out evenly among all possible different ways of outputing k 1s (also derivable from calculus). Thus p(0000000000) = 9.1% while p(0010111101) = 0.043%.

Comment author: uzalud 28 December 2011 04:14:21PM 1 point [-]

Thanks, this is a great answer. It didn't occur to me that stateless generator with unknown p(0) will have such a "preference" for all-digits-are-same-sequences. p(ten zeros) = 1/11 if p(0) can be any number; but p(ten zeros)=1/1024 if p(0)=1/2.

Comment author: timtyler 29 December 2011 02:06:50PM 4 points [-]

But surely you can statistically discern between random and non-random generators?

See DIEHARD.

Comment author: Cthulhoo 28 December 2011 02:36:45PM *  2 points [-]

I see a possible confusion in your post, in the sense that you have apparently fused two hypothesis into one: *A is random *A is fair

The two are not equivalent: A can be a random generator with p(0)!=50% or it can have p(0)=50% but output a very predictable string, e.g. 01010101010101010101...

Going back to your example if you hypothesis is A=(random et fair) then seeing the string 0000000000 can lower your belief in the statement (not for the "random" part, but for the "fair" part"). If your hypothesis was A=(random et p(0)= 0.00000000000000001%) then the string could raise your belief instead.

Comment author: uzalud 28 December 2011 06:14:24PM 0 points [-]

Yes, I misspoke. The question is to discern between fair and biased random generators, not between random and non-random ones. As benelliott pointed out, stateless random bit generators seem to have quite unequal probability distributions of output sequences.

Comment author: snarles 28 December 2011 02:14:32PM *  1 point [-]

From a Bayesian viewpoint, no generator is inherently random. It is our inability to predict the outcome which makes the output "random."

Comment author: wedrifid 28 December 2011 08:51:58PM 2 points [-]

From a Bayesian viewpoint, no generator is inherently random. It is our inability to predict the outcome which makes the output "random."

Bayesianism is agnostic on the subject - it's physics that has an opinion. A rather strong opinion.

Comment author: DanielLC 28 December 2011 09:32:29PM 1 point [-]

Physics only tells you if you can predict the later output from the earlier output. It can't tell you what you know about the later output. If I flip a coin, it's random what it will land on before I flip it, but it has a probability near one of having landed on whatever I saw after I flipped it.

Comment author: Thomas 28 December 2011 02:25:15PM 2 points [-]

I've asked before. What about radioactive atoms decay? Are they truly random or not?

Do you think there is a mechanism behind those apparently random pops of radium atoms in a jar?

Comment author: prase 28 December 2011 04:05:34PM 1 point [-]

What does it mean "truly random"? If it means "indeterministic", in the sense of "there is no way to predict the outcome with certainty from any set of previously gathered data", then chances are that the radioactive decay is "truly random". However, this doesn't make the sentence

It is our inability to predict the outcome which makes the output "random."

false.

Comment author: Thomas 28 December 2011 04:19:33PM *  1 point [-]

What does it mean "truly random"?

That there is no finite algorithm behind, which would decide when which atom will explode.

In other words, that there is no hidden variables behind those events, which would cause the decay.

It is quite an unorthodox view in quantum mechanics, that those exist.

As far as I can see, the official view in QM is inherently "nonbayesian" in this sense. No hidden mechanism which would output the decay time of an uranium atom for example.

Comment author: DanielLC 28 December 2011 09:39:59PM *  2 points [-]

All of them will decay and all of them will not decay in different worlds.

Let's assume for the sake of argument that the Copenhagen interpretation is true. Before the particles decay, there is no way to tell which decay. It's random. After they decay, you can tell by looking. You know which ones decayed. It's not random at all.

Randomness is a state of the mind. All indeterminism tells you is that randomness must be a state of every mind that exists before the event.

Imagine a universe where it's always possible to tell the future from the past, but it's not always possible to tell what's on the right from what's on the left. This is deterministic, but if you rotate it 90 degrees, it isn't. A coordinate transformation can't be changing whether or not something is random, can it?

Comment author: shminux 28 December 2011 09:55:45PM 1 point [-]

This is deterministic, but if you rotate it 90 degrees, it isn't.

Time is not quite like space (or, in the PDE language, initial value problems are quite different from boundary value problems). There are QFT techniques that treat time as "imaginary space", but their applicability is quite limited and they certainly do not justify the view that "Randomness is a state of the mind", which is either untestable or manifestly false.

Comment author: prase 29 December 2011 12:44:14AM *  0 points [-]

initial value problems are quite different from boundary value problems

Could you be more specific, in what sense different?

In any case, final (or terminal?) value problems are the same as the initial value problems, PDE-wise.

Comment author: shminux 29 December 2011 01:22:35AM 0 points [-]

Could you be more specific, in what sense different?

Start with the obvious link, look for hyperbolic and elliptic PDEs and ways to solve them, especially numerically. The wave equation techniques are different from the Laplace equation techniques, though there is some overlap. Anyway, this is getting quite far from the original discussion.

In any case, final (or terminal?) value problems are the same as the initial value problems, PDE-wise.

Not quite. For example, the heat/diffusion equation cannot be run backwards, because it fails the uniqueness conditions with the t sign reversed. In a simpler language, you cannot reconstruct the shape of an ink drop once it is dissolved in a cup of water.

Comment author: prase 29 December 2011 01:57:38AM *  0 points [-]

I was mainly asking what differences are, in your opinion, important in the context of the present debate.

the heat/diffusion equation cannot be run backwards, because it fails the uniqueness conditions with the t sign reversed

1) You can run the diffusion equation backwards, only you encounter problems with precision when the solution exponentially grows.

2) Fundamental laws of nature are [second order in time and - edit:that's not true ] symmetric with respect to time reversal.

Comment author: shminux 29 December 2011 04:09:30AM 1 point [-]

I was mainly asking what differences are, in your opinion, important in the context of the present debate.

Well, we are quite a ways from the original context, but I was commenting on that treating time and space on the same footing and saying that future and past are basically interchangeable (sorry for paraphrasing) is often a bad assumption.

You can run the diffusion equation backwards, only you encounter problems with precision when the solution exponentially grows.

In other words, it is ill-posed and cannot be used to recover the initial conditions.

Fundamental laws of nature are second order in time and symmetric with respect to time reversal.

This is a whole other debate on what is fundamental and what is emergent. Clearly, the heat equation is pretty fundamental in many contexts, but its origins can be traced to the microscopic models of diffusion. There are other reasons why the apparent time reversal might not be there. For example, if you take the MWI seriously, the branching process is has a clear time arrow attached to it.

Comment author: pragmatist 29 December 2011 02:45:43AM 0 points [-]

A small nitpick: The Schrodinger equation is not second order in time.

Comment author: DanielLC 29 December 2011 12:42:17AM 0 points [-]

Let me start over.

You could define "randomness" to mean indeterminism. If you do so, you would call the results of a fair, indeterministic coin toss random. Even so, if I tossed such a quantum coin, and you saw it land on heads, you would not be willing to bet even money that it landed on tails. P(coin landed on heads|your current information) ≈ 1. When you're finding expected value, this is all that matters.

Comment author: shminux 28 December 2011 09:48:55PM 1 point [-]

As far as I can see, the official view in QM is inherently "nonbayesian" in this sense. No hidden mechanism which would output the decay time of an uranium atom for example.

Indeed there is no hidden "time until I decay" number hidden inside each radioactive atom (based on some pseudo-random generator, or what have you), but how is it related to Bayes? And what do you mean by "official"?

Comment author: Thomas 28 December 2011 10:09:36PM *  0 points [-]

what do you mean by "official"?

I mean the *prevailing view among (quantum) physicists that:

"Indeed there is no hidden "time until I decay" number hidden inside each radioactive atom "

You said it.

but how is it related to Bayes?

It is, while one thinks, that he must update on every evidence. You can't update anything on a decay of the particular radioactive atom. Could be another one, but it just wasn't and what is to update? Nothing, if that was a "truly random" event.

Either it wasn't, either you have nothing to update based on this evidence.

Comment author: shminux 28 December 2011 10:15:56PM 1 point [-]

prevailing view among (quantum) physicists

This "view" has been experimentally tested in a simpler case of two-state systems as Bell's inequality, though I do not remember, off-hand, any tests related to radioactive decay.

It is, while one thinks, that he must update on every evidence. You can't update anything on a decay of the particular radioactive atom.

You can update your estimate of the element's half-life, if nothing else.

Comment author: Thomas 28 December 2011 10:33:24PM *  0 points [-]

You can update your estimate of the element's half-life, if nothing else.

You can update the half-life from the TIME of the decay. But nothing from the fact that it was the atom number 2 and not the number 1 or any other.

This "view" has been experimentally tested

I know. That's way I keep bringing up this "true random" case.

Comment author: Normal_Anomaly 29 December 2011 02:05:22AM 1 point [-]

If I understand correctly, there is no physical difference between atom 2 and atom 1. There just is no fact of the matter to update on.

Comment author: Thomas 29 December 2011 02:57:01PM 1 point [-]

Say you have two diamonds, both marked with a million uranium 238 atoms.

You can measure in WHICH diamond the first decay will occur. An evidence you can't use it for any update then.

Comment author: prase 29 December 2011 01:16:11AM 0 points [-]

That there is no finite algorithm behind, which would decide when which atom will explode.

We can test algorithms which we use to predict which atom would explode and when. The variables are part of the theory, not of the atoms. Absence of hidden variables effectively means that there is no regularity such that we could infer a law that would predict the state of an arbitrary system at time t1 with certainty* from observations made at time t0 < t1. Nevertheless any selected atom* is either going to explode or isn't at a given time, and we can observe which was the case afterwards. Bayesianism doesn't prohibit updating our beliefs about events after those events happened, in fact it doesn't say anything at all about time. The "inherent randomness" of radioactive decay doesn't make the uncertainty non-Bayesian in any meaningful way.

That said, I am afraid we may start to argue over the silly problem of future contingents and over definitions in general. The right question to ask now is: why do you want to distinguish truly random numbers from apparently random ones? The answer to the question about the quality of quantum randomness may depend on that purpose.

*) Although I know that certainty is impossible to achieve and atoms are indistinguishable, I have chosen to formulate the sentences the way I did for sake of brevity.

Comment author: Luke_A_Somers 28 December 2011 06:34:46PM 0 points [-]

There are many forms of Bayesianism, and I've only seen a few that are married to the notion that ALL uncertainty is due to ignorance and none due to nondeterminism.

QM, incidentally, even in MWI, is nondeterministic in the sense that you don't know which of the outcomes you personally will experience.

Comment author: wedrifid 28 December 2011 07:11:15PM 1 point [-]

QM, incidentally, even in MWI, is nondeterministic in the sense that you don't know which of the outcomes you personally will experience.

Yes I do. All of them. What I cannot predict is what my observation will be when it is determined by a quantum event that has already occurred but with which I have not yet had any interaction. That's no more deterministic than a 'for loop' in a computer - self reflective code before the loop can predict exactly what is going to happen in the future but code within the loop has to do a lookup of the counter variable (or a side effect) if it is going to work conditionally.

Comment author: shminux 28 December 2011 08:33:38PM 2 points [-]

Yes I do. All of them.

That's not a testable prediction, or a useful one.

Comment author: pragmatist 28 December 2011 11:24:37PM -2 points [-]

That's not a testable prediction, or a useful one.

It is in fact a testable prediction.

Comment author: shminux 28 December 2011 11:50:04PM 0 points [-]

I cannot find anything in that entry that suggests that experiencing all possible outcomes can be experimentally tested. Feel free to elaborate.

Comment author: pragmatist 29 December 2011 01:53:21AM 1 point [-]

Sorry, I should have elaborated, but I was short on time when I wrote the comment.

Let's say you set up a sequence of quantum experiments, each of which has a 90% chance (according to the Born probabilities) of killing you instantly and a 10% chance of leaving you unharmed. After a number of such experiments you find yourself alive. This is something you would expect if some form of MWI were true and if all surviving future selves had conscious experience continuous with yours. It is not something you would expect if a collapse interpretation were true, or if MWI combined with some sort of indeterminism (governed by Born's rule, presumably) about which future self continues your conscious experience were true. So such a sequence of experiments should lead you to update in favor of MWI + experience all possible outcomes.

Comment author: dlthomas 29 December 2011 12:17:16AM -2 points [-]

QM, incidentally, even in MWI, is nondeterministic in the sense that you don't know which of the outcomes you personally will experience.

This is broken because

which of the outcomes you personally will experience

is incoherent in the context of MWI. There is a "you" now, on this side of the event. There will be many people labeled "you", on the other side. There is no one person on the other side that corresponds to "you personally" while the event is something you can say "will" about - at that point, it's "did".

Comment author: Luke_A_Somers 03 January 2012 05:14:00PM *  0 points [-]

Congratulations! You have constructed an interpretation of what I said that doesn't make sense.

Why don't you go back and try doing it the other way?

Comment author: dlthomas 03 January 2012 05:41:24PM 0 points [-]

Which other way?

Comment author: [deleted] 28 December 2011 02:58:33PM 1 point [-]

Assume for now that the universe is the expression of a computable function. Then there exists a universal turing machine with a binary alphabet that computes this function. By compressing its input we can get a single number identifying our universe. If we knew this number and hat enough computing power we could then compute the exact timepoint of the radioactive decay of any single atom. But the number in itself would be completely random, meaning it would not be itself determined by anything else. It's just the universe that happened to have us as part of its execution. Would you count that as "truly random" or not?

You could of course argue that the universe might not in fact be a computable function. But that claim would go far beyond just random atomic decay since we could in principle encode the time points of all radioactive decays ever into the initial number from which to create our universal turing machine.

Comment author: Maelin 29 December 2011 01:07:03AM 0 points [-]

Here is my understanding (which I hope others will correct if it is mistaken).

Suppose I have a duplication machine, with an in slot and two out slots. You drop something in the in slot and two identical versions of it are produced in the out slots. They are down-to-the-atom identical at the moment of production - there is not an 'original' and a 'copy', there are 'two originals'.

Now I position the machine so that one out slot drops into a blue box and the other into a yellow box. I drop you into the machine. As you tumble down the in slot, what is your subjective expectation about what colour box you will end up in? It might be ~50% blue and ~50% yellow (with a small probability that the machine explodes or the boxes change colour or whatever).

In fact, 'you' experience both. But no single instance of you remembers landing in both a blue box and a yellow box. Instead, one instance of you remembers landing in a blue box, and the other remembers landing in a yellow box. Both of you remember a single continuous narrative thread, from entering the in slot, to landing in a coloured box.

There is no sense in which the you from before the experiment landed in one box but not the other and could have predicted which one with better information. Each instance of you might initially think, "Well, I remember falling into the in slot, and then I ended up here. So the other guy must be the copy". But this is false. There was one of you, and now there are two.

Similarly with quantum events. The universe bifurcates (trifurcates, whatever - this is obviously a simplification) and different instances of us experience each different possibility. Before a quantum experiment, it is nonsensical to try to make definitive predictions about "which" outcome we will experience. The feeling that this could/should be possible is an intuition based in the fact that we only see our own narrative thread running back into the past, we don't see all the other narrative threads of the different versions of us. Different instances of us will experience each outcome of the experiment.

After the experiment, we can ask "which instance are we?" But before the experiment, it is meaningless to ask "which instance will we be?"

Comment author: wedrifid 28 December 2011 08:50:36PM *  0 points [-]

But the probability that a true random generator will output 0000000000 should be the same as the probability that it will output 0010111101, because all sequences of equal length are equally likely.

With a fair random generator:

p(0000000000) = 1/2^10
p(1000000000) = 1/2^10
p(0100000000) = 1/2^10
p(0010000000) = 1/2^10

The numbers produced are independent of each other and for our purposes we don't care about the order. The relevant thing is how likely it is is to produce a given total number of zeroes or ones.

p(just one 1) = 10/2^10; A whole heap more likely!

So the chance that the generator is fair is rather slim. You can calculate just how slim by simply applying bayes rule (and doing some integration).

On a related note if you role two six sided dice you are just as likely to get two sixes as you are to get a three and a five. But if you are playing Settlers of Catan and put all your settlements next to the twelve instead of the eight then you are probably going to lose.

Comment author: Quinn 29 December 2011 02:44:08AM 0 points [-]

p(just one 1) = 1/2^9; A whole heap more likely!

Actually p(just one 1) = 10/(2^10).

if you role two six sided dice you are just as likely to get two sixes as you are to get a three and a five.

Nitpick: this is true if by "a three and a five" you mean (that the dice are labeled and) "die A comes up 3, and die B comes up 5", but it's false as written (and in games like Settlers, the identities of simultaneously thrown dice are not tracked).

Comment author: wedrifid 29 December 2011 03:21:12AM 0 points [-]

(and in games like Settlers, the identities of simultaneously thrown dice are not tracked).

They are tracked actually, at least in the latest version - the only one worth playing. The red one determines whether or not you qualify for a bonus card. So yes, obviously I mean "a three and then a 5". If also considering a five and then a three then the point becomes even stronger.

Comment author: uzalud 28 December 2011 09:23:00PM 0 points [-]

Hm. So the only relevant measure is the prevalence of zeros, because the generators are stateless (n+1st digit does not depend on the nth digit)?

But what if the generator B was not necessarily stateless?