Suppose you have a system X that's equally likely to be in any of 8 possible states:
{X1, X2, X3, X4, X5, X6, X7, X8.}
There's an extraordinarily ubiquitous quantity—in physics, mathematics, and even biology—called entropy; and the entropy of X is 3 bits. This means that, on average, we'll have to ask 3 yes-or-no questions to find out X's value. For example, someone could tell us X's value using this code:
X1: 001 X2: 010 X3: 011 X4: 100 X5: 101 X6: 110 X7: 111 X8: 000
So if I asked "Is the first symbol 1?" and heard "yes", then asked "Is the second symbol 1?" and heard "no", then asked "Is the third symbol 1?" and heard "no", I would know that X was in state 4.
Now suppose that the system Y has four possible states with the following probabilities:
Y1: 1/2 (50%) Y2: 1/4 (25%) Y3: 1/8 (12.5%) Y4: 1/8 (12.5%)
Then the entropy of Y would be 1.75 bits, meaning that we can find out its value by asking 1.75 yes-or-no questions.
What does it mean to talk about asking one and three-fourths of a question? Imagine that we designate the states of Y using the following code:
Y1: 1 Y2: 01 Y3: 001 Y4: 000
First you ask, "Is the first symbol 1?" If the answer is "yes", you're done: Y is in state 1. This happens half the time, so 50% of the time, it takes 1 yes-or-no question to find out Y's state.
Suppose that instead the answer is "No". Then you ask, "Is the second symbol 1?" If the answer is "yes", you're done: Y is in state 2. Y is in state 2 with probability 1/4, and each time Y is in state 2 we discover this fact using two yes-or-no questions, so 25% of the time it takes 2 questions to discover Y's state.
If the answer is "No" twice in a row, you ask "Is the third symbol 1?" If "yes", you're done and Y is in state 3; if "no", you're done and Y is in state 4. The 1/8 of the time that Y is in state 3, it takes three questions; and the 1/8 of the time that Y is in state 4, it takes three questions.
(1/2 * 1) + (1/4 * 2) + (1/8 * 3) + (1/8 * 3)
= 0.5 + 0.5 + 0.375 + 0.375
= 1.75.
The general formula for the entropy of a system S is the sum, over all Si, of -p(Si)*log2(p(Si)).
For example, the log (base 2) of 1/8 is -3. So -(1/8 * -3) = 0.375 is the contribution of state S4 to the total entropy: 1/8 of the time, we have to ask 3 questions.
You can't always devise a perfect code for a system, but if you have to tell someone the state of arbitrarily many copies of S in a single message, you can get arbitrarily close to a perfect code. (Google "arithmetic coding" for a simple method.)
Now, you might ask: "Why not use the code 10 for Y4, instead of 000? Wouldn't that let us transmit messages more quickly?"
But if you use the code 10 for Y4 , then when someone answers "Yes" to the question "Is the first symbol 1?", you won't know yet whether the system state is Y1 (1) or Y4 (10). In fact, if you change the code this way, the whole system falls apart—because if you hear "1001", you don't know if it means "Y4, followed by Y2" or "Y1, followed by Y3."
The moral is that short words are a conserved resource.
The key to creating a good code—a code that transmits messages as compactly as possible—is to reserve short words for things that you'll need to say frequently, and use longer words for things that you won't need to say as often.
When you take this art to its limit, the length of the message you need to describe something, corresponds exactly or almost exactly to its probability. This is the Minimum Description Length or Minimum Message Length formalization of Occam's Razor.
And so even the labels that we use for words are not quite arbitrary. The sounds that we attach to our concepts can be better or worse, wiser or more foolish. Even apart from considerations of common usage!
I say all this, because the idea that "You can X any way you like" is a huge obstacle to learning how to X wisely. "It's a free country; I have a right to my own opinion" obstructs the art of finding truth. "I can define a word any way I like" obstructs the art of carving reality at its joints. And even the sensible-sounding "The labels we attach to words are arbitrary" obstructs awareness of compactness. Prosody too, for that matter—Tolkien once observed what a beautiful sound the phrase "cellar door" makes; that is the kind of awareness it takes to use language like Tolkien.
The length of words also plays a nontrivial role in the cognitive science of language:
Consider the phrases "recliner", "chair", and "furniture". Recliner is a more specific category than chair; furniture is a more general category than chair. But the vast majority of chairs have a common use—you use the same sort of motor actions to sit down in them, and you sit down in them for the same sort of purpose (to take your weight off your feet while you eat, or read, or type, or rest). Recliners do not depart from this theme. "Furniture", on the other hand, includes things like beds and tables which have different uses, and call up different motor functions, from chairs.
In the terminology of cognitive psychology, "chair" is a basic-level category.
People have a tendency to talk, and presumably think, at the basic level of categorization—to draw the boundary around "chairs", rather than around the more specific category "recliner", or the more general category "furniture". People are more likely to say "You can sit in that chair" than "You can sit in that recliner" or "You can sit in that furniture".
And it is no coincidence that the word for "chair" contains fewer syllables than either "recliner" or "furniture". Basic-level categories, in general, tend to have short names; and nouns with short names tend to refer to basic-level categories. Not a perfect rule, of course, but a definite tendency. Frequent use goes along with short words; short words go along with frequent use.
Or as Douglas Hofstadter put it, there's a reason why the English language uses "the" to mean "the" and "antidisestablishmentarianism" to mean "antidisestablishmentarianism" instead of antidisestablishmentarianism other way around.
"So these people are frauds?"
To answer your question , I think the card can be one of the better random generator existing, so it is absolutely not a frauds ( and pheraps I will buy one ... thank you for the link ) . But there are many definition of random . The only one theoretical random definition I accept as true is that a bit string S is random only if this string has K(S) >=Len(S) . My strong deterministic opinion is that in the real world these random objects exist only if we take short object , also for quantum field ( like Wolfram think ). I read articles where people reasoning on the quantum field and the relation on exponential or polynomial world , I don't know what is the answer but I don't think that quantum filed open the door for an exponential world. This opinion come to me from many discrepance I find in the mathematical description and what happen in practical . For example for the K function ( Kolmogorov complexity ) there are proof say that for major part of value we have K(X)>=Log(X) and if you watch on the function you can say it is absolutely correct , not only but for very very few case we have K(X)<Log(X) and also from a statistical point of view all is coherent. The probability to compress X using the optimal K function with a dimension of N bits into a Y with a dimension of N/2 bits is (2^(N/2))/(2^N) . This probability is exponential low! So for increasing value X the probability to have a small K(X) is very low. What this mean for practical point of view? This mean that if I take a file from my pc and then I try to compress this file using a very very bad compressor ( becouse the K function is an idealized compressor with an infinite power ) it is crazy to hope to compress it . But I absolutely sure that I am able to compress it and with high probability of 50% ! . Why? What happen ? There are many explanation we can give to this phenomena but the follow is my opinion. When we get an object , when we receive an input this object is only a representation of the information of the object computed by a program , every program has limited power , we can define this power as a limit on the size of the bitstring that this program can do . I call this limit M and using this limit something change , for examples the number of available bit string of lenght N in a mathematical view are 2^N , now become min( 2^N , M/N ) . The compression probability change ... The universal distribution change ... But the interesting property of this vision is that it make pactical aspecative coherent with theretical function . In the standard mathematical/statistical view is more easy to compress small string and become more difficult to compress large string in absolutely opposition in what happen in the real world! When I have a small file I think will be difficult to compress it and when I have a big file I think that I will get a big compression ratio.
If you watch this function min( 2^N , M/N ) what happen is this! for small string we have exponential behaviour and this is coherent with mathematical classical view but after a limit M what happen change and the probability to compress for example increase! .
Another important characteristic is that big variation on the parameter M make small variation on the behaviour of the function , so is more important to assume the existence of this behaviour also if we don't know M also if it is impossible to compute M !.
This cause a discrepance in the entropy , in the assumption of Log as function of information measurement , etc ... becouse this theories suppose an exponential world ! a very big world! ( exponential functions are very big and we can not underrate them ! )
There are many consequence of this simple observation and many to be investigate .
I don't know if quantum field will open the exponential door but the world behaviour seem to me polynomial.
Denis.