I sometimes wonder just how useful probability and statistics are. There is the theoretical argument that Bayesian probability is the fundamental method of correct reasoning, and that logical reasoning is just the limit as p=0 or 1 (although that never seems to be applied at the meta-level: what is the probability that Bayes' Theorem is true?), but today I want to consider the practice.
Casinos, lotteries, and quantum mechanics: no problem. The information required for deterministic measurement is simply not available, by adversarial design in the first two cases, and by we know not what in the third. Insurance: by definition, this only works when it's impossible to predict the catastrophes insured against. No-one will offer insurance against a risk that will happen, and no-one will buy it for a risk that won't. Randomised controlled trials are the gold standard of medical testing; but over on OB Robin Hanson points out from time to time that the marginal dollar of medical spending has little effectiveness. And we don't actually know how a lot of treatments work. Quality control: test a random sample from your production run and judge the whole batch from the results. Fine -- it may be too expensive to test every widget, or impossible if the test is destructive. But wherever someone is doing statistical quality control of how accurately you're filling jam jars with the weight of jam it says on the label, someone else will be thinking about how to weigh every single one, and how to make the filling process more accurate. (And someone else will be trying to get the labelling regulations amended to let you sell the occasional 15-ounce pound of jam.)
But when you can make real measurements, that's the way to go. Here is a technical illustration.
Prof. Sagredo has assigned a problem to his two students Simplicio and Salviati: "X is difficult to measure accurately. Predict it in some other way."
Simplicio collects some experimental data consisting of a great many pairs (X,Y) and with high confidence finds a correlation of 0.6 between X and Y. So given the value y of Y, his best prediction for the value of X is 0.6y. [Edit: that formula is mistaken. The regression line for Y against X is Y = bcX/a, assuming the means have been normalised to zero, where a and b are the standard deviations of X and Y respectively. For the Y=X+D1 model below, bc/a is equal to 1.]
Salviati instead tries to measure X, and finds a variable Z which is experimentally found to have a good chance of lying close to X. Let us suppose that the standard deviation of Z-X is 10% that of X.
How do these two approaches compare?
A correlation of 0.6 is generally considered pretty high in psychology and social science, especially if it's established with p=0.001 to be above, say, 0.5. So Simplicio is quite pleased with himself.
A measurement whose range of error is 10% of the range of the thing measured is about as bad as it could be and still be called a measurement. (One might argue that any sort of entanglement whatever is a measurement, but one would be wrong.) It's a rubber tape measure. By that standard, Salviati is doing rather badly.
In effect, Simplicio is trying to predict someone's weight from their height, while Salviati is putting them on a (rather poor) weighing machine (and both, presumably, are putting their subjects on a very expensive and accurate weighing machine to obtain their true weights).
So we are comparing a good correlation with a bad measurement. How do they stack up? Let us suppose that the underlying reality is that Y = X + D1 and Z = X + D2, where X, D1, and D2 are normally distributed and uncorrelated (and causally unrelated, which is a stronger condition). I'm choosing the normal distribution because it's easy to calculate exact numbers, but I don't believe the conclusions would be substantially different for other distributions.
For convenience, assume the variables are normalised to all have mean zero, and let X, D1, and D2 have standard deviations 1, d1, and d2 respectively.
Z-X is D2, so d2 = 0.1. The correlation between Z and X is c(X,Z) = cov(X,Z)/(sd(X)sd(Z)) = 1/sqrt(1+d2 2) = 0.995.
The correlation between X and Y is c(X,Y) = 1/sqrt(1+d1 2) = 0.6, so d1 = 1.333.
We immediately see something suspicious here. Even a terrible measurement yields a sky-high correlation. Or put the other way round, if you're bothering to measure correlations, your data are rubbish. Even this "good" correlation gives a signal to noise ratio of less than 1. But let us proceed to calculate the mutual informations. How much do Y and Z tell you about X, separately or together?
For the bivariate normal distribution, the mutual information between variables A and B with correlation c is lg(I), where lg is the binary logarithm and I = sd(A)/sd(A|B). (The denominator here -- the standard deviation of A conditional on the value of B -- happens to be independent of the particular value of B for this distribution.) This works out to 1/sqrt(1-c2). So the mutual information is -lg(sqrt(1-c2)).
corr. | mut. inf. | |||
---|---|---|---|---|
Simplicio | 0.6 | 0.3219 | ||
Salviati | 0.995 | 3.3291 |
What can you do with one third of a bit? If Simplicio tries to predict just the sign of X from the sign of Y, he will be right only 70% of the time (i.e. cos-1(-c(X,Y))/π). Salviati will be right 96.8% of the time. Salviati's estimate will even be in the right decile 89% of the time, while on that task Simplicio can hardly do better than chance. So even a good correlation is useless as a measurement.
Simplicio and Salviati show their results to Prof. Sagredo. Simplicio can't figure out how Salviati did so much better without taking measurements on thousands of samples. Salviati seemed to just think about the problem and come up with a contraption out of nowhere that did the job, without doing a single statistical test. "But at least," says Simplicio, "you can't throw away my 0.3219, it all adds up!" Sagredo points out that it literally does not add up. The information gained about X from Y and Z together is not 0.3219+3.3291 = 3.6510 bits. The correct result is found from the standard deviation of X conditional on both Y and Z, which is sqrt(1/(1 + 1/d1 2 + 1/d2 2)). The information gained is then lg(sqrt(1 + 1/d1 2 + 1/d2 2)) = 0.5*lg(101.5625) = 3.3331. The extra information over knowing just Z is only 0.0040 = 1/250 of a bit, because nearly all of Simplicio's information is already included in Salviati's.
Sagredo tells Simplicio to go away and come up with some real data.
I'll give this one last try.
You've given me the two results I mentioned above, in the area of motor response. They sound like good experiments to me: you can take a model with relatively few free parameters, and find that most subjects' behavior will fit that model extremely well for some particular values of the parameters. That is the kind of experiment I'd take as good evidence that control theory operates in motor response. (Incidentally, if you could give me a link to those experiments, I'd much appreciate it.)
You've been claiming for months that this is just the tip of the iceberg, that PCT is able to isolate variables that subjects are controlling in cognitive contexts like belief. I would be very interested in this claim if I saw some evidence for it; fortunately, your claim that PCT is able to diagnose and treat cognitive conditions implies that it's strong enough to do the same kind of experiments as in the case of motor response. So I began by asking for references to such results, and gave an example of the kind of result that would definitely move me to look into PCT.
Experimental verification seems to me like the obvious thing for PCT advocates to do if they're confident in their theory and frustrated by its lack of academic respect. I would therefore find it highly unlikely, given that your claims are true, that in 35 years there hasn't been a single positive experimental result in a cognitive context, of the same form as the "controlling the position of the dot" or "varying the force on the bar" experiments. That you meet my question with outrage, rather than with citations, is thus Bayesian evidence against the validity of PCT.
Nope. I'm just claiming that if PCT doesn't have the kind of evidence it claims, then I probably shouldn't bother investigating it. The problems with mainstream psychology are manifold, but the discipline seems to be making (slow) progress by scientific criteria: Tversky and Kahneman, for instance, were making novel and unexpected experimental predictions that turned out to be correct. If your discipline does that much better than mainstream psychology, there should be some strong experimental results that show it.
I really can't imagine that's too much to ask, and that's why I've made this challenge. Point me to experimental results that validate PCT in a cognitive context, and I'll pick up the textbook of your choice. Keep grandstanding against the very kind of evidence you presented as evidence of PCT in motor response, and I'll have to conclude that you're peddling woo-woo. The ball is in your court.
I take it you skipped reading Marken's references then, since I believe one of the cited papers was on how physicians' errors in prescribing medicine match a PCT model of the situation, but fail to match an intuitive model of how such errors would respond to environmental changes.
You've apparently also been ig... (read more)