Eugine_Nier comments on Political ideas meant to provoke thought - Less Wrong
You are viewing a comment permalink. View the original post to see all comments and the full post content.
You are viewing a comment permalink. View the original post to see all comments and the full post content.
Comments (141)
Every actual population differs from a parameterised mathematical function with few parameters, and for pretty much anything you can measure, if the mathematical distribution has infinite support, there will be some reason that the population cannot. But the question to ask is not, are they different, but, does the difference make a difference?
The way to answer this question is to repeat the analysis in the paper Eugine cited using a truncated power law. The bounds must be placed at the limits of what is possible, not at the accidental maximum and minimum values observed in the current population, as the point here is that the population is not fully exploring the tails.
I have not done this, but I did once do a simulation for the Cauchy distribution (which has no mean), finding empirically the standard deviation of the mean of samples of size N. Each individual set of N values has a mean, but they will be wildly different for different samples. Increasing N does not reduce the effect for any practical value of N (and I did this in Matlab, which is optimised for fast number-crunching on arrays). This is completely different from what happens for sample means drawn from distributions with finite mean and variance, whose means converge with increasing N to the population mean.
For my experiment with the Cauchy distribution, not a single one of my samples had to be rejected due to exceeding the limits of finite precision arithmetic. The absence of infinite tails from the samples made no difference to the experimental results, even though it is the presence of those infinite tails that gives the Cauchy distribution its lack of moments.
This may look like a paradox. You have two distributions, the Cauchy distribution and its truncation at 1e50 or wherever. The former has no moments, and the latter does. Yet the empirical behaviour of samples drawn from the latter agrees with mathematical analysis of the former, even though in the latter case the standard deviation of the sample mean must converge with increasing sample size to zero, and in the former case it remains infinite.
The resolution of this paradox lies in the fact that as the variance of a distribution that has a finite variance becomes larger and larger, the rate of convergence of sample means becomes slower and slower. For the Cauchy distribution truncated at +/- X and a sample size of N, for large X and N the variance of the sample mean is proportional to X/N. If we take the limit of this as X goes to infinity, we get infinity, independent of N. If we take the limit as N goes to infinity we get zero, independent of X. The behaviour found when both X and N are finite will depend on which is bigger. When X is very large, even the entire population (conceived as a sample from an underlying data-generation process) may not give a good estimate of the distribution mean.
Taleb and Douady's point is that for a power law distribution, wealth owned by the top 1% is subject to this phenomenon. A larger population will explore more of the tail of the distribution, and unlike the normal distribution, the tail is fat enough to give a different value for the statistic. The "true" distribution does not have to actually have infinite support, for the entire population of a country to be insufficient to explore the tails.
The authors draw the implication that as both population and technological development grow, the top 1% will be found to have larger proportions of the wealth, not because of any change in the mechanisms of society to favour them, but because more of the sample space is being explored. "So examining times series, we can easily get a historical illusion of rise in wealth concentration when it has been there all along." (Presumably one could quantify the effect and correct for it.)
A possibility that the paper does not raise is that instead of calculating the actual wealth held by the actual top 1%, you could estimate the Gini coefficient from the whole population, and calculate a theoretical 1% wealth. This may be substantially more. The authors suggest that Pareto's empirical observation of the 80/20 rule, which implies 53% wealth held by the top 1%, might actually correspond to a figure of 70%.
This could be spun in opposite ways. If you want to boom freedom and boo levellers, you can point to this and say there's always more room at the top. If you want to boom equality and boo the rich, you can say that the true situation is even worse that the 1% figure says, indeed that the figure is a systematic underestimate, a piece of evil propaganda used by the rich to conceal the true extent of the inequality inherent in the system.
Take your pick.
Taleb would probably object on the grounds that the above will lead misleading results if the population is actually composed of a supper position of several distinct populations with different Gini coefficients.
His paper does go into these and other elaborations of the basic point.