You're looking at Less Wrong's discussion board. This includes all posts, including those that haven't been promoted to the front page yet. For more information, see About Less Wrong.

Some secondary statistics from the results of LW Survey

8 Nanashi 12 February 2015 04:46PM

 

Global LW (N=643) vs USA LW (N=403) vs. Average US Household (Comparable Income)
Income Bracket LW Mean Contributions USA LW Mean Contribution US Mean Contributions** [1]   LW Mean Income USA LW Mean Income US Mean*** Income [1]   LW Contributions /Income USA LW Contributions/Income US Contributions/Income [1]    
$0 - $25000 (41% of LW) $1,395.11 $935.47 $1,177.52   $11,241.14 $11,326.18 $15,109.85   12.41% 8.26% 7.79%    
$25000-$50000 (17% of LW) $438.25 $571.00 $1,748.08   $34,147.14 $32,758.06 $38,203.79   1.28% 1.74% 4.58%    
$50000-$75000 (12% of LW) $1,757.77 $1,638.59 $2,191.58   $60,387.69 $61,489.30 $62,342.05   2.91% 2.66% 3.52%    
$75000-$100000 (9% of LW) $1,883.36 $2,211.81 $2,624.81   $84,204.09 $83,049.54 $87,182.68   2.24% 2.66% 3.01%    
$100000-$200000 (16% of LW) $3,645.73 $3,372.84 $3,555.02   $123,581.28 $124,577.88 $137,397.03   2.95% 2.71% 2.59%    
>$200000 (5% of LW) $14,162.35 $15,970.67 $15,843.97   $296,884.63 $299,444.44 $569,447.35   4.77% 5.33% 2.78%    
Total $2,265.56 $2,669.85 $3,949.26   $62,285.72 $75,130.37 $133,734.60   3.64% 3.55% 2.95%    
All < $200000 $1,689.36 $1,649.32 $2,515.29   $51,254.43 $58,306.81 $81,207.03   3.30% 2.83% 3.10%    

 

Global LW (N=643) vs USA LW (N=403) vs. Average US Citizen (Comparable Age)
Age Bracket* LW Median US LW Median US Median*** [2]
15-24 $17,000.00 $20,000.00 $26,999.13
25-34 $50,000.00 $60,504.00 $45,328.70
All <35 $40,000.00 $58,000.00 $40,889.57

 

 

 

Global LW (N=407) vs USA LW (N=243) vs. Average US Citizen (Comparable IQ)
  Average LW** US LW US Between 125-155 IQ [3]
Median Income $40,000.00 $58,000.00 $60,528.70
Mean Contributions $2,265.56 $2,669.85 $2,016.00

 

Note: Three data points were removed from the sample due to my subjective opinion that they were fake. Any self-reported IQs of 0 were removed. Any self-reported income of 0 was removed. 

*89% of the LW population is between the age of 15 and 34.

**88% of the LW population has an IQ between 125 and 155, with an average IQ of 138. 

****Median numbers were adjusted down by a factor of 1.15 to account for the fact that the source data was calculating household median income rather than individual median income. 

[1] Internal Revenue Service, Charitable Giving by Households that Itemize Deductions (AGI and Itemized Contributions Summary by Zip, 2012), The Urban Institute, National Center for Charitable Statistics 

[2] U.S. Census Bureau, Current Population Survey, 2013 and 2014 Annual Social and Economic Supplements.

[3] Do you have to be smart to be rich? The impact of IQ on wealth, income and financial distress Intelligence, Vol. 35, No. 5. (September 2007), by Jay L. Zagorsky

 

Update 1: Updated chart 1&2 to account for the fact that the source data was calculating household median income rather than individual income.

Update 2: Reverted Chart 1 back to original because I realized that the purpose was to compare LWers to those in similar income brackets. So in that situation, whether it's a household or an individual is not as relevant. It does penalize households to an extent because they have less money available to donate to charity because they're splitting their money three ways. 

Update 3: Updated all charts to include data that is filtered for US only.

Gelman Against Parsimony

5 lukeprog 24 November 2013 03:23PM

In two posts, Bayesian stats guru Andrew Gelman argues against parsimony, though it seems to be favored 'round these parts, in particular Solomonoff Induction and BIC as imperfect formalizations of Occam's Razor.

Gelman says:

I’ve never seen any good general justification for parsimony...

Maybe it’s because I work in social science, but my feeling is: if you can approximate reality with just a few parameters, fine. If you can use more parameters to fold in more information, that’s even better.

In practice, I often use simple models–because they are less effort to fit and, especially, to understand. But I don’t kid myself that they’re better than more complicated efforts!

My favorite quote on this comes from Radford Neal‘s book, Bayesian Learning for Neural Networks, pp. 103-104: "Sometimes a simple model will outperform a more complex model . . . Nevertheless, I believe that deliberately limiting the complexity of the model is not fruitful when the problem is evidently complex. Instead, if a simple model is found that outperforms some particular complex model, the appropriate response is to define a different complex model that captures whatever aspect of the problem led to the simple model performing well."

...

...ideas like minimum-description-length, parsimony, and Akaike’s information criterion, are particularly relevant when models are estimated using least squares, maximum likelihood, or some other similar optimization method.

When using hierarchical models, we can avoid overfitting and get good descriptions without using parsimony–the idea is that the many parameters of the model are themselves modeled. See here for some discussion of Radford Neal’s ideas in favor of complex models, and see here for an example from my own applied research.

Stats Advice on a New N-back Game

4 Antisuji 29 May 2013 09:44PM

Cross-posted to my blog. I expect this will be of some interest to the LessWrong community both because of previous interest in N-back and because of the opportunity to apply Bayesian statistics to a real-world problem. The main reason I'm writing this article is to get feedback on my approach and to ask for help in the areas where I'm stuck. For some background, I'm a software developer who's been working in games for 7+ years and recently left my corporate job to work on this project full-time.

As I mentioned here and here, since early February I've been working on an N-back-like mobile game. I plan to release for iOS this summer and for Android a few months later if all goes well. I have fully implemented the core gameplay and most of the visual styling and UI, and am currently working with a composer on the sound and music.

I am just now starting on the final component of the game: an adaptive mode that assesses the player's skill and presents challenges that are tuned to induce a state of flow.

The Problem

The game is broken down into waves, each of which presents an N-back-like task with certain parameters, such as the number of attributes, the number of variants in each attribute, the tempo, and so on. I would like to find a way to collapse these parameters into a single difficulty parameter that I can compare against a player's skill level to predict their performance on a given wave.

But I realize that some players will be better at some challenges than others (e.g. memory, matching multiple attributes, handling fast tempos, dealing with visual distractions like rotation, or recognizing letters). Skill and difficulty are multidimensional quantities, and this makes performance hard to predict. The question is, is there a single-parameter approximation that delivers an adequate experience? Additionally, the task is not pure N-back — I've made it more game-like — and as a result the relationship between the game parameters and the overall difficulty is not as straightforward as it would be in a cleaner environment (e.g. difficulty might be nearly linear in tempo for some set-ups but highly non-linear for others).

I have the luxury of having access to fairly rich behavioral data. The game is partly a rhythm game, so not only do I know whether a match has been made correctly (or a non-match correctly skipped) but I also know the timing of a player's positive responses. A player with higher skill should have smaller timing errors, so a well-timed match is evidence for higher skill. I am still unsure exactly how I can use this information optimally.

I plan to display a plot of player skill over time, but this opens another set of questions. What exactly am I plotting? How do I model player skill over time (just a time-weighted average? as a series of slopes and plateaus? how should I expect skill to change over a period of time without any play?)? How much variation in performance is due to fatigue, attention, caffeine, etc.? Do I show error bars or box plots? What units do I use?

And finally, how do I turn a difficulty and a skill level into a prediction of performance? What is the model of the player playing the game?

Main Questions

  • Is there an adequate difficulty parameter and if so how do I calculate it?
  • Can I use timing data to improve predictions? How?
  • What model do I use for player skill changing over time?
  • How do I communicate performance stats to the user? Box and whiskers? Units?
  • What is the model of the player and how do I turn that into a prediction?

My Approach

I've read Sivia, so I have some theoretical background on how to solve this kind of problem, but only limited real-world experience. These are my thoughts so far.

Modeling gameplay performance as Bernoulli trials seems ok. That is, given a skill level S and a difficulty D, performance on a set of N matches should be closely matched by N Bernoulli trials with probability of success p(S, D) as follows:

  • if S ≪ D, p = 0.5
  • if S ≫ D, p is close to 1.0 (how close?)
  • if S = D, p = 0.9 feels about right
  • etc.

Then I can update S (and maybe D? see next paragraph) on actual player performance. This will result in a new probability density function over the "true" value of S, which will hopefully be unimodal and narrow enough to report as a single best estimate (possibly with error bars). Which reminds me, what do I use as a prior for S? And what happens if the player just stops playing halfway through, or hands the game to their 5-year-old?

Determining difficulty is another hard problem. I currently have a complicated ad-hoc formula that I cobbled together with logarithms, exponentials, and magic numbers, and lots of trial and error. It seems to work pretty well for the limited set of levels I've tested with a small group of playtesters, but I'm worried that it won't predict difficulty well outside of that domain. One possibility is to croud-source it: after release I'd collect performance data across all users and update the difficulty ratings on the fly. This seems risky and difficult, and the initial difficulty ratings might be way off, which would lead to poor initial user experiences with the adaptive mode. I would also have to worry about maintaining a server back-end to gather the data and report on updated difficulty levels.

Request For Feedback

So, any suggestions on how to tackle these problems? Or the first place to start looking?

I'm pretty excited about the potential to collect real-world data on skill acquisition over time. If there is sufficient interest I'll consider making the raw data public, and even instrument the code to collect other data of interest, by request. I do have some concerns over data privacy, so I may allow users to opt out of sending their data up to the server.