Comment author: turchin 23 September 2016 11:58:44PM *  -1 points [-]

It would be interesting to make Null experiment, which will consist only of two control groups, so we will know what is the medium difference between two equal groups. It would also interesting to add two control groups in each experiment, as we will see how strong is the effect.

For example if we have difference between main and control in 10 per cent, it could looks like strong result. But if we have second control group, and it has 7 per cent difference from first control group, our result is not so strong after all.

I think that it is clear that can't do it just splitting existing control group in two parts, as such action could be done in many different ways and researcher could choose most favorable, and also because there could be some interactions inside control group, and also because smaler statistic power.

Comment author: gwern 24 September 2016 12:50:48AM 0 points [-]

I think that it is clear that can't do it just splitting existing control group in two parts, as such action could be done in many different ways and researcher could choose most favorable, and also because there could be some interactions inside control group, and also because smaler statistic power.

You can. Cross-validation, the bootstrap, permutation tests - these rely on that sort of procedure. They generate an empirical distribution of differences between groups or effect sizes which replace the assumption of being two normal distributions etc. It would be better to do those with both the experimental and control data, though.

Comment author: MattG2 22 September 2016 07:17:29PM *  1 point [-]

You want some sort of adaptive or sequential design (right?), so the optimal design not being terribly helpful is not surprising: they're more intended for fixed up-front designing of experiments.

So after looking at the problem I'm actually working on, I realize an adaptive/sequential design isn't really what I'm after.

What I really want is a fractional factorial model that takes a prior (and minimizes regret between information learned and cumulative score). It seems like the goal of multi-armed bandit is to do exactly that, but I only want to do it once, assuming a fixed prior which doesn't update over time.

Do you think your monte-carlo Bayesian experimental design is the best way to do this, or can I utilize some of the insights from Thompson sampling to make this process a bit less computationally expensive (which is important for my particular use case)?

Comment author: gwern 23 September 2016 04:34:44PM 1 point [-]

but I only want to do it once, assuming a fixed prior which doesn't update over time.

I still don't understand what you're trying to do. If you're trying to maximize test scores by increasing them through picking textbooks and this is done many times, you want a multi-armed bandit to help you find what is the best textbook over the many students exposed to different combinations. If you are throwing out the information from each batch and assuming the interventions are totally different each time, then your decision is made before you do any learning and your optimal choice is simply whatever your prior says: the value of information is the subsequent decisions it affects, except you're not updating your prior so the information can't change any decisions after the first one and is worthless.

Do you think your monte-carlo Bayesian experimental design is the best way to do this, or can I utilize some of the insights from Thompson sampling to make this process a bit less computationally expensive (which is important for my particular use case)?

Dunno. Simulation is the most general way of tackling the problem, which will work for just about anything, but can be extremely computationally expensive. There are many special cases which can reuse computations or have closed-form solutions, but must be considered on a case by case basis.

Comment author: ChristianKl 20 September 2016 05:11:02PM 0 points [-]

I opened a question on skeptics to check the claim: http://skeptics.stackexchange.com/q/35204/196

Comment author: gwern 20 September 2016 07:36:41PM 5 points [-]

You did read the rest of the article right, perhaps looked at the bibliography with over a dozen references?

Checkmate atheists.

(More seriously, you should've posted that to the cognitive science stack where there might actually be someone who knows something about IQ or gifted & talented education.)

Comment author: MattG2 20 September 2016 03:52:15PM *  4 points [-]

Let's say I have a set of students, and a set of learning materials for an upcoming test. My goal is to run an experiment to see which learning materials are correlated with better scores on the test via multiple linear regression. I'm also going to make the simplifying assumption that the effects of the learning materials are independent.

I'm looking for an experimental protocol with the following conditions:

  1. I want to be able to give each student as many learning materials as possible. I don't want a simple RCT, but a factorial experiment where students get many materials and the statistics tease out the linear regression.

  2. I have a prior about which learning materials will do better, I'd like to utilize this prior by originally distributing these materials to more students.

  3. (Bonus) Students are constantly entering this class, I'd love to be able to do some multi-armed bandit thingy where as I get more data I continually change this prior.

I've looked at most of the links going from https://en.wikipedia.org/wiki/Optimal_design but they mostly show the mathematical interpretation of each method, not a clear explanation of in which conditions you'd use that method.

Thanks!

Comment author: gwern 20 September 2016 04:38:27PM 2 points [-]

You want some sort of adaptive or sequential design (right?), so the optimal design not being terribly helpful is not surprising: they're more intended for fixed up-front designing of experiments. They also tend to be oriented towards overall information or reduction of variance, which doesn't necessarily correspond to your loss function. Having priors affects the optimal design somewhat (usually, you can spend fewer datapoints on the variables with prior information; for a Bayesian experimental design, you can simulate a set of parameters from your priors and then simulate drawing n datapoints with a particular experimental design, fit the model, find your loss or your entropy/variance, record the loss/design, and repeat many times; then find the design with the best average loss.).

If you are running the learning material experiment indefinitely and want to maximize cumulative test scores, then it's a multi-armed bandit and so Thompson sampling on a factorial Bayesian model will work well & handle your 3 desiderata: you set your informative priors on each learning material, model as a linear model (with interactions?), and Thompson sample from the model+data.

If you want to find what set of learning materials is optimal as fast as possible by the end of your experiment, then that's the 'best-arm identification' multi-armed bandit problem. You can do a kind of Thompson sampling there too: best-arm Thompson sampling: http://imagine.enpc.fr/publications/papers/COLT10.pdf https://www.escholar.manchester.ac.uk/api/datastream?publicationPid=uk-ac-man-scw:227658&datastreamId=FULL-TEXT.PDF http://nowak.ece.wisc.edu/bestArmSurvey.pdf http://arxiv.org/pdf/1407.4443v1.pdf https://papers.nips.cc/paper/4478-multi-bandit-best-arm-identification.pdf One version goes: with the full posteriors, find the action A with the best expected loss; for all the other actions B..Z, Thompson sample their possible value; take the action with the best loss out of A..Z. This explores the other arms in proportion to their remaining chance of being the best arm, better than A, while firming up the estimate of A's value.

Comment author: buybuydandavis 19 September 2016 03:51:50AM 6 points [-]

If your lunatic sensor didn't go off reading this, you should get it adjusted.

A funny comment at LW.

Even lunatics can be right.

Gwern said

The assumption here is that both the general population and elite professions are described by a normal distribution (N(100,15) and N(125,6.5), respectively)

Is it? I didn't see that. assumption stated. Problem is, they didn't explicitly specify where they got their distributions. At least I don't see it.

Looking again at some of their conclusions in the preceding paragraph, it does look like they're assuming gaussians based on mean and sd a small sample, then projecting that out to the tails. Clearly malpractice.

They don't come out and say it, but the "This means that" below shows that they are extrapolating to the tails.

This means that 95% of people in intellectually elite professions have IQs between 112 and 138 99.98% have IQs between 99 and 151.

Funny that an article talking about how hard it is to be smart can be so dumb.

Still, my question remains - is there real data out there to support the contention that P(elite career|IQ) has a local max and then decreases for higher IQ?

Comment author: gwern 19 September 2016 06:22:44PM 4 points [-]

Still, my question remains - is there real data out there to support the contention that P(elite career|IQ) has a local max and then decreases for higher IQ?

No. As I point out in my comment there, the evidence is strongly the other way: TIP/SMPY. To the extent that measures like wealth hit diminishing returns or even fall (eg Zagorsky), it has as much to do with personal choices & values as ability: the physicist who could make money on Wall Street but chooses to continue studying particles, the person who chooses to become an influential but poor writer, etc. (There are many coins of the realm, and greenbacks are but one.)

Comment author: James_Miller 19 September 2016 04:30:35AM 6 points [-]

My reading of the behavioral genetics literature is that high intelligence being driven by rare autism variants is looking unlikely.

I haven't looked at this literature, but people with autism and very high IQs might be able to fake being neurotypical. As Steve Hsu told me, we don't know if von Neumann had a normal personality because he certainly had the intelligence to fake being normal if he felt this suited his interests.

Comment author: gwern 19 September 2016 06:20:19PM *  7 points [-]

von Neumann was noted as being social and extraverted long before he began his lobbying and politicking, and was never described as a second Dirac, so I don't think he was simply acting out of expediency. If high intelligence enabled faking extraversion & social skills, which are useful in almost all contexts*, we would see a noted personality correlation with intelligence and increasing with intelligence, which we don't - extraversion is largely independent of IQ, it's Openness in the Big Five which correlates. High-functioning autistic people are also not noted for easily acquiring psychopath-level skills in imitating & manipulating without feeling.

* see for example the correlation of increasing extraversion with increasing lifetime income in the Terman semi-high IQ sample

Comment author: moridinamael 19 September 2016 02:12:28AM 4 points [-]

One interpretation I've seen is that ~130 is about as high as a human brain can get while still using basically the same architecture as an IQ 100 brain. The further beyond that you get, the more you're using significantly different systems. These differences may tend to be autism-related, such that the higher IQ comes at the expense of impairments.

Comment author: gwern 19 September 2016 02:40:47AM 9 points [-]

My reading of the behavioral genetics literature is that high intelligence being driven by rare autism variants is looking unlikely. DeFries-Fulker extremes analyses like "Thinking positively: The genetics of high intelligence", Shakeshaft et al 2015 aren't consistent with the (relatively) high end being due to rare variants (but are consistent with the low end being due to rare variants) and current attempts to find rare variants enriched in the very high IQ with large effect sizes have turned up nothing: "A genome-wide analysis of putative functional and exonic variation associated with extremely high intelligence", Spain et al 2015. There is also an autism heritability observed in the GCTAs/LD score regression using only common SNPs (>=1% population frequency), along with a positive autism/intelligence genetic correlation, which undermines that idea.

My speculation at this point is that Spearman's law of diminishing returns is - based on all the genetic correlations with intelligence which have piled up and the current trends in brain imaging studies finding brain volume/thickness & global connectivity & white-matter integrity & connection speed to be the best predictors of intelligence - is due to intelligence reflecting a bottleneck between all the regions of the brain communicating to solve problems and that as the global communication becomes closer to optimal due to better health & development, individual specialized brain regions start to become the bottleneck to higher performance and shrinking the g factor.

Comment author: morganism 13 September 2016 09:56:43PM 3 points [-]

Transposon Theory Of Aging, some more support.

"As cells get older, prior studies have shown, tightly wound heterochromatin wrapping that typically imprisons transposons becomes looser, allowing them to slip out of their positions in chromosomes and move to new ones, disrupting normal cell function. Meanwhile, scientists have shown that potentially related interventions, such as restricting calories or manipulating certain genes, can demonstrably lengthen lifespans in laboratory animals."

http://www.science20.com/news_articles/a_transposon_theory_of_aging-180287

report at PNAS, but link not provided.

Comment author: gwern 13 September 2016 11:57:19PM 3 points [-]

"Chromatin-modifying genetic interventions suppress age-associated transposable element activation and extend life span in Drosophila", Wood et al 2016 http://www.pnas.org/content/early/2016/09/07/1604621113.long

Comment author: gwern 13 September 2016 06:37:40PM 1 point [-]
Comment author: gwern 07 September 2016 01:31:57AM 1 point [-]

"Philanthropy’s Success Stories"

I see impressive choices of causes and organizations; I don’t see impressive “tactics,” i.e., choices of projects or theories of change. This may simply be a reflection of the Casebook’s choice of approach and focus. There are many cases in which I found myself agreeing with the Casebook that a foundation had chosen an important and overlooked problem to put its money toward, but few (if any) cases where its strategy within the sector seemed particularly intricate, clever, noteworthy or crucial to its success. The spreadsheet above categorizes grants both by “sector” and by “type of grantmaking”; there is a surprising (to me) amount of variety when it comes to the former, and not as much when it comes to the latter. My intuition is that the choice of sector is the most important choice a funder makes.

View more: Prev | Next