Exactly! This is a math problem! And it becomes a very complicated math problem very quickly as the prior information gets interesting.
There's nothing magical about an AI; it can't figure out anything a human couldn't figure out in principle. The difference is the "superintelligence" bit: a superintelligent AI could efficiently use much more complicated prior information for experiment design.
Subscribe to RSS Feed
= f037147d6e6c911a85753b9abdedda8d)
JWW suggests that an AI could partition trial subjects into control and experimental groups such that expected number of events in both was equal, and presumably also such that cases involving assumptions were distributed equally, to minimize the impact of assumptions. For instance, an AI doing a study of responses to an artificial sweetener could do some calculations to estimate the impact of each gene on sugar metabolism, then partition subjects so as to balance their allele frequencies for those genes.
(A more extreme interpretation would be that the AI is partitioning subjects and performing the experiment not in a way designed to test a single hypothesis, but to maximize total information extracted from the experiment. This would be optimal, but a radical departure from how we do science. Actually, now that I think of it, I wrote a grant proposal suggesting this 7 years ago. My idea was that molecular biology must now be done by interposing a layer of abstraction via computational intelligence in between the scientist and the data, so that the scientist is framing hypotheses not about individual genes or proteins, but about causes, effects, or systems. It was not well-received.)
There's another comment somewhere countering this idea by noting that this almost requires omniscience; the method one uses to balance out one bias may introduce another.
There is a lot of statistical literature on optimal experimental design, and it's used all the time. Years ago at Intel, we spent a lot of time on optimal design of quality control measurements, and I have no doubt a lot of industrial scientists in other companies spend their time thinking about such things.
The problem is, information is a model dependent concept (derivatives of log-likelihood depend on the likelihood), so if your prior isn't fairly strong, there isn't a lot of improvement to be had. A lot of science is exploratory, trying to optimize the experimental design is premature.
Either way, this isn't stuff you need an AI for at all, it's stuff people talk about and think about now, today, using computer assisted human intellect.