satt points out that (via the Bienaymé formula) “An RCT with a sample size of e.g. 400 would still be 10 times better than 4 self-experiments by this metric.”
Since this has come up again, I may as well point out that this is a very abstruse argument.
The way I'm thinking about this argument is to picture a normal distribution representing my uncertainty about some value. When I do 100 times as many experiments, the distribution
See also this comment of mine, which does math showing that just a few perfectly done self-experiments can be quite valuable in actionable terms: http://lesswrong.com/lw/bs0/knowledge_value_knowledge_quality_domain/6d9k
Master copy lives on gwern.net