ciphergoth just asked what the actual value of Quantified Self/self-experimentation is. This finally tempted me into running value of information calculations on my own experiments. It took me all afternoon because it turned out I didn’t actually understand how to do it and I had a hard time figuring out the right values for specific experiments. (I may not have not gotten it right, still. Feel free to check my work!) Then it turned out to be too long for a comment, and as usual the master versions will be on my website at some point. But without further ado!
The value of an experiment is the information it produces. What is the value of information? Well, we can take the economic tack and say value of information is the value of the decisions it changes. (Would you pay for a weather forecast about somewhere you are not going to? No. Or a weather forecast about your trip where you have to make that trip, come hell or high water? Only to the extent you can make preparations like bringing an umbrella.)
Wikipedia says that for a risk-neutral person, value of perfect information is “value of decision situation with perfect information” - “value of current decision situation”. (Imperfect information is just weakened perfect information: if your information was not 100% reliable but 99% reliable, well, that’s worth 99% as much.)
1 Melatonin
http://www.gwern.net/Zeo#melatonin
& http://www.gwern.net/Melatonin
The decision is the binary take or not take. Melatonin costs ~$10 a year (if you buy in bulk during sales, as I did). Suppose I had perfect information it worked; I would not change anything, so the value is $0. Suppose I had perfect information it did not work; then I would stop using it, saving me $10 a year in perpetuity, which has a net present value (at 5% discounting) of $205. So the value of perfect information is $205, because it would save me from blowing $10 every year for the rest of my life. My melatonin experiment is not perfect since I didn’t randomize or double-blind it, but I had a lot of data and it was well powered, with something like a >90% chance of detecting the decent effect size I expected, so the imperfection is just a loss of 10%, down to $184. From my previous research and personal use over years, I am highly confident it works - say, 80%. If it works, the information is useless to me, and if it doesn’t, I save $184; what’s the expected value of obtaining the information, giving these two outcomes? (80% * $0) + (20% * $184) = $36.8
. At minimum wage opportunity cost of $7 an hour, $36.8 is worth 5.25 hours of my time. I spent much time on screenshots, summarizing, and analysis, and I’d guess I spent closer to 10–15 hours all told.
(The net present value formula is the annual savings divided by the natural log of the discount rate, out to eternity. Exponential discounting means that a bond that expires in 50 years is worth a surprisingly similar amount to one that continues paying out forever. For example, a 50 year bond paying $10 a year at a discount rate of 5% is worth sum $ map (\t -> 10 / (1 + 0.05)^t) [1..50] ~> 182.5
but if that same bond never expires, it’s worth 10 / log 1.05 = 204.9
or just $22.4 more! My own expected longevity is ~50 more years, but I prefer to use the simple natural log formula rather than the more accurate summation. All the numbers here are questionable anyway.)
This worked out example demonstrates that when a substance is cheap and you are highly confident it works, a long costly experiment may not be worth it. (Of course, I would have done it anyway due to factors not included in the calculation: to try out my Zeo, learn a bit about sleep experimentation, do something cool, and have something neat to show everyone.)
2 Vitamin D
http://www.gwern.net/Zeo#vitamin-d
I ran 2 experiments on vitamin D: whether it hurt sleep when taken in the evening, and whether it helped sleep when taken in the morning.
2.1 Evening
http://www.gwern.net/Zeo#vitamin-d-at-night-hurts
The first I had no opinion on. I actually did sometimes take vitamin D in the evening when I hadn’t gotten around to it earlier (I take it for its anti-cancer and SAD effects). There was no research background, and the anecdotal evidence was of very poor quality. Still, it was plausible since vitamin D is involved in circadian rhythms, so I gave it 50% and decided to run an experiment. What effect would perfect information that it did negatively affect my sleep have? Well, I’d definitely switch to taking it in the morning and would never take it in the evening again, which would change maybe 20% of my future doses, and what was the negative effect? It couldn’t be that bad or I would have noticed it already (like I noticed sulbutiamine made it hard to get to sleep). I’m not willing to change my routines very much to improve my sleep, so I would be lying if I estimated that the value of eliminating any vitamin D-related disturbance was more than, say, 10 cents per night; so the total value of affected nights would be $0.10 * 0.20 * 365.25 = $7.3
. On the plus side, my experiment design was high quality and ran for a fair number of days, so it would surely detect any sleep disturbance from the randomized vitamin D, so say 90% quality of information. This gives ((7.3 - 0) / log 1.05) * 0.90 * 0.50 = 67.3
, justifying <9.6 hours. Making the pills took perhaps an hour, recording used up some time, and the analysis took several hours to label & process all the data, play with it in R, and write it all up in a clean form for readers. Still, I don’t think it took almost 10 hours of work, so I think this experiment ran at a profit.
2.2 Morning
http://www.gwern.net/Zeo#vitamin-d-at-morn-helps
With the vitamin D theory partially vindicated by the previous experiment, I became fairly sure that vitamin D in the morning would benefit my sleep somehow: 70%. Benefit how? I had no idea, it might be large or small. I didn’t expect it to be a second melatonin, improving my sleep and trimming it by 50 minutes, but I hoped maybe it would help me get to sleep faster or wake up less. The actual experiment turned out to show, with very high confidence, absolutely no change except in my mood upon awakening in the morning.
What is the “value of information” for this experiment? Essentially - nothing! Zero!
- If the experiment had shown any benefit, I obviously would have continued taking it in the morning
- if the experiment had shown no effect, I would have continued taking it in the morning to avoid incurring the evening penalty discovered in the previous experiment
- if the experiment had shown the unthinkable, a negative effect, it would have to be substantial to convince me to stop taking vitamin D altogether and forfeit its other health benefits, and it’s not worth bothering to analyze an outcome I would have given <=5% chance to.
Of course, I did it anyway because it was cool and interesting! (Estimated time cost: perhaps half the evening experiment, since I manually recorded less data and had the analysis worked out from before.)
3 Adderall
http://www.gwern.net/Nootropics#adderall-blind-testing
The amphetamine mix branded “Adderall” is terribly expensive to obtain even compared to modafinil, due to its tight regulation (a lower schedule than modafinil), popularity in college as a study drug, and reportedly moves by its manufacture to exploit its privileged position as a licensed amphetamine maker to extract more consumer surplus. I paid roughly $4 a pill but could have paid up to $10. Good stimulant hygiene involves recovery periods to avoid one’s body adapting to eliminate the stimulating effects, so even if Adderall was the answer to all my woes, I would not be using it more than 2 or 3 times a week. Assuming 50 uses a year (for specific projects, let’s say, and not ordinary aimless usage), that’s a cool $200 a year. My general belief was that Adderall would be too much of a stimulant for me, as I am amphetamine-naive and Adderall has a bad reputation for letting one waste time on unimportant things. We could say my prediction was 50% that Adderall would be useful and worth investigating further. The experiment was pretty simple: blind randomized pills, 10 placebo & 10 active. I took notes on how productive I was and the next day guessed whether it was placebo or Adderall before breaking the seal and finding out. I didn’t do any formal statistics for it, much less a power calculation, so let’s try to be conservative by penalizing the information quality heavily and assume it had 25%. So ((200 - 0) / log 1.05) * 0.50 * 0.25 = 512
! The experiment probably used up no more than an hour or two total.
This example demonstrates that anything you are doing expensively is worth testing extensively.
4 Modafinil day
http://www.gwern.net/Nootropics#modalert-blind-day-trial
I tried 8 randomized days like with Adderall to see whether I was one of the people whom modafinil energizes during the day. (The other way to use it is to skip sleep, which is my preferred use.) I rarely use it during the day since my initial uses did not impress me subjectively. The experiment was not my best - while it was double-blind randomized, the measurements were subjective, and not a good measure of mental functioning like dual n-back (DNB) scores which I could statistically compare from day to day or against my many previous days of dual n-back scores. Between my high expectation of finding the null result, the poor experiment quality, and the minimal effect it had (eliminating an already rare use), it’s obvious without guesstimating any numbers that the value of this information was very small.
I mostly did it so I could tell people that “no, day usage isn’t particularly great for me; why don’t you run an experiment on yourself and see whether it was just a placebo effect (or whether you genuinely are sleep-deprived and it is indeed compensating)?”
5 Lithium
http://www.gwern.net/Nootropics#lithium-experiment
Low-dose lithium orotate is extremely cheap, ~$10 a year. There is some research literature on it improving mood and impulse control in regular people, but some of it is epidemiological (which implies considerable unreliability); my current belief is that there is probably some effect size, but at just 10mg, it may be too tiny to matter. I have ~40% belief that there will be a large effect size, but I’m doing a long experiment and I should be able to detect a large effect size with >75% chance. So, the formula is NPV of the difference between taking and not taking, times quality of information, times expectation: ((10 - 0) / log 1.05) * 0.75 * 0.40 = 61.4
, which justifies a time investment of less than 9 hours. As it happens, it took less than an hour to make the pills & placebos, and taking them is a matter of seconds per week, so the analysis will be the time-consuming part. This one may actually turn a profit.
6 Redshift
http://www.gwern.net/Zeo#redshiftf.lux
Like the modafinil day trial, this was another value-less experiment justified by its intrinsic interest. I expect the results will confirm what I believe: that red-tinting my laptop screen will result in less damage to my sleep by not forcing lower melatonin levels with blue light. The only outcome that might change my decisions is if the use of Redshift actually worsens my sleep, but I regard this as highly unlikely. It is cheap to run as it is piggybacking on other experiments, and all the randomizing & data recording is being handled by 2 simple shell scripts.
7 Meditation
http://www.gwern.net/Zeo#meditation-1
I find meditation useful when I am screwing around and can’t focus on anything, but I don’t meditate as much as I might because I lose half an hour. Hence, I am interested in the suggestion that meditation may not be as expensive as it seems because it reduces sleep need to some degree: if for every two minutes I meditate, I need one less minute of sleep, that halves the time cost - I spend 30 minutes meditating, gain back 15 minutes from sleep, for a net time loss of 15 minutes. So if I meditate regularly but there is no substitution, I lose out on 15 minutes a day. Figure I skip every 2 days, that’s a total lost time of (15 * 2/3 * 365.25) / 60 = 61
hours a year or $427 at minimum wage. I find the theory somewhat plausible (60%), and my year-long experiment has roughly a 60% chance of detecting the effect size (estimated based on the sleep reduction in a Indian sample of meditators). So ((427 - 0) / log 1.05) * 0.60 * 0.60 = $3150
. The experiment itself is unusually time-intensive, since it involve ~180 sessions of meditation, which if I am “overpaying” translates to 45 hours ((180 * 15) / 60
) of wasted time or $315. But even including the design and analysis, that’s less than the calculated value of information.
This example demonstrates that drugs aren’t the only expensive things for which you should do extensive testing.
First off, kudos for discussing non-VoI reasons to run these experiments. Real decisions have many factors.
The eyeballed estimate of how much the experimental design reduces the value from perfect information should be replaced by a decision tree. If the experiment can't give you enough data to change your position, then it's not material.
Using the first example, where W is melatonin works and "W" is the experiment saying that melatonin works, it looks like you provided P(W)=.8, P("W"|W)=.95, and P("W"|~W)=.05. I assumed that >90% corresponded to a point estimate of 95%, and that the test was symmetric, which should get thought about more if you're doing this seriously.
In the case where you get "W", you update and P(W|"W")=99% and you continue taking melatonin. But in the case where you get "~W", you update and P(W|"~W")=17%. Given the massive RoI you calculated for melatonin, it sounds like it's worth taking even if there's only a 17% chance that it's actually effective. Rather than continuing blindly on, you'd probably continue the test until you had enough data to be sure / pin down your RoI calculation, but you should be able to map that out now before you start the experiment.
There's a question of prior information here- from what you're written, it sounds like you should be more than 80% sure that melatonin worked for you. You might be interested in a different question- "melatonin still works for me"- which it might be reasonable to have an 80% prior on. If the uncertainty is about the value of taking melatonin, it seems like you could design a better experiment that narrows your uncertainty there (by looking for cognitive costs, or getting a better estimate of time saved, etc.).
A brief terminology correction: the "value of perfect information" would be $41, not $205 (i.e. it includes the 20% estimate that melatonin doesn't work). If you replace that with "value of a perfect negative result" you should be fine.
In 3, you're considering adding a new supplement, not stopping a supplement you already use. The "I don't try Adderall" case has value $0, the "Adderall fails" case is worth -$40 (assuming you only bought 10 pills, and this number should be increased by your analysis time and a weighted cost for potential permanent side effects), and the "Adderall succeeds" case is worth $X-40-4099, where $X is the discounted lifetime value of the increased productivity due to Adderall, minus any discounted long-term side effect costs. If you estimate Adderall will work with p=.5, then you should try out Adderall if you estimate that .5(X-4179)>0 -> X>4179. (Adderall working or not isn't binary, and so you might be more comfortable breaking down the various "how effective Adderall is" cases when eliciting X, by coming up with different levels it could work at, their values, and then using a weighted sum to get X. This can also give you a better target with your experiment- "this needs to show a benefit of at least Y from Adderall for it to be worth the cost, and I've designed it so it has a reasonable chance of showing that.")
One thing to notice is that the default case matters a lot. This asymmetry is because you switch decisions in different possible worlds- when you would take Adderall but stop you're in the world where Adderall doesn't work, and when you wouldn't take Adderall but do you're in the world where Adderall does work (in the perfect information case, at least). One of the ways you can visualize this is that you don't penalize tests for giving you true negative information, and you reward them for giving you true positive information. (This might be worth a post by itself, and is very Litany of Gendlin.)
The rest is similar. I definitely agree with the last line: possibly a way to drive it home is to talk about dividing by ln(1.05), which is essentially multiplying by 20.5. If you can make a one-time investment that pays off annually until you die, that's worth 20.5 times the annual return, and multiplying the value of something by 20 can often move it from not worth thinking about to worth thinking about.
Thanks for the comments.
The Bayes calculation is
(0.05 * 0.8) / ((0.05 * 0.8) + (0.95 * 0.2)) = 0.1739...
, right? (A second experiment would knock it down to ~0.01, apparently.)I didn't notice that. I didn't re... (read more)