I'm not sure I should also count the time spent learning about causality in the first place (which I would probably estimate at ~2 weeks), but it's striking how much of the investment in generating the results is capital, and how little of it is labor. That is, now that I have the package downloaded, I can do this easily for other datasets. Time to start picking some low-hanging fruit.
Absolutely. When I look at my own projects, they go like 'gathering and cleaning data: 2 months. Figuring out the right analysis the first time: 2 days. Runtime of analysis: 2 hours.'
The first time this happened to me, it drove me nuts. It reminded me of writing my first program, where it took maybe 20 minutes to write it even looking everything up, and then 2 hours to debug. That was when the true horror of programming struck me. Years later, when I came across the famous quote by Wilkes that "I can remember the exact instant when I realized that a large part of my life from then on was going to be spent in finding mistakes in my own programs." I instantly knew it for truth.
(Same for http://lesswrong.com/lw/j8f/anonymous_feedback_forms_revisited/ - downloading the data and figuring out how to join up the two CSVs in just the right way, an irritating hour. The logistic regression? Maybe 3 minutes playing around with different predictor variables.)
That was when the true horror of programming struck me. Years later, when I came across the famous quote by Wilkes that "I can remember the exact instant when I realized that a large part of my life from then on was going to be spent in finding mistakes in my own programs." I instantly knew it for truth.
This realization caused one of my big life mistakes, I think. It struck me in high school, and so I foolishly switched my focus from computer science to physics (I think, there might have been a subject or two in between) because I disliked deb...
If it's worth saying, but not worth its own post (even in Discussion), then it goes here.