whpearson comments on Frequentist Statistics are Frequently Subjective - Less Wrong

59 Post author: Eliezer_Yudkowsky 04 December 2009 08:22PM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (81)

You are viewing a single comment's thread. Show more comments above.

Comment author: whpearson 05 December 2009 01:19:35AM 5 points [-]

Bittorrent? You can publish shasums of the data sets in the paper so you know it is the data you are looking for.

Comment author: gwern 05 December 2009 06:53:22PM 3 points [-]

Bittorrent specializes in short-term, spiky mass downloading. It's not so hot for the long tail of years or decades. How many large torrents are alive after a few years?

Comment author: sketerpot 07 December 2009 05:08:57AM 5 points [-]

This is exactly the problem that archive.org was set up to deal with. They've been doing an excellent job of it, and their cost-per-gigabyte-month is only going to drop as storage and bandwidth become cheaper.

Comment author: gwern 07 December 2009 06:43:02PM 1 point [-]

Yes, they have been doing an excellent job. I've donated to them more than once because I find myself using the IA on a nigh-daily basis.

But the IA is no panacea. It can only store some categories of content reliably, and the rest is inaccessible. Nor have I seen them hold & distribute the truly enormous datasets that much research will use - the biggest files I've seen the IA offer for public download are in the single gigabytes or hundreds of megabytes range.