Viliam_Bur comments on Open Thread, March 16-31, 2012 - Less Wrong Discussion
You are viewing a comment permalink. View the original post to see all comments and the full post content.
You are viewing a comment permalink. View the original post to see all comments and the full post content.
Comments (114)
Is the LW database structure available? If yes, you could prepare some SELECT queries and ask admins to run them for you and send you the result.
Anonymization: Replace user ids with "f(id+c)" where "f" is a hash function and "c" is a constant that will be modified by the admin before running you script. Replace times of karma clicks with "ym(time+r)" where "r" is a random value between 0 and 30 days, and "ym" is a function that returns only month and year. Select only data from recent year and only from users who are were active during the whole year (made at least one vote in the first and last months of the time period). Would such data be still useful to you?
My day job is DB admin and development. In the unlikely event of LW back-end admin-types being comfortable running a query sent in by some dude off the site, I wouldn't be comfortable giving it to them. The effort of due diligence on a foreign script is probably greater than that required to put it together.
The data I want correspond to:
If I were providing this data, I would also scramble the IDs in some fashion while maintaining the underlying relationships, as consecutive IDs could provide some small clue as to the identity and chronology of users or posts. While this is pretty straightforward, the mechanism for such scrambling should not be known to recipients of the data.