gwern comments on LessWrong Help Desk - free paper downloads and more (2014) - Less Wrong

30 Post author: jsalvatier 16 January 2014 05:51AM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (279)

You are viewing a single comment's thread. Show more comments above.

Comment author: gwern 26 January 2014 11:40:52PM 2 points [-]

The link to this data is provided in the gated HTML article, but there doesn't seem to be a link from an ungated page, so I wonder if these data are supposed to be freely accessible... In any case, all their data are currently ungated and accessible by appending '/downloadstats' to the appropriate URL.

Hm. I wonder how I would get a full list of URLs. It'd be nice to feed it into my archiver bot.

Comment author: Douglas_Knight 31 January 2014 03:59:35AM 2 points [-]

It would be easy to extract a partial list of URLs from this. Google probably has better coverage with its in url search, but I don't know how to get lots of data out of it.

Comment author: gwern 31 January 2014 04:41:30PM 3 points [-]

Looks like one would be better off using the site: parameter than inurl:, since it's a prefix; so site:onlinelibrary.wiley.com/doi/10.1002/14651858

Comment author: Douglas_Knight 31 January 2014 05:14:46PM 2 points [-]

huh. I didn't try that because I knew that site: doesn't work for all prefixes (eg, it fails if you chop off the last digit). I thought it required termination with a slash, but maybe any punctuation works? I do recommend inurl:abstract.