gwern comments on Learned Blankness - Less Wrong

130 Post author: AnnaSalamon 18 April 2011 06:55PM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (186)

You are viewing a single comment's thread. Show more comments above.

Comment author: gwern 26 February 2014 03:26:53AM *  2 points [-]

I've even encountered hackerly tools with lock-outs to ensure that you've read the manual

One way to do this is to not even put the dangerous options in the manual. For example, I often use wget to download static copies of websites (many of whose owners would prefer I not). The default way to block spiders like wget is to throw up a narrow robots.txt which wget will read and then abort; if you search the wget man page, there is no option to stop this. However, if you are an advanced user or you went as far as reading the default .wgetrc configuration file, you find something useful, and in my own file one will find the following:

# Setting this to off makes Wget not download /robots.txt. Be sure to
# know *exactly* what /robots.txt is and how it is used before changing
# the default!
robots = off

Of course, I do know exactly what robots.txt is, why ignoring it is potentially dangerous, and why I want to ignore it. So it all works out for me, and avoids noobs DoSing websites.


And no documentation serves as its own self-enforcing way of making sure only the technically skilled can use a program... Consider a tool I am using right now to investigate Bitcoin usage among torrent uploaders on The Pirate Bay: https://github.com/andronikov/tpb2csv This could use up a lot of bandwidth and hurt the TPB if someone were to use it wastefully and accidentally throw it in, say, a while loop. Fortunately (?), the repo has hardly any documentation, doesn't tell you which file you would run (download.py), how (python download.py), what dependencies you need (python-requests python-beautifulsoup), what the arguments are (unique IDs embedded in the URLs of the relevant TPB torrent pages), or where you would get them (valid torrent IDs start in the 3m range and increment to the latest in http://thepiratebay.se/recent ).

Anyone who can figure all that out probably has a good reason for using the code and won't abuse it.

Comment author: CronoDAS 26 February 2014 04:21:05AM 1 point [-]

Anyone who can figure all that out probably has a good reason for using the code and won't abuse it.

I agree with the first part. After all, there are many people for which "stealing money" is their good reason for using poorly documented software tools, although this particular piece of code probably won't help with that...

Comment author: Lumifer 26 February 2014 05:18:02AM 0 points [-]

It's basically a script kiddie filter. Unfortunately it's not very effective.