Lumifer comments on Learned Blankness - Less Wrong
You are viewing a comment permalink. View the original post to see all comments and the full post content.
You are viewing a comment permalink. View the original post to see all comments and the full post content.
Comments (186)
One way to do this is to not even put the dangerous options in the manual. For example, I often use
wgetto download static copies of websites (many of whose owners would prefer I not). The default way to block spiders like wget is to throw up a narrowrobots.txtwhich wget will read and then abort; if you search the wget man page, there is no option to stop this. However, if you are an advanced user or you went as far as reading the default.wgetrcconfiguration file, you find something useful, and in my own file one will find the following:Of course, I do know exactly what robots.txt is, why ignoring it is potentially dangerous, and why I want to ignore it. So it all works out for me, and avoids noobs DoSing websites.
And no documentation serves as its own self-enforcing way of making sure only the technically skilled can use a program... Consider a tool I am using right now to investigate Bitcoin usage among torrent uploaders on The Pirate Bay: https://github.com/andronikov/tpb2csv This could use up a lot of bandwidth and hurt the TPB if someone were to use it wastefully and accidentally throw it in, say, a
whileloop. Fortunately (?), the repo has hardly any documentation, doesn't tell you which file you would run (download.py), how (python download.py), what dependencies you need (python-requests python-beautifulsoup), what the arguments are (unique IDs embedded in the URLs of the relevant TPB torrent pages), or where you would get them (valid torrent IDs start in the 3m range and increment to the latest in http://thepiratebay.se/recent ).Anyone who can figure all that out probably has a good reason for using the code and won't abuse it.
It's basically a script kiddie filter. Unfortunately it's not very effective.