Error comments on The Web Browser is Not Your Client (But You Don't Need To Know That) - Less Wrong

22 Post author: Error 22 April 2016 12:12AM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (47)

You are viewing a single comment's thread. Show more comments above.

Comment author: Brotherzed 25 April 2016 10:50:29PM *  1 point [-]

But consider the following problem: Find and display all comments by me that are children of this post, and only those comments, using only browser UI elements, i.e. not the LW-specific page widgets. You cannot -- and I'd be pretty surprised if you could make a browser extension that could do it without resorting to the API, skipping the previous elements in the chain above. For that matter, if you can do it with the existing page widgets, I'd love to know how.

If you mean parse the document object model for your comments without using an external API, it would probably take me about a day, because I'm rusty with WatiN (the tool I used to used for web scraping when that was my job a couple years ago). About four hours of that would be setting up an environment. If I was up to speed, maybe a couple hours to work out the script. Not even close to hard compared to the crap I used to have to scrape. And I'm definitely not the best web scraper; I'm a non-amateur novice, basically. The basic process is this: anchor to a certain node type that is the child of another node with certain attributes and properties, and then search all the matching nodes for your user name, then extract the content of some child nodes of all the matched nodes that contain your post.

WatiN:: http://watin.org/

Selenium: http://www.seleniumhq.org/

These are the most popular tools in the Microsoft ecosystem.

As someone who has the ability to control how content is displayed to me (tip - hit f12 in google chrome), I disagree with the statement that a web browser is not a client. It is absolutely a client and if I were sufficiently motivated I could view this page in any number of ways. So can you. Easy examples you can do with no knowledge are to disable the CSS, disable JS, etc.

Comment author: Error 28 April 2016 04:43:05PM 0 points [-]

Upvoted for actually considering how it could be done. It does sort of answer the letter if not the spirit of what I had in mind.

Comment author: Brotherzed 10 May 2016 08:52:37PM *  0 points [-]

I admit I didn't think it all the way through. If your goal isn't ultimately data collection, you would make a browser addon and use javascript injection (the frontend scripting language for rendering web pages). I replied to another person with loose technical details, but you could create a browser addon where you push a button in the top right corner of your browser, type a username, and then it transforms the page to show nothing but posts by the user of that page by leveraging the web page's frontend scripting language.

So there's a user-friendly way to transform your browser's rendering without APIs, clunky web scrapers or excess server load. It's basically the same principle that adblockers work on.