(Part of a sequence on discussion technology and NNTP. As last time, I should probably emphasize that I am a crank on this subject and do not actually expect anything I recommend to be implemented. Add whatever salt you feel is necessary)1
If there is one thing I hope readers get out of this sequence, it is this: The Web Browser is Not Your Client.
It looks like you have three or four viable clients -- IE, Firefox, Chrome, et al. You don't. You have one. It has a subforum listing with two items at the top of the display; some widgets on the right hand side for user details, RSS feed, meetups; the top-level post display; and below that, replies nested in the usual way.
Changing your browser has the exact same effect on your Less Wrong experience as changing your operating system, i.e. next to none.
For comparison, consider the Less Wrong IRC, where you can tune your experience with a wide range of different software. If you don't like your UX, there are other clients that give a different UX to the same content and community.
That is how the mechanism of discussion used to work, and does not now. Today, your user experience (UX) in a given community is dictated mostly by the admins of that community, and software development is often neither their forte nor something they have time for. I'll often find myself snarkily responding to feature requests with "you know, someone wrote something that does that 20 years ago, but no one uses it."
Semantic Collapse
What defines a client? More specifically, what defines a discussion client, a Less Wrong client?
The toolchain by which you read LW probably looks something like this; anyone who's read the source please correct me if I'm off:
Browser -> HTTP server -> LW UI application -> Reddit API -> Backend database.
The database stores all the information about users, posts, etc. The API presents subsets of that information in a way that's convenient for a web application to consume (probably JSON objects, though I haven't checked). The UI layer generates a web page layout and content using that information, which is then presented -- in the form of (mostly) HTML -- by the HTTP server layer to your browser. Your browser figures out what color pixels go where.
All of this is a gross oversimplification, obviously.
In some sense, the browser is self-evidently a client: It talks to an http server, receives hypertext, renders it, etc. It's a UI for an HTTP server.
But consider the following problem: Find and display all comments by me that are children of this post, and only those comments, using only browser UI elements, i.e. not the LW-specific page widgets. You cannot -- and I'd be pretty surprised if you could make a browser extension that could do it without resorting to the API, skipping the previous elements in the chain above. For that matter, if you can do it with the existing page widgets, I'd love to know how.
That isn't because the browser is poorly designed; it's because the browser lacks the semantic information to figure out what elements of the page constitute a comment, a post, an author. That information was lost in translation somewhere along the way.
Your browser isn't actually interacting with the discussion. Its role is more akin to an operating system than a client. It doesn't define a UX. It provides a shell, a set of system primitives, and a widget collection that can be used to build a UX. Similarly, HTTP is not the successor to NNTP; the successor is the plethora of APIs, for which HTTP is merely a substrate.
The Discussion Client is the point where semantic metadata is translated into display metadata; where you go from 'I have post A from user B with content C' to 'I have a text string H positioned above visual container P containing text string S.' Or, more concretely, when you go from this:
Author: somebody
Subject: I am right, you are mistaken, he is mindkilled.
Date: timestamp
Content: lorem ipsum nonsensical statement involving plankton....
to this:
<h1>I am right, you are mistaken, he is mindkilled.</h1>
<div><span align=left>somebody</span><span align=right>timestamp</span></div>
<div><p>lorem ipsum nonsensical statement involving plankton....</p></div>
That happens at the web application layer. That's the part that generates the subforum headings, the interface widgets, the display format of the comment tree. That's the part that defines your Less Wrong experience, as a reader, commenter, or writer.
That is your client, not your web browser. If it doesn't suit your needs, if it's missing features you'd like to have, well, you probably take for granted that you're stuck with it.
But it doesn't have to be that way.
Mechanism and Policy
One of the difficulties forming an argument about clients is that the proportion of people who have ever had a choice of clients available for any given service keeps shrinking. I have this mental image of the Average Internet User as having no real concept for this.
Then I think about email. Most people have probably used at least two different clients for email, even if it's just Gmail and their phone's built-in mail app. Or perhaps Outlook, if they're using a company system. And they (I think?) mostly take for granted that if they don't like Outlook they can use something else, or if they don't like their phone's mail app they can install a different one. They assume, correctly, that the content and function of their mail account is not tied to the client application they use to work with it.
(They may make the same assumption about web-based services, on the reasoning that if they don't like IE they can switch to Firefox, or if they don't like Firefox they can switch to Chrome. They are incorrect, because The Web Browser is Not Their Client)
Email does a good job of separating mechanism from policy. Its format is defined in RFC 2822 and its transmission protocol is defined in RFC 5321. Neither defines any conventions for user interfaces. There are good reasons for that from a software-design standpoint, but more relevant to our discussion is that interface conventions change more rapidly than the objects they interface with. Forum features change with the times; but the concepts of a Post, an Author, or a Reply are forever.
The benefit of this separation: If someone sends you mail from Outlook, you don't need to use Outlook to read it. You can use something else -- something that may look and behave entirely differently, in a manner more to your liking.
The comparison: If there is a discussion on Less Wrong, you do need to use the Less Wrong UI to read it. The same goes for, say, Facebook.
I object to this.
Standards as Schelling Points
One could argue that the lack of choice is for lack of interest. Less Wrong, and Reddit on which it is based, has an API. One could write a native client. Reddit does have them.
Let's take a tangent and talk about Reddit. Seems like they might have done something right. They have (I think?) the largest contiguous discussion community on the net today. And they have a published API for talking to it. It's even in use.
The problem with this method is that Reddit's API applies only to Reddit. I say problem, singular, but it's really problem, plural, because it hits users and developers in different ways.
On the user end, it means you can't have a unified user interface across different web forums; other forum servers have entirely different APIs, or none at all.2 It also makes life difficult when you want to move from one forum to another.
On the developer end, something very ugly happens when a content provider defines its own provision mechanism. Yes, you can write a competing client. But your client exists only at the provider's sufferance, subject to their decision not to make incompatible API changes or just pull the plug on you and your users outright. That isn't paranoia; in at least one case, it actually happened. Using an agreed-upon standard limits this sort of misbehavior, although it can still happen in other ways.
NNTP is a standard for discussion, like SMTP is for email. It is defined in RFC 3977 and its data format is defined in RFC 5536. The point of a standard is to ensure lasting interoperability; because it is a standard, it serves as a deliberately-constructed Schelling point, a place where unrelated developers can converge without further coordination.
Expertise is a Bottleneck
If you're trying to build a high-quality community, you want a closed system. Well kept gardens die by pacifism, and it's impossible to fully moderate an open system. But if you're building a communication infrastructure, you want an open system.
In the early Usenet days, this was exactly what existed; NNTP was standardized and open, but Usenet was a de-facto closed community, accessible mostly to academics. Then AOL hooked its customers into the system. The closed community became open, and the Eternal September began.3 I suspect, but can't prove, that this was a partial cause of the flight of discussion from Usenet to closed web forums.
I don't think that was the appropriate response. I think the appropriate response was private NNTP networks or even single servers, not connected to Usenet at large.
Modern web forums throw the open-infrastructure baby out with the open-community bathwater. The result, in our specific case, is that if we want something not provided by the default Less Wrong interface, it must be implemented by Less Wrongers.
I don't think UI implementation is our comparative advantage. In fact I know it isn't, or the Less Wrong UI wouldn't suck so hard. We're pretty big by web-forum standards, but we still contain only a tiny fraction of the Internet's technical expertise.
The situation is even worse among the diaspora; for example, at SSC, if Scott's readers want something new out of the interface, it must be implemented either by Scott himself or his agents. That doesn't scale.
One of the major benefits of a standardized, open infrastructure is that your developer base is no longer limited to a single community. Any software written by any member of any community backed by the same communication standard is yours for the using. Additionally, the developers are competing for the attention of readers, not admins; you can expect the reader-facing feature set to improve accordingly. If readers want different UI functionality, the community admins don't need to be involved at all.
A Real Web Client
When I wrote the intro to this sequence, the most common thing people insisted on was this: Any system that actually gets used must allow links from the web, and those links must reach a web page.
I completely, if grudgingly, agree. No matter how insightful a post is, if people can't link to it, it will not spread. No matter how interesting a post is, if Google doesn't index it, it doesn't exist.
One way to achieve a common interface to an otherwise-nonstandard forum is to write a gateway program, something that answers NNTP requests and does magic to translate them to whatever the forum understands. This can work and is better than nothing, but I don't like it -- I'll explain why in another post.
Assuming I can suppress my gag reflex for the next few moments, allow me to propose: a web client.
(No, I don't mean write a new browser. The Browser Is Not Your Client.4)
Real NNTP clients use the OS's widget set to build their UI and talk to the discussion board using NNTP. There is no fundamental reason the same cannot be done using the browser's widget set. Google did it. Before them, Deja News did it. Both of them suck, but they suck on the UI level. They are still proof that the concept can work.
I imagine an NNTP-backed site where casual visitors never need to know that's what they're dealing with. They see something very similar to a web forum or a blog, but whatever software today talks to a database on the back end, instead talks to NNTP, which is the canonical source of posts and post metadata. For example, it gets the results of a link to http://lesswrong.com/posts/message_id.html
by sending ARTICLE message_id
to its upstream NNTP server (which may be hosted on the same system), just as a native client would.
To the drive-by reader, nothing has changed. Except, maybe, one thing. When a regular reader, someone who's been around long enough to care about such things, says "Hey, I want feature X," and our hypothetical web client doesn't have it, I can now answer:
Someone wrote something that does that twenty years ago.
Here is how to get it.
-
Meta-meta: This post took about eight hours to research and write, plus two weeks procrastinating. If anyone wants to discuss it in realtime, you can find me on #lesswrong or, if you insist, the LW Slack.↩
-
The possibility of "universal clients" that understand multiple APIs is an interesting case, as with Pidgin for IM services. I might talk about those later.↩
-
Ironically, despite my nostalgia for Usenet, I was a part of said September; or at least its aftermath.↩
-
Okay, that was a little shoehorned in. The important thing is this: What I tell you three times is true.↩
Your proposal requires a lot of work: both coding, and the social effort of getting everyone to use new custom software on their backends. So we should compare it not to existing alternatives, but to potential solutions we could implement at similar cost.
Let's talk about a concrete alternative: a new protocol, using JSON over HTTP, with an API representing CRUD operations over a simple schema of users, posts, comments, et cetera; with some non-core features provided over existing protocols like RSS. An optional extension could provide e.g. server push notifications, but that would be for performance or convenience, not strictly for functionality.
It would be simpler to specify (compared to contorting NNTP), and everyone's used to JSON/HTTP CRUD. It would be simpler to implement - almost trivial, in fact - in any client or server language, easier than writing an HTTP to NNTP gateway even though NNTP servers already exist. It would better match the existing model of forums and users. And it would (more easily) allow integration with existing forum software, so we don't have to tell everyone they have to find a Linux host and install custom software, rather than finding a Wordpress+MySql host and installing this one plugin.
I think the current model is fine. Posts and comments are associated with forums (sites), and links to them are links to those sites. (As opposed to a distributed design like NNTP that forwards messages to different hosts.) User accounts are also associated with sites, but sites can delegate authentication to other sites via Google/Facebook login, OpenID, etc. Clients can aggregate data from different sites and crosslink posts by the same users on different sites. A site owner has moderator powers over content on their site, including comments by users whose account is registered at a different site.
The UXs for posters, commenters, readers, and site owners all need to be improved. But I don't see a problem with the basic model.
Then you suffer all the problems of NNTP's distributed design (which I outlined in my first comment) without getting any of the benefits.
It seems easy to me. The user account lives on LW, but the actual comment lives on SSC, so an SSC mod can moderate it or ban the user from SSC. There are plenty of competing cross-site authentication systems and we don't even have to limit ourselves to supporting or endorsing one of them.
Also, we can just as easily support non-site-associated accounts, which are authenticated by a pubkey. System designers usually don't like this choice because it's too easy to create lots of new accounts, but frankly it's also very easy to create lots of Google accounts. SSC even allows completely auth-less commenting, so anyone can claim another's username, and it hasn't seemed to hurt them too badly yet.
I'll just repeat my core argument here. Extant NNTP software is far more terrible, if you penalize it for things like not supporting incoming hyperlink, not allowing editing posts, not having karma, no existing Web clients, etc. Adding those things to NNTP (both the protocol and the software) requires more work than building a new Web-friendly forum standard and implementations, and would also be much more difficult for site admins to adopt and install.
I don't know of any concrete ones, but I haven't really searched for them either. It just feels as though it's likely there were some - which were ultimately unsuccessful, clearly.
Having an RFC isn't really that important. There are lots of well-documented, historically stable protocols with many opensource implementations that aren't any worse just because they haven't been published via the IETF or OASIS or ECMA or what have you.
Well, yes. That's more or less why I expect it to never, ever happen. I did say I'm a crank with no serious hopes. ;-)
While I don't object in theory to a new protocol, JSON over HTTP specifically is a paradigm I would like to destroy.
(which is kind of hilarious given that my day job involves an app with exactly that design)
Some kind of NNTP2 would be nice. The trouble with taking ... (read more)