A couple of weeks ago I asked Should LW have an official list of norms? and I appreciate the responses there. Here I want to say what I'm currently thinking following that post, and continue having a public conversation about it.
I think saying more on this topic actually gets into a bunch of interesting questions around LessWrong's purpose, userbase, de facto norms and culture, moderation mandate, etc. Without locking in things as "Officially How It Is Forever", I'll opine on my current thinking on this topics and how I relate to them in practice. It's possible that further public discussion will shift some things here, and after more back-and-forth, it'd make sense to "ratify" some of it more.
With all that said...
LessWrong and The Art of Discourse
LessWrong was founded to be a place for perfecting the Art of Human Rationality, i.e., generally thinking in ways which more reliably result in true beliefs, etc. Similarly, I think there's a closely related "Art of Discourse": communicating in ways that more reliably result in those conversing (and reading along) having more true beliefs. Perhaps it's a sub-art of the Art of Human Rationality.
The real rules of which communication most efficiently gets you towards truth lives in reality. You can choose your norms, but whether those norms are conducive to truth isn't up to you.
The LessWrong community, over its 10-15 year existence, has assembled a number of beliefs about the Art of Discourse. Things like communicating degrees of belief quantitatively, preference for asymmetric weapons, an interest in local validity, etc. We of course don't have the complete art and may be mistaken about pieces of it, but we feel strongly about some of the pieces we believe we possess of this art.
Different people in our community have somewhat different senses of the Art of Discourse, and these even form clusters. But there's a pretty solid common core set of norms on the site, such that if someone is not conforming to them, most people would want them to change their behavior or go elsewhere.
The core point I want to make here is: The Art of [Truth-seeking] Discourse lives in the territory, and we community members attempt to discover it and practice it.
Moderators moderate according to their own understanding of The Art
A thing you could imagine doing is the community comes together, writes down its sense of how you ought to behave, and enshrines that as The Law. The moderators (judges/police) then interpret and enforce the law. I think this sometimes gets called "Rule of Law".
I think that gets you some advantages, but requires infrastructure and investment LessWrong can't realistically have, both for enshrining the initial law and then updating it over time in cases of incompleteness and ambiguity.
(edit: "Rule of Man" as an existing phrase means something crucially different from what I wanted to described. See my comment here for clarification.
Instead LessWrong operates by a "Rule of Man"[1] ~~"hybrid Rule of Law/Man" system where the moderators apply our own understanding of the Art of Discourse to making moderation decisions about which behaviors are okay or not, and what to do with users who behave badly according to us. This has quite a few benefits: it allows us to be flexible and adaptable to new cases, it means we ask a direct question of "does this seem good or not?" rather than "did it violate the enshrined law?", and it allows us to smoothly improve the enforced policy as our understanding of the Art of Discourse improves over time.
This approach does run the risk that moderators have bad calls (or could be corrupt or biased), which is why I favor moderation being transparent where doing so isn't too costly , so people can call out things they think are mistakes.
Components of Decision-Making: Inside-View/Outside-View/Stakeholder-Game-Theory
It'd actually be imprecise to say that moderators just moderate according to our inside views of the Art of Discourse. I could possibly carve it in a few ways, but here's one attempted breakdown at how we'll make our site decisions:
- We make moderation decisions based on our inside view beliefs about what would be good for the Discourse and LessWrong's goal. We do so both via indirect (via principles we've settled on) and direct consequentialist reasoning[2].
- We might sometimes weight the views of people we think are wrong, but generally respect their thinking. This is something like applying our "outside view" to situations.
- One form of this is trying to hold ourselves accountable to the people we think we should hold ourselves accountable to. I can't currently provide you a list or clear criteria, but it's like "these people seem to really capture the spirit of my values, by doing well in their lights, I will do well according to my own values and judgment". Another framing might be "this includes the people that if they thought we were fucking up or were unhappy with us, I'd really really care. Eliezer and Scott would be on that list, for example.
- We do some "game theory" to figure out how to account for the views and preferences of people that we feel have meaningful "stake" in LessWrong. For example, if it was the case that a number of very core contributors differed from the LessWrong mod team in their beliefs about the Art of Discourse (could be something like different beliefs about what politeness-norms or psychologizing-norms are good), we would likely weigh those beliefs in the actually policy we upheld.
Legibilizing one's understanding of the Art
There's a bunch of encoded functions and algorithms in my brain which, for any given post or comment on LessWrong, will provide an evaluation of it. This illegible function is what is what I actually use to make moderation calls, and it would be very difficult, or really impossible, for me to make it fully legible (even to myself). The other members of the LessWrong team have their own functions, and for that matter, so does every user on LessWrong.
However, I can attempt to capture aspects of my encoded function into something explicit. Lists of principles that, while not the actual thing, point you in the right direction. Or lists of principles that I can invoke to help explain my reasoning in various cases. These legible list of principle or rules aren't the law in the sense in which the US Constitution is law, but they'll provide a better sense of the real rules than if you didn't have them.
You end up with a fair bit of indirection:
written discussion principles attempt to capture LW team's understanding of the Art of Discourse attempts to capture The Actual Art of Discourse.
The written principles/norm are then hopefully useful by:
- Being a useful start to learning the actual Art of Discourse for new users
- Helping new users understand which behaviors get upvoted/downvoted, approved/rejected, moderated/not-moderated
- Helping moderators and other users explain their reactions
- Focusing community discussions around okay/not-okay behavior
At the same time, the written principles are not the end-all be-all. A moderator might say "while none of our existing written things capture what you're doing, we're pretty sure it's bad and we're taking moderator action to prevent more of this".
A list of norms for LessWrong which is of the shape "here's our understanding of the Art of Discourse (work in progress)" seems like it could be pretty good.
Towards a settled picture
I think the above picture is pretty good and it's approx the models/philosophy behind current moderation. But seems good to write up it up and discuss in advance of us taking bolder actions on the basis of it (e.g. writing a list of site norms). Very interested in feedback here that could result in amending the picture.
More than something framed as "site norms", I like the idea of writing up "here's our understanding of the Art of Discourse so far" that can be shared with new users and cited in moderation decisions would be pretty good. Also ideally it gets updated over time as we figure out more and more Art of the Discourse, and make LW more successful at its missions.
- ^
This term gets used elsewhere in not quite the sense I mean it. Elsewhere, Rule of Man means something like the laws from the man, whereas I actually mean something like "the laws live in the territory but are interpreted and applied by man". Perhaps could have a better term for it.
- ^
I've long been a fan of R. M. Hare's two-level utilitarianism, and think it in fact matches how we moderate – attempting to figure out general principles, but figuring out those principles and applying via more direct consequentialist reasoning.
This alleged "hybrid system" doesn't get you the benefits of rule of law, because the distinguishing feature of the rule of law is that the law is not an optimizer. As Yudkowsky explains in "Free to Optimize", the function of the legal system is "to provide a predictable environment in which people can optimize their own futures." In a free country (as contrasted to an authoritarian dictatorship), a good citizen is someone who pays their taxes and doesn't commit crimes. That way, citizens who have different ideas about what the good life looks like can get along with each other. Sometimes people might make bad decisions, but it turns out that it's actually more fun to live in a country where people have the right to make their own bad decisions (and suffer the natural consequences, like losing money or getting downvoted or failing to persuade people), than a country where a central authority tries to micromanage everyone's decisions.
A country where judges try to get citizens to "actively (credibly) agree to stop optimizing in a fairly deep way", and impose ad hoc punishments to prevent those who don't agree from doing things that "feel damaging" to the judge, is not reaping the benefits of the rule of law, because the benefits of the rule of law flow precisely from the fact that there are rules, and that citizens are free to optimize in a deep way as long as they obey the rules—that the centralized authority isn't trying to grab all the optimization power in the system for itself.
Crucially, assurances that the power structure is trying optimize for something good, are not rules. I'm sure the judges of the Inquisition or Soviet show trials would have told you that they weren't exercising power arbitrarily, because they were making judgements about an external standard—the platonic ideal of God's will, or the common good. I'm sure they were being perfectly sincere about that. The rule of law is about imposing more stringent limits on the use of power than an authority figure's subjectively sincere allegiance to an abstract ideal that isn't written down.
I wrote a post explaining why there's very obviously not going to be any such thing. Probability theory isn't going to tell you how polite to be. It just isn't. Why would it? How could it?
What's your counterargument? If you don't have a counterargument, then how can you possibly claim with a straight face that "go[ing] to the rulers and say[ing] 'hey, you're mistaken about Art of Discourse'" is a redress mechanism?