I came across a 2015 blog post by Vitalik Buterin that contains some ideas similar to Paul Christiano's recent Crowdsourcing moderation without sacrificing quality. The basic idea in both is that it would be nice to have a panel of trusted moderators carefully pore over every comment and decide on its quality, but since that is too expensive, we can instead use some tools to predict moderator decisions, and have the trusted moderators look at only a small subset of comments in order to calibrate the prediction tools. In Paul's proposal the prediction tool is machine learning (mainly using individual votes as features), and in Vitalik's proposal it's prediction markets where people bet on what the moderators would decide if they were to review each comment.
It seems worth thinking about how to combine the two proposals to get the best of both worlds. One fairly obvious idea is to let people both vote on comments as an expression of their own opinions, and also place bets about moderator decisions, and use ML to set baseline odds, which would reduce how much the forum would have to pay out to incentivize accurate prediction markets. The hoped for outcome is that the ML algorithm would make correct decisions most of the time, but people can bet against it when they see it making mistakes, and moderators would review comments that have the greatest disagreements between ML and people or between different bettors in general. Another part of Vitalik's proposal is that each commenter has to make an initial bet that moderators would decide that their comment is good. The article notes that such a bet can also be viewed as a refundable deposit. Such forced bets / refundable deposits would help solve a security problem with Paul's ML-based proposal.
Are there better ways to combine these prediction tools to help with forum moderation? Are there other prediction tools that can be used instead or in addition to these?
Luis von Ahn (of CAPTCHA fame) came up with a number of games with a purpose, such as the ESP game. The idea to bet on moderator decisions reminded me of those games.
I recall another one that judged the aesthetic quality of images, but I don't remember the name. We could use something similar to judge the quality of posts in a way that would be resistant to abuse.
I'm pretty sure that aesthetic game was on the original gwap.com, but unfortunately, the creators seem to have moved on to other projects and the game part of the site doesn't seem to exist anymore. I'm not sure I remember exactly how it worked. Maybe you could find the rules in Wayback Machine, but it doesn't seem to be preserved very well. Does anyone remember the name of that game or how it worked?
Von Ahn also invented reCAPTCHA, which gives me another idea. We could perhaps require that a user participate in a judging game on two other posts as a cost to submit a post of their own.
The aim of the judging game is to predict how another other player would judge the post. (for example, upvote, downvote, flag as inappropriate, or some other set of emojis with standard meanings) The post could be chosen randomly from recent posts. The identity of the other player is kept secret by the system. If they agree on an emoji, it gets applied to the post, and the players both earn points in their predictor score. If they disagree, they lose predictor points, and the emoji doesn't get applied.
New moderators can then be chosen from the highest-scoring predictors.
Games with a purpose seem to work when useful things are also fun or can be made fun; indeed, to the extent that people sometimes do fun useless things, and there are fun useful things, then I certainly agree that we should be moving as much of the fun to "useful stuff" as possible.
Schemes based on markets can work even if they are not fun (e.g. even when participants are algorithms and companies offering professional moderation services).