Combining Prediction Technologies to Help Moderate Discussions

Wei Dai

I came across a 2015 blog post by Vitalik Buterin that contains some ideas similar to Paul Christiano's recent Crowdsourcing moderation without sacrificing quality. The basic idea in both is that it would be nice to have a panel of trusted moderators carefully pore over every comment and decide on its quality, but since that is too expensive, we can instead use some tools to predict moderator decisions, and have the trusted moderators look at only a small subset of comments in order to calibrate the prediction tools. In Paul's proposal the prediction tool is machine learning (mainly using individual votes as features), and in Vitalik's proposal it's prediction markets where people bet on what the moderators would decide if they were to review each comment.

It seems worth thinking about how to combine the two proposals to get the best of both worlds. One fairly obvious idea is to let people both vote on comments as an expression of their own opinions, and also place bets about moderator decisions, and use ML to set baseline odds, which would reduce how much the forum would have to pay out to incentivize accurate prediction markets. The hoped for outcome is that the ML algorithm would make correct decisions most of the time, but people can bet against it when they see it making mistakes, and moderators would review comments that have the greatest disagreements between ML and people or between different bettors in general. Another part of Vitalik's proposal is that each commenter has to make an initial bet that moderators would decide that their comment is good. The article notes that such a bet can also be viewed as a refundable deposit. Such forced bets / refundable deposits would help solve a security problem with Paul's ML-based proposal.

Are there better ways to combine these prediction tools to help with forum moderation? Are there other prediction tools that can be used instead or in addition to these?

It also adds an attack vector, both for those willing to spend to influence the automation

I'm optimistic that we can cope with this in a very robust way (e.g. by ensuring that when there is disagreement, the disagreeing parties end up putting in enough money that the arbitrage can be used to fund moderation).

and for those wanting to make a profit on their influence over the moderators

This seems harder to deal with convincingly address.

But I don't think there's any solution that doesn't involve a lot more ground-truthing by trusted evaluators.

So far I don't see any lower bounds on the amount of ground truth required. I expect that there aren't really theoretical limits---if the moderator was only willing to moderate in return for very large sums of money, then the cost per comment would be quite high, but they would potentially have to moderate very few times. I see two fundamental limits:

Moderation is required in order to reveal info about the moderator's behavior, which is needed by sophisticated bettors. This could also be provided in other ways.
Moderation is required in order to actually move money from the bad predictors to the good predictors. (This doesn't seem important for "small" forums, since then the incentive effects are always the main thing, i.e. the relevant movement of funds from bad- to good- predictors is happening at the scale of the world at large, not at the scale of a particular small forum).

I'm optimistic that we can cope with this in a very robust way (e.g. by ensuring that when there is disagreement, the disagreeing parties end up putting in enough money that the arbitrage can be used to fund moderation).

That assumes that many people are away of a given post over which there are disagreements in the first place.

21

Combining Prediction Technologies to Help Moderate Discussions

21

21

21

Combining Prediction Technologies to Help Moderate Discussions

21

21