- A very important fact which just came to my attention is that people do not tend to sum or take the max reasonableness of arguments for P to form a judgement about P, rather they tend to take the average.
- This is a somewhat reasonable heuristic in some situations. For instance, if someone gives you a really unreasonable argument for P this is evidence that their judgement of arguments isn’t very good, and so their best argument is more likely to be secretly bad.
- Similarly, it is evidence that they are motivated to convince you even using faulty arguments, which is generally speaking a bad sign.
- It has important implications. Sometimes people think “oh I will make 50 ok arguments for P instead of one really good one” but most folks are not very impressed by this, even though they should be.
- Relatedly, if you try to turn a complicated thesis T into a social movement, the average reasonableness of an argument in favor of T will plummet, and so you may very quickly find that everyone perceives the anti-T-ers as being much more reasonable.
- This will probably still be true even if the best pro-T arguments are very good, and especially true if the best pro-T arguments are subtle or hard to follow.
- Yes, this is about ai risk. I don’t think this is a slam dunk argument against trying to make ai-risk-pilled-ness into a popular social movement, but it is a real cost, and nearly captures the shape of my real worries.
- The best version of my real worries, I also have real worries that are not nearly as defensible or cool.
- Oh actually probably they take the min rather than the average unless they like you, in which case they take the max.
First of all, is this important fact actually true? I'd love to know. Reviewing my life experience... it sure seems true? At least true in many circumstances? I think I can think of lots of examples where this fact being true is a good explanation of what happened. If people have counterarguments or sources of skepticism I'd be very interested to hear in the comments.
Secondly, I concluded a while back that One Strong Argument Beats Many Weak Arguments, and in Ye Olden Days of Original Less Wrong when rationalists spent more time talking about rationality there was a whole series of posts arguing for the opposite claim (1, 2, 3). Seems possibly related. I'd love to see this debate revived, and tied in to the more general questions of:
(A) Does rationality in practice recommend aggregating the quality of a group of arguments for a claim by taking the sum, the max, the min, the mean, or what? (To be clear, obviously the ideal is more complicated & looks more like Bayesian conditionalization on a huge set of fleshed-out hypotheses. But in practice, when you don't have time for that, what do you do?)
(B) What do people typically do, and on what factors does that depend--e.g. do they take the min if they don't like you or the claim, and take the max if they do?
Finally: Steven Adler pointed me to this paper that maybe provides some empirical evidence for Ronny's claim.
Any time you get a data point about X, you get to update both on X and on the process that generated the data point. If you get several data points in a row, then as your view of the data-generating process changes you have re-evaluate all of the data it gave it you earlier. Examples:
None of this is cheap to compute; there are a bunch of subtle, clashing considerations. So if we don't have a lot of time, should we use the sum, or the average, or what? Equivalently: what prior should we have over data-generating processes? Here's how I think about it:
Sum: Use this when you think your data points are independent, and not filtered in any particular way -- or if you think you can precisely account for conditional dependence, selection, and so on. Ideal, but sometimes impractical and too expensive to use all the time.
Max: Useful when your main concern is noise. Probably what I use the most in my ordinary life. The idea is that most of the data I get doesn't pertain to X at all, and the data that is about X is both subject to large random distortions and probably secretly correlated in a way that I can't quantify very well. Nevertheless, if X is true you should expect to see signs of it, here and there, and tracking the max leaves you open to that evidence without having to worry about double-updating. As a bonus, it's very memory efficient: you only have to remember the strongest data favoring X and the strongest data disfavoring it, and can forget all the rest.
Average: What I use when I'm evaluating an attempt at persuasion from someone I don't know well. Averaging is a lousy way to evaluate arguments but a pretty-good-for-how-cheap-it-is way to evaluate argument-generating processes. Data points that aren't arguments probably shouldn't ever be averaged.
Min: I don't think this one has any legitimate use at all. Lots of data points are only very weakly about X, even when X is true.
All of these heuristics have cases where they abjectly fail, and none of them work well when your adversary is smarter than you are.