cross-posted on the EA Forum
I'm interested in questions of the form, "I have a bit of metadata/structure to the question, but I know very little about the content of the question (or alternatively, I'm too worried about biases/hacks to how I think about the problem or what pieces of information to pay attention to). In those situations, what prior should I start with?"
I'm not sure if there is a more technical term than "low-information prior."
Some examples of what I found useful recently:
1. Laplace's Rule of Succession, for when the underlying mechanism is unknown.
2. Percentage of binary questions that resolves as "yes" on Metaculus. It turns out that of all binary (Yes-No) questions asked on the prediction platform Metaculus, 29% of them resolved yes. This means that even if you know nothing about the content of a Metaculus question, a reasonable starting point for answering a randomly selected binary Metaculus question is 29%.
In both cases, obviously there are reasons to override the prior (for example, you can arbitrarily flip all questions on Metaculus such that your prior is now 71%). However (I claim), having a decent prior is nonetheless useful in practice, even if it's theoretically unprincipled.
I'd be interested in seeing something like 5-10 examples of low-information priors as useful as the rule of succession or the Metaculus binary prior.
I'd imagine publication bias priors are helpful, especially with increasing specificity of research area, and especially where you can think of any remote possibility for interference.
Just as an example I'm familiar with (note this is probably a somewhat more extreme example than for most research areas due to the state of pharmacological research): If you see 37 RCTs in favour of a given drug, and 3 that find no significant impact (i.e. 93% in favour), it is not unfounded to assume that the trials actually performed are roughly equal in favour and against, and that there may be a missing 34-odd studies.
A 2009 analysis found that this was almost exactly the case (the studies registered were 36:38 in favour of the drug; one positive RCT went missing before publication. Along with twenty-two non-significant studies that were missing altogether, and a further 11 which were so poorly analysed as to appear significant.
(Bad Pharma, by Ben Goldacre, is a pretty sound resource for this topic in general)
Wow thank you! I found this really helpful.