I'm assuming their opinions are independent, usually because they're trained on different features that have low correlations with each other. I was thinking of adding in log-odds space, as a way of adding up bits of information, and this turns out to be the same as using DanielLC's method. Averaging instead seems reasonable if correlations are high.
Yes, but the key point I was trying to make is that using different features with low correlations does not at all ensure that adding the evidence is correct. What matters is not correlations between the features, but correlations between the experts. Correlated features will of course mean correlated experts, but the converse is not true. The features don't have to be correlated for the experts to make mistakes on the same inputs. It's often the case that they do simply because some inputs are fundamentally more difficult than others, in ways that affect ...
Suppose you have a property Q which certain objects may or may not have. You've seen many of these objects; you know the prior probability P(Q) that an object has this property.
You have 2 independent measurements of object O, which each assign a probability that Q(O) (O has property Q). Call these two independent probabilities A and B.
What is P(Q(O) | A, B, P(Q))?
To put it another way, expert A has opinion O(A) = A, which asserts P(Q(O)) = A = .7, and expert B says P(Q(O)) = B = .8, and the prior P(Q) = .4, so what is P(Q(O))? The correlation between the opinions of the experts is unknown, but probably small. (They aren't human experts.) I face this problem all the time at work.
You can see that the problem isn't solvable without the prior P(Q), because if the prior P(Q) = .9, then two experts assigning P(Q(O)) < .9 should result in a probability lower than the lowest opinion of those experts. But if P(Q) = .1, then the same estimates by the two experts should result in a probability higher than either of their estimates. But is it solvable or at least well-defined even with the prior?
The experts both know the prior, so if you just had expert A saying P(Q(O)) = .7, the answer must be .7 . Expert B's opinion B must revise the probability upwards if B > P(Q), and downwards if B < P(Q).
When expert A says O(A) = A, she probably means, "If I consider all the n objects I've seen that looked like this one, nA of them had property Q."
One approach is to add up the bits of information each expert gives, with positive bits for indications that Q(O) and negative bits that not(Q(O)).