I recently read Maximum ignorance probability, with applications to surgery's error rates by N.N. Taleb where he proposes a new estimator for the parameter p of a Bernoulli random variable. In this article, I review the main points of it and also share my own thoughts about it.
The estimator in question (which I will call maximum ignorance estimator) takes the following form
^p=1−I−10.5(n−m,m+1)
where I is the regularized beta function, n is the number of independent trials and m is the number of successes.
This estimator is derived by solving the following equation
Fp(m)=q
where Fp is the cumulative distribution function of a binomial with n trials and probability p of success. In words, this estimator sets p to a value such that the probability of observing m successes or... (read 438 more words →)
That's a very interesting question and it is unfortunate that it did not get more traction, because I think I could learn a lot by reading more answers. In no way what follows is a definitive answer, it is just my own take.
A naive answer would be, let's pick the forecaster who has the lowest cross-entropy, i.e. the same way when we train a binary classifier which outputs probabilities, we pick the model which minimises the cross-entropy. I say this answer is naive because if we take the question at face value and we want to pick the best forecaster,... (read 531 more words →)