I'm pretty sure that you're actually asking some deep questions right now.
I'm not all that well-versed in epistemology or probability theory, but when you write:
A friend in academia suggested that this touches on a problem with Bayes priors that has not been settled.
I think this is a reference to the problem of priors.
I think 'a problem with Bayes priors that has not been settled' kind of understates the significance.
And:
The first issue is that there are infinite people who never existed and did not have a coin made. If I narrow it to historic figures who turned out not to exist and did not have a coin made it becomes possible but also becomes subjective as to whether someone actually thought they existed. For example, did people believe the Minotaur existed? Perhaps I should choose another filter instead of historic figure, like humans that existed. But picking and choosing the category is again so subjective. Someone may also argue that woman inequality back then was so great that the data should only look at men, as a woman’s chance of being portrayed on a coin was skewed in a way that isn’t applicable to men.
I believe this is referred to as the reference class problem. It seems that in a Bayesian framework, the reference class problem is something of a subproblem of the problem of priors. It seems that you're only trying to define a suitable reference class in the first place because you're trying to produce a reasonable prior.
It's my understanding that one approach to the problem of priors has been to come up with a 'universal' prior, a prior which is reasonable to adopt before making any observations of the world. One example is Solomonoff's algorithmic probability. It seems however than even this may not be a satisfactory solution, because this prior defies our intuition in some ways. For example, humans might find it intuitive to assign nonzero prior probability to uncomputable hypotheses (e.g. our physics involves hypercomputation), but algorithmic probability only assigns nonzero probability to computable hypotheses, an agent with this prior will never be able to have credence in uncomputable hypotheses. Another problem is that, with this prior, hypotheses are penalized for their complexity, but utility can grow far more quickly than complexity. Increasing the number of happy people in a program from 1000 people to 1,000,000 people seems to increase its utility a lot without increasing its complexity by much. Taking this up to larger and larger numbers that become difficult to intuitively comprehend, it may be that such a prior would result in agents whose decision-making is dominated by very improbable outcomes with very high utilities. We can also ask if this apparently absurd result is a point against Solomonoff induction or if it's a point against how humans think, but if we humans are thinking the right way, we still don't know what it is that's going right inside of our heads and how it compares to Solomonoff induction.
For any other readers, sorry if I'm mistaken on any of this, it is quite technical and I haven't really studied it. Do correct me if I've made a mistake.
Back to my point, I think that you accidentally hit upon a problem that doesn't seem to take too many prerequisites to initially run into, and that, after a bit of squinting, turns out to be way harder than it seems it should be at first glance, given the small number of prerequisites necessary to realize that the problem exists. Personally, I would consider the task of providing a complete answer to your questions an open research problem.
Stanford Encyclopedia of Philosophy's entry on Bayesian epistemology might be some good reading for this.
Thanks for taking the time to write all that for me. This is exactly the nudge in the right direction i was looking for. I will need at least the next few months to cover all this and all the further Google searches it sends me down. Perfect, thanks again!
I posted before about an open source decision making web site I am working on called WikiLogic. The site has a 2 minute explanatory animation if you are interested. I wont repeat myself but the tl;dr is that it will follow the Wikipedia model of allowing everyone to collaborate on a giant connected database of arguments where previously established claims can be used as supporting evidence for new claims.
The raw deduction element of it works fine and would be great in a perfect world where such a thing as absolute truths existed, however in reality we normally have to deal with claims that are just the most probable. My program allows opposing claims to be connected and then evidence to be gathered for each. The evidence will create a probability of it being correct and which ever is highest, gets marked as best answer. Principles such as Occams Razor are applied automatically as long list of claims used as evidence will be less likely as each claim will have its own likelihood which will dilute its strength.
However, my only qualification in this area is my passion and I am hitting a wall with some basic questions. I am not sure if this is the correct place to get help with these. If not, please direct me somewhere else and I will remove the post.
The arbitrarily chosen example claim I am working with is whether “Alexander the Great existed”. This has the useful properties of 1: an expected outcome (that he existed - although, perhaps my problem is that this is not the case!) and 2: it relies heavily on probability as there is little solid evidence.
One popular claim is that coins were minted with his face on them. I want to use Bayes to find how likely a face appearing on a coin is for someone who existed. As I understand it, there should be 4 combinations:
The first issue is that there are infinite people who never existed and did not have a coin made. If I narrow it to historic figures who turned out not to exist and did not have a coin made it becomes possible but also becomes subjective as to whether someone actually thought they existed. For example, did people believe the Minotaur existed?
Perhaps I should choose another filter instead of historic figure, like humans that existed. But picking and choosing the category is again so subjective. Someone may also argue that woman inequality back then was so great that the data should only look at men, as a woman’s chance of being portrayed on a coin was skewed in a way that isn’t applicable to men.
I hope i have successfully communicated the problem i am grappling with and what i want to use it for. If not, please ask for clarifications. A friend in academia suggested that this touches on a problem with Bayes priors that has not been settled. If that is the case, is there any suggested resources for a novice with limited free time, to start to explore the issue? References to books or other online resources or even somewhere else I should be posting this kind of question would all be gratefully received. Not to mention a direct answer in the comments!