Theoretical Computer Science Msc student at the University of [Redacted] in the United Kingdom.
I'm an aspiring alignment theorist; my research vibes are descriptive formal theories of intelligent systems (and their safety properties) with a bias towards constructive theories.
I think it's important that our theories of intelligent systems remain rooted in the characteristics of real world intelligent systems; we cannot develop adequate theory from the null string as input.
I have not read all of them!
My current position now is basically:
Actually, I'm less confident and now unsure.
Harth's framing was presented as an argument re: the canonical Sleeping Beauty problem.
And the question I need to answer is: "should I accept Harth's frame?"
I am at least convinced that it is genuinely a question about how we define probability.
There is still a disconnect though.
While I agree with the frequentist answer, it's not clear to me how to backgpropagate this in a Bayesian framework.
Suppose I treat myself as identical to all other agents in the reference class.
I know that my reference class will do better if we answer "tails" when asked about the outcome of the coin toss.
But it's not obvious to me that there is anything to update from when trying to do a Bayesian probability calculation.
There being many more observers in the tails world to me doesn't seem to alter these probabilities at all:
- P(waking up)
- P(being asked questions)
- P(...)
By stipulation my observational evidence is the same in both cases.
And I am not compelled by assuming I should be randomly sampled from all observers.
There are many more versions of me in this other world does not by itself seem to raise the probability of me witnessing the observational evidence since by stipulation all versions of me witness the same evidence.
I'm curious how your conception of probability accounts for logical uncertainty?
So in this case, I agree that like if this experiment is repeated multiple times and every Sleeping Beauty version created answered tails, the reference class of Sleeping Beauty agents would have many more correct answers than if the experiment is repeated many times and every sleeping Beauty created answered heads.
I think there's something tangible here and I should reflect on it.
I separately think though that if the actual outcome of each coin flip was recorded, there would be a roughly equal distribution between heads and tails.
And when I was thinking through the question before it was always about trying to answer a question regarding the actual outcome of the coin flip and not what strategy maximises monetary payoffs under even bets.
While I do think that like betting odds isn't convincing re: actual probabilities because you can just have asymmetric payoffs on equally probable mutually exclusive and jointly exhaustive events, the "reference class of agents being asked this question" seems like a more robust rebuttal.
I want to take some time to think on this.
Strong up voted because this argument actually/genuinely makes me think I might be wrong here.
Much less confident now, and mostly confused.
I mean I am not convinced by the claim that Bob is wrong.
Bob's prior probability is 50%. Bob sees no new evidence to update this prior so the probability remains at 50%.
I don't favour an objective notion of probabilities. From my OP:
2. Bayesian Reasoning
- Probability is a property of the map (agent's beliefs), not the territory (environment).
- For an observation O to be evidence for a hypothesis H, P(O|H) must be > P(O|¬H).
- The wake-up event is equally likely under both Heads and Tails scenarios, thus provides no new information to update priors.
- The original 50/50 probability should remain unchanged after waking up.
So I am unconvinced by your thought experiments? Observing nothing new I think the observers priors should remain unchanged.
I feel like I'm not getting the distinction you're trying to draw out with your analogy.
I mean I think the "gamble her money" interpretation is just a different question. It doesn't feel to me like a different notion of what probability means, but just betting on a fair coin but with asymmetric payoffs.
The second question feels closer to actually an accurate interpretation of what probability means.
i.e. if each forecaster has an first-order belief , and is your second-order belief about which forecaster is correct, then should be your first-order belief about the election.
I think there might be a typo here. Did you instead mean to write: "" for the second order beliefs about the forecasters?
The claim is that given the presence of differential adversarial examples, the optimisation process would adjust the parameters of the model such that it's optimisation target is the base goal.
That was it, thanks!
Yeah, since posting this question:
I had a firm notion in mind for what I thought probability meant. But Rafael Harth's answer really made me unconfident that the notion I had in mind was the right notion of probability for the question.