JesseClifton - LessWrong

Notes on Occam via Solomonoff vs. hierarchical Bayes

Thanks!

where you just take the uniform prior over all programs of length T, then let T to infinity

Sure, but because of language-dependence I'm not sure why we would want to apply the principle of indifference at this level. (Note that the quote says "if we can find a privileged parameterization to which we can apply the principle".) I tend to think you should apply the POI at the "explanatorily basic" level (see here, here), which might be the properties of fundamental objects in the ontology (e.g., position and momentum in Newtonian mechanics, maybe?). Otherwise I think you run into unsatisfying-to-me arbitrariness.

In what sense does the universe have structure, such that a-priori bit strings of observations about the universe ought to be treated by us as more than members of the set of all possible bit strings?

Right, I think this is the kind of question you're not going to be able to answer without thinking about the kinds of ontological considerations pointed to here.

afaik that's false

Thanks, will change.

In Defense of Open-Minded UDT

JesseClifton2moΩ112

FWIW, in our original formulation of open-minded updatelessness, the idea was about revising the prior via either

-becoming aware of new possibilities (thereby changing the support of the prior);
-"philosophy", i.e., reflecting on principles for prior-setting. (Which would allow for the use of an "objective" prior, if we ended up thinking there was such a thing.)

(Cool post, Abraham, thanks!)

Updatelessness doesn't solve most problems

JesseClifton2moΩ110

Open-Minded Updatelessness doesn’t push back on this fundamental trade-off. Instead, given the trade-off persists, it explores which kinds and shapes of partial commitments seem more robustly net-positive from the perspective of our current game-theoretic knowledge (that is, our current prior)

I would distinguish between two cases:

We’re aware of specific reasons why continuing to follow an OMU policy could be harmful (e.g., specific reasons other agents might punish us for doing so). In such cases, if those hypotheses have high-enough weight, OMU as a normative criterion can itself recommend not continuing to follow the OMU policy. So, in this sense I agree with the quote, but this doesn’t undermine OMU as a normative criterion.
We’re not aware of any specific reasons why continuing to follow the OMU policy could be harmful. In that case, it seems arbitrary to form beliefs according to which it’s net-negative to continue following OMU, and so it seems reasonable to continue following OMU (I’m not sure what else to do).

Sorry for only now commenting...

Why I’m not a Bayesian

JesseClifton4mo40

This paper discusses two semantics for Bayesian inference in the case where the hypotheses under consideration are known to be false.

Verisimilitude: p(h) = the probability that that h is closest to the truth [according to some measure of closeness-to-truth] among hypotheses under consideration
Counterfactual: p(h) = the probability of h given the (false) supposition that one of the hypotheses under consideration is true

In any case, it’s unclear what motivates making decisions by maximizing expected value against such probabilities, which seems like a problem for boundedly rational decision-making.

Winning isn't enough

JesseClifton5mo*43

mildly disapprove of words like "a widely-used strategy"

The text says “A widely-used strategy for arguing for norms of rationality involves avoiding dominated strategies”, which is true* and something we thought would be familiar to everyone who is interested in these topics. For example, see the discussion of Dutch book arguments in the SEP entry on Bayesianism and all of the LessWrong discussion on money pump/dominance/sure loss arguments (e.g., see all of the references in and comments on this post). But fair enough, it would have been better to include citations.

"we often encounter claims"

We did include (potential) examples in this case. Also, similarly to the above, I would think that encountering claims like “we ought to use some heuristic because it has worked well in the past” is commonplace among readers so didn’t see the need to provide extensive evidence.

*Granted, we are using “dominated strategy” in the wide sense of “strategy that you are certain is worse than something else”, which glosses over technical points like the distinction between dominated strategy and sure loss.

D0TheMath's Shortform

JesseClifton1y*20

What principles? It doesn’t seem like there’s anything more at work here than “Humans sometimes become more confident that other humans will follow through on their commitments if they, e.g., repeatedly say they’ll follow through”. I don’t see what that has to do with FDT, more than any other decision theory.

If the idea is that Mao’s forming the intention is supposed to have logically-caused his adversaries to update on his intention, that just seems wrong (see this section of the mentioned post).

(Separately I’m not sure what this has to do with not giving into threats in particular, as opposed to preemptive commitment in general. Why were Mao’s adversaries not able to coerce him by committing to nuclear threats, using the same principles? See this section of the mentioned post.)

D0TheMath's Shortform

JesseClifton1y50

I don't think FDT has anything to do with purely causal interactions. Insofar as threats were actually deterred here this can be understood in standard causal game theory terms. (I.e., you claim in a convincing manner that you won't give in -> People assign high probability to you being serious -> Standard EV calculation says not to commit to threat against you.) Also see this post.

SIA > SSA, part 4: In defense of the presumptuous philosopher

JesseClifton1y20

Awesome sequence!

I wish that discussions of anthropics were clearer about metaphysical commitments around personal identity and possibility. I appreciated your discussions of this, e.g., in Section XV. I agree with what you, though, that it is quite unclear what justifies the picture “I am sampled from the set of all possible people-in-my-epistemic situation (weighted by probability of existence)”. I take it the view of personal identity at work here is something like “‘I’ am just a sequence of experiences S”, and so I know I am one of the sequences of experiences consistent with my current epistemic situation E. But the straightforward Bayesian way of thinking about this would seem to be: “I am sampled from all of the sequences of experiences S consistent with E, in the actual world”.

(Compare with: I draw a ball from an urn, which either contains (A) 10 balls or (B) 100 balls, 50% chance each. I don’t say “I am indifferent between the 110 possible balls I could’ve drawn, and therefore it’s 10:1 that this ball came from (B).” I say that with 50%, ball came from (A) and with 50% the ball came from (B). Of course, there may be some principled difference between this and how you want to think about anthropics, but I don’t see what it is yet.)

This is just minimum reference class SSA, which you reject because of its verdict in God’s Coin Toss with Equal Numbers. I agree that this result is counterintuitive. But I think it becomes much more acceptable if (1) we get clear about the notion of personal identity at work and (2) we try to stick with standard Bayesianism. mrcSSA also avoids many of the apparent problems you list for SSA. Overall I think mrcSSA's answer to God's Coin Toss with Equal Numbers is a good candidate for a "good bullet" =).

(Cf. Builes (2020), part 2, who argues that if you have a deflationary view of personal identity, you should use (something that looks extensionally equivalent to) mrcSSA.)

Open-minded updatelessness

JesseClifton1y*33

But it's true that if you had been aware from the beginning that you were going to be threatened, you would have wanted to give in.

To clarify, I didn’t mean that if you were sure your counterpart would Dare from the beginning, you would’ve wanted to Swerve. I meant that if you were aware of the possibility of Crazy types from the beginning, you would’ve wanted to Swerve. (In this example.)

I can’t tell if you think that (1) being willing to Swerve in the case that you’re fully aware from the outset (because you might have a sufficiently high prior on Crazy agents) is a problem. Or if you think (2) this somehow only becomes a problem in the open-minded setting (even though the EA-OMU agent is acting according to the exact same prior as they would've if they started out fully aware, once their awareness grows).

(The comment about regular ol exploitability suggests (1)? But does that mean you think agents shouldn't ever Swerve, even given arbitrarily high prior mass on Crazy types?)

What if anything does this buy us?

In the example in this post, the ex ante utility-maximizing action for a fully aware agent is to Swerve. The agent starts out not fully aware, and so doesn’t Swerve unless they are open-minded. So it buys us being able to take actions that are ex ante optimal for our fully aware selves when we otherwise wouldn’t have due to unawareness. And being ex ante optimal from the fully aware perspective seems preferable to me than being, e.g., ex ante optimal from the less-aware perspective.

More generally, we are worried that agents will make commitments based on “dumb” priors (because they think it’s dangerous to think more and make their prior less dumb). And EA-OMU says: No, you can think more (in the sense of becoming aware of more possibilities), because the right notion of ex ante optimality is ex ante optimality with respect to your fully-aware prior. That's what it buys us.

And revising priors based on awareness growth differs from updating on empirical evidence because it only gives other agents incentives to make you aware of things you would’ve wanted to be aware of ex ante.

they need to gradually build up more hypotheses and more coherent priors over time

I’m not sure I understand—isn't this exactly what open-mindedness is trying to (partially) address? I.e., how to be updateless when you need to build up hypotheses (and, as mentioned briefly, better principles for specifying priors).

Open-minded updatelessness

JesseClifton1y21

If I understand correctly, you’re making the point that we discuss in the section on exploitability. It’s not clear to me yet why this kind of exploitability is objectionable. After all, had the agent in your example been aware of the possibility of crazy agents from the start, they would have wanted to swerve, and non-crazy agents would want to take advantage of this. So I don’t see how the situation is any worse than if the agents were making decisions under complete awareness.

LESSWRONG
LW

Sequences

Posts

Wikitag Contributions

Comments