By 'mean of the utility function', I meant the mean of the utility function over all possible universes rather than just valid universes. The validity constraint forces the expected utility to diverge from the mean of the utility function - it must for the agent to make any useful decisions!
Okay. In that case there are two reasons that mugger hypotheses are still important: the unupdated expected utility is not necessarily anywhere near the naive tail-less expected utility and that while the central limit theorem shows that updating based on observations is unlikely to produce a shift in the utility of the tails that is large relative to the bounds on the utility function, it will still be large relative to the actual utility.
The way I'm approaching this is to ask whether most of the expected utility comes from high probability events or low probability ones
My entire post concerns the subset of universes with probabilities approaching 1/infinity, corresponding to programs with length going to infinity. The high probability scenarios (shorter program universes) don't matter in mugger scenarios, we categorically assume they all have boring extremely low utilities (the mugger is jokin/lying/crazy).
The utility of the likely scenarios is essential here. If we don't take into account the utility of $5, we have no obvious reason not to pay the mugger. The ratio of the utility differences of various action due to the likely hypotheses and due to the high-utility hypotheses is what is important.
Your observations have some probability P(T|N) to retain a hypothesis of length N. I don't see why this would depend that strongly on the value of N.
In AIXI-models, hypothesis acceptance is not probabilistic, it is completely binary: a universe program either perfectly fits the observation history or it does not. If even 1 bit is off, the program is ignored.
That is a probability (well really a frequency) taken over all hypotheses of length N (or L if you prefer).
It's unfortunate I started using N for program length in my prior post, that was a mistake, L was the term for program length in the EU equation. L (program length) matters because of the solomonoff prior complexity penalty: 2^-L.
The space of valid programs of length L, for any L, is simply all possible programs of length L, which is expected to be a set of around 2^L in size.
Well, an O(1) factor less, since otherwise our prior measure would diverge, but you don't have to write it explicitly; when working with Kolmogorov complexity, you expect everything to be within a constant factor.
Now consider O:{1}. We have cut out exactly half of the program space. O:{11}, cuts out 3/4th of the tegmark, and in general an observation history with length(O) filters the universe space down to 2^-length(O) of it's previous size, removing 1 - 2^-length(O) possible universes - but there are an infinite number of total universes.
No, not quite. Observations are not perfectly informative. If someone wanted to optimally communicate their observations, they would use such a system, but a real observation will not be perfectly optimized to rule out half the hypothesis space. We are reading bits from the output of the program, not its source code!
However, for length(P) > length(O) + C, for some small C, valid programs are absolutely guaranteed. Specifically for some constant C there are programs which simply directly encode random strings which happen to align with O. This set of programs correspond to 'chaos'.
I don't think this set behaves how you think it behaves. 1 - 2^-length(O) of this set will be ruled out, but there are more programs that have with more structure than "print this string" that don't get falsified, since they actually have enough structure to reproduce our observation (about K(O) bits) and they use the leftover bits to encode various unobservable things that might have high utility.
Looking at you conclusions, you can actually replace l(O) with K(O) and everything qualitatively survives.
The utility of the likely scenarios is essential here. If we don't take into account the utility of $5, we have no obvious reason not to pay the mugger.
No, not necessarily. It could be an arbitrarily small cost: the mugger could say just look at me for a nanosecond, and this tiny action of almost no cost could still not be worthwhile.
If AIXI can not find a full observation history O matching program P which generates a future we would describe as (mugger really does have matrix powers and causes massive negative reward) under the constraints that leng...
Edit: Added clarification of the limit in response to gwern's comment.
For recent examples, see this post by MileyCyrus, or this post from XiXiDu (where I reply with unbounded utility functions, which is not the general solution).
I encountered this issue again while reading through a fascinating discussion thread on John Baez's blog from earlier this year where Greg Egan jumped in with a "Yudkowsky/Bostrom" criticism:
In short, Egan is indirectly accusing SIAI and FHI of Pascal Mugging(among else): something serious indeed. Egan in particular presents the following (presumably Yudkowsky) quote as evidence:
Yudkowsky responds with his Pascal's Wager Fallacy Fallacy, and points out that in fact he agrees there is no case for investing in defense against highly improbable existential risks:
The rest of the thread makes for an entertaining read, but the takeaway I'd like to focus on is the original source of Egan's criticism: the apparent domination of immensely unlikely scenarios of immensely high utility.
It occurred to me that the expected value of any action - properly summed over subsets of integrated futures - necessarily converges to zero as the probability of those considered subsets goes to zero. Critically this convergence occurs for *all* utility functions, as it is not dependent on any particular utility assignments. Alas LW is vast enough that there may be little new left under the sun: In researching this idea, I encountered an earlier form of it in a post by SilasBart here, as well as some earlier attempts by RichardKennaway, Komponisto, and jimrandomh.
Now that we've covered the background, I'll jump to the principle:
The Infinitesimal Probability Utility Convergence Principle (IPUP): For any action A, utility function U, and a subset of possible post-action futures F, EU(F) -> 0 as p(F) -> 0.
In Pascal's Mugging scenarios we are considering possible scenarios (futures) that have some low probability. It is important to remember that rational agents compute expected reward over all possible futures, not just the one scenario we may be focusing on.
The principle can be formalized in the theoretical context of perfect omniscience-approaching agents running on computers approaching infinite power.
The AIXI formalization provides a simple mathematical model of such agents. It's single line equation has a concise English summary:
AIXI is just a mathematical equation. We must be very careful in mapping it to abstract scenarios lest we lose much in translation. It is best viewed as a family of agent-models, the reward observations it seeks to maximize could be anything.
When one ponders: "What would AIXI/Omega do?" There are a couple of key points to keep in mind:
In other words the perfectly rational agent considers everything that could possibly happen as a consequence of it's action in every possible universe it could be in, weighted by an exponential penalty against high-complexity universes.
Here is a sketch of how the limit convergence (IPUP above) can be derived: When considering a possible action A, such as giving $5 to a Pascal Mugger, an optimal agent considers all possible dependent futures for all possible physics-universes. As we advance into scenarios of infinitesimal probability, we are advancing up the complexity ladder into increasingly chaotic universes which feature completely random rewards which approach positive/negative infinity. As we advance into this regime of infinitesimal probability, causality itself breaks down completely and expected reward of any action goes to zero.
The convergence principle can be derived from the program length prior 2^-l(q). An agent which has accumulated P perception bits so far can fully explain those perceptions by completely random programs of length P, thus 2^-l(P) forms a probability limit at which the agent's perceptions start becoming irrelevant, and chaotic non-causal physics dominate. Chaos should dominate expected reward for actions where p(A) << 2^-l(P).
Thinking as a limited human, we impose abstractions and collapse all extremely similar (to us) futures. All the tiny random quantum-dependent variations of a particular future correspond to "giving the Mugger $5" we collapse into a single set of futures which we assign a probability to based on counting the subinstances in that set as a fraction of the whole.
AIXI does not do this: it actually computes each individual future path.
But as we can't hope to think that way, we have to think in terms of probability categorizations. Fine. Imagine collapsing any futures that are sufficiently indistinguishable such that humans would consider them identical: described by the same natural language. We then get subsets of futures which we assign probabilities as relative size measures.
Now consider ranking all of those future-sets in decreasing probability order. Most of the early list is dominated by Mugger is (joking/lying/crazy/etc). Farther down the list you get into scenarios where we do live in a multi-level Simulation (AIXI only ever considers itself in some simulation), but the Mugger is still (joking/lying/crazy/etc).
By the time you get down the list to scenarios described where the Mugger says "Or else I will use my magic powers from outside the Matrix to run a Turing machine that simulates and kills 3^^^^3 people" and what the Mugger says actually happens, we are almost certainly down in infinitesimal probability land.
Infinitesimal probability land is a wierd place. It is a regime where the physics that we commonly accept is wrong - which is to say simply that the exponential complexity penalty no longer rules out ultra-complex universes. It is dominated by chaos: universes of every possible fancy, where nothing is as what it seems, where everything you possibly thought is completely wrong, where there is no causality, etc. etc.
At the complete limit of improbability, we just get universes where our entire observation history is completely random - generated by programs more complex than our observations. You give the mugger $5 and the universe simply dissolves in white noise and nothing happens (or god appears and gives you infinite heaven, or infinite hell, or the speed of light goes to zero, or a black hole forms near your nose, or the Mugger turns into jellybeans, etc. etc., an infinite number of stories, over which the net reward summation necessarily collapses to zero.)
Remember AIXI doesn't consider the mugger's words as 'evidence', they are simply observations. In the more complex universes they are completely devoid of meaning, as causality itself collapses.