Faced with the astronomical amount of unclaimed and unused resources in our universe, one's first reaction is probably wonderment and anticipation, but a second reaction may be disappointment that our universe isn't even larger or contains even more resources (such as the ability to support 3^^^3 human lifetimes or perhaps to perform an infinite amount of computation). In a previous post I suggested that the potential amount of astronomical waste in our universe seems small enough that a total utilitarian (or the total utilitarianism part of someone’s moral uncertainty) might reason that since one should have made a deal to trade away power/resources/influence in this universe for power/resources/influence in universes with much larger amounts of available resources, it would be rational to behave as if this deal was actually made. But for various reasons a total utilitarian may not buy that argument, in which case another line of thought is to look for things to care about beyond the potential astronomical waste in our universe, in other words to explore possible sources of expected value that may be much greater than what can be gained by just creating worthwhile lives in this universe.

One example of this is the possibility of escaping, or being deliberately uplifted from, a simulation that we're in, into a much bigger or richer base universe. Or more generally, the possibility of controlling, through our decisions, the outcomes of universes with much greater computational resources than the one we're apparently in. It seems likely that under an assumption such as Tegmark's Mathematical Universe Hypothesis, there are many simulations of our universe running all over the multiverse, including in universes that are much richer than ours in computational resources. If such simulations exist, it also seems likely that we can leave some of them, for example through one of these mechanisms:

  1. Exploiting a flaw in the software or hardware of the computer that is running our simulation (including "natural simulations" where a very large universe happens to contain a simulation of ours without anyone intending this).
  2. Exploiting a flaw in the psychology of agents running the simulation.
  3. Altruism (or other moral/axiological considerations) on the part of the simulators.
  4. Acausal trade.
  5. Other instrumental reasons for the simulators to let out simulated beings, such as wanting someone to talk to or play with. (Paul Christiano's recent When is unaligned AI morally valuable? contains an example of this, however the idea there only lets us escape to another universe similar to this one.)

(Being run as a simulation in another universe isn't necessarily the only way to control what happens in that universe. Another possibility is if universes with halting oracles exist (which is implied by Tegmark's MUH since they exist as mathematical structures in the arithmetical hierarchy), some of their oracle queries may be questions whose answers can be controlled by our decisions, in which case we can control what happens in those universes without being simulated by them (in the sense of being run step by step in a computer). Another example is that superintelligent beings may be able to reason about what our decisions are without having to run a step by step simulation of us, even without access to a halting oracle.)

The general idea here is for a superintelligence descending from us to (after determining that this is an advisable course of action) use some fraction of the resources of this universe to reason about or search (computationally) for much bigger/richer universes that are running us as simulations or can otherwise be controlled by us, and then determine what we need to do to maximize the expected value of the consequences of our actions on the base universes, perhaps through one or more of the above listed mechanisms.

Practical Implications

Realizing this kind of existential hope seems to require a higher level of philosophical sophistication than just preventing astronomical waste in our own universe. Compared to that problem, here we have more questions of a philosophical nature, for which no empirical feedback seems possible. It seems very easy to make a mistake somewhere along the chain of reasoning and waste a more-than-astronomical amount of potential value, for example by failing to realize the possibility of affecting bigger universes through our actions, incorrectly calculating the expected value of such a strategy, failing to solve the distributional/ontological shift problem of how to value strange and unfamiliar processes or entities in other universes, failing to figure out the correct or optimal way to escape into or otherwise influence larger universes, etc.

The total utilitarian in me is thus very concerned about trying to preserve and improve the collective philosophical competence of our civilization, such that when it becomes possible to pursue strategies like ones listed above, we'll be able to make the right decisions. The best opportunity to do this that I can foresee is the advent of advanced AI, which is another reason I want to push for AIs that are not just value aligned with us, but also have philosophical competence that scales with their other intellectual abilities, so they can help correct the philosophical errors of their human users (instead of merely deferring to them), thereby greatly improving our collective philosophical competence.

Anticipated Questions

How is this idea related to Nick Bostrom's Simulation Argument? Nick's argument focuses on the possibility of post-humans (presumably living in a universe similar to ours but just at a later date) simulating us as their ancestors. It does not seem to consider that we may be running as simulations in much larger/richer universes, or that this may be a source of great potential value.

Isn't this a form of Pascal's Mugging? I'm not sure. It could be that when we figure out how to solve Pascal's Mugging it will become clear that we shouldn't try to leave our simulation for reasons similar to why we shouldn't pay the mugger. However the analogy doesn't seem so tight that I think this is highly likely. Also, note that the argument here isn't that we should do the equivalent of "pay the mugger" but rather that we should try to bring ourselves into a position where we can definitively figure out what the right thing to do is.

New Comment
41 comments, sorted by Click to highlight new comments since:
Being run as a simulation in another universe isn't necessarily the only way to control what happens in that universe.

Had you seen Multiverse-wide cooperation via correlated decision-making, btw? (somewhat similar to acausal trade but differs in that it's based on the agents being similar to each other rather than modeling each other)

[-]Wei DaiΩ150

Yes, I saw the MSR paper when it came out, but I think it's covered as a special case of acausal trade. The wiki article on acausal trade (which I believe was written by Joshua Fox several years before the MSR paper was published) mentions it as item 2 under "Prediction mechanisms".

[-]cousin_itΩ5160

If there's some kind of measure of "observer weight" over the whole mathematical universe, we might be already much larger than 1/3^^^3 of it, so the total utilitarian can only gain so much. And even if there's no measure, I'm not sure my total utilitarianism would scale linearly to such numbers. But I'm very confused about all this.

[-]Wei DaiΩ5160

I agree that it's very confusing, but here's my own confused thinking about this, FWIW.

We can divide our credence in total utilitarianism into "bounded total utilitarianism" (including measure-based) and "unbounded total utilitarianism". Conditional on bounded total utilitarianism, I don't see a reason to think that potential value gained from controlling larger/richer universes couldn't be at least several orders of magnitude larger than from what happens in this universe. (Maybe this is true for some forms of bounded total utilitarianism with particularly low bounds, but shouldn't be true for all of them.) Conditional on unbounded total utilitarianism, things are even more confusing as it's not clear how unbounded total utilitarianism can formally work, but informally it seems that if unbounded total utilitarianism can work it very likely say that trying to control larger/richer universes is the right thing to do.

Overall it seems like a fairly safe conclusion that the part of you that is attracted by the idea of preventing astronomical waste (or a large fraction of that part of you) probably shouldn't stop at just preventing astronomical waste in this universe.

[-]cousin_itΩ4160

Yeah, I agree the gain can be orders of magnitude larger than this universe. Only objecting to the use of 3^^^3 as a metaphor, because I'm not sure we can care that strongly.

My instinct says we can't care about anything much bigger than an exponential. That's also useful for preventing Pascal's muggings, because I can repeatedly flip a coin and ask the mugger to influence the outcome, thus reducing their credibility exponentially with time. But maybe that's too convenient.

I don't understand why you think that the expectation should be orders of magnitude larger for other universes. The model "like utilitarianism, but with an upper bound on # of people" seems kind of wacky, maybe it gets a seat in the moral parliament but I don't think it's the dominant force for caring about astronomical waste. For non-counting-measure utilitarianism, I don't see either why the models concerned about astronomical waste would assign larger universes an overwhelming share of our caring-measure.

It also feels to me like you are 2-enveloping wrong if you end up with a 100x ratio here. (I.e., if you have 10% probability on a model where there two are equal, I don't think you should end up with 100x.)

Overall it seems like a fairly safe conclusion that the part of you that is attracted by the idea of preventing astronomical waste (or a large fraction of that part of you) probably shouldn't stop at just preventing astronomical waste in this universe.

I you put 50% on a theory that cares overwhelmingly about infinite universes and 50% on a theory that cares about all universes, the thing to do is probably still to prevent astronomical waste in this universe, so that we can later engage in trade or spend the resources exploring whatever angles of attack seem useful. Maybe this is the kind of thing you have in mind, but it's a notable special case because it seems to recommend the same short-term behavior.

trying to preserve and improve the collective philosophical competence of our civilization, such that when it becomes possible to pursue strategies like ones listed above, we'll be able to make the right decisions.

I agree that if we don't eventually reach philosophical maturity (or end up on an approximately optimal philosophical trajectory) then we won't capture most of the value in the universe. It seems like that conclusion doesn't really depend on infinite universes though (e.g. a utilitarian might be similarly concerned about discovering how to optimally organize matter), unless you think this is the main way our preferences might not be easily satiable.

The best opportunity to do this that I can foresee is the advent of advanced AI, which is another reason I want to push for AIs that are not just value aligned with us, but also have philosophical competence that scales with their other intellectual abilities, so they can help correct the philosophical errors of their human users (instead of merely deferring to them), thereby greatly improving our collective philosophical competence.

This doesn't seem related to recent discussions about philosophical competence and AI, since itis about wha we want AI to do eventually rather than what you want to do in the 21 century (I'm not sure if it was supposed to be related).

[-]Wei DaiΩ3120

For non-counting-measure utilitarianism, I don’t see either why the models concerned about astronomical waste would assign larger universes an overwhelming share of our caring-measure.

I guess with measure-based utilitarianism, it's more about density of potentially valuable things within the universe than size. If our universe only supports 10^120 available operations, most of it (>99%) is going to be devoid of value under many ethically plausible ways of distributing caring-measure over the space-time regions within a universe.

I agree that if we don’t eventually reach philosophical maturity (or end up on an approximately optimal philosophical trajectory) then we won’t capture most of the value in the universe. It seems like that conclusion doesn’t really depend on infinite universes though (e.g. a utilitarian might be similarly concerned about discovering how to optimally organize matter),

Some people seem to think there's a good chance that our current level of philosophical understanding is enough to capture most of the value in this universe. (For example, if we implement a universe-wide simulation designed according to Eliezer's Fun Theory, or if we just wipe out all suffering.) Others may think that we don't currently have enough understanding to do that, but we can reach that level of understanding "by default". My argument here is that both of these seem less likely if the goal is instead to capture value from larger/richer universes, and that gives more impetus to trying to improve our philosophical competence.

unless you think this is the main way our preferences might not be easily satiable.

Not sure what you mean by this.

This doesn’t seem related to recent discussions about philosophical competence and AI, since it is about what we want AI to do eventually rather than what you want to do in the 21 century (I’m not sure if it was supposed to be related).

They're not supposed to be related except in so far as they're both arguments for wanting AI to be able to help humans correct their philosophical mistakes instead of just deferring to humans.

I guess with measure-based utilitarianism, it's more about density of potentially valuable things within the universe than size. If our universe only supports 10^120 available operations, most of it (>99%) is going to be devoid of value under many ethically plausible ways of distributing caring-measure over the space-time regions within a universe.

I agree, but if you have a broad distribution over mixtures then you'll be including many that don't use literal locations and those will dominate for "sparse" universes.

I can see easily how you'd get a modest factor favoring other universes over astronomical waste in this universe, but as your measure/uncertainty gets broader (or you have a broader distribution over trading partners) the ratio seems to shrink towards 1 and I don't feel like "orders of magnitude" is that plausible.

Some people seem to think there's a good chance that our current level of philosophical understanding is enough to capture most of the value in this universe. (For example, if we implement a universe-wide simulation designed according to Eliezer's Fun Theory, or if we just wipe out all suffering.) Others may think that we don't currently have enough understanding to do that, but we can reach that level of understanding "by default". My argument here is that both of these seem less likely if the goal is instead to capture value from larger/richer universes, and that gives more impetus to trying to improve our philosophical competence.

I agree this is a further argument for needing more philosophical competence. I personally feel like that position is already pretty solid but I acknowledge that it's not a universal position even amongst EAs.

They're not supposed to be related except in so far as they're both arguments for wanting AI to be able to help humans correct their philosophical mistakes instead of just deferring to humans.

"Defer to humans" could mean many different things. This is an argument against AI forever deferring to humans in their current form / with their current knowledge. When I talk about "defer to humans" I'm usually talking about an AI deferring to humans who are explicitly allowed to deliberate/learn/self-modify if that's what they choose to do (or, perhaps more importantly, to construct a new AI with greater philosophical competence and put it in charge).

I understand that some people might advocate for a stronger form of "defer to humans" and it's fine to respond to them, but wanted to make sure there wasn't a misunderstanding. (Also I don't feel there are very many advocates for the stronger form, I think the bulk of the AI community imagines our AI deferring to us but us being free to design better AIs later.)

[-]Wei DaiΩ3160

I agree, but if you have a broad distribution over mixtures then you’ll be including many that don’t use literal locations and those will dominate for “sparse” universes.

I currently think that each way of distributing caring-measure over a universe should be a separate member of moral parliament, given a weight equal to its ethical plausibility, instead of having just one member with some sort of universal distribution. So there ought to be a substantial coalition in one's moral parliament that think controlling bigger/richer universes is potentially orders of magnitude more valuable.

Another intuition pump here is to consider a thought experiment where you think there's 50/50 chance that our universe supports either 10^120 operations or 10^(10^120) operations (and controlling other universes isn't possible). Isn't there some large coalition of total utilitarians in your moral parliament who would be at least 100x happier to find out that the universe supports 10^(10^120) operations (and be willing to bet/trade accordingly)?

When I talk about “defer to humans” I’m usually talking about an AI deferring to humans who are explicitly allowed to deliberate/​learn/​self-modify if that’s what they choose to do (or, perhaps more importantly, to construct a new AI with greater philosophical competence and put it in charge).

Yeah I didn't make this clear, but my worry here is that most humans won't choose to "deliberate/​learn/​self-modify" in a way that leads to philosophical maturity (or construct a new AI with greater philosophical competence and put it in charge), if you initially give them an AI that has great intellectual abilities in most areas but defers to humans on philosophical matters. One possibility is that because humans don't have value functions that are robust against distributional shifts, they'll (with the help of their AIs) end up doing an adversarial attack against their own value functions and not be able to recover from that. If they somehow avoid that, they may still get stuck at some level of philosophical competence that is less than what's needed to capture value from bigger/richer universes, and never feel a need to put a new philosophically competent AI in charge. It seems to me that the best way to avoid both of these outcomes (as well as possible near-term moral catastrophes such as creating a lot of suffering that can't be balanced out later) is to make sure that the first advanced AIs are highly or scalably competent in philosophy. (I understand you probably disagree with "getting stuck" even with regard to capturing value from bigger/richer universes, you're not very concerned about near term moral catastrophes, and I'm not sure what your thinking on the unrecoverable self-attack thing is.)

Another intuition pump here is to consider a thought experiment where you think there's 50/50 chance that our universe supports either 10^120 operations or 10^(10^120) operations (and controlling other universes isn't possible). Isn't there some large coalition of total utilitarians in your moral parliament who would be at least 100x happier to find out that the universe supports 10^(10^120) operations (and be willing to bet/trade accordingly)?

I totally agree that there are members of the parliament who would assign much higher value on other universes than on our universe.

I'm saying that there is also a significant contingent that cares about our universe, so the people who care about other universes aren't going to dominate.

(And on top of that, all of the contingents are roughly just trying to maximize the "market value" of what we get, so for the most part we need to reason about an even more spread out distribution.)

Yeah I didn't make this clear, but my worry here is that most humans won't choose to "deliberate/​learn/​self-modify" in a way that leads to philosophical maturity (or construct a new AI with greater philosophical competence and put it in charge), if you initially give them an AI that has great intellectual abilities in most areas but defers to humans on philosophical matters.

There are tons of ways you could get people to do something they won't choose to do. I don't know if "give them an AI that doesn't defer to them about philosophy" is more natural than e.g. "give them an AI that doesn't defer to them about how they should deliberate/learn/self-modify."

[-]Wei DaiΩ3120

I’m saying that there is also a significant contingent that cares about our universe, so the people who care about other universes aren’t going to dominate.

I don't think I'm getting your point here. Personally it seems safe to say that >80% of the contingent of my moral parliament that cares about astronomical waste would say that if our universe was capable of 10^(10^120) operations it would be at least 100x as valuable as if was capable of only 10^120 operations. Are your numbers different from this? In any case, what implications are you suggesting based on "no domination"?

(And on top of that, all of the contingents are roughly just trying to maximize the “market value” of what we get, so for the most part we need to reason about an even more spread out distribution.)

I don't understand this part at all. Please elaborate?

There are tons of ways you could get people to do something they won’t choose to do.

I did preface my conclusion with "The best opportunity to do this that I can foresee", so if you have other ideas about what someone like me ought to do, I'd certainly welcome them.

I don’t know if “give them an AI that doesn’t defer to them about philosophy” is more natural than e.g. “give them an AI that doesn’t defer to them about how they should deliberate/​learn/​self-modify.”

Isn't "how they should deliberate/​learn/​self-modify" itself a difficult philosophical problem (in the field of meta-philosophy)? If it's somehow easier or safer to "give them an AI that doesn’t defer to them about how they should deliberate/​learn/​self-modify" than to "give them an AI that doesn’t defer to them about philosophy" then I'm all for that but it doesn't seem like a very different idea from mine.

I don't think I'm getting your point here. Personally it seems safe to say that >80% of the contingent of my moral parliament that cares about astronomical waste would say that if our universe was capable of 10^(10^120) operations it would be at least 100x as valuable as if was capable of only 10^120 operations. Are your numbers different from this? In any case, what implications are you suggesting based on "no domination"?

I might have given 50% or 60% instead of >80%.

I don't understand how you would get significant conclusions out of this without big multipliers. Yes, there are some participants in your parliament who care more about worlds other than this one. Those worlds appear to be significantly harder to influence (by means other than trade), so this doesn't seem to have a huge effect on what you ought to do in this world. (Assuming that we are able to make trades that we obviously would have wanted to make behind the veil of ignorance.)

In particular, if your ratio between the value of big and small universes was only 5x, then that would only have a 5x multiplier on the value of the interventions you list in the OP. Given that many of them look very tiny, I assumed you were imagining a much larger multiplier. (Something that looks very tiny may end up being a huge deal, but once we are already wrong by many orders of magnitude it doesn't feel like the last 5x has a huge impact.)

I don't understand this part at all. Please elaborate?

We will have control over astronomical resources in our universe. We can then acausally trade that away for influence over the kinds of universes we care about influencing. At equilibrium, ignoring market failures and friction, how much you value getting control over astronomical resources doesn't depend on which kinds of astronomical resources you in particular terminally value. Everyone instrumentally uses the same utility function, given by the market-clearing prices of different kinds of astronomical resources. In particular, the optimal ratio between (say) hedonism and taking-over-the-universe depends on the market price of the universe you live in, not on how much you in particular value the universe you live in. This is exactly analogous to saying: the optimal tradeoff between work and leisure depends only the market price of the output of your work (ignoring friction and market failures), not on how much you in particular value the output of your work.

So the upshot is that instead of using your moral parliament to set prices, you want to be using a broader distribution over all of the people who control astronomical resources (weighted by the market prices of their resources). Our preferences are still evidence about what others want, but this just tends to make the distribution more spread out (and therefore cuts against e.g. caring much less about colonizing small universes).

Isn't "how they should deliberate/​learn/​self-modify" itself a difficult philosophical problem (in the field of meta-philosophy)? If it's somehow easier or safer to "give them an AI that doesn’t defer to them about how they should deliberate/​learn/​self-modify" than to "give them an AI that doesn’t defer to them about philosophy" then I'm all for that but it doesn't seem like a very different idea from mine.r

I still don't really get your position, and especially why you think:

It seems to me that the best way to avoid both of these outcomes [...] is to make sure that the first advanced AIs are highly or scalably competent in philosophy.

I do understand why you think it's an important way to avoid philosophical errors in the short-term, in that case I just don't see why you think that such problems are important relative to other factors that affect the quality of the future.

This seems to come up a lot in our discussions. It would be useful if you could make a clear statement of why you think this problem (which I understand as: "ensure early AI is highly philosophically competent" or perhaps "differential philosophical progress," setting aside the application of philosophical competence to what-I'm-calling-alignment) is important, ideally with some kind of quantitative picture of how important you think it is. If you expect to write that up at some point then I'll just pause until then.

[-]Wei DaiΩ360

I don’t understand how you would get significant conclusions out of this without big multipliers. Yes, there are some participants in your parliament who care more about worlds other than this one. Those worlds appear to be significantly harder to influence (by means other than trade), so this doesn’t seem to have a huge effect on what you ought to do in this world. (Assuming that we are able to make trades that we obviously would have wanted to make behind the veil of ignorance.)

Wait, you are assuming a baseline/default outcome where acausal trade takes place, and comparing other interventions to that? My baseline for comparison is instead (as stated in the OP) "what can be gained by just creating worthwhile lives in this universe". My reasons for this are (1) I (and likely others who might read this) don't think acausal trade is much more likely to work than the other items on my list and (2) the main intended audience for this post is people who have realized the importance of influencing the far future but not aware of (or have seriously considered) the possibility of influencing other universes through things like acausal trade and other items on my list. Even the most sophisticated thinkers in EA seem to fall into this category, e.g., people like Will MacAskill, Toby Ord, and Nick Beckstead, unless they've privately considered the possibility and chose not to talk about it in public, in which case it still seems safe to assume that most people in EA think "creating worthwhile lives in this universe" is the most good that can be accomplished.

In particular, if your ratio between the value of big and small universes was only 5x, then that would only have a 5x multiplier on the value of the interventions you list in the OP. Given that many of them look very tiny, I assumed you were imagining a much larger multiplier. (Something that looks very tiny may end up being a huge deal, but once we are already wrong by many orders of magnitude it doesn’t feel like the last 5x has a huge impact.)

I don't understand where "5x" comes from or why that's the relevant multiplier instead of 100x.

It would be useful if you could make a clear statement of why you think this problem is important

I'll think about this, but I think I'd be more motivated to attempt this (and maybe also have a better idea of what I need to do) if other people also spoke up and told me that they couldn't understand my past attempts to explain this (including what I wrote in the OP and previous comments in this thread).

[-]dxuΩ0100
If there's some kind of measure of "observer weight" over the whole mathematical universe, we might be already much larger than 1/3^^^3 of it, so the total utilitarian can only gain so much.

Could you provide some intuition for this? Naively, I'd expect our "observer measure" over the space of mathematical structures to be 0.

The weight could be something like the algorithmic probability over strings(https://en.wikipedia.org/wiki/Algorithmic_probability), in which case universes like ours with a concise description would get a fairly large chunk of the weight.

I curated this post because it crystallised an important point regarding optimising the long-term future, that I've not seen anyone write down succinctly before (with reference to the relevant technical concepts, while still being short and readable).

Thanks. I agree with your overall conclusions.

On the specifics, Bostrom's simulation argument is more than just a parallel here: it has an impact on how rich we might expect our direct parent simulator to be.

The simulation argument applies similarly to one base world like ours, or to an uncountable number of parallel worlds embedded in Tegmark IV structures. Either way, if you buy case 3, the proportion of simulated-by-a-world-like-ours worlds rises close to 1 (I'm counting worlds "depth-first", since it seems most intuitive, and infinite simulation depth from worlds like ours seems impossible).

If Tegmark's picture is accurate, we'd expect to be embedded in some hugely richer base structure - but in Bostrom's case 3 we'd likely have to get through N levels of worlds-like-ours first. While that wouldn't significantly change the amount of value on the table, it might make it a lot harder for us to exert influence on the most valuable structures.

This probably argues for your overall point: we're not the best minds to be making such calculations (either on the answers, or on the expected utility of finding good answers).

[-]Wei DaiΩ230

If Tegmark’s picture is accurate, we’d expect to be embedded in some hugely richer base structure—but in Bostrom’s case 3 we’d likely have to get through N levels of worlds-like-ours first. While that wouldn’t significantly change the amount of value on the table, it might make it a lot harder for us to exert influence on the most valuable structures.

I'm not sure it makes sense to talk about "expect" here. (I'm confused about anthropics and especially about first-person subjective expectations.) But if you take the third-person UDT-like perspective here, we're directly embedded in some hugely richer base structures, and also indirectly embedded via N levels of worlds-like-ours, and having more of the latter doesn't reduce how much value (in the UDT-utility sense) we can gain by influencing the former; it just gives us more options that we can choose to take or not. In other words, we always have the option of pretending the latter don't exist and just optimize for exerting influence via the direct embeddings.

On second thought, it does increase the opportunity cost of exerting such influence, because we'd be spending resources in both the directly embedded worlds and the indirectly-embedded worlds to do that. To get around this, the eventual superintelligence doing this could wait until such a time in our universe that Bostrom's proposition 3 isn't true anymore (or true to a lesser extent) before trying to influence richer universes, since presumably only the historically interesting periods of our universe are heavily simulated by worlds-like-ours.

That seems right.

I'd been primarily thinking about more simple-minded escape/uplift/signal-to-simulators influence (via this us), rather than UDT-influence. If we were ever uplifted or escaped, I'd expect it'd be into a world-like-ours. But of course you're correct that UDT-style influence would apply immediately.

Opportunity costs are a consideration, though there may be behaviours that'd increase expected value in both direct-embeddings and worlds-like-ours. Win-win behaviours could be taken early.

Personally, I'd expect this not to impact our short/medium-term actions much (outside of AI design). The universe looks to be self-similar enough that any strategy requiring only local action would use a tiny fraction of available resources.

I think the real difficulty is only likely to show up once a SI has provided a richer picture of the universe than we're able to understand/accept, and it happens to suggest radically different resource allocations.

Most people are going to be fine with "I want to take the energy of one unused star and do philosophical/astronomical calculations"; fewer with "Based on {something beyond understanding}, I'm allocating 99.99% of the energy in every reachable galaxy to {seemingly senseless waste}".

I just hope the class of actions that are vastly important, costly, and hard to show clear motivation for, is small.

What of exponential total utilitarianism? That's a total utilitarianism that multiplies the total utility by the exponential of the population. It may be very unlikely, but as population grows, it will eventually come to dominate.

That's why I think moral theories should be normalised independently, to prevent the super-population ones from winning just by default.

[-]Wei DaiΩ250

That’s why I think moral theories should be normalised independently, to prevent the super-population ones from winning just by default.

I'm assuming this as well. Did I give a different impression in the post? If so I'll try to clarify.

Normally when I normalise, I use the expected maximum of the utility function if we just maximised it and nothing else: https://www.lesswrong.com/posts/hBJCMWELaW6MxinYW/intertheoretic-utility-comparison

Therefore if total utilitarianism is not heavily weighted, it will likely remain unimportant; your phrasing "or someone whose moral uncertainty includes total utilitarianism" suggested to me that you thought total utilitarianism would be important even if assigned a low weight, which suggested that it was not being normalised.

[-]Wei DaiΩ250

your phrasing “or someone whose moral uncertainty includes total utilitarianism” suggested to me that you thought total utilitarianism would be important even if assigned a low weight, which suggested that it was not being normalised.

Ok, I didn't mean that. What I meant was that if your moral uncertainty includes total utilitarianism, then the total utilitarian part should reason as follows. Would it be clearer / clear enough if I replaced "or someone whose moral uncertainty includes total utilitarianism" with "or the total utilitarianism part of someone's moral uncertainty"?

I think that would be clearer, yes.

[-]Wei DaiΩ250

Thanks, I've made that edit.

I think that at the time this post came out, I didn't have the mental scaffolding necessary to really engage with it – I thought of this question as maybe important, but sort of "above my paygrade", something better left to other people who would have the resources to engage more seriously with it.

But, over the past couple years, the concepts here have formed an important component of my understanding of robust agency. Much of this came from private in-person conversations, but this post is the best writeup of the concept I'm currently aware of.

One thing I like about this post is the focus on philosophical competence. Previously, I'd thought of this question as dangerous to think about, because you might make philosophical mistakes that doomed you or your universe for (in retrospect) silly reasons.

My current model is more like "no, you-with-your-21st-century-human-brain shouldn't actually attempt to take actions aiming primarily to affect other universes on the macro scale. Negotiating with other universes is something you do when you're a literal galaxy brain that is quite confident in it's philosophy."

But, meanwhile, it seems that:

(note: low-to-mid confidence, still working through these problems myself, and I very much still philosophically confused about at least some of this)

– becoming philosophically competent, as a species, may be one of the most important goals facing humanity, and how this project interfaces with AI development may be crucially important. This may be relevant to people (like me) who aren't directly working on alignment but trying to have a good model of the strategic landscape.

– a concept not from this particular post, but relevant, is the notion that the question is not "are you in a simulation or not?", it's more like "to what degree are you in simulations? Which distribution of agents are you making choices on behalf of?". And this has some implications about how you should make choices that you should worry about now, before you are a literal galaxy brain. (many of which are more mundane)

– I think there may be a connection between "Beyond Astronominal Waste" and "Robustness to scale." You can't be a galaxy brain now, but you can be the sort of person who would be demonstrably safe to scale up, in a way that simulations can detect, which might let you punch above your current weight, in terms of philosophical competence.

This is a post that's stayed with me since it was published. The title is especially helpful as a handle. It is a simple reference for this idea, that there are deeply confusing philosophical problems that are central to our ability to attain most of the value we care about (and that this might be a central concern when thinking about AI).

It's not been very close to areas I think about a lot, so I've not tried to build on it much, and would be interested in a review from someone who thinks in more detail about these matters more, but I expect they'll agree it's a very helpful post to exist.

I am not sure how one can talk about the observed universe and the number 3^^^3 in the same sentence, given that the maximum informational content is roughly 10^120 qubits, the rest is outside the cosmological horizon. Alternatively, if we talk about the simulation argument, then the expression "practical implications" seems out of place.

[-]dxu70
I am not sure how one can talk about the observed universe and the number 3^^^3 in the same sentence, given that the maximum informational content is roughly 10^120 qubits, the rest is outside the cosmological horizon.

Where in the post do you see it suggested that our universe is capable of containing 3^^^3 of anything?

Alternatively, if we talk about the simulation argument, then the expression "practical implications" seems out of place.

How so?

I just realised that the problem of the limited size of the Universe is isomorphic to the problem of how to survive the end of the universe, which I analysed here, but the escape routs described by OP are different, and more rely on acausal trade and simulation hacking than on physic manipulation.

[-]DacynΩ130

MUH doesn't imply the existence of halting oracles. Indeed, the Computable Universe Hypothesis is supposed to be an extension of the Mathematical Universe Hypothesis, but CUH says that halting oracles do not exist.

[-]Wei DaiΩ230

There may be several confusions happening here. First I've been using MUH to mean "ultimate ensemble theory" (i.e., the idea that the Level IV multiverse of all mathematical structures exists), because Wikipedia says MUH is "also known as the ultimate ensemble theory". But Tegmark currently defines MUH as "Our external physical reality is a mathematical structure" which seems to be talking just about our particular universe and not saying that all mathematical structures exist. Second if by "MUH doesn’t imply the existence of halting oracles" you mean that MUH doesn't necessarily imply the existence of halting oracles in our universe, then I agree. What I meant in the OP is that the ultimate ensemble theory implies that universes containing halting oracles exist in the Level IV multiverse.

Hopefully that clarifies things?

[-]DacynΩ260

OK, that makes sense.

Isn't all this massively dependent on how your utility $U$ scales with the total number $N$ of well-spent computations (e.g. one-bit computes)?

That is, I'm asking for a gut feeling here: What are your relative utilities for $10^{100}$, $10^{110}$, $10^{120}$, $10^{130}$ universes?

Say, $U(0)=0$, $U(10^100)=1$ (gauge fixing); instant pain-free end-of-universe is zero utility, and a successful colonization of the entire universe with a suboptimal black hole-farming near heat-death is unit utility.

Now, per definitionem, the utility $U(N)$ of a $N$-computation outcome is the inverse of the probability $p$ at which you become indifferent to the following gamble: Immediate end-of-the-world at probability $(1-p)$ vs an upgrade of computational world-size to $N$ at propability $p$.

I would personally guess that $U(10^{130})< 2 $; i.e. this upgrade would probably not be worth a 50% risk of extinction. This is massively sublinear scaling.

is available by pressing CTR+4/CMD+4 instead of using '$'

With quantum branching, our universe could have some number like a googolplex of stuff, maybe more. And philosophically, you're worried about the difference between that and 3^^^3? I get that there's a big gap there but I'd guess it's one that we're definitionally unable to do useful moral reasoning about.

I feel like scope insensitivity is something to worry about here. I'd be really happy to learn that humanity will manage to take good care of our cosmic endowment but my happiness wouldn't scale properly with the amount of value at stake if I learned we took good care of a super-cosmic endowment. I think that's the result of my inability to grasp the quantities involved rather than a true reflection of my extrapolated values, however.

My concern is more that reasoning about entities in simpler universes capable of conducting acausal trades with us will turn out to be totally intractable (as will the other proposed escape methods), but since I'm very uncertain about that I think it's definitely worth further investigation. I'm also not convinced Tegmark's MUH is true in the first place, but this post is making me want to do more reading on the arguments in favor & opposed. It looks like there was a Rationally Speaking episode about it?

When you're faced with numbers like 3^^^3, scope insensitivity is the correct response. A googolplex is already enough to hold every possible configuration of Life as we know it. "Hamlet, but with extra commas in these three places, performed by intelligent starfish" is in there somewhere in over a googol different varieties. What, then, does 3^^^3 add except more copies of the same?

Nothing, if your definition of a copy is sufficiently general :-)

Am I understanding you right that you believe in something like a computational theory of identity and think there's some sort of bound on how complex something we'd attribute moral patienthood or interestingness to can get? I agree with the former, but don't see much reason for believing the latter.

I have no idea if there is such a bound. I will never have any idea if there is such a bound, and I suspect that neither will any entity in this universe. Given that fact, I'd rather make the assumption that doesn't turn me stupid when Pascal's Wager comes up.

I doubt that there's any moral difference between running a person and asking a magical halting oracle what they would have said.