Over the last few years, progress has been made in estimating the density of intelligent life in the universe (e.g., Olson 2015, Sandberg 2018, Hanson 2021). Bits of progress have been made in using these results to update longtermist macrostrategy, but these results are partial and stopped short of their potential (Finnveden 2019, Olson 2020, Olson 2021, Cook 2022). Namely, this work stopped early in its tracks, at best, only hinting at the meaty part of the implications and leaving half of the work almost untouched: comparing the expected utility produced by different Space-Faring Civilizations (SFCs). In this post, we hint at the possible macrostrategic implications of these works: A possible switch for the longtermist community from targeting decreasing X-Risks (including increasing P(Alignment)[1]), to increasing P(Alignment | Humanity creates an SFC).
Sequence: This post is part 1 of a sequence investigating the longtermist implications of alien Space-Faring Civilizations. Each post aims to be standalone.
Summary
We define two hypotheses:
Civ-Saturation Hypothesis: Most resources will be claimed by Space-Faring Civilizations (SFCs) regardless of whether humanity creates an SFC[2].
Civ-Similarity Hypothesis: Humanity's Space-Faring Civilization would produce utility[3] similar to other SFCs.
If these hypotheses hold, this could shift longtermist priorities away from reducing pure extinction risks and toward specifically optimizing P(Alignment | Humanity creates an SFC)[1]. This means that rather than focusing broadly on preventing misaligned AI and extinction, longtermists might need to prioritize strategies that specifically increase the probability of alignment conditional on humanity creating an SFC. Macrostrategy updates include the following:
(i) Deprioritizing significantly extinction risks, such as nuclear weapon and bioweapon risks.
(ii) Deprioritizing to some degree AI Safety agendas mostly increasing P(Humanity creates an SFC) but not increasing much P(Alignment | Humanity creates an SFC).
(iii) Giving more weight to previously neglected AI Safety agendas. E.g., a "Plan B AI Safety" agenda that would focus on decreasing P(Humanity creates an SFC | Misalignment), for example, by implementing (active & corrigible) preferences against space colonization in early AI systems.
The Civ-Saturation Hypothesis
Will Humanity's SFC grab marginal resources? The Civ-Saturation Hypothesis posits that when making decisions, we should assume most of Humanity’s SFC future resources will eventually be grabbed by SFCs regardless of whether Humanity's SFC exists or not.
Plausibly low marginal resources under EDT. The validity of this hypothesis can be studied using models estimating the frequency of Space-Faring Civilizations (SFCs) in the universe (Sandberg 2018, Finnveden 2019, Olson 2020, Hanson 2021, Snyder-Beattie 2021, Cook 2022). The validity will also depend on which decision theory we use and on our beliefs behind these. As soon as we put some credence on evidential decision theories and on our actions being correlated with those of our exact copies[4], we may have to put significant weight on the Civ-Saturation Hypothesis. We will produce a first quantitative evaluation of this hypothesis in a later post.
Hinting at longtermist macrostrategic implications
What is the impact of human ancestry on SFC's expected utility? For simplicity, let’s assume the Civ-Saturation Hypothesis is 100% true. How much counterfactual value Humanity creates then depends entirely on the utility Humanity’s SFC creates relative to all SFCs. Are SFCs going to create more or less utility per unit of resources than Humanity’s SFC? I.e., how different are U(SFC) and U(SFC| Human-ancestry)? Little progress has been made on this question. For reference, see quotes from (Finnveden, 2019)[5], (Brian Tomasik, 2015)[6], (Brauner and Grosse-Holz, 2019)[7], (Anthony DiGiovanni, 2021)[8] in footnotes. Most discussions stop after a few of the following arguments.
Under moral anti-realism, humans are more likely to be higher utility since we should expect a lower level of convergence between moral values, and we are more likely to carry out our own precise values.
There may be convergence among goals, especially between ancestors of SFCs.
Human moral values may depend on the biological structure of our brain or on contingent cultural features.
It is plausible that humans are more compassionate than other Intelligent Civilizations since humans are somewhat abnormally compassionate among Earthly animals.
Human's SFC may cause more suffering than other SFCs because we are sentient, and conflicts may target our values.
Human's SFC may create more utility because we are sentient and are more likely to create a sentient SFC.
Humans are likely biased towards valuing themselves.
For clarity, I am not endorsing these arguments. I am listing arguments found in existing discussions.
No existing work directly studies this precise question in depth. Some related work exists but mostly looks at the moral values of alien or alien SFCs, much more rarely, at those of SFC’s alien ancestors, and not at the relative expected utility between Humanity’s SFC and other SFCs. I will introduce novel object-level arguments about this question in a later post.
A priori, Humanity's SFC expected utility is not special. For now, let’s assume we know nothing about how conditioning on Human-ancestry impacts the utility produced by an SFC, then U(SFC) ~ U(SFC | Human-ancestry). This assumption is similar to using the Principle of Mediocrity. What would be the macrostrategic longtermist implications in that case?
Reducing pure extinction risks is much less valuable. Increasing P(Humanity creates an SFC) has much less longtermist value, and Nuclear and Bio X-risk reduction agendas would have a reduced priority. Though their neartermist justifications would remain.
Longtermists should optimize P(Alignment | Humanity creates a SFC). Concerning AI Safety, from the point of view of impartial longtermists[9], increasing P(Alignment | Humanity creates an SFC) would replace the currently commonly used target of increasing P(Alignment AND Humanity creates an SFC). Longtermist AI Safety agendas would need to be re-evaluated using this new target.
Some existing AI Safety agendas may increase P(Alignment AND Humanity creates an SFC) while at the same time not increasing as much or even, if unlucky, reducing P(Alignment | Humanity creates an SFC). For example, such agendas may significantly prevent early AIs and AI usages from destroying, at the same time, the potential of Humanity and AIs.
Other currently neglected agendas may increase P(Alignment | Humanity creates an SFC) while not increasing P(Alignment AND Humanity creates an SFC). Those include agendas aiming at decreasing P(Humanity creates an SFC | Misalignment). An example of intervention in such an agenda is overriding instrumental goals for space colonization and replacing them with an active desire not to colonize space. This defensive preference could be removed later, conditional on achieving corrigibility.
The Civ-Similarity Hypothesis
Is Human ancestry neutral, positive, or negative? The implications hinted above are only plausible if U(SFC) ~ U(SFC | Human-ancestry). We formulate this requirement as a hypothesis. The Civ-Similarity Hypothesis posits that the expected utility efficiency, per unit of resources, of Humanity's future SFC is similar to that of other SFCs.
How could this hypothesis be valid? There are two main components contributing in that direction:
High uncertainty about the future may flatten expected utilities. We may not know enough about how conditioning on Human (or others) ancestors impacts the value of the long-term future produced by an SFC.
SFCs are rare, and creating them may be very constrained, AKA convergent evolution and strong selection. We may observe that selection mechanisms and convergent evolutionary processes drastically reduce the space of possible characteristics an SFC’s ancestors can have.
How could this hypothesis be invalid?
We may know enough to predict significant differences in expected utilities. We may already have enough information to say that Humanity's SFC will be abnormal in some specific ways relative to other SFCs. If, additionally, we are confident in how these abnormalities impact the long-term utility of Humanity's SFC, then we should be able to conclude that our future SFC is significantly higher or lower utility than other SFCs.
We may only care about our precise values, and we may succeed at aligning our future SFC. We may consider that only our own precise values are valuable (e.g., no moral uncertainty). Additionally, if the distribution of alien moral values is much more diffuse than that of humans, even after conditioning on ancestors creating first an SFC. And if, finally, we are confident enough in how our values impact the long-term utility produced by SFCs (e.g., we think we will succeed at alignment). Then, we should conclude that the hypothesis is invalid.
In later posts, we will look deeper into evaluating the Civ-Similarity Hypothesis and the tractability of making further progress there. We will see that a lot can be said regarding this hypothesis.
The Existence Neutrality Hypothesis
A third hypothesis as the conjunction of the previous two. This third and last hypothesis is simply the conjunction of the first two hypotheses. The Existence Neutrality Hypothesis posits that influencing Humanity's chance at creating an SFC produces little value compared to increasing the quality of the SFC we would eventually create conditional on doing so. Let's note that this hypothesis somewhat contradicts Nick Bostrom's astronomical waste argument.
Whispers of plausible importance. A few discussions about the implications (from the existence of alien SFCs, including the Existence Neutrality Hypothesis) are already available online but, to my knowledge, never led to a proper assessment of these questions. For reference, in the footnotes, you can find relevant quotes from (Brian Tomasik 2015)[10], (Jan M. Brauner and Friederike M. Grosse-Holz, 2018)[11], (Anthony DiGiovanni, 2021)[8], (Maxwell Tabarrok, 2022)[12], (MacAskill, 2023)[13], (Toby Ord's answer to MacAskill 2023)[14], (Jim Buhler, 2023)[15], (Magnus Vinding 2024)[16].
Context
Evaluating the Neutrality Hypothesis - Introductory Series. This post is part of a series introducing a research project for which I am seeking funding: Evaluating the Neutrality Hypothesis. This project includes evaluating both the Civ-Saturation and the Civ-Similarity Hypotheses and their longtermist macrostrategic implications. This introductory series hints at preliminary research results and looks at the tractability of making further progress in evaluating these hypotheses.
Next: A first evaluation of the Civ-Saturation Hypothesis. Over the next few posts, we will introduce a first evaluation of the Civ-Saturation Hypothesis. Starting by reviewing existing SFC density estimates and models producing them and clarifying the meaning and impact of Civ-Saturation on which possible world we should bet on.
Plan of the sequence
(Introduction)
(1) Longtermist implications of aliens Space-Faring Civilizations - Introduction
(A first pass at evaluating the Civ-Saturation Hypothesis)
(2) Space-Faring Civilization density estimates and models - Review
(3) Decision-Relevance of worlds and ADT implementations
(4) Formalizing Civ-Saturation concepts and metrics
(5) Should we bet on worlds saturated with Space-Faring Civilizations? - A first-pass evaluation
(Objects-level arguments about the Civ-Similarity Hypothesis and its tractability)
(6) Selection Pressures on Space-Faring Civilization Shapers - Preliminary Insights
(7) High-level reasons for optimism in studying the Existence Neutrality Hypothesis
(Introducing the research project & implications)
(8) Evaluating the Existence Neutrality Hypothesis - A research project
(9) Macrostrategic Implications of the Existence Neutrality Hypothesis
Acknowledgments
Thanks to Tristan Cook, Magnus Vinding, Miles Kodama, and Justis Mills for their excellent feedback on this post and ideas. Note that this research was done under my personal name and that this content is not meant to represent any organization's stance.
By increasing P(Alignment), I mean increasing the probability that the SFC Humanity would create is aligned with some kind of ideal moral value (e.g., CEV), and has the ability to optimize it strongly. This requires some degree of success at both technical alignment and AI governance.
The hypothesis is specifically about what we should bet on when we are making decisions. Its extended version is: When making decisions, we should bet on the fact that most resources will be claimed by Space-Faring Civilizations (SFCs) regardless of whether humanity creates an SFC
Exact copies are the group of agents that are exactly equivalent to you, the position of all the particles composing them is identical to the positions in you. They are perfect copies of you living in different parts of the world (e.g. multiverse).
Quote: “How much one should value Earth-originating and alien civilisations is very unclear. If you accept moral anti-realism, one reason to expect aliens to be less valuable than Earth-originating civilisations is that humans are more likely to share your values, since you are a human. However, there might be some convergence among goals, so it’s unclear how strong this effect is.” (Finnveden 2019)
Quote: “If we knew for certain that ETs would colonize our region of the universe if Earth-originating intelligence did not, then the question of whether humans should try to colonize space becomes less obvious. As noted above, it's plausible that humans are more compassionate than a random ET civilization would be. On the other hand, human-inspired computations might also entail more of what we consider to count as suffering because the mind architectures of the agents involved would be more familiar. And having more agents in competition for our future light cone might lead to dangerous outcomes.” (Brian Tomasik 2015)
Quote: "We may however assume that our reflected preferences depend on some aspects of being human, such as human culture or the biological structure of the human brainfn-48. Thus, our reflected preferences likely overlap more with a (post-)human civilization than alternative civilizations. As future agents will have powerful tools to shape the world according to their preferences, we should prefer (post-)human space colonization over space colonization by an alternative civilization." (Jan M. Brauner and Friederike M. Grosse-Holz, 2019)
Quote: "Arguments on this point will very likely not be robust; on any side of the debate, we are left with speculation, as our data consists of only one sample from the distribution of potentially space-colonizing species (i.e., ourselves).[51] On the side of optimism about humans relative to aliens, our species has historically displayed a capacity to extend moral consideration from tribes to other humans more broadly, and partly to other animals. Pessimistic lines of evidence include the exponential growth of factory farming, genocides of the 19th and 20th centuries, and humans’ unique degree of proactive aggression among primates (Wrangham, 2019).[52] Our great uncertainty arguably warrants focusing on increasing the quality of future lives conditional on their existence, rather than influencing the probability of extinction in either direction.
It does seem plausible that, by evolutionary forces, biological nonhumans would care about the proliferation of sentient life about as much as humans do, with all the risks of great suffering that entails. To the extent that impartial altruism is a byproduct of cooperative tendencies that were naturally selected (rather than “spandrels”), and of rational reflection, these beings plausibly would care about as much as humans do about reducing suffering. If, as suggested by work such as that of Henrich (2020), impartial values are largely culturally contingent, this argument does not provide a substantial update against +ERR if our prior view was that impartiality is an inevitable consequence of philosophical progress.[53] On the other hand, these cultures that tend to produce impartial values may themselves arise from convergent economic factors.[54] Brauner and Grosse-Holz’s mathematical model also acknowledges the following piece of weak evidence against +ERR in this respect: intelligent beings with values orthogonal to most humans’ (or most philosophically deliberative humans’) would tend not only to create less value in the future, but also less disvalue. Given the arguments in section 2.2 for the simplicity of disvalue, however, this difference may not be large." (Anthony DiGiovanni, 2021)
More precisely, from the point of view of impartial longtermists who also, at least, care for the impact of their exact copies (or believe in stronger forms of EDT).
Quote: "If another species took over and built a space-faring civilization, would it be better or worse than our own? There's some chance it could be more compassionate, such as if bonobos took our place. But it might also be much less compassionate, such as if chimpanzees had won the evolutionary race, not to mention killer whales. On balance it's plausible our hypothetical replacements would be less compassionate, because compassion is something humans value a lot, while a random other species probably values something else more. The reason I'm asking this question in the first place is because humans are outliers in their degree of compassion. Still, in social animals, various norms of fair play are likely to emerge regardless of how intrinsically caring the species is. Simon Knutsson pointed out to me that if human survivors do recover from a near-extinction-level catastrophe, or if humans go extinct and another species with potential to colonize space evolves, they'll likely need to be able to cooperate rather than fighting endlessly if they are to succeed in colonizing space. This suggests that if they colonize space, they will be more moral or peaceful than we were. My reply is that while this is possible, a rebuilding civilization or new species might curb infighting via authoritarian power structures or strong ingroup loyalty that doesn't extend to outgroups, which might imply less compassion than present-day humans have." (Brian Tomasik 2015)
Quote: "If humanity goes extinct without colonizing space, some kind of other beings would likely survive on earthfn-47. These beings might evolve into a non-human technological civilization in the hundreds of millions of years left on earth and eventually colonize space. Similarly, extraterrestrials (that might already exist or come into existence in the future) might colonize (more of) our corner of the universe, if humanity does not.
In these cases, we must ask whether we prefer (post-)human space colonization over the alternatives. Whether alternative civilizations would be more or less compassionate or cooperative than humans, we can only guess. We may however assume that our reflected preferences depend on some aspects of being human, such as human culture or the biological structure of the human brainfn-48. Thus, our reflected preferences likely overlap more with a (post-)human civilization than alternative civilizations. As future agents will have powerful tools to shape the world according to their preferences, we should prefer (post-)human space colonization over space colonization by an alternative civilization." (Jan M. Brauner and Friederike M. Grosse-Holz, 2018)
Quote: "The base rate of formation of intelligent or morally valuable life on earth and in the universe is an essential but unknown parameter for EA Longtermist philosophy. Longtermism currently assumes that this rate is very low which is fair given the lack of evidence. If we find evidence that this rate is higher, then wide moral circle Longtermists should shift their efforts from shielding humanity from as much existential risk as possible, to maximizing expected value by taking higher volatility paths into the future." (Maxwell Tabarrok, 2022)
Quote: "I think one could reasonably hold, for example, that the probability of a technologically-capable species evolving, if Homo sapiens goes extinct, is 90%, that non-Earth-originating alien civilisations settling the solar systems that we would ultimately settle is also 90%, and that such civilisations would have similar value to human-originating civilisation.
(They also change how you should think about longterm impact. If alien civilisations will settle the Milky Way (etc) anyway, then preventing human extinction is actually about changing how interstellar resources are used, not whether they are used at all.)
And I think it means we miss out on some potentially important ways of improving the future. For example, consider scenarios where we fail on alignment. There is no “humanity”, but we can still make the future better or worse. A misaligned AI system that promotes suffering (or promotes something that involves a lot of suffering) is a lot worse than an AI system that promotes something valueless. " (MacAskill 2023)
Quote: "You are right that the presence or absence of alien civilisations (especially those that expand to settle very large regions) can change things. I didn't address this explicitly because (1) I think it is more likely that we are alone in the affectable universe, and (2) there are many different possible dynamics for multiple interacting civilisations and it is not clear what is the best model. But it is still quite a plausible possibility and some of the possible dynamics are likely enough and simple enough that they are worth analysing." (Toby Ord's answer to MacAskill 2023)
Quote: "Hanson (2021) and Cook (2022) estimate that we should expect to eventually “meet” (grabby) alien AGIs/civilizations – just AGIs, from here on – if humanity expands, and that our corner of the universe will eventually be colonized by aliens if humanity doesn’t expand.
This raises the following three crucial questions:
What would happen once/if our respective AGIs meet? Values handshakes (i.e., cooperation) or conflict? Of what forms?
Do we have good reasons to think the scenario where our corner of the universe is colonized by humanity is better than that where it is colonized by aliens? Should we update on the importance of reducing existential risks?[1]
Considering the fact that aliens might fill our corner of the universe with things we (dis)value, does humanity have an (inter-civilizational) comparative advantage in focusing on something the grabby aliens will neglect?" (Jim Buhler, 2023)
Quote: "Impartial AI safety would plausibly give strong consideration to our potential impact on other cosmic agents, whereas AI safety that exclusively prioritizes, say, human survival or human suffering reduction would probably not give it strong consideration, if indeed any consideration at all. So the further we diverge from ideals of impartiality in our practical focus, the more likely we may be to neglect our potential impact on other cosmic agents." (Magnus Vinding 2024)
Over the last few years, progress has been made in estimating the density of intelligent life in the universe (e.g., Olson 2015, Sandberg 2018, Hanson 2021). Bits of progress have been made in using these results to update longtermist macrostrategy, but these results are partial and stopped short of their potential (Finnveden 2019, Olson 2020, Olson 2021, Cook 2022). Namely, this work stopped early in its tracks, at best, only hinting at the meaty part of the implications and leaving half of the work almost untouched: comparing the expected utility produced by different Space-Faring Civilizations (SFCs). In this post, we hint at the possible macrostrategic implications of these works: A possible switch for the longtermist community from targeting decreasing X-Risks (including increasing P(Alignment)[1]), to increasing P(Alignment | Humanity creates an SFC).
Sequence: This post is part 1 of a sequence investigating the longtermist implications of alien Space-Faring Civilizations. Each post aims to be standalone.
Summary
We define two hypotheses:
If these hypotheses hold, this could shift longtermist priorities away from reducing pure extinction risks and toward specifically optimizing P(Alignment | Humanity creates an SFC)[1]. This means that rather than focusing broadly on preventing misaligned AI and extinction, longtermists might need to prioritize strategies that specifically increase the probability of alignment conditional on humanity creating an SFC. Macrostrategy updates include the following:
The Civ-Saturation Hypothesis
Will Humanity's SFC grab marginal resources? The Civ-Saturation Hypothesis posits that when making decisions, we should assume most of Humanity’s SFC future resources will eventually be grabbed by SFCs regardless of whether Humanity's SFC exists or not.
Plausibly low marginal resources under EDT. The validity of this hypothesis can be studied using models estimating the frequency of Space-Faring Civilizations (SFCs) in the universe (Sandberg 2018, Finnveden 2019, Olson 2020, Hanson 2021, Snyder-Beattie 2021, Cook 2022). The validity will also depend on which decision theory we use and on our beliefs behind these. As soon as we put some credence on evidential decision theories and on our actions being correlated with those of our exact copies[4], we may have to put significant weight on the Civ-Saturation Hypothesis. We will produce a first quantitative evaluation of this hypothesis in a later post.
Hinting at longtermist macrostrategic implications
What is the impact of human ancestry on SFC's expected utility? For simplicity, let’s assume the Civ-Saturation Hypothesis is 100% true. How much counterfactual value Humanity creates then depends entirely on the utility Humanity’s SFC creates relative to all SFCs. Are SFCs going to create more or less utility per unit of resources than Humanity’s SFC? I.e., how different are U(SFC) and U(SFC| Human-ancestry)? Little progress has been made on this question. For reference, see quotes from (Finnveden, 2019)[5], (Brian Tomasik, 2015)[6], (Brauner and Grosse-Holz, 2019)[7], (Anthony DiGiovanni, 2021)[8] in footnotes. Most discussions stop after a few of the following arguments.
For clarity, I am not endorsing these arguments. I am listing arguments found in existing discussions.
No existing work directly studies this precise question in depth. Some related work exists but mostly looks at the moral values of alien or alien SFCs, much more rarely, at those of SFC’s alien ancestors, and not at the relative expected utility between Humanity’s SFC and other SFCs. I will introduce novel object-level arguments about this question in a later post.
A priori, Humanity's SFC expected utility is not special. For now, let’s assume we know nothing about how conditioning on Human-ancestry impacts the utility produced by an SFC, then U(SFC) ~ U(SFC | Human-ancestry). This assumption is similar to using the Principle of Mediocrity. What would be the macrostrategic longtermist implications in that case?
Longtermists should optimize P(Alignment | Humanity creates a SFC). Concerning AI Safety, from the point of view of impartial longtermists[9], increasing P(Alignment | Humanity creates an SFC) would replace the currently commonly used target of increasing P(Alignment AND Humanity creates an SFC). Longtermist AI Safety agendas would need to be re-evaluated using this new target.
Some existing AI Safety agendas may increase P(Alignment AND Humanity creates an SFC) while at the same time not increasing as much or even, if unlucky, reducing P(Alignment | Humanity creates an SFC). For example, such agendas may significantly prevent early AIs and AI usages from destroying, at the same time, the potential of Humanity and AIs.
Other currently neglected agendas may increase P(Alignment | Humanity creates an SFC) while not increasing P(Alignment AND Humanity creates an SFC). Those include agendas aiming at decreasing P(Humanity creates an SFC | Misalignment). An example of intervention in such an agenda is overriding instrumental goals for space colonization and replacing them with an active desire not to colonize space. This defensive preference could be removed later, conditional on achieving corrigibility.
The Civ-Similarity Hypothesis
Is Human ancestry neutral, positive, or negative? The implications hinted above are only plausible if U(SFC) ~ U(SFC | Human-ancestry). We formulate this requirement as a hypothesis. The Civ-Similarity Hypothesis posits that the expected utility efficiency, per unit of resources, of Humanity's future SFC is similar to that of other SFCs.
How could this hypothesis be valid? There are two main components contributing in that direction:
How could this hypothesis be invalid?
In later posts, we will look deeper into evaluating the Civ-Similarity Hypothesis and the tractability of making further progress there. We will see that a lot can be said regarding this hypothesis.
The Existence Neutrality Hypothesis
A third hypothesis as the conjunction of the previous two. This third and last hypothesis is simply the conjunction of the first two hypotheses. The Existence Neutrality Hypothesis posits that influencing Humanity's chance at creating an SFC produces little value compared to increasing the quality of the SFC we would eventually create conditional on doing so. Let's note that this hypothesis somewhat contradicts Nick Bostrom's astronomical waste argument.
Whispers of plausible importance. A few discussions about the implications (from the existence of alien SFCs, including the Existence Neutrality Hypothesis) are already available online but, to my knowledge, never led to a proper assessment of these questions. For reference, in the footnotes, you can find relevant quotes from (Brian Tomasik 2015)[10], (Jan M. Brauner and Friederike M. Grosse-Holz, 2018)[11], (Anthony DiGiovanni, 2021)[8], (Maxwell Tabarrok, 2022)[12], (MacAskill, 2023)[13], (Toby Ord's answer to MacAskill 2023)[14], (Jim Buhler, 2023)[15], (Magnus Vinding 2024)[16].
Context
Evaluating the Neutrality Hypothesis - Introductory Series. This post is part of a series introducing a research project for which I am seeking funding: Evaluating the Neutrality Hypothesis. This project includes evaluating both the Civ-Saturation and the Civ-Similarity Hypotheses and their longtermist macrostrategic implications. This introductory series hints at preliminary research results and looks at the tractability of making further progress in evaluating these hypotheses.
Next: A first evaluation of the Civ-Saturation Hypothesis. Over the next few posts, we will introduce a first evaluation of the Civ-Saturation Hypothesis. Starting by reviewing existing SFC density estimates and models producing them and clarifying the meaning and impact of Civ-Saturation on which possible world we should bet on.
Plan of the sequence
(Introduction)
(A first pass at evaluating the Civ-Saturation Hypothesis)
(Objects-level arguments about the Civ-Similarity Hypothesis and its tractability)
(Introducing the research project & implications)
Acknowledgments
Thanks to Tristan Cook, Magnus Vinding, Miles Kodama, and Justis Mills for their excellent feedback on this post and ideas. Note that this research was done under my personal name and that this content is not meant to represent any organization's stance.
By increasing P(Alignment), I mean increasing the probability that the SFC Humanity would create is aligned with some kind of ideal moral value (e.g., CEV), and has the ability to optimize it strongly. This requires some degree of success at both technical alignment and AI governance.
The hypothesis is specifically about what we should bet on when we are making decisions. Its extended version is: When making decisions, we should bet on the fact that most resources will be claimed by Space-Faring Civilizations (SFCs) regardless of whether humanity creates an SFC
Expected utility per unit of resource grabbed.
Exact copies are the group of agents that are exactly equivalent to you, the position of all the particles composing them is identical to the positions in you. They are perfect copies of you living in different parts of the world (e.g. multiverse).
Quote: “How much one should value Earth-originating and alien civilisations is very unclear. If you accept moral anti-realism, one reason to expect aliens to be less valuable than Earth-originating civilisations is that humans are more likely to share your values, since you are a human. However, there might be some convergence among goals, so it’s unclear how strong this effect is.” (Finnveden 2019)
Quote: “If we knew for certain that ETs would colonize our region of the universe if Earth-originating intelligence did not, then the question of whether humans should try to colonize space becomes less obvious. As noted above, it's plausible that humans are more compassionate than a random ET civilization would be. On the other hand, human-inspired computations might also entail more of what we consider to count as suffering because the mind architectures of the agents involved would be more familiar. And having more agents in competition for our future light cone might lead to dangerous outcomes.” (Brian Tomasik 2015)
Quote: "We may however assume that our reflected preferences depend on some aspects of being human, such as human culture or the biological structure of the human brainfn-48. Thus, our reflected preferences likely overlap more with a (post-)human civilization than alternative civilizations. As future agents will have powerful tools to shape the world according to their preferences, we should prefer (post-)human space colonization over space colonization by an alternative civilization." (Jan M. Brauner and Friederike M. Grosse-Holz, 2019)
Quote: "Arguments on this point will very likely not be robust; on any side of the debate, we are left with speculation, as our data consists of only one sample from the distribution of potentially space-colonizing species (i.e., ourselves).[51] On the side of optimism about humans relative to aliens, our species has historically displayed a capacity to extend moral consideration from tribes to other humans more broadly, and partly to other animals. Pessimistic lines of evidence include the exponential growth of factory farming, genocides of the 19th and 20th centuries, and humans’ unique degree of proactive aggression among primates (Wrangham, 2019).[52] Our great uncertainty arguably warrants focusing on increasing the quality of future lives conditional on their existence, rather than influencing the probability of extinction in either direction.
It does seem plausible that, by evolutionary forces, biological nonhumans would care about the proliferation of sentient life about as much as humans do, with all the risks of great suffering that entails. To the extent that impartial altruism is a byproduct of cooperative tendencies that were naturally selected (rather than “spandrels”), and of rational reflection, these beings plausibly would care about as much as humans do about reducing suffering. If, as suggested by work such as that of Henrich (2020), impartial values are largely culturally contingent, this argument does not provide a substantial update against +ERR if our prior view was that impartiality is an inevitable consequence of philosophical progress.[53] On the other hand, these cultures that tend to produce impartial values may themselves arise from convergent economic factors.[54] Brauner and Grosse-Holz’s mathematical model also acknowledges the following piece of weak evidence against +ERR in this respect: intelligent beings with values orthogonal to most humans’ (or most philosophically deliberative humans’) would tend not only to create less value in the future, but also less disvalue. Given the arguments in section 2.2 for the simplicity of disvalue, however, this difference may not be large." (Anthony DiGiovanni, 2021)
More precisely, from the point of view of impartial longtermists who also, at least, care for the impact of their exact copies (or believe in stronger forms of EDT).
Quote: "If another species took over and built a space-faring civilization, would it be better or worse than our own? There's some chance it could be more compassionate, such as if bonobos took our place. But it might also be much less compassionate, such as if chimpanzees had won the evolutionary race, not to mention killer whales. On balance it's plausible our hypothetical replacements would be less compassionate, because compassion is something humans value a lot, while a random other species probably values something else more. The reason I'm asking this question in the first place is because humans are outliers in their degree of compassion. Still, in social animals, various norms of fair play are likely to emerge regardless of how intrinsically caring the species is. Simon Knutsson pointed out to me that if human survivors do recover from a near-extinction-level catastrophe, or if humans go extinct and another species with potential to colonize space evolves, they'll likely need to be able to cooperate rather than fighting endlessly if they are to succeed in colonizing space. This suggests that if they colonize space, they will be more moral or peaceful than we were. My reply is that while this is possible, a rebuilding civilization or new species might curb infighting via authoritarian power structures or strong ingroup loyalty that doesn't extend to outgroups, which might imply less compassion than present-day humans have." (Brian Tomasik 2015)
Quote: "If humanity goes extinct without colonizing space, some kind of other beings would likely survive on earthfn-47. These beings might evolve into a non-human technological civilization in the hundreds of millions of years left on earth and eventually colonize space. Similarly, extraterrestrials (that might already exist or come into existence in the future) might colonize (more of) our corner of the universe, if humanity does not.
In these cases, we must ask whether we prefer (post-)human space colonization over the alternatives. Whether alternative civilizations would be more or less compassionate or cooperative than humans, we can only guess. We may however assume that our reflected preferences depend on some aspects of being human, such as human culture or the biological structure of the human brainfn-48. Thus, our reflected preferences likely overlap more with a (post-)human civilization than alternative civilizations. As future agents will have powerful tools to shape the world according to their preferences, we should prefer (post-)human space colonization over space colonization by an alternative civilization." (Jan M. Brauner and Friederike M. Grosse-Holz, 2018)
Quote: "The base rate of formation of intelligent or morally valuable life on earth and in the universe is an essential but unknown parameter for EA Longtermist philosophy. Longtermism currently assumes that this rate is very low which is fair given the lack of evidence. If we find evidence that this rate is higher, then wide moral circle Longtermists should shift their efforts from shielding humanity from as much existential risk as possible, to maximizing expected value by taking higher volatility paths into the future." (Maxwell Tabarrok, 2022)
Quote: "I think one could reasonably hold, for example, that the probability of a technologically-capable species evolving, if Homo sapiens goes extinct, is 90%, that non-Earth-originating alien civilisations settling the solar systems that we would ultimately settle is also 90%, and that such civilisations would have similar value to human-originating civilisation.
(They also change how you should think about longterm impact. If alien civilisations will settle the Milky Way (etc) anyway, then preventing human extinction is actually about changing how interstellar resources are used, not whether they are used at all.)
And I think it means we miss out on some potentially important ways of improving the future. For example, consider scenarios where we fail on alignment. There is no “humanity”, but we can still make the future better or worse. A misaligned AI system that promotes suffering (or promotes something that involves a lot of suffering) is a lot worse than an AI system that promotes something valueless. " (MacAskill 2023)
Quote: "You are right that the presence or absence of alien civilisations (especially those that expand to settle very large regions) can change things. I didn't address this explicitly because (1) I think it is more likely that we are alone in the affectable universe, and (2) there are many different possible dynamics for multiple interacting civilisations and it is not clear what is the best model. But it is still quite a plausible possibility and some of the possible dynamics are likely enough and simple enough that they are worth analysing." (Toby Ord's answer to MacAskill 2023)
Quote: "Hanson (2021) and Cook (2022) estimate that we should expect to eventually “meet” (grabby) alien AGIs/civilizations – just AGIs, from here on – if humanity expands, and that our corner of the universe will eventually be colonized by aliens if humanity doesn’t expand.
This raises the following three crucial questions:
Quote: "Impartial AI safety would plausibly give strong consideration to our potential impact on other cosmic agents, whereas AI safety that exclusively prioritizes, say, human survival or human suffering reduction would probably not give it strong consideration, if indeed any consideration at all. So the further we diverge from ideals of impartiality in our practical focus, the more likely we may be to neglect our potential impact on other cosmic agents." (Magnus Vinding 2024)