Linked decisions an a "nice" solution for the Fermi paradox
One of the more speculative solutions of the Fermi paradox is that all civilizations decide to stay home, thereby meta-cause other civilizations to stay home too, and thus allow the Fermi paradox to have a nice solution. (I remember reading this idea in Paul Almond’s writings about evidential decision theory, which unfortunately seem no longer available online.) The plausibility of this argument is definitely questionable. It requires a very high degree of goal convergence both within and among different civilizations. Let us grant this convergence and assume that, indeed, most civilizations arrive at the same decision and that they make their decision knowing this. One paradoxical implication then is: If a civilization decides to attempt space colonization, they are virtually guaranteed to face unexpected difficulties (for otherwise space would already be colonized, unless they are the first civilization in their neighborhood attempting space colonization). If, on the other hand, everyone decides to stay home, there is no reason for thinking that there would be any unexpected difficulties if one tried. Space colonization can either be easy, or you can try it, but not both.
Can the basic idea behind the argument be formalized? Consider the following game: There are N>>1 players. Each player is offered to push a button in turn. Pushing the button yields a reward R>0 with probability p and a punishment P<0 otherwise. (R corresponds to successful space colonization while P corresponds to a failed colonization attempt.) Not pushing the button gives zero utility. If a player pushes the button and receives R, the game is immediately aborted, while the game continues if a player receives P. Players do not know how many other players were offered to push the button before them, they only know that no player before them received R. Players also don’t know p. Instead, they have a probability distribution u(p) over possible values of p. (u(p)>=0 and the integral of u(p) from 0 to 1 is given by int_{0}^{1}u(p)dp=1.) We also assume that the decisions of the different players are perfectly linked.
Naively, it seems that players simply have an effective success probability p_eff,1=int_{0}^{1}p*u(p)dp and they should push the button iff p_eff,1*R+(1-p_eff,1)*P>0. Indeed, if players decide not to push the button they should expect that pushing the button would have given them R with probability p_eff,1. The situation becomes more complicated if a player decides to push the button. If a player pushes the button, they know that all players before them have also pushed the button and have received P. Before taking this knowledge into account, players are completely ignorant about the number i of players who were offered to push the button before them, and have to assign each number i from 0 to N-1 the same probability 1/N. Taking into account that all players before them have received P, the variables i and p become correlated: the larger i, the higher the probability of a small value of p. Formally, the joint probability distribution w(i,p) for the two variables is, according to Bayes’ theorem, given by w(i,p)=c*u(p)*(1-p)^i, where c is a normalization constant. The marginal distribution w(p) is given by w(p)=sum_{i=0}^{N-1}w(i,p). Using N>>1, we find w(p)=c*u(p)/p. The normalization constant is thus c=[int_{0}^{1}u(p)/p*dp]^{-1}. Finally, we find that the effective success probability taking the linkage of decisions into account is given by
p_eff,2 = int_{0}^{1}p*w(p)dp = c = [int_{0}^{1}u(p)/p*dp]^{-1} .
This is the expected chance of success if players decide to push the button. Players should push the button iff p_eff,2*R+(1-p_eff,2)*P>0. If follows from convexity of the function x->1/x (for positive x) that p_eff,2<=p_eff,1. So by deciding to push the button, players decrease their expected success probability from p_eff,1 to p_eff,2; they cannot both push the button and have the unaltered success probability p_eff,1. Linked decisions can explain why no one pushes the button if p_eff,2*R+(1-p_eff,2)*P<0, even though we might have p_eff,1*R+(1-p_eff,1)*P>0 and pushing the button naively seems to have positive expected utility.
It is also worth noting that if u(0)>0, the integral int_{0}^{1}u(p)/p*dp diverges such that we have p_eff,2=0. This means that given perfectly linked decisions and a sufficiently large number of players N>>1, players should never push the button if their distribution u(p) satisfies u(0)>0, irrespective of the ratio of R and P. This is due to an observer selection effect: If a player decides to push the button, then the fact that they are even offered to push the button is most likely due to p being very small and thus a lot of players being offered to push the button.
Crash problems for total futarchy
Futarchy holds great promise for dealing with all the morass of poor decision making in our governments and corporations. For those who haven't heard of it, the main concept is to use betting markets, where people place bets on the expected outcome of a policy, and the decision-makers choose the policy that the market decrees is most likely to achieve their desired outcomes. Robin Hanson summarises it as "Vote Values, But Bet Beliefs".
The approach, however, could lead to problems in a large financial crisis. When a large financial bubble bursts, many things change: liquidity, risk aversion, volatility, the competence of the average investor. If the betting markets are integrated into the general market (which they would be), then they would be affected in the same way. So at precisely the moment when decision makers need the best results, their main tools would be going haywire.
This would be even worse if they'd been depending on the betting markets for their decisions, operating merely as overseers. At that point, they may have lost the ability to make effective decision entirely.
Since isolating the betting markets from the swings of the rest of the market is unrealistic/impossible/stupid, we should aim for a mixed governance model - one where betting markets play an integral part, but where the deciders still have experience making their own decisions and overriding the betting markets with some regularity.
Why (anthropic) probability isn't enough
A technical report of the Future of Humanity Institute (authored by me), on why anthropic probability isn't enough to reach decisions in anthropic situations. You also have to choose your decision theory, and take into account your altruism towards your copies. And these components can co-vary while leaving your ultimate decision the same - typically, EDT agents using SSA will reach the same decisions as CDT agents using SIA, and altruistic causal agents may decide the same way as selfish evidential agents.
Anthropics: why probability isn't enough
This paper argues that the current treatment of anthropic and self-locating problems over-emphasises the importance of anthropic probabilities, and ignores other relevant and important factors, such as whether the various copies of the agents in question consider that they are acting in a linked fashion and whether they are mutually altruistic towards each other. These issues, generally irrelevant for non-anthropic problems, come to the forefront in anthropic situations and are at least as important as the anthropic probabilities: indeed they can erase the difference between different theories of anthropic probability, or increase their divergence. These help to reinterpret the decisions, rather than probabilities, as the fundamental objects of interest in anthropic problems.
Applied Rationality Practice
It's one thing to read about a subject, but one gains a deeper understanding by seeing it applied to real problems, and an even deeper understanding by applying it yourself. This applies in particular to the closely related subjects of rationality, cognitive biases, and decision theory. With this in mind, I'd like to propose that we create one or more discussion topics each devoted to discussing and analyzing one decision problem of one person, and see how all this theory we've been discussing can help. The person could be either a Less Wrong member or just an acquaintance of one of us.
I'll commit to actively participating myself. Does anyone want to put forth a problem to discuss?
Would AIXI protect itself?
Research done with Daniel Dewey and Owain Evans.
AIXI can't find itself in the universe - it can only view the universe as computable, and it itself is uncomputable. Computable versions of AIXI (such as AIXItl) also fail to find themselves in most situations, as they generally can't simulate themselves.
This does not mean that AIXI wouldn't protect itself, though, if it had some practice. I'll look at the three elements an AIXI might choose to protect: its existence, its algorithm and utility function, and its memory.
Grue-verse
In this setup, the AIXI is motivated to increase the number of Grues in its universe (its utility is the time integral of the number of Grues at each time-step, with some cutoff or discounting). At each time step, the AIXI produces its output, and receives observations. These observations include the number of current Grues and the current time (in our universe, it could deduce the time from the position of stars, for instance). The first bit of the AIXI's output is the most important: if it outputs 1, a Grue is created, and if it outputs 0, a Grue is destroyed. The AIXI has been in existence for long enough to figure all this out.
Protecting its existence
Here there is a power button in the universe, which, if pressed, will turn the AIXI off for the next timestep. The AIXI can see this button being pressed.
CEV-inspired models
I've been involved in a recent thread where discussion of coherent extrapolated volition came up. The general consensus was that CEV might - or might not - do certain things, probably, maybe, in certain situations, while ruling other things out, possibly, and that certain scenarios may or may not be the same in CEV, or it might be the other way round, it was too soon to tell.
Ok, that's an exaggeration. But any discussion of CEV is severely hampered by our lack of explicit models. Even bad, obviously incomplete models would be good, as long as we can get useful information as to what they would predict. Bad models can be improved; undefined models are intuition pumps for whatever people feel about them - I dislike CEV, and can construct a sequence of steps that takes my personal CEV to wanting the death of the universe, but that is no more credible than someone claiming that CEV will solve all problems and make lots of cute puppies.
So I'd like to ask for suggestions of models that formalise CEV to at least some extent. Then we can start improving them, and start making CEV concrete.
To start it off, here's my (simplistic) suggestion:
Volition
Use revealed preferences as the first ingredient for individual preferences. To generalise, use hypothetical revealed preferences: the AI calculates what the person would decide in these particular situations.
Extrapolation
Whenever revealed preferences are non-transitive or non-independent, use the person's stated meta-preferences to remove the issue. The AI thus calculates what the person would say if asked to resolve the transitivity or independence (for people who don't know about the importance of resolving them, the AI would present them with a set of transitive and independent preferences, derived from their revealed preferences, and have them choose among them). Then (wave your hands wildly and pretend you've never heard of non-standard reals, lexicographical preferences, refusal to choose and related issues) everyone's preferences are now expressible as utility functions.
Coherence
Normalise each existing person's utility function and add them together to get your CEV. At the FHI we're looking for sensible ways of normalising, but one cheap and easy method (with surprisingly good properties) is to take the maximal possible expected utility (the expected utility that person would get if the AI did exactly what they wanted) as 1, and the minimal possible expected utility (if the AI was to work completely against them) as 0.
Non-personal preferences of never-existed people
Some people see never-existed people as moral agents, and claim that we can talk about their preferences. Generally this means their personal preference in existing versus non-existing. Formulations such "it is better for someone to have existed than not" reflect this way of thinking.
But if the preferences of never-existed are relevant, then their non-personal perferences are also relevant. Do they perfer a blue world or a pink one? Would they want us to change our political systems? Would they want us to not bring into existence some never-existent people they don't like?
It seems that those who are advocating bringing never-existent people into being in order to satisfy those people's preferences should be focusing their attention on their non-personal preferences instead. After all, we can only bring into being so many trillions of trillions of trillions; but there is no theoretical limit to the number of never-existent people whose non-personal preferences we can satisfy. Just get some reasonable measure across the preferences of never-existent people, and see if there's anything that sticks out from the mass.
= 783df68a0f980790206b9ea87794c5b6)
Subscribe to RSS Feed
= f037147d6e6c911a85753b9abdedda8d)