[LINK] IBM simulate a "brain" with 500 billion neurons and 100 trillion synapses

5 drnickbone 21 November 2012 10:23PM

Recent article in The New Yorker:

http://www.newyorker.com/online/blogs/newsdesk/2012/11/ibm-brain-simulation-compass.html

Here is the research report from IBM, with the simple title "10^14":

http://www.modha.org/blog/SC12/RJ10502.pdf

It's nothing like a real brain simulation, of course, but illustrates that hardware to do this is getting very close. 

There is likely to be quite a long overhang between the hardware and the software...

 

How do we really escape Prisoners' Dilemmas?

1 drnickbone 31 August 2012 11:36PM

This discussion article was provoked in part by Yvain's post on Main a few weeks ago, and some of the follow-up comments.

EDIT: I've also just noticed that there was a recent sequence rerun on the point about finite iterations. My bad: I simply didn't see the rerun article, as it had already slipped down a couple of pages when I posted. If you down-voted (or didn't read) out of a feeling of "Didn't we just do this?" then sorry.

In any case, one of my main motivations for running this article was point 5 (Does an environment of commitment and reputation create the background against which TDT - or something like it - can easily evolve?) I didn't get any responses on that point, so might try to run it again in a future article.

END EDIT

It is well-known that in a one-shot prisoner's dilemma, the only stable solution (Nash equilibrium) is for both parties to defect. But, perhaps less well-known, this is true for any finite-shot version of the dilemma, or any version where there is a finite upper bound on the number of iterations. For instance, a more sophisticated strategy than Tit For Tat (TFT) would determine when it has reached the last iteration, and then defect. Call this TFT-1. But then once TFT-1 has established itself, a strategy which detects and defects the last two iterations (TFT-2) would establish itself, and so on.

Since prisoners' dilemmas are always finite in practice, and always have been (we are mortal, and the Sun will blow up at some point), this raises the question of why we actually co-operate in practice. Why is TFT, or something very like it, still around?

Somehow, evolution (biological, cultural or both) has managed to engineer into us a strategy which is not a Nash equilibrium. Because any "evolutionarily stable strategy" (as usually defined) is a Nash equilibrium, somehow we have evolved a strategy which is not strictly evolutionarily stable. How could that have happened?

I can think of a few possibilities, and have a view about which of these are more realistic. I'm also wondering if other Less Wrong contributors have seriously thought through the problem, and have alternative suggestions.

 

1. Strategies like TFT succeed because they are very simple, and the alternatives are too complicated to replace them.

The argument here is that there are big costs to a strategy in "hardware" or "software" complexity, so that a crude strategy will out-compete a more sophisticated strategy. In particular TFT-1 is more complex than TFT and the additional computational costs outweigh the benefits. This is most plausibly the case where there is a very large upper bound on iterations (such as 100 years), but the upper bound is so rarely (if ever) reached in practice, that strategies which do something different in the final phase just don't have a selective advantage compared to the cost of the additional complexity. So the replacement of TFT by TFT-1 never happens.     

The difficulty with this explanation is that humans can (often) recognize when "this time is the last", and the computational cost of doing something different in that case is not great. Yet we either don't change, or we change in ways that TFT-1 would not predict. For instance, we can tell when we are visiting a restaurant we will never visit again (on a trip abroad say), but are still likely to tip. Also, it is striking that people co-operate about 50% of the time in known one-shot prisoners' dilemmas and similar games (see this analysis of Split or Steal?). Why 50%, rather than nearly 0%, or nearly 100%? And we often change our behaviour radically when we know we are going to die soon, but this change rarely involves antisocial behaviour like stealing, mugging, running up huge debts we'll never have to pay back and so on.

So I'm not convinced by this "alternatives are too complicated" explanation.

 

2. Emotional commitments change the pay-offs

Victims of defection don't take it lying down. They react angrily, and vengefully. Even if there are no obvious opportunities for future co-operation, and even where it involves further cost, victims will go out of their way to attempt to hurt the defector. On the nicer side, emotions of friendliness, indebtedness, duty, loyalty, admiration or love can cause us to go out of our way to reward co-operators, again even if there are no obvious opportunities for future co-operation.

Given these features of human nature as a background, the pay-offs change in a one-shot or finite-bound prisoner's dilemma, and may convert it to a non-dilemma. The pay-off for co-operating becomes greater than the pay-off for defection. This "solves" the problem of why we co-operate in a PD by denying it - effectively there wasn't a true Prisoner's Dilemma in the first place.

There are a number of difficulties with this "solution", one being that even allowing for emotional reactions, there are some true PDs and we can usually recognize them. Scenarios such as the foreign restaurant, where we know we will not be pursued across the world by a vengeful waiter demanding pay-back for a missing tip. So why don't we always defect in such cases? Why is there a voice of conscience telling us not to? Perhaps this objection could be solved by the "too complicated" response. For example, a strategy which could reliably detect when it is safe to defect (no vengeful payback) would in principle work, but it is likely to have a large complexity overhead. And a strategy which almost works (sometimes thinks it can "get away with it" but actually can't) may have a big negative payoff, so there is no smooth evolutionary pathway towards the "perfect" strategy.  

A further difficulty is to explain why humans react in this convenient pay-off-shifting fashion anyway. On one level, it is obvious: we are committed to doing so by strong emotions. Even when we suspect that emotions of vengeance and duty are "irrational" (all pain to us from now on, no gain) we can't help ourselves. Yet, it is this emotional commitment that increases the likelihood that others co-operate with us in the first place. So we can tell a plausible-sounding story about how ancestors with emotional commitments induced more co-operation from their fellows than those without, and hence the "irrationally emotional" ancestors out-competed the "coldly rational" non-ancestors.

But there is a major problem with this story: the "emotionally committed" ancestors could be out-competed in turn by bluffers. Anyone who could fake the emotional signals would be able to elicit the benefits of co-operation (they would successfully deter defection), but without having to follow through on the (costly) commitments in case the co-operation failed. Bluffing out-competes commitment.

Ahh, but if the bluff has been called, and the threatened vengeance (or promised loyalty) doesn't materialise, won't this lead to more defection? So won't people who genuinely follow-through on their commitments succeed at the expense of the bluffers? The answer is yes, but again only in the case of iterated interactions, and only in a potentially infinite scenario. The problem of the finite bound returns: it is always better to "bluff" a commitment on the very last interaction. And once bluffing on the last turn has been established, it is better to bluff on the next-to-last. And so on, leading to bluffing on all turns. And then there is no advantage in believing the bluffs, so no deterrent effect, and (in the final equilibrium), no advantage in making the bluffs either. The only true equilibrium has no commitment, no deterrence and no co-operation.

Again, we can try to rescue the "commitment" theory by recourse to the "too complicated" theory. Quite possibly, alternatives to true commitment are very costly in hardware or software: it is just too hard to bluff convincingly and successfully. That might be true, but on the other hand, there are plenty of poker players and con artists who would say differently.

 

3. Social pressures and reputational effects change the pay-offs

Human decisions to co-operate or defect are very rarely made in isolation, and this could help explain why we co-operate even though we know (or can predict) "this time is the last". We won't benefit from defection if we simultaneously gain reputations as defectors. 

As in explanation 2, the effect of this social pressure is to change the pay-off matrix. Although there may appear to be a benefit from one-shot/last-shot defecting, in a social context where our actions are known (and defections by us will lead to defections by third parties against us), then there is a greater pay-off from co-operating rather than defecting.

Once again this "solves" the problem of why we co-operate in PDs by denying it. Once again it faces the objection that there are true PDs (involving secret defection) and we can recognize them, but often don't defect in them. Again, perhaps this objection could be met by the "too complicated" response; it is just too hard to tell when the defection is really secret.  

A second objection is that this reputational theory still doesn't cover end-of-life effects: why are we worried at all about our reputation when death is near? (Why do we even worry more about our reputation in such cases?)

But a more basic objection is "How did we ever get into a social environment where third party reputation matters like this?" Consider for instance a small society involving Anne, Bob, and Charles. Anne and Bob are engaging in an iterated prisoners' dilemma, and regularly co-operating. Bob and Charles meet in a one-shot prisoners' dilemma, and Bob defects. Anne sees this. How does it help Anne in this situation to start defecting against Bob? Generally it doesn't. A reputational system only helps if it identifies and isolates people who won't co-operate at all (the pure defectors). But Bob is not a pure defector, so why does he end up being penalized by Anne?

Perhaps the relevant model is where Anne hasn't interacted with Bob yet at all, but there is a new opportunity for iterated co-operation coming up. By seeing Bob defect against Charles, Anne gets evidence that Bob is a defector rather than a co-operator, so she won't even start to co-operate with him. If Anne could discriminate a bit more clearly, she would see that Bob is not a pure defector, but she can't. And this is enough to penalize Bob for defecting against Charles. Possibly that works, but I'm doubtful if these "new opportunity to co-operate," cases occur often enough in practice to really penalize one-shot defection (which is observed at exactly the time needed to spoil the opportunity). Or more to the point, did they occur often enough in human history and pre-history to matter?

But suppose for the moment that we have an explanation for how the reputational system arises and persists. Then the reputational effect will apply to commitments as well: individuals won't benefit if they are identified as bluffers, so truly committed individuals (with strong emotions) benefit over those who are known to fake emotions, or to "coldly" override their emotions. So a reputational explanation for co-operation can strengthen a commitment explanation for co-operation. Or in the other direction, any emotional commitments (to principles of justice, disgust at exploitation etc.) can reinforce the reputational system. So it seems we have two somewhat dubious mechanisms which could nevertheless reinforce each other and build to a strong mechanism. Perhaps.

 

4. Group selection

There have been different societies / social groups through history. Perhaps some have had reputational systems which successfully converted Prisoners' Dilemmas into non-Prisoners' Dilemmas, while others haven't, and their members were left with lots of true PDs (and lots of defection). The societies which avoided true PDs experienced less defection, and out-competed the others.

This has a ring of plausibility about it, but suffers from many of the same general problems as any Group-selection theory. Human groups aren't isolated from each other like separate organisms, and don't reproduce like organisms: they exchange members too often.

Still, this solution might address one of the main theoretical objections to Group selection, that "co-operating" groups are unstable to defection (either arising from internal changes, or brought in by new members), and the defection will spread through the group faster than the group can out-reproduce rival groups. Groups with the right reputational systems are - by hypothesis - stable against defection. So it might work.

Or perhaps reputational systems aren't quite stable against defection - they eventually collapse because of secret defections, "last time" defections which can't be punished by other members, laziness of other members in enforcing the co-operation, false reputations and so on. This slow erosion eventually kills the group, but not before it has established child groups of some sort. Again perhaps this might work.

 

5. Prediction and Omegas : from TFT to TDT

One striking feature about both the commitment explanation (2) and the reputational explanation (3) is how they reward successful prediction of human behaviour. This is obvious for commitments: it is the predictable emotional commitment that creates the deterrent against defection (or the lure towards co-operation). And being able to predict who is really vengeful and loyal, and who is just bluffing, gives individuals a further strong advantage.

But this extends to the reputational system too. Suppose Bob defects against Charles, while Charles co-operates. Anne sees this and is disgusted (how could Bob exploit poor Charles like that?). Yet suppose Charles defects as well. Then Anne admires Bob for his prudence (rather than being taken for a ride by that evil Charles). So Bob gets the reputational pay-off precisely when he can successfully predict how Charles will behave, and do the same. If the reputational pay-off is high, then there is a strong pressure towards a "mirror" strategy (try to predict whether the other person will co-operate or defect and then do likewise).

This is rather interesting, since it is starting to sound like Newcomb's problem, where we have a (hypothetical) predictor who can't be outwitted. Why is that a believable story at all? Why don't we just stare in bemusement at the very idea? Well, suppose we model "co-operation" as the human player taking one box, which Omega fills with $1 million, versus "defection" as the human player taking both boxes (and Omega not filling the opaque one). Or suppose we treat a resolution to take one box as a "commitment" and an after-the-fact decision to take two boxes (because it no longer makes a difference) as bluffing on a commitment. And of course the rationale for a human player to "co-operate" or to truly "commit" is Omega's reputation for always predicting correctly!

So, here is a story about how "Timeless Decision Theory" (or something like it) could emerge from "Tit for Tat". A combination of commitment effects (2) and reputational effects (3) leads to an environment where successful prediction of human behaviour is rewarded. Such an environment is - possibly - maintained by group selection (4).

People get rather good at prediction. When meeting a successful predictor who will co-operate if you co-operate, and defect if you defect, it is better to co-operate. When the successful predictor will defect if he suspects you are bluffing on a commitment, it is better to have a true commitment. But it is still not obvious what to do on a one-shot prisoner's dilemma, because you don't know how the other party's prediction will go, and don't know what will enhance your own reputation (so sometimes people co-operate, sometimes defect).

All this favours a style of reasoning rather like TDT. But it can also favour a rather "superstitious" approach to justifying the reasoning, since there is no causal connection between our action and the prediction. Instead we get weird pseudo-causal explanations/justifications like gods who are always watching, ancestral spirits who can be angered, bad karma, what goes around comes around etc. and a general suspicion of those who don't go along with the local superstition (since they can't be predicted to co-operate with those who do).

Does this sound familiar? 

Problematic Problems for TDT

36 drnickbone 29 May 2012 03:41PM

A key goal of Less Wrong's "advanced" decision theories (like TDT, UDT and ADT) is that they should out-perform standard decision theories (such as CDT) in contexts where another agent has access to the decider's code, or can otherwise predict the decider's behaviour. In particular, agents who run these theories will one-box on Newcomb's problem, and so generally make more money than agents which two-box. Slightly surprisingly, they may well continue to one-box even if the boxes are transparent, and even if the predictor Omega makes occasional errors (a problem due to Gary Drescher, which Eliezer has described as equivalent to "counterfactual mugging"). More generally, these agents behave like a CDT agent will wish it had pre-committed itself to behaving before being faced with the problem.

However, I've recently thought of a class of Omega problems where TDT (and related theories) appears to under-perform compared to CDT. Importantly, these are problems which are "fair" - at least as fair as the original Newcomb problem - because the reward is a function of the agent's actual choices in the problem (namely which box or boxes get picked) and independent of the method that the agent uses to choose, or of its choices on any other problems. This contrasts with clearly "unfair" problems like the following:

Discrimination: Omega presents the usual two boxes. Box A always contains $1000. Box B contains nothing if Omega detects that the agent is running TDT; otherwise it contains $1 million.

 

So what are some fair "problematic problems"?

Problem 1: Omega (who experience has shown is always truthful) presents the usual two boxes A and B and announces the following. "Before you entered the room, I ran a simulation of this problem as presented to an agent running TDT. I won't tell you what the agent decided, but I will tell you that if the agent two-boxed then I put nothing in Box B, whereas if the agent one-boxed then I put $1 million in Box B. Regardless of how the simulated agent decided, I put $1000 in Box A. Now please choose your box or boxes."

Analysis: Any agent who is themselves running TDT will reason as in the standard Newcomb problem. They'll prove that their decision is linked to the simulated agent's, so that if they two-box they'll only win $1000, whereas if they one-box they will win $1 million. So the agent will choose to one-box and win $1 million.

However, any CDT agent can just take both boxes and win $1001000. In fact, any other agent who is not running TDT (e.g. an EDT agent) will be able to re-construct the chain of logic and reason that the simulation one-boxed and so box B contains the $1 million. So any other agent can safely two-box as well. 

Note that we can modify the contents of Box A so that it contains anything up to $1 million; the CDT agent (or EDT agent) can in principle win up to twice as much as the TDT agent.

 

Problem 2: Our ever-reliable Omega now presents ten boxes, numbered from 1 to 10, and announces the following. "Exactly one of these boxes contains $1 million; the others contain nothing. You must take exactly one box to win the money; if you try to take more than one, then you won't be allowed to keep any winnings. Before you entered the room, I ran multiple simulations of this problem as presented to an agent running TDT, and determined the box which the agent was least likely to take. If there were several such boxes tied for equal-lowest probability, then I just selected one of them, the one labelled with the smallest number. I then placed $1 million in the selected box. Please choose your box."

Analysis: A TDT agent will reason that whatever it does, it cannot have more than 10% chance of winning the $1 million. In fact, the TDT agent's best reply is to pick each box with equal probability; after Omega calculates this, it will place the $1 million under box number 1 and the TDT agent has exactly 10% chance of winning it.
 
But any non-TDT agent (e.g. CDT or EDT) can reason this through as well, and just pick box number 1, so winning $1 million. By increasing the number of boxes, we can ensure that TDT has arbitrarily low chance of winning, compared to CDT which always wins.


Some questions:

1. Have these or similar problems already been discovered by TDT (or UDT) theorists, and if so, is there a known solution? I had a search on Less Wrong but couldn't find anything obviously like them.

2. Is the analysis correct, or is there some subtle reason why a TDT (or UDT) agent would choose differently from described?

3. If a TDT agent believed (or had reason to believe) that Omega was going to present it with such problems, then wouldn't it want to self-modify to CDT? But this seems paradoxical, since the whole idea of a TDT agent is that it doesn't have to self-modify.

4. Might such problems show that there cannot be a single TDT algorithm (or family of provably-linked TDT algorithms) so that when Omega says it is simulating a TDT agent, it is quite ambiguous what it is doing? (This objection would go away if Omega revealed the source-code of its simulated agent, and the source-code of the choosing agent; each particular version of TDT would then be out-performed on a specific matching problem.)

5. Are these really "fair" problems? Is there some intelligible sense in which they are not fair, but Newcomb's problem is fair? It certainly looks like Omega may be "rewarding irrationality" (i.e. giving greater gains to someone who runs an inferior decision theory), but that's exactly the argument that CDT theorists use about Newcomb.

6. Finally, is it more likely that Omegas - or things like them - will present agents with Newcomb and Prisoner's Dilemma problems (on which TDT succeeds) rather than problematic problems (on which it fails)?

 

Edit: I tweaked the explanation of Box A's contents in Problem 1, since this was causing some confusion. The idea is that, as in the usual Newcomb problem, Box A always contains $1000. Note that Box B depends on what the simulated agent chooses; it doesn't depend on Omega predicting what the actual deciding agent chooses (so Omega doesn't put less money in any box just because it sees that the actual decider is running TDT).

Sneaky Strategies for TDT

8 drnickbone 25 May 2012 04:13PM

My previous article on problematic problems attracted quite a lot of feedback and comment. One of the questions it posed (2) was whether TDT should do something other than the initial analysis suggests.

I've had a couple of ideas on that, partly in response to the comments. I'm posting some follow-up thoughts in the hope that they might help clarify the original problems. Basically, it seems there are some sneaky things that TDT could try to do, but mostly they are not robust to slightly different variants of the problems. Interestingly, some of these variants look even "fairer", since they contain no mentions of TDT anywhere in the problem statement.

An alternative approach could be that TDT resolves to never let itself be out-performed by any other decision theory, because of evolutionary considerations as discussed here. Even if that requires a large sacrifice of immediate utility (e.g. two-boxing and taking $1000 along with CDT, rather than one-boxing and taking $1 million, but with CDT getting $1,001,000.) I don't currently know what to think about that, except that it makes my head spin; it also sounds like a rather Unfriendly form of AI.

 

Problem 1: Omega (who experience has shown is always truthful) presents the usual two boxes A and B and announces the following. "Before you entered the room, I ran a simulation of this problem as presented to an agent running TDT. I won't tell you what the agent decided, but I will tell you that if the agent two-boxed then I put nothing in Box B, whereas if the agent one-boxed then I put $1 million in Box B. Regardless of how the simulated agent decided, I put $1000 in Box A. Now please choose your box or boxes."

Consider the following sneaky strategy S, which TDT agents might try to adopt. 

"Use every piece of evidence at your disposal to try to decide whether you are the simulated agent. If you can prove you are not the simulated agent, then 2-box; otherwise 1-box".

We could imagine Omega making this strategy very easy for agents, by revealing the full source-code of the simulated agent "C-sim" as well as the full source code of the actual choosing agent "C-act". By "full source code" here, we have to include not just code describing how the agent executes TDT, but also its exact computational state on entry to the room (or simulated room), including a database of existing beliefs and proven theorems, preferences, values etc. To be fully precise, it should also describe any computational constraints that the agent faces, including time-out after t steps, or an upper bound on memory usage m, if they exist. The choosing agent then checks whether C-sim = C-act, and unless there is an exact match, it two-boxes. (If Omega is not kind enough to reveal C-act, a TDT agent may be able to generate it anyway via a quine-program construction.)

There seem to be a few difficulties with strategy S:

1. Each particular C-act will still under-perform on a matching problem. Though, arguably, it is an "unfair" problem (specifically tailored by Omega to get that C-act to shoot itself in the foot).

2. It looks unattractive for different "flavours" of TDT agent to de-link themselves like this. Aren't we going to run into problems with the different flavours refusing to co-operate with each other on Prisoner's Dilemma, and only co-operating with exact clones? One answer is "We don't know until we perform a formal analysis". Another answer is "No, the different flavours of TDT are all linked at the strategy level, since they all follow strategy S; they are just de-linked at the execution level, since they do something different based on their different epistemic circumstances." That's slightly more convincing, but I still feel a need for analysis. One concern is if there are problems where some TDT agents/utilities benefit from a fine-grained de-linking of execution and others don't; what then is the best overall TDT strategy?

3. It doesn't solve all versions of the problem, since Omega doesn't have to reveal the exact C-sim code which it used. Omega could just say that it picked C-sim's code at random from a probability distribution across all full source-codes for TDT agents. (To model this as a formally-defined problem, we'd need to specify the distribution of course.) In such a case, no TDT agent can prove it is distinct from the simulation, so by strategy S, all TDT agents will one-box. And CDT will win against all flavours of TDT.

It is still arguable that the problem is "unfair" to TDT as a whole, but now suppose that Omega samples its C-sim from a probability distribution across multiple types of agent source-code, with TDT agents just part of the population. There is thus some probability p_t > 0 for the simulated agent being a TDT agent. If the difference in box values is big enough (roughly value_A / value_B  < p_t e.g. suppose that 1000/1000000 = 1/1000 < p_t) then a TDT agent would still maximize expected winnings by 1-boxing. This doesn't seem particularly unfair to TDT, and yet CDT would still do better.


An alternative strategy to S is what I'd informally call "TDT uber alles". It relies on long-range consequentialism, and perhaps "utility trading" as well (for TDT agents that don't inherently care about long-range consequences). A TDT agent might argue to itself "If TDT beats CDT - and other theories - at each and every available opportunity, and at least matches them where it can't beat them, then TDT will come to dominate the agent space as quickly as possible, which will maximize my expected utility. So I'm not going to let CDT beat me here: I'll two-box".

 

Problem 2: Our ever-reliable Omega now presents ten boxes, numbered from 1 to 10, and announces the following. "Exactly one of these boxes contains $1 million; the others contain nothing. You must take exactly one box to win the money; if you try to take more than one, then you won't be allowed to keep any winnings. Before you entered the room, I ran multiple simulations of this problem as presented to an agent running TDT, and determined the box which the agent was least likely to take. If there were several such boxes tied for equal-lowest probability, then I just selected one of them, the one labelled with the smallest number. I then placed $1 million in the selected box. Please choose your box."

My original analysis gave TDT no more than 10% chance of winning the $1 million. However, here's something a bit cleverer. Suppose each particular TDT agent has a favourite number in the set {1, 2} constructed as a balanced function over TDT full source-codes. (One way I imagine doing this is by taking a SHA-256 hash of the full source-code, extracting the first bit, and adding 1.)  Each agent chooses the box of their favourite number with probability 1. Since the simulated agent does so as well, by the rules of the game, the $1 million must be in one of boxes 1 or 2. And unless the full source-code of the simulated agent is known, it will not be possible for the choosing agent to tell whether Box 1 or 2 was picked by the sim, so that if the choosing agent picks their own favourite box, they have a 50% chance of winning. And CDT won't do any better.

Omega could prevent this approach by the information it reveals. If it reveals the full source-code for C-sim (and in its simulation, presents this same source-code C-sim to C-sim itself) then TDT shouldn't try to execute the "favourite number" strategy. A better strategy is to pick each of the ten boxes with equal probability if finding that C-act = C-sim; or if finding that C-act differs from C-sim, then pick Box 1 with certainty.

Or much as for Problem 1, Omega can vary the problem as follows:

"...Before you entered the room, I ran multiple simulations of this problem as presented to different randomly-selected TDT agents. I determined which box they were collectively least likely to take..." (Again this needs a distribution to be specified to become formally precise.)

There doesn't seem much that TDT agents can do about that, except to give a collective groan, and arrange that TDT collectively selects each of the ten boxes with equal probability. The simplest way to ensure that is for each TDT agent individually to select the boxes with equal probability (so each individual agent at least gets an equal chance at the prize). And any other agent just takes Box 1, laughing all the way to the bank. 

Consider a final variant as follows:

"...Before you entered the room, I ran multiple simulations of this problem as presented to different agents, sampled uniformly from different possible future universes according to their relative numbers, with the universes themselves sampled from my best projections of the future. I determined the box which the agents were least likely to take..."

If TDT uber alles is the future, then almost all the sampled agents will be TDT agents, so the problem is essentially as before. And now it doesn't look like Omega is being unfair at all (nothing discriminatory in the problem description). But TDT is still stuck, and can get beaten by CDT in the present.

One thought is that the TDT collective should vary the box probabilities very very slightly, so that Omega can tell which has the lowest probability, but regular CDT agents can't - in that case CDT also has only maximum 10% chance of winning. Possibly, the computationally-advanced members of the collective toss a logical coin (which only they and Omega can compute) to decide which box to de-weight; the less advanced members - ones who actually have to compete against rival decision theories - just pick at random. If CDT tries to simulate TDT instances, it will detect equal probabilities, pick Box 1 and most likely get it wrong...


Edit 2: I've clarified the alternative to the "favourite number" strategy if Omega reveals C-sim in Problem 2. We can actually get a range of different problems and strategies by slight variations here. See the comment below from lackofcheese, and my replies.

Self-Indication Assumption - Still Doomed

2 drnickbone 28 January 2012 11:09PM

I recently posted a discussion article on the Doomsday Argument (DA) and Strong Self-Sampling Assumption. See http://lesswrong.com/lw/9im/doomsday_argument_with_strong_selfsampling/

This new post is related to another part of the literature concerning the Doomsday Argument - the Self Indication Assumption or SIA. For those not familiar, the SIA says (roughly) that I would be more likely to exist if the world contains a large number of observers. So, when taking into account the evidence that I exist, this should shift my probability assessments towards models of the world with more observers.

Further, on first glance, it looks like the SIA shift can be arranged to exactly counteract the effect of the DA shift. Consider, for instance, these two hypotheses:

 

H1. Across all of space time, there is just one civilization of observers (humans) and a total of 200 billion observers.

H2. Across all of space time, there is just one civilization of observers (humans) and a total of 200 billion trillion observers. 

 

Suppose I had assigned a prior probability ratio p_r = P(H1)/P(H2) before considering either SIA or the DA. Then when I apply the SIA, this ratio will shrink by a factor of a trillion i.e. I've become much more confident in hypothesis H2. But then when I observe I'm roughly the 100 billionth human being, and apply the DA, the ratio expands back by exactly the same factor of a trillion, since this observation is much more likely under H1 than under H2. So my probability ratio returns to p_r. I should not make any predictions about "Doom Soon" unless I already believed them at the outset, for other reasons.

Now I won't discuss here whether the SIA is justified or not; my main concern is whether it actually helps to counteract the Doomsday Argument. And it seems quite clear to me that it doesn't. If we choose to apply the SIA at all, then it will instead overwhelming favour a hypothesis like H3 below over either H1 or H2:

 

H3. Across all of space time, there are infinitely many civilizations of observers, and infinitely many observers in total.

 

In short, by applying the SIA we wipe out from consideration all the finite-world models, and then only have to look at the infinite ones (e.g. models with an infinite universe, or with infinitely many universes). But now, consider that H3 has two sub-models:

 

H3.1. Across all of space time, there are infinitely many civilizations of observers, but the mean number of observers per civilization (taking a suitable limit construction to define the mean) is 200 billion observers.

H3.2. Across all of space time, there are infinitely many civilizations of observers, but the mean number of observers per civilization (taking the same limit construction) is 200 billion trillion observers.

 

Notice that while SIA is indifferent between these sub-cases (since both contain the same number of observers), it seems clear that DA still greatly favours H3.1 over H3.2. Whatever our prior ratio r' = P(H3.1)/P(H3.2), DA raises that ratio by a trillion, and so the combination of SIA and DA also raises that ratio by a trillion. SIA doesn't stop the shift. 

 

Worse still, the conclusion of the DA has now become far *stronger*, since it seems that the only way for H3.1 to hold is if there is some form of "Universal Doom" scenario. Loosely, pretty much every one of those infinitely-many civilizations will have to terminate itself before managing to expand away from its home planet. 

Looked at more carefully, there is some probability of a civilization expanding p_e which is consistent with H3.1 but it has to be unimaginably tiny. If the population ratio of an expanded civilization to a a non-expanded one is R_e, then H3.1 requires that p_e < 1/R_e. But values of R_e > trillion look right; indeed values of R_e > 10^24 (a trillion trillion) look plausible, which then forces p_e < 10^-12 and plausibly < 10^-24. The believer in the SIA has to be a really strong Doomer to get this to work!

By contrast the standard DA doesn't have to be quite so doomerish. It can work with a rather higher probability p_e of expansion and avoiding doom, as long as the world is finite and the total number of actual civilizations is less than 1 / p_e.  As an example, consider:

H4. There are 1000 civilizations of observers in the world, and each has a probability of 1 in 10000 of expanding beyond its home planet. Conditional on a civilization not expanding, its expected number of observers is 200 billion. 

This hypothesis seems to be pretty consistent with our current observations (observing that we are the 100 billionth human being). It predicts that - with 90% probability - all observers will find themselves on the home planet of their civilization. Since this H4 prediction applies to all observers, we don't actually have to worry about whether we are a "random" observer or not; the prediction still holds. The hypothesis also predicts that, while the prospect of expansion will appear just about attainable for a civilization, it won't in fact happen.

P.S. With a bit of re-scaling of the numbers, this post also works with observations or observer-moments, not just observers. See my previous post for more on this.

Doomsday Argument with Strong Self-Sampling Assumption

7 drnickbone 20 January 2012 11:50PM

Hello everyone; I'm new to the forum, and have been advised to post this in the "discussion" section. Hope this is OK.

I've found some references to discussions here on Brandon Carter / John Leslie's "Doomsday Argument" and they seemed well-informed. One thing I've noticed about the argument though (but haven't seen discussed before) is that it can be made much sharper by assuming that we are making random *observations*, rather than just that we are a random *observer*.  

For those who know the literature, this is a form of Nick Bostrom's Strong Self-Sampling Assumption as opposed to the (basic) Self-Sampling Assumption. Oddly enough, Bostrom discusses SSSA quite a lot in connection with the Doomsday Argument, but I can't see that he's done quite the analysis below. 

So here goes:

In the "random observer" model (the Self-Sampling Assumption with the widest reference class of "all observers"), we discover that we are in a human civilization and there have been ~100 billion observers before us in that civilization. We should then predict (crudely) that there will be about ~100 billion observers coming after us in that civilization; also we should predict that a typical civilization of observers won't have much more than ~100-200 billion observers in total (otherwise we'd be in one of the much bigger ones, rather than in a smaller one). So typical civilizations don't expand beyond their planets of origin, and don't even last very long on their planets of origin.

Further, since there are currently ~150 million human births per year that would imply the end of the human race in ~700 years at current population size and birth-rates. Doom soon-ish but not very soon.

 

But what about the "random observation" model? One difference here is that a large portion of the ~100 billion humans living before us died very young (high infant mortality rate) so made very few observations. For instance, Carl Haub, who calculated the 100 billion number (see http://www.prb.org/Articles/2002/HowManyPeopleHaveEverLivedonEarth.aspx) reckons that for most of human history, life expectancy at birth has been little more than 10 years. By contrast, recent observers have had a life expectancy of 60+ years, so are making many more observations through their lives than average. This means that *observations* are much more concentrated in the present era than *observers*.

 

Working with Haub's population numbers, there have been about 1-2 trillion "person-years" of observations before our current observations (in January 2012). Also, that estimate is very stable even when we make quite different estimates about birth-rate. (The reason is that the overall population at different stages in history is easier to estimate than the overall birth-rate, so integrating population through time to give person-years is easier than integrating birth-rate through time to give births).

Under the "random observation" model, we would predict a similar number of person-years of observations to come in the future of our civilization. At a human population size of ~7 billion, there are only around 1-2000 / 7 or ~200 years until human extinction: doom rather sooner. And if population climbs to 10 or 14 billion before flattening out (as some demographers predict) then doom even sooner still.

What's also quite striking is that over 20% of all observations *so far* have happened since 1900, and under a "doom soon" model the *majority* of all observations would happen in the period of multi-billion population sizes. So our current observations look very typical in this model.

 

Now I'm aware that Bostrom thinks the SSSA is a way out of the Doomsday Argument, since by relativizing the "reference class" (to something other than all observations, or all human observatioons) then we get a less "doomish" prediction. All we can conclude is that the reference class we are part of (whatever that is) will terminate soon, whereas observers in general can carry on. I'm also aware of a number of criticisms of the whole SSA/SSSA approach.

On the other hand, it is quite striking that a very simple reference class (all observations), coupled to a very simple population model for observers (exponential growth -> short peak -> collapse) predicts more or less exactly what we are seeing now.

View more: Next