Less Wrong is a community blog devoted to refining the art of human rationality. Please visit our About page for more information.

## Robust Cooperation in the Prisoner's Dilemma

61 07 June 2013 08:30AM

I'm proud to announce the preprint of Robust Cooperation in the Prisoner's Dilemma: Program Equilibrium via Provability Logic, a joint paper with Patrick LaVictoire (me), Mihaly Barasz, Paul Christiano, Benja Fallenstein, Marcello Herreshoff, and Eliezer Yudkowsky.

This paper was one of three projects to come out of the 2nd MIRI Workshop on Probability and Reflection in April 2013, and had its genesis in ideas about formalizations of decision theory that have appeared on LessWrong. (At the end of this post, I'll include links for further reading.)

Below, I'll briefly outline the problem we considered, the results we proved, and the (many) open questions that remain. Thanks in advance for your thoughts and suggestions!

## Background: Writing programs to play the PD with source code swap

(If you're not familiar with the Prisoner's Dilemma, see here.)

The paper concerns the following setup, which has come up in academic research on game theory: say that you have the chance to write a computer program X, which takes in one input and returns either Cooperate or Defect. This program will face off against some other computer program Y, but with a twist: X will receive the source code of Y as input, and Y will receive the source code of X as input. And you will be given your program's winnings, so you should think carefully about what sort of program you'd write!

Of course, you could simply write a program that defects regardless of its input; we call this program DefectBot, and call the program that cooperates on all inputs CooperateBot. But with the wealth of information afforded by the setup, you might wonder if there's some program that might be able to achieve mutual cooperation in situations where DefectBot achieves mutual defection, without thereby risking a sucker's payoff. (Douglas Hofstadter would call this a perfect opportunity for superrationality...)

## Previously known: CliqueBot and FairBot

And indeed, there's a way to do this that's been known since at least the 1980s. You can write a computer program that knows its own source code, compares it to the input, and returns C if and only if the two are identical (and D otherwise). Thus it achieves mutual cooperation in one important case where it intuitively ought to: when playing against itself! We call this program CliqueBot, since it cooperates only with the "clique" of agents identical to itself.

There's one particularly irksome issue with CliqueBot, and that's the fragility of its cooperation. If two people write functionally analogous but syntactically different versions of it, those programs will defect against one another! This problem can be patched somewhat, but not fully fixed. Moreover, mutual cooperation might be the best strategy against some agents that are not even functionally identical, and extending this approach requires you to explicitly delineate the list of programs that you're willing to cooperate with. Is there a more flexible and robust kind of program you could write instead?

As it turns out, there is: in a 2010 post on LessWrong, cousin_it introduced an algorithm that we now call FairBot. Given the source code of Y, FairBot searches for a proof (of less than some large fixed length) that Y returns C when given the source code of FairBot, and then returns C if and only if it discovers such a proof (otherwise it returns D). Clearly, if our proof system is consistent, FairBot only cooperates when that cooperation will be mutual. But the really fascinating thing is what happens when you play two versions of FairBot against each other. Intuitively, it seems that either mutual cooperation or mutual defection would be stable outcomes, but it turns out that if their limits on proof lengths are sufficiently high, they will achieve mutual cooperation!

The proof that they mutually cooperate follows from a bounded version of Löb's Theorem from mathematical logic. (If you're not familiar with this result, you might enjoy Eliezer's Cartoon Guide to Löb's Theorem, which is a correct formal proof written in much more intuitive notation.) Essentially, the asymmetry comes from the fact that both programs are searching for the same outcome, so that a short proof that one of them cooperates leads to a short proof that the other cooperates, and vice versa. (The opposite is not true, because the formal system can't know it won't find a contradiction. This is a subtle but essential feature of mathematical logic!)

## Generalization: Modal Agents

Unfortunately, FairBot isn't what I'd consider an ideal program to write: it happily cooperates with CooperateBot, when it could do better by defecting. This is problematic because in real life, the world isn't separated into agents and non-agents, and any natural phenomenon that doesn't predict your actions can be thought of as a CooperateBot (or a DefectBot). You don't want your agent to be making concessions to rocks that happened not to fall on them. (There's an important caveat: some things have utility functions that you care about, but don't have sufficient ability to predicate their actions on yours. In that case, though, it wouldn't be a true Prisoner's Dilemma if your values actually prefer the outcome (C,C) to (D,C).)

However, FairBot belongs to a promising class of algorithms: those that decide on their action by looking for short proofs of logical statements that concern their opponent's actions. In fact, there's a really convenient mathematical structure that's analogous to the class of such algorithms: the modal logic of provability (known as GL, for Gödel-Löb).

So that's the subject of this preprint: what can we achieve in decision theory by considering agents defined by formulas of provability logic?

## Prisoner's Dilemma (with visible source code) Tournament

45 07 June 2013 08:30AM

After the iterated prisoner's dilemma tournament organized by prase two years ago, there was discussion of running tournaments for several variants, including one in which two players submit programs, each of which are given the source code of the other player's program, and outputs either “cooperate” or “defect”. However, as far as I know, no such tournament has been run until now.

Here's how it's going to work: Each player will submit a file containing a single Scheme lambda-function. The function should take one input. Your program will play exactly one round against each other program submitted (not including itself). In each round, two programs will be run, each given the source code of the other as input, and will be expected to return either of the symbols “C” or “D” (for "cooperate" and "defect", respectively). The programs will receive points based on the following payoff matrix:

$\begin{array}{cccc} & C & D & other\\ C & (2,\,2) & (0,\,3) & (0,\,2)\\ D & (3,\,0) & (1,\,1) & (1,\,0)\\ other & (2,\,0) & (0,\,1) & (0,\,0) \end{array}$

“Other” includes any result other than returning “C” or “D”, including failing to terminate, throwing an exception, and even returning the string “Cooperate”. Notice that “Other” results in a worst-of-both-worlds scenario where you get the same payoff as you would have if you cooperated, but the other player gets the same payoff as if you had defected. This is an attempt to ensure that no one ever has incentive for their program to fail to run properly, or to trick another program into doing so.

Your score is the sum of the number of points you earn in each round. The player with the highest score wins the tournament. Edit: There is a 0.5 bitcoin prize being offered for the winner. Thanks, VincentYu!

Details:
All submissions must be emailed to wardenPD@gmail.com by July 5, at noon PDT. Your email should also say how you would like to be identified when I announce the tournament results.
Each program will be allowed to run for 10 seconds. If it has not returned either “C” or “D” by then, it will be stopped, and treated as returning “Other”. For consistency, I will have Scheme collect garbage right before each run.
One submission per person or team. No person may contribute to more than one entry. Edit: This also means no copying from each others' source code. Describing the behavior of your program to others is okay.
I will be running the submissions in Racket. You may be interested in how Racket handles time (especially the (current-milliseconds) function), threads (in particular, “thread”, “kill-thread”, “sleep”, and “thread-dead?”), and possibly randomness.
Don't try to open the file you wrote your program in (or any other file, for that matter). I'll add code to the file before running it, so if you want your program to use a copy of your source code, you will need to use a quine. Edit: No I/O of any sort.
Unless you tell me otherwise, I assume I have permission to publish your code after the contest.
You are encouraged to discuss strategies for achieving mutual cooperation in the comments thread.
I'm hoping to get as many entries as possible. If you know someone who might be interested in this, please tell them.
It's possible that I've said something stupid that I'll have to change or clarify, so you might want to come back to this page again occasionally to look for changes to the rules. Any edits will be bolded, and I'll try not to change anything too drastically, or make any edits late in the contest.

Here is an example of a correct entry, which cooperates with you if and only if you would cooperate with a program that always cooperates (actually, if and only if you would cooperate with one particular program that always cooperates):

(lambda (x)
(if (eq? ((eval x) '(lambda (y) 'C)) 'C)
'C
'D))

## Tiling Agents for Self-Modifying AI (OPFAI #2)

51 06 June 2013 08:24PM

An early draft of publication #2 in the Open Problems in Friendly AI series is now available:  Tiling Agents for Self-Modifying AI, and the Lobian Obstacle.  ~20,000 words, aimed at mathematicians or the highly mathematically literate.  The research reported on was conducted by Yudkowsky and Herreshoff, substantially refined at the November 2012 MIRI Workshop with Mihaly Barasz and Paul Christiano, and refined further at the April 2013 MIRI Workshop.

Abstract:

We model self-modication in AI by introducing 'tiling' agents whose decision systems will approve the construction of highly similar agents, creating a repeating pattern (including similarity of the offspring's goals).  Constructing a formalism in the most straightforward way produces a Godelian difficulty, the Lobian obstacle.  By technical methods we demonstrate the possibility of avoiding this obstacle, but the underlying puzzles of rational coherence are thus only partially addressed.  We extend the formalism to partially unknown deterministic environments, and show a very crude extension to probabilistic environments and expected utility; but the problem of finding a fundamental decision criterion for self-modifying probabilistic agents remains open.

Commenting here is the preferred venue for discussion of the paper.  This is an early draft and has not been reviewed, so it may contain mathematical errors, and reporting of these will be much appreciated.

The overall agenda of the paper is introduce the conceptual notion of a self-reproducing decision pattern which includes reproduction of the goal or utility function, by exposing a particular possible problem with a tiling logical decision pattern and coming up with some partial technical solutions.  This then makes it conceptually much clearer to point out the even deeper problems with "We can't yet describe a probabilistic way to do this because of non-monotonicity" and "We don't have a good bounded way to do this because maximization is impossible, satisficing is too weak and Schmidhuber's swapping criterion is underspecified."  The paper uses first-order logic (FOL) because FOL has a lot of useful standard machinery for reflection which we can then invoke; in real life, FOL is of course a poor representational fit to most real-world environments outside a human-constructed computer chip with thermodynamically expensive crisp variable states.

As further background, the idea that something-like-proof might be relevant to Friendly AI is not about achieving some chimera of absolute safety-feeling, but rather about the idea that the total probability of catastrophic failure should not have a significant conditionally independent component on each self-modification, and that self-modification will (at least in initial stages) take place within the highly deterministic environment of a computer chip.  This means that statistical testing methods (e.g. an evolutionary algorithm's evaluation of average fitness on a set of test problems) are not suitable for self-modifications which can potentially induce catastrophic failure (e.g. of parts of code that can affect the representation or interpretation of the goals).  Mathematical proofs have the property that they are as strong as their axioms and have no significant conditionally independent per-step failure probability if their axioms are semantically true, which suggests that something like mathematical reasoning may be appropriate for certain particular types of self-modification during some developmental stages.

Thus the content of the paper is very far off from how a realistic AI would work, but conversely, if you can't even answer the kinds of simple problems posed within the paper (both those we partially solve and those we only pose) then you must be very far off from being able to build a stable self-modifying AI.  Being able to say how to build a theoretical device that would play perfect chess given infinite computing power, is very far off from the ability to build Deep Blue.  However, if you can't even say how to play perfect chess given infinite computing power, you are confused about the rules of the chess or the structure of chess-playing computation in a way that would make it entirely hopeless for you to figure out how to build a bounded chess-player.  Thus "In real life we're always bounded" is no excuse for not being able to solve the much simpler unbounded form of the problem, and being able to describe the infinite chess-player would be substantial and useful conceptual progress compared to not being able to do that.  We can't be absolutely certain that an analogous situation holds between solving the challenges posed in the paper, and realistic self-modifying AIs with stable goal systems, but every line of investigation has to start somewhere.

Parts of the paper will be easier to understand if you've read Highly Advanced Epistemology 101 For Beginners including the parts on correspondence theories of truth (relevant to section 6) and model-theoretic semantics of logic (relevant to 3, 4, and 6), and there are footnotes intended to make the paper somewhat more accessible than usual, but the paper is still essentially aimed at mathematically sophisticated readers.

## Rationality Quotes June 2013

3 03 June 2013 03:08AM

Another month has passed and here is a new rationality quotes thread. The usual rules are:

• Please post all quotes separately, so that they can be upvoted or downvoted separately. (If they are strongly related, reply to your own comments. If strongly ordered, then go ahead and post them together.)
• Do not quote yourself.
• Do not quote from Less Wrong itself, Overcoming Bias, or HPMoR.
• No more than 5 quotes per person per monthly thread, please.

## Earning to Give vs. Altruistic Career Choice Revisited

30 02 June 2013 02:55AM

A commonly voiced sentiment in the effective altruist community is that the best way to do the most good is generally to make as much money as possible, with a view toward donating to the most cost-effective charities. This is often referred to as “earning to give.” In the article To save the world, don’t get a job at a charity; go work on Wall Street William MacAskill wrote:

Top undergraduates who want to “make a difference” are encouraged to forgo the allure of Wall Street and work in the charity sector ... while researching ethical career choice, I concluded that it’s in fact better to earn a lot of money and donate a good chunk of it to the most cost-effective charities, a path that I call “earning to give.” ... In general, the charitable sector is people-rich but money-poor. Adding another person to the labor pool just isn’t as valuable as providing more money, so that more workers can be hired.

In private correspondence, MacAskill clarified that he wasn’t arguing that “earning to give” is the best way to do good, only that it’s often better than working at a given nonprofit.  In a recent comment MacAskill wrote

I think there's too much emphasis on “earning to give” as the *best* option rather than as the *baseline* option

and raises a number of counter-considerations against “earning to give. Despite this, the idea that “earning to give” is optimal has caught on in the effective altruist community, and so it’s important to discuss it.

Over the past three years, I myself have shifted from the position that earning to give is philanthropically optimal, to the position that it’s generally the case that one can do more good by choosing a career with high direct social value than by choosing a lucrative career with a view toward donating as much as possible

In this post I’ll outline some arguments in favor of this view.

## Reductionism sequence now available in audio format

17 02 June 2013 02:55AM

The sequence "Reductionism", which includes the subsequences "Joy in the Merely Real" and "Zombies", is now available as a professionally read podcast.

Thanks to those who've been listening, let us know how your experience has been thus far and what you think of the service by dropping an email to support@castify.co.

## The Centre for Applied Rationality: a year later from a (somewhat) outside perspective

39 27 May 2013 06:31PM

I recently had the privilege of being a CFAR alumni volunteering at a later workshop, which is a fascinating thing to do, and put me in a position both to evaluate how much of a difference the first workshop actually made in my life, and to see how the workshops themselves have evolved.

Exactly a year ago, I attended one of the first workshops, back when they were still inexplicably called “minicamps”. I wasn't sure what to expect, and I especially wasn't sure why I had been accepted. But I bravely bullied the nursing faculty staff until they reluctantly let me switch a day of clinical around, and later stumbled off my plane into the San Francisco airport in a haze of exhaustion. The workshop spat me out three days later, twice as exhausted, with teetering piles of ideas and very little time or energy to apply them. I left with a list of annual goals, which I had never bothered to have before, and a feeling that more was possible–this included the feeling that more would have been possible if the workshop had been longer and less chaotic, if I had slept more the week before, if I hadn't had to rush out on Sunday evening to catch a plane and miss the social.

Like I frequently do on Less Wrong the website, I left the minicamp feeling a bit like an outsider, but also a bit like I had come home. As well as my written goals, I made an unwritten pre-commitment to come back to San Francisco later, for longer, and see whether I could make the "more is possible" in my head more specific. Of my thirteen written goals on my list, I fully accomplished only four and partially accomplished five, but I did make it back to San Francisco, at the opportunity cost of four weeks of sacrificed hospital shifts.

A week or so into my stay, while I shifted around between different rationalist shared houses and attempted to max out interesting-conversations-for-day, I found out that CFAR was holding another May workshop. I offered to volunteer, proved my sincerity by spending 6 hours printing and sticking nametags, and lived on site for another 4-day weekend of delightful information overload and limited sleep.

Before the May 2012 workshop, I had a low prior that any four-day workshop could be life-changing in a major way. A four-year nursing degree, okay–I've successfully retrained my social skills and my ability to react under pressure by putting myself in particular situations over and over and over and over again. Four days? Nah. Brains don't work that way.

In my experience, it's exceedingly hard for the human brain to do anything deliberately. In Kahneman-speak, habits are System 1, effortless and automatic. Doing things on purpose involves System 2, effortful and a bit aversive. I could have had a much better experience in my final intensive care clinical if I'd though to open up my workshop notes and tried to address the causes of aversions, or use offline time to train habits, or, y'know, do anything on purpose instead of floundering around trying things at random until they worked.

(The again, I didn't apply concepts like System 1 and System 2 to myself a year ago. I read 'Thinking Fast and Slow' by Kahneman and 'Rationality and the Reflective Mind' by Stanovich as part of my minicamp goal 'read 12 hard nonfiction books this year', most of which came from the CFAR recommended reading list. If my preceptor had had any idea what I was saying when I explained to her that she was running particular nursing skills on System 1, because they were engrained on the level of habit, and I was running the same tasks on System 2 in working memory because they were new and confusing to me, and that was why I appeared to have poor time management, because System 2 takes forever to do anything, this terminology might have helped. Oh, for the world where everyone knows all jargon!)

...And here I am, setting aside a month of my life to think only about rationality. I can't imagine that my counterfactual self-who-didn't-attend-in-May-2012 would be here. I can't imagine that being here now will have zero effect on what I'm doing in a year, or ten years. Bingo. I did one thing deliberately!

So what was the May 2013 workshop actually like?

The curriculum has shifted around a lot in the past year, and I think with 95% probability that it's now more concretely useful. (Speaking of probabilities, the prediction markets during the workshop seemed to flow better and be more fun and interesting this time, although this may just show that I was more averse to games in general and betting in particular. In that case, yay for partly-cured aversions!)

The classes are grouped in an order that allows them to build on each other usefully, and they've been honed by practice into forms that successfully teach skills, instead of just putting words in the air and on flipcharts. For example, having a personal productivity system like GTD came across as a culturally prestigious thing at the last workshop, but there wasn't a lot of useful curriculum on it. Of course, I left on this trip wanting to spend my offline month creating with a GTD system better than paper to-do lists taped to walls, so I have both motivation and a low threshold for improvement.

There are also some completely new classes, including "Againstness training" by Valentine, which seem to relate to some of the 'reacting under pressure' stuff in interesting ways, and gave me vocabulary and techniques for something I've been doing inefficiently by trial and error for a good part of my life.

In general, there are more classes about emotions, both how to deal with them when they're in the way and how to use them when they're the best tool available. Given that none of us are Spock, I think this is useful.

Rejection therapy has morphed into a less terrifying and more helpful form with the awesome name of CoZE (Comfort Zone Expansion). I didn't personally find the original rejection therapy all that awful, but some people did, and that problem is largely solved.

The workshops are vastly more orderly and organized. (I like to think I contributed to this slightly with my volunteer skills of keeping the fridge stocked with water bottles and calling restaurants to confirm orders and make sure food arrived on time.) Classes began and ended on time. The venue stayed tidy. The food was excellent. It was easier to get enough sleep. Etc. The May 2012 venue had a pool, and this one didn't, which made exercise harder for addicts like me. CFAR staff are talking about solving this.

The workshops still aren't an easy environment for introverts. The negative parts of my experience in May 2012 were mostly because of this. It was easier this time, because as a volunteer I could skip classes if I started to feel socially overloaded, but periods of quiet alone time had to be effortfully carved out of the day, and at an opportunity cost of missing interesting conversations. I'm not sure if this problem is solvable without either making the workshops longer, in order to space the material out, and thus less accessible for people with jobs, or by cutting out curriculum. Either would impose a cost on the extroverts who don't want an hour at lunch to meditate or go running alone or read a sci-fi book, etc.

In general, I found the May 2012 workshop too short and intense–we had material thrown at us at a rate far exceeding the usual human idea-digestion rate. Keeping in touch via Skype chats with other participants helped. CFAR now does official followups with participants for six weeks following the workshop.

Meeting the other participants was, as usual, the best part of the weekend. The group was quite diverse, although I was still the only health care professional there. (Whyyy???? The health care system needs more rationality so badly!) The conversations were engaging. Many of the participants seem eager to stay in touch. The May 2012 workshop has a total of six people still on the Skype chats list, which is a 75% attrition rate. CFAR is now working on strategies to help people who want to stay in touch do it successfully.

Conclusions?

I thought the May 2012 workshop was awesome. I thought the May 2013 workshop was about an order of magnitude more awesome. I would say that now is a great time to attend a CFAR workshop...except that the organization is financially stable and likely to still be around in a year and producing even better workshops. So I'm not sure. Then again, rationality skills have compound interest–the value of learning some new skills now, even if they amount more to vocab words and mental labels than superpowers, compounds over the year that you spend seeing all the books you read and all the opportunities you have in that framework. I'm glad I went a year ago instead of this May. I'm even more glad I had the opportunity to see the new classes and meet the new participants a year later.

## Robustness of Cost-Effectiveness Estimates and Philanthropy

36 24 May 2013 08:28PM

Note: I formerly worked as a research analyst at GiveWell. This post describes the evolution of my thinking about robustness of cost-effectiveness estimates in philanthropy. All views expressed here are my own.

Up until 2012, I believed that detailed explicit cost-effectiveness estimates are very important in the context of philanthropy. My position was reflected in a comment that I made in 2011:

The problem with using unquantified heuristics and intuitions is that the “true” expected values of philanthropic efforts plausibly differ by many orders of magnitude, and unquantified heuristics and intuitions are frequently insensitive to this. The last order of magnitude is the only one that matters; all others are negligible by comparison. So if at all possible, one should do one’s best to pin down the philanthropic efforts with the “true” expected value per dollar of the highest (positive) order of magnitude. It seems to me as though any feasible strategy for attacking this problem involves explicit computation.

During my time at GiveWell, my position on this matter shifted. I still believe that there are instances in which rough cost-effectiveness estimates can be useful for determining good philanthropic foci. But I’ve shifted toward the position that effective altruists should spend much more time on qualitative analysis than on quantitative analysis in determining how they can maximize their positive social impact.

In this post I’ll focus on one reason for my shift: explicit cost-effectiveness estimates are generally much less robust than I had previously thought.

## Post ridiculous munchkin ideas!

47 15 May 2013 10:27PM

A Munchkin is the sort of person who, faced with a role-playing game, reads through the rulebooks over and over until he finds a way to combine three innocuous-seeming magical items into a cycle of infinite wish spells.  Or who, in real life, composes a surprisingly effective diet out of drinking a quarter-cup of extra-light olive oil at least one hour before and after tasting anything else.  Or combines liquid nitrogen and antifreeze and life-insurance policies into a ridiculously cheap method of defeating the invincible specter of unavoidable Death.  Or figures out how to build the real-life version of the cycle of infinite wish spells.

It seems that many here might have outlandish ideas for ways of improving our lives. For instance, a recent post advocated installing really bright lights as a way to boost alertness and productivity. We should not adopt such hacks into our dogma until we're pretty sure they work; however, one way of knowing whether a crazy idea works is to try implementing it, and you may have more ideas than you're planning to implement.

So: please post all such lifehack ideas! Even if you haven't tried them, even if they seem unlikely to work. Post them separately, unless some other way would be more appropriate. If you've tried some idea and it hasn't worked, it would be useful to post that too.

## Pascal's Muggle: Infinitesimal Priors and Strong Evidence

39 08 May 2013 12:43AM

Short form:  Pascal's Muggle

tl;dr:  If you assign superexponentially infinitesimal probability to claims of large impacts, then apparently you should ignore the possibility of a large impact even after seeing huge amounts of evidence.  If a poorly-dressed street person offers to save 10(10^100) lives (googolplex lives) for \$5 using their Matrix Lord powers, and you claim to assign this scenario less than 10-(10^100) probability, then apparently you should continue to believe absolutely that their offer is bogus even after they snap their fingers and cause a giant silhouette of themselves to appear in the sky.  For the same reason, any evidence you encounter showing that the human species could create a sufficiently large number of descendants - no matter how normal the corresponding laws of physics appear to be, or how well-designed the experiments which told you about them - must be rejected out of hand.  There is a possible reply to this objection using Robin Hanson's anthropic adjustment against the probability of large impacts, and in this case you will treat a Pascal's Mugger as having decision-theoretic importance exactly proportional to the Bayesian strength of evidence they present you, without quantitative dependence on the number of lives they claim to save.  This however corresponds to an odd mental state which some, such as myself, would find unsatisfactory.  In the end, however, I cannot see any better candidate for a prior than having a leverage penalty plus a complexity penalty on the prior probability of scenarios.

In late 2007 I coined the term "Pascal's Mugging" to describe a problem which seemed to me to arise when combining conventional decision theory and conventional epistemology in the obvious way.  On conventional epistemology, the prior probability of hypotheses diminishes exponentially with their complexity; if it would take 20 bits to specify a hypothesis, then its prior probability receives a 2-20 penalty factor and it will require evidence with a likelihood ratio of 1,048,576:1 - evidence which we are 1048576 times more likely to see if the theory is true, than if it is false - to make us assign it around 50-50 credibility.  (This isn't as hard as it sounds.  Flip a coin 20 times and note down the exact sequence of heads and tails.  You now believe in a state of affairs you would have assigned a million-to-one probability beforehand - namely, that the coin would produce the exact sequence HTHHHHTHTTH... or whatever - after experiencing sensory data which are more than a million times more probable if that fact is true than if it is false.)  The problem is that although this kind of prior probability penalty may seem very strict at first, it's easy to construct physical scenarios that grow in size vastly faster than they grow in complexity.

I originally illustrated this using Pascal's Mugger:  A poorly dressed street person says "I'm actually a Matrix Lord running this world as a computer simulation, along with many others - the universe above this one has laws of physics which allow me easy access to vast amounts of computing power.  Just for fun, I'll make you an offer - you give me five dollars, and I'll use my Matrix Lord powers to save 3↑↑↑↑3 people inside my simulations from dying and let them live long and happy lives" where ↑ is Knuth's up-arrow notation.  This was originally posted in 2007, when I was a bit more naive about what kind of mathematical notation you can throw into a random blog post without creating a stumbling block.  (E.g.:  On several occasions now, I've seen someone on the Internet approximate the number of dust specks from this scenario as being a "billion", since any incomprehensibly large number equals a billion.)  Let's try an easier (and way smaller) number instead, and suppose that Pascal's Mugger offers to save a googolplex lives, where a googol is 10100 (a 1 followed by a hundred zeroes) and a googolplex is 10 to the googol power, so 1010100 or 1010,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000 lives saved if you pay Pascal's Mugger five dollars, if the offer is honest.