SUDT: A toy decision theory for updateless anthropics

15 Benja 23 February 2014 11:50PM

The best approach I know for thinking about anthropic problems is Wei Dai's Updateless Decision Theory (UDT). We aren't yet able to solve all problems that we'd like to—for example, when it comes to game theory, the only games we have any idea how to solve are very symmetric ones—but for many anthropic problems, UDT gives the obviously correct solution. However, UDT is somewhat underspecified, and cousin_it's concrete models of UDT based on formal logic are rather heavyweight if all you want is to figure out the solution to a simple anthropic problem.

In this post, I introduce a toy decision theory, Simple Updateless Decision Theory or SUDT, which is most definitely not a replacement for UDT but makes it easy to formally model and solve the kind of anthropic problems that we usually apply UDT to. (And, of course, it gives the same solutions as UDT.) I'll illustrate this with a few examples.

This post is a bit boring, because all it does is to take a bit of math that we already implicitly use all the time when we apply updateless reasoning to anthropic problems, and spells it out in excruciating detail. If you're already well-versed in that sort of thing, you're not going to learn much from this post. The reason I'm posting it anyway is that there are things I want to say about updateless anthropics, with a bit of simple math here and there, and while the math may be intuitive, the best thing I can point to in terms of details are the posts on UDT, which contain lots of irrelevant complications. So the main purpose of this post is to save people from having to reverse-engineer the simple math of SUDT from the more complex / less well-specified math of UDT.

(I'll also argue that Psy-Kosh's non-anthropic problem is a type of counterfactual mugging, I'll use the concept of l-zombies to explain why UDT's response to this problem is correct, and I'll explain why this argument still works if there aren't any l-zombies.)

*

I'll introduce SUDT by way of a first example: the counterfactual mugging. In my preferred version, Omega appears to you and tells you that it has thrown a very biased coin, which had only a 1/1000 chance of landing heads; however, in this case, the coin has in fact fallen heads, which is why Omega is talking to you. It asks you to choose between two options, (H) and (T). If you choose (H), Omega will create a Friendly AI; if you choose (T), it will destroy the world. However, there is a catch: Before throwing the coin, Omega made a prediction about which of these options you would choose if the coin came up heads (and it was able to make a highly confident prediction). If the coin had come up tails, Omega would have destroyed the world if it's predicted that you'd choose (H), and it would have created a Friendly AI if it's predicted (T). (Incidentally, if it hadn't been able to make a confident prediction, it would just have destroyed the world outright.)

  Coin falls heads (chance = 1/1000) Coin falls tails (chance = 999/1000)
You choose (H) if coin falls heads Positive intelligence explosion
Humanity wiped out
You choose (T) if coin falls heads Humanity wiped out Positive intelligence explosion

In this example, we are considering two possible worlds: and . We write (no pun intended) for the set of all possible worlds; thus, in this case, . We also have a probability distribution over , which we call . In our example, and .

In the counterfactual mugging, there is only one situation you might find yourself in in which you need to make a decision, namely when Omega tells you that the coin has fallen heads. In general, we write for the set of all possible situations in which you might need to make a decision; the stands for the information available to you, including both sensory input and your memories. In our case, we'll write , where is the single situation where you need to make a decision.

For every , we write for the set of possible actions you can take if you find yourself in situation . In our case,. A policy (or "plan") is a function that associates to every situation an action to take in this situation. We write for the set of all policies. In our case, , where and .

Next, there is a set of outcomes, , which specify all the features of what happens in the world that make a difference to our final goals, and the outcome function , which for every possible world and every policy specifies the outcome that results from executing in the world . In our case, (standing for FAI and DOOM), and and .

Finally, we have a utility function . In our case, and . (The exact numbers don't really matter, as long as , because utility functions don't change their meaning under affine transformations, i.e. when you add a constant to all utilities or multiply all utilities by a positive number.)

Thus, an SUDT decision problem consists of the following ingredients: The sets , and of possible worlds, situations you need to make a decision in, and outcomes; for every , the set of possible actions in that situation; the probability distribution ; and the outcome and utility functions and . SUDT then says that you should choose a policy that maximizes the expected utility , where is the expectation with respect to , and is the true world.

In our case, is just the probability of the good outcome , according to the (prior) distribution . For , that probability is 1/1000; for , it is 999/1000. Thus, SUDT (like UDT) recommends choosing (T).

If you set up the problem in SUDT like that, it's kind of hidden why you could possibly think that's not the right thing to do, since we aren't distinguishing situations that are "actually experienced" in a particular possible world ; there's nothing in the formalism that reflects the fact that Omega never asks us for our choice if the coin comes up tails. In my post on l-zombies, I've argued that this makes sense because even if there's no version of you that actually consciously experiences being in the heads world, this version still exists as a Turing machine and the choices that it makes influence what happens in the real world. If all mathematically possible experiences exist, so that there aren't any l-zombies, but some experiences are "experienced more" (have more "magical reality fluid") than others, the argument is even clearer—even if there's some anthropic sense in which, upon being told that the coin fell heads, you can conclude that you should assign a high probability of being in the heads world, the same version of you still exists in the tails world, and its choices influence what happens there. And if everything is experienced to the same degree (no magical reality fluid), the argument is clearer still.

*

From Vladimir Nesov's counterfactual mugging, let's move on to what I'd like to call Psy-Kosh's probably counterfactual mugging, better known as Psy-Kosh's non-anthropic problem. This time, you're not alone: Omega gathers you together with 999,999 other advanced rationalists, all well-versed in anthropic reasoning and SUDT. It places each of you in a separate room. Then, as before, it throws a very biased coin, which has only a 1/1000 chance of landing heads. If the coin does land heads, then Omega asks all of you to choose between two options, (H) and (T). If the coin falls tails, on the other hand, Omega chooses one of you at random and asks that person to choose between (H) and (T). If the coin lands heads and you all choose (H), Omega will create a Friendly AI; same if the coin lands tails, and the person who's asked chooses (T); else, Omega will destroy the world.

  Coin falls heads (chance = 1/1000) Coin falls tails (chance = 999/1000)
Everyone chooses (H) if asked Positive intelligence explosion
Humanity wiped out
Everyone chooses (T) if asked
Humanity wiped out Positive intelligence explosion
Different people choose differently
Humanity wiped out (Depends on who is asked)

We'll assume that all of you prefer a positive FOOM over a gloomy DOOM, which means that all of you have the same values as far as the outcomes of this little dilemma are concerned: , as before, and all of you have the same utility function, given by and . As long as that's the case, we can apply SUDT to find a sensible policy for everybody to follow (though when there is more than one optimal policy, and the different people involved can't talk to each other, it may not be clear how one of the policies should be chosen).

This time, we have a million different people, who can in principle each make an independent decision about what to answer if Omega asks them the question. Thus, we have . Each of these people can choose between (H) and (T), so for every person , and a policy is a function that returns either (H) or (T) for every . Obviously, we're particularly interested in the policies and satisfying and for all .

The possible worlds are , and their probabilities are and . The outcome function is as follows: , for , if , and otherwise.

What does SUDT recommend? As in the counterfactual mugging, is the probability of the good outcome , under policy . For , the good outcome can only happen if the coin falls heads: in other words, with probability . If , then the good outcome can not happen if the coin falls heads, because in that case everybody gets asked, and at least one person chooses (T). Thus, in this case, the good outcome will happen only if the coin comes up tails and the randomly chosen person answers (T); this probability is , where is the number of people answering (T). Clearly, this is maximized for , where ; moreover, in this case we get the probability , which is better than for , so SUDT recommends the plan .

Again, when you set up the problem in SUDT, it's not even obvious why anyone might think this wasn't the correct answer. The reason is that if Omega asks you, and you update on the fact that you've been asked, then after updating, you are quite certain that the coin has landed heads: yes, your prior probability was only 1/1000, but if the coin has landed tails, the chances that you would be asked was only one in a million, so the posterior odds are about 1000:1 in favor of heads. So, you might reason, it would be best if everybody chose (H); and moreover, all the people in the other rooms will reason the same way as you, so if you choose (H), they will as well, and this maximizes the probability that humanity survives. This relies on the fact that the others will choose the same way as you, but since you're all good rationalists using the same decision theory, that's going to be the case.

But in the worlds where the coin comes up tails, and Omega chooses someone else than you, the version of you that gets asked for its decision still "exists"... as an l-zombie. You might think that what this version of you does or doesn't do doesn't influence what happens in the real world; but if we accept the argument from the previous paragraph that your decisions are "linked" to those of the other people in the experiment, then they're still linked if the version of you making the decision is an l-zombie: If we see you as a Turing machine making a decision, that Turing machine should reason, "If the coin came up tails and someone else was chosen, then I'm an l-zombie, but the person who is actually chosen will reason exactly the same way I'm doing now, and will come to the same decision; hence, my decision influences what happens in the real world even in this case, and I can't do an update and just ignore those possible worlds."

I call this the "probably counterfactual mugging" because in the counterfactual mugging, you are making your choice because of its benefits in a possible world that is ruled out by your observations, while in the probably counterfactual mugging, you're making it because of its benefits in a set of possible worlds that is made very improbable by your observations (because most of the worlds in this set are ruled out). As with the counterfactual mugging, this argument is just all the stronger if there are no l-zombies because all mathematically possible experiences are in fact experienced.

*

As a final example, let's look at what I'd like to call Eliezer's anthropic mugging: the anthropic problem that inspired Psy-Kosh's non-anthropic one. This time, you're alone again, except that there's many of you: Omega is creating a million copies of you. It flips its usual very biased coin, and if that coin falls heads, it places all of you in exactly identical green rooms. If the coin falls tails, it places one of you in a green room, and all the others in red rooms. It then asks all copies in green rooms to choose between (H) and (T); if your choice agrees with the coin, FOOM, else DOOM.

  Coin falls heads (chance = 1/1000) Coin falls tails (chance = 999/1000)
Green roomers choose (H) Positive intelligence explosion
Humanity wiped out
Green roomers choose (T) Humanity wiped out Positive intelligence explosion

Our possible worlds are back to being , with probabilities and . We are also back to being able to make a choice in only one particular situation, namely when you're a copy in a green room: . Actions are , outcomes , utilities and , and the outcome function is given by and . In other words, from SUDT's perspective, this is exactly identical to the situation with the counterfactual mugging, and thus the solution is the same: Once more, SUDT recommends choosing (T).

On the other hand, the reason why someone might think that (H) could be the right answer is closer to that for Psy-Kosh's probably counterfactual mugging: After waking up in a green room, what should be your posterior probability that the coin has fallen heads? Updateful anthropic reasoning says that you should be quite sure that it has fallen heads. If you plug those probabilities into an expected utility calculation, it comes out as in Psy-Kosh's case, heavily favoring (H).

But even if these are good probabilities to assign epistemically (to satisfy your curiosity about what the world probably looks like), in light of the arguments from the counterfactual and the probably counterfactual muggings (where updating definitely is the right thing to do epistemically, but plugging these probabilities into the expected utility calculation gives the wrong result), it doesn't seem strange to me to come to the conclusion that choosing (T) is correct in Eliezer's anthropic mugging as well.

Comment author: torekp 17 February 2014 09:28:24PM 0 points [-]

I'm confused by your use of "priors". On a Tegmark IV sort of view, all meaningful sentences are true (in some universe). So the usefulness of the term "prior probability" turns on one's having at least some doubt about Tegmark IV, yes? I'm not accusing you of making any mistake over this; I just want reassurance or correction about my (mis)understanding of your probability talk.

Comment author: Benja 17 February 2014 10:04:24PM *  1 point [-]

It's priors over logical states of affairs. Consider the following sentence: "There is a cellular automaton that can be described in at most 10 KB in programming language X, plus a computable function f() which can be described in another 10 KB in the same programming language, such that f() returns a space/time location within the cellular automaton corresponding to Earth as we know it in early 2014." This could be false even if Tegmark IV is true, and prior probability (i.e., probability without trying to do an anthropic update of the form "I observe this, so it's probably simple") says it's probably false.

Comment author: shminux 14 February 2014 10:49:15PM *  3 points [-]

Imagine this: Once you finish reading this article, you hear a bell ringing, and then a sonorous voice announces: "You do indeed live in a Tegmark IV multiverse without a measure. You had better deal with it." And then it turns out that it's not just you who's heard that voice: Every single human being on the planet (who didn't sleep through it, isn't deaf etc.) has heard those same words.

Suppose instead the same voice says "You do not live in a Tegmark IV multiverse". Wouldn't you still conclude that you do, anyway? It is still a "disorderly experience", isn't it?

Comment author: Benja 14 February 2014 10:56:20PM 1 point [-]

Yup, sure.

Comment author: FeepingCreature 14 February 2014 10:04:35PM 1 point [-]

So you don't know that you live in a simple world. But, goes the obvious reply, you care much more about what happens if you do happen to live in the simple world.

You kind of seem to jump around there. Our world looks simple, that's why we're worrying so much about why our world looks so simple in the first place! Sure our world might not actually be simple, but we simply have no sufficient reason to distrust the copious simplicity our scientific inquiry seems to yield.

If I live in a simple world, I want to believe I live in a simple world. If I live in a complex, interventionistic world I want to believe I live in a complex, interventionistic world. The way to find out what sort of world I live in is to look at the world. It looks simple.

Comment author: Benja 14 February 2014 10:24:16PM 1 point [-]

To summarize that part of the post: (1) The view I'm discussing there argues that the reason we find ourselves in a simple-looking world is that all possible experiences are consciously experienced, including the ones where the world looks simple, and we just happen to experience the latter. (2) If this is correct, then you cannot use the fact that you look around and see a simple-looking world to infer that you live in a simple-looking world, because there are plenty of complex interventionistic worlds that look deceptively simple. In fact, the prior probability that the particular world you see is actually simple is extremely low. (3) However, if you value the things that happen in actually simple worlds more than the things that happen in complex worlds, then it's still correct to act as if your simple-looking world is in fact simple, despite the fact that prior probability says this is possibly wrong (or to put this differently, even though most of the equally-existing mathematically possible humans reasoning like this will be wrong).

Comment author: Coscott 14 February 2014 09:37:16PM *  0 points [-]

I feel like my first reaction was like you, that I do not care about simplicity THAT much, but reflicting on it has made me think that maybe I really do care about simplicity that much.

FIrst, let's remodel the problem. There is a collection of universes, each one an infinite string of bits, which encodes a turing machine. Lets say the actual states of the TM are encoded on some finite subset of the infinite string, and the rest of the string is random bits that The TM can read if it chooses to.

The first obstacle to out intuition is that caring based on K-complexity is the same as caring about all of these different TMs equally, so what seems unfair in one model seems very fair in another model. This might be enough to convince you to care THAT much, but I imagine you have the following rebuttal:

Many of these simple TMs never even read the infinite string of random bits at the end. They are all exactly the same. I have this vague feeling of diminishing returns. A million of the same good thing and a million different bad things does not feel as good as a million of the same bad thing and a million different good things.

I feel this intuition myself, but maybe this is just a fallacy of projecting intuitions about diminishing returns from within one universe to questions about multiple universes that do not communicate with each other.

Comment author: Benja 14 February 2014 09:57:35PM 0 points [-]

I don't feel like considering these different ways to approach K-complexity addresses the point I was trying to make. The rebuttal seems to be arguing that we should weigh the TMs that don't read the end of the tape equally, rather than weighing TMs more that read less of the tape. But my point isn't that I don't want to weigh complex TMs as much as simple TMs; it is (1) that I seem to be willing to consider TMs with one obviously disorderly event "pretty simple", even though I think they have high K-complexity; and (2) given this, the utility I lose by only disregarding the possibility of magical reality fluid in worlds where I've seen a single obviously disorderly event doesn't seem to lose me all that much utility if measureless Tegmark IV is true, compared to the utility I may lose if there actually is magical reality fluid or something like that and I ignore this possibility and, because of this, act in a way that is very bad.

(If there aren't any important ways in which I'd act differently if measureless Tegmark IV is false, then this argument has no pull, but I think there may be; for example, if the ultrafinitist hypothesis from the end of my post were correct, that might make a difference to FAI theory.)

Comment author: Coscott 14 February 2014 08:50:58PM *  1 point [-]

I might care more about the complex universes than I would if one universe existed randomly selected from a probability measure. However, I attribute this to risk aversion. I think that I care similarly in my model to the way I would care with "reality fluid," but I care differently from the way I would care if there was one randomly selected universe with probability measure as from the "reality fluid."

Comment author: Benja 14 February 2014 09:24:21PM 0 points [-]

So, I can see that you would care similarly as you would in a multiverse with magical reality fluid that's distributed in the same proportions as your measure of caring, and if your measure of caring is K-complexity with respect to a universal Turing machine (UTM) we would consider simple, it's at least one plausible possibility that the true magical reality fluid that's distributed in roughly those proportions. But given the state of our confusion, I think that conditional on there being a true measure, any single hypothesis as to how that measure is distributed should have significantly less than 50% probability, so "Conditional on there being a true measure, I would act the same way as according to my K-complexity based preferences" sounds wrong to me. (One particularly salient other possibility is that we could have magical reality fluid due to Tegmark I -- infinite space -- and Tegmark III -- many-worlds -- but not due to all mathematically possible universes existing, in which case we surely wouldn't get weightings that are close to K-complexity with a simple UTM. I mean, this is a case of one single universe, but with all possible experiences existing, to different degrees.)

I like simplicity, but not THAT much

15 Benja 14 February 2014 07:51PM

Followup to: L-zombies! (L-zombies?)
Reply to: Coscott's Preferences without Existence; Paul Christiano's comment on my l-zombies post

In my previous post, I introduced the idea of an "l-zombie", or logical philosophical zombie: A Turing machine that would simulate a conscious human being if it were run, but that is never run in the real, physical world, so that the experiences that this human would have had, if the Turing machine were run, aren't actually consciously experienced.

One common reply to this is to deny the possibility of logical philosophical zombies just like the possibility of physical philosophical zombies: to say that every mathematically possible conscious experience is in fact consciously experienced, and that there is no kind of "magical reality fluid" that makes some of these be experienced "more" than others. In other words, we live in the Tegmark Level IV universe, except that unlike Tegmark argues in his paper, there's no objective measure on the collection of all mathematical structures, according to which some mathematical structures somehow "exist more" than others (and, although IIRC that's not part of Tegmark's argument, according to which the conscious experiences in some mathematical structures could be "experienced more" than those in other structures). All mathematically possible experiences are experienced, and to the same "degree".

So why is our world so orderly? There's a mathematically possible continuation of the world that you seem to be living in, where purple pumpkins are about to start falling from the sky. Or the light we observe coming in from outside our galaxy is suddenly replaced by white noise. Why don't you remember ever seeing anything as obviously disorderly as that?

And the answer to that, of course, is that among all the possible experiences that get experienced in this multiverse, there are orderly ones as well as non-orderly ones, so the fact that you happen to have orderly experiences isn't in conflict with the hypothesis; after all, the orderly experiences have to be experienced as well.

One might be tempted to argue that it's somehow more likely that you will observe an orderly world if everybody who has conscious experiences at all, or if at least most conscious observers, see an orderly world. (The "most observers" version of the argument assumes that there is a measure on the conscious observers, a.k.a. some kind of magical reality fluid.) But this requires the use of anthropic probabilities, and there is simply no (known) system of anthropic probabilities that gives reasonable answers in general. Fortunately, we have an alternative: Wei Dai's updateless decision theory (which was motivated in part exactly by the problem of how to act in this kind of multiverse). The basic idea is simple (though the details do contain devils): We have a prior over what the world looks like; we have some preferences about what we would like the world to look like; and we come up with a plan for what we should do in any circumstance we might find ourselves in that maximizes our expected utility, given our prior.

*

In this framework, Coscott and Paul suggest, everything adds up to normality if, instead of saying that some experiences objectively exist more, we happen to care more about some experiences than about others. (That's not a new idea, of course, or the first time this has appeared on LW -- for example, Wei Dai's What are probabilities, anyway? comes to mind.) In particular, suppose we just care more about experiences in mathematically really simple worlds -- or more precisely, places in mathematically simple worlds that are mathematically simple to describe (since there's a simple program that runs all Turing machines, and therefore all mathematically possible human experiences, always assuming that human brains are computable). Then, even though there's a version of you that's about to see purple pumpkins rain from the sky, you act in a way that's best in the world where that doesn't happen, because that world has so much lower K-complexity, and because you therefore care so much more about what happens in that world.

There's something unsettling about that, which I think deserves to be mentioned, even though I do not think it's a good counterargument to this view. This unsettling thing is that on priors, it's very unlikely that the world you experience arises from a really simple mathematical description. (This is a version of a point I also made in my previous post.) Even if the physicists had already figured out the simple Theory of Everything, which is a super-simple cellular automaton that accords really well with experiments, you don't know that this simple cellular automaton, if you ran it, would really produce you. After all, imagine that somebody intervened in Earth's history so that orchids never evolved, but otherwise left the laws of physics the same; there might still be humans, or something like humans, and they would still run experiments and find that they match the predictions of the simple cellular automaton, so they would assume that if you ran that cellular automaton, it would compute them -- except it wouldn't, it would compute us, with orchids and all. Unless, of course, it does compute them, and a special intervention is required to get the orchids.

So you don't know that you live in a simple world. But, goes the obvious reply, you care much more about what happens if you do happen to live in the simple world. On priors, it's probably not true; but it's best, according to your values, if all people like you act as if they live in the simple world (unless they're in a counterfactual mugging type of situation, where they can influence what happens in the simple world even if they're not in the simple world themselves), because if the actual people in the simple world act like that, that gives the highest utility.

You can adapt an argument that I was making in my l-zombies post to this setting: Given these preferences, it's fine for everybody to believe that they're in a simple world, because this will increase the correspondence between map and territory for the people that do live in simple worlds, and that's who you care most about.

*

I mostly agree with this reasoning. I agree that Tegmark IV without a measure seems like the most obvious and reasonable hypothesis about what the world looks like. I agree that there seems no reason for there to be a "magical reality fluid". I agree, therefore, that on the priors that I'd put into my UDT calculation for how I should act, it's much more likely that true reality is a measureless Tegmark IV than that it has some objective measure according to which some experiences are "experienced less" than others, or not experienced at all. I don't think I understand things well enough to be extremely confident in this, but my odds would certainly be in favor of it.

Moreover, I agree that if this is the case, then my preferences are to care more about the simpler worlds, making things add up to normality; I'd want to act as if purple pumpkins are not about to start falling from the sky, precisely because I care more about the consequences my actions have in more orderly worlds.

But.

*

Imagine this: Once you finish reading this article, you hear a bell ringing, and then a sonorous voice announces: "You do indeed live in a Tegmark IV multiverse without a measure. You had better deal with it." And then it turns out that it's not just you who's heard that voice: Every single human being on the planet (who didn't sleep through it, isn't deaf etc.) has heard those same words.

On the hypothesis, this is of course about to happen to you, though only in one of those worlds with high K-complexity that you don't care about very much.

So let's consider the following possible plan of action: You could act as if there is some difference between "existence" and "non-existence", or perhaps some graded degree of existence, until you hear those words and confirm that everybody else has heard them as well, or until you've experienced one similarly obviously "disorderly" event. So until that happens, you do things like invest time and energy into trying to figure out what the best way to act is if it turns out that there is some magical reality fluid, and into trying to figure out what a non-confused version of something like a measure on conscious experience could look like, and you act in ways that don't kill you if we happen to not live in a measureless Tegmark IV. But once you've had a disorderly experience, just a single one, you switch over to optimizing for the measureless mathematical multiverse.

If the degree to which you care about worlds is really proportional to their K-complexity, with respect to what you and I would consider a "simple" universal Turing machine, then this would be a silly plan; there is very little to be gained from being right in worlds that have that much higher K-complexity. But when I query my intuitions, it seems like a rather good plan:

  • Yes, I care less about those disorderly worlds. But not as much less as if I valued them by their K-complexity. I seem to be willing to tap into my complex human intuitions to refer to the notion of "single obviously disorderly event", and assign the worlds with a single such event, and otherwise low K-complexity, not that much lower importance than the worlds with actual low K-complexity.
  • And if I imagine that the confused-seeming notions of "really physically exists" and "actually experienced" do have some objective meaning independent of my preferences, then I care much more about the difference between "I get to 'actually experience' a tomorrow" and "I 'really physically' get hit by a car today" than I care about the difference between the world with true low K-complexity and the worlds with a single disorderly event.

In other words, I agree that on the priors I put into my UDT calculation, it's much more likely that we live in measureless Tegmark IV; but my confidence in this isn't extreme, and if we don't, then the difference between "exists" and "doesn't exist" (or "is experienced a lot" and "is experienced only infinitesimally") is very important; much more important than the difference between "simple world" and "simple world plus one disorderly event" according to my preferences if we do live in a Tegmark IV universe. If I act optimally according to the Tegmark IV hypothesis in the latter worlds, that still gives me most of the utility that acting optimally in the truly simple worlds would give me -- or, more precisely, the utility differential isn't nearly as large as if there is something else going on, and I should be doing something about it, and I'm not.

This is the reason why I'm trying to think seriously about things like l-zombies and magical reality fluid. I mean, I don't even think that these are particularly likely to be exactly right even if the measureless Tegmark IV hypothesis is wrong; I expect that there would be some new insight that makes even more sense than Tegmark IV, and makes all the confusion go away. But trying to grapple with the confused intuitions we currently have seems at least a possible way to make progress on this, if it should be the case that there is in fact progress to be made.

*

Here's one avenue of investigation that seems worthwhile to me, and wouldn't without the above argument. One thing I could imagine finding, that could make the confusion go away, would be that the intuitive notion of "all possible Turing machines" is just wrong, and leads to outright contradictions (e.g., to inconsistencies in Peano Arithmetic, or something similarly convincing). Lots of people have entertained the idea that concepts like the real numbers don't "really" exist, and only the behavior of computable functions is "real"; perhaps not even that is real, and true reality is more restricted? (You can reinterpret many results about real numbers as results about computable functions, so maybe you could reinterpret results about computable functions as results about these hypothetical weaker objects that would actually make mathematical sense.) So it wouldn't be the case after all that there is some Turing machine that computes the conscious experiences you would have if pumpkins started falling from the sky.

Does the above make sense? Probably not. But I'd say that there's a small chance that maybe yes, and that if we understood the right kind of math, it would seem very obvious that not all intuitively possible human experiences are actually mathematically possible (just as obvious as it is today, with hindsight, that there is no Turing machine which takes a program as input and outputs whether this program halts). Moreover, it seems plausible that this could have consequences for how we should act. This, together with my argument above, make me think that this sort of thing is worth investigating -- even if my priors are heavily on the side of expecting that all experiences exist to the same degree, and ordinarily this difference in probabilities would make me think that our time would be better spent on investigating other, more likely hypotheses.

*

Leaving aside the question of how I should act, though, does all of this mean that I should believe that I live in a universe with l-zombies and magical reality fluid, until such time as I hear that voice speaking to me?

I do feel tempted to try to invoke my argument from the l-zombies post that I prefer the map-territory correspondences of actually existing humans to be correct, and don't care about whether l-zombies have their map match up with the territory. But I'm not sure that I care much more about actually existing humans being correct, if the measureless mathematical multiverse hypothesis is wrong, than I care about humans in simple worlds being correct, if that hypothesis is right. So I think that the right thing to do may be to have a subjective belief that I most likely do live in the measureless Tegmark IV, as long as that's the view that seems by far the least confused -- but continue to spend resources on investigating alternatives, because on priors they don't seem sufficiently unlikely to make up for the potential great importance of getting this right.

Comment author: ThisSpaceAvailable 08 February 2014 07:39:23AM 3 points [-]

But you see Eliezer's comments because a conscious copy of Eliezer has been run. If I'm figuring out what output a program "would" give "if" it were run, in what sense am I not running it? Suppose I have a program MaybeZombie, and I run a Turing Test with it as the Testee and you as the Tester. Every time you send a question to MaybeZombie, I figure out what MaybeZombie would say if it were run, and send that response back to you. Can I get MaybeZombie to pass a Turing Test, without ever running it?

Comment author: Benja 10 February 2014 11:05:04PM 2 points [-]

But you see Eliezer's comments because a conscious copy of Eliezer has been run.

A conscious copy of Eliezer that thought about what Eliezer would do when faced with that situation, not a conscious copy of Eliezer actually faced with that situation -- the latter Eliezer is still an l-zombie, if we live in a world with l-zombies.

Comment author: trist 07 February 2014 09:19:32PM *  10 points [-]

Are cryopreserved humans l-zombies?

keeping in mind that if they were an l-zombie, they would still say "I have conscious experiences, so clearly I can't be an l-zombie"?

As well they should. For l-zombies to do anything they need to be run, whereupon they stop being l-zombies.

Comment author: Benja 07 February 2014 10:30:15PM 2 points [-]

For l-zombies to do anything they need to be run, whereupon they stop being l-zombies.

Omega doesn't necessarily need to run a conscious copy of Eliezer to be pretty sure that Eliezer would pay up in the counterfactual mugging; it could use other information about Eliezer, like Eliezer's comments on LW, the way that I just did. It should be possible to achieve pretty high confidence that way about what Eliezer-being-asked-about-a-counterfactual-mugging would do, even if that version of Eliezer should happen to be an l-zombie.

Comment author: ESRogs 07 February 2014 09:54:49PM 4 points [-]

Actually, there probably aren't any p-zombies

Should that (from the first line of the third paragraph) be l-zombies?

Comment author: Benja 07 February 2014 10:24:27PM 1 point [-]

Fixed, thanks!

View more: Prev | Next