A proof of Löb's theorem in Haskell
I'm not sure if this post is very on-topic for LW, but we have many folks who understand Haskell and many folks who are interested in Löb's theorem (see e.g. Eliezer's picture proof), so I thought why not post it here? If no one likes it, I can always just move it to my own blog.
A few days ago I stumbled across a post by Dan Piponi, claiming to show a Haskell implementation of something similar to Löb's theorem. Unfortunately his code had a couple flaws. It was circular and relied on Haskell's laziness, and it used an assumption that doesn't actually hold in logic (see the second comment by Ashley Yakeley there). So I started to wonder, what would it take to code up an actual proof? Wikipedia spells out the steps very nicely, so it seemed to be just a matter of programming.
Well, it turned out to be harder than I thought.
One problem is that Haskell has no type-level lambdas, which are the most obvious way (by Curry-Howard) to represent formulas with propositional variables. These are very useful for proving stuff in general, and Löb's theorem uses them to build fixpoints by the diagonal lemma.
The other problem is that Haskell is Turing complete, which means it can't really be used for proof checking, because a non-terminating program can be viewed as the proof of any sentence. Several people have told me that Agda or Idris might be better choices in this regard. Ultimately I decided to use Haskell after all, because that way the post will be understandable to a wider audience. It's easy enough to convince yourself by looking at the code that it is in fact total, and transliterate it into a total language if needed. (That way you can also use the nice type-level lambdas and fixpoints, instead of just postulating one particular fixpoint as I did in Haskell.)
But the biggest problem for me was that the Web didn't seem to have any good explanations for the thing I wanted to do! At first it seems like modal proofs and Haskell-like languages should be a match made in heaven, but in reality it's full of subtle issues that no one has written down, as far as I know. So I'd like this post to serve as a reference, an example approach that avoids all difficulties and just works.
LW user lmm has helped me a lot with understanding the issues involved, and wrote a candidate implementation in Scala. The good folks on /r/haskell were also very helpful, especially Samuel Gélineau who suggested a nice partial implementation in Agda, which I then converted into the Haskell version below.
To play with it online, you can copy the whole bunch of code, then go to CompileOnline and paste it in the edit box on the left, replacing what's already there. Then click "Compile & Execute" in the top left. If it compiles without errors, that means everything is right with the world, so you can change something and try again. (I hate people who write about programming and don't make it easy to try out their code!) Here we go:
main = return ()
-- Assumptions
data Theorem a
logic1 = undefined :: Theorem (a -> b) -> Theorem a -> Theorem b logic2 = undefined :: Theorem (a -> b) -> Theorem (b -> c) -> Theorem (a -> c) logic3 = undefined :: Theorem (a -> b -> c) -> Theorem (a -> b) -> Theorem (a -> c)
data Provable a
rule1 = undefined :: Theorem a -> Theorem (Provable a) rule2 = undefined :: Theorem (Provable a -> Provable (Provable a)) rule3 = undefined :: Theorem (Provable (a -> b) -> Provable a -> Provable b)
data P
premise = undefined :: Theorem (Provable P -> P)
data Psi
psi1 = undefined :: Theorem (Psi -> (Provable Psi -> P)) psi2 = undefined :: Theorem ((Provable Psi -> P) -> Psi)
-- Proof
step3 :: Theorem (Psi -> Provable Psi -> P) step3 = psi1
step4 :: Theorem (Provable (Psi -> Provable Psi -> P)) step4 = rule1 step3
step5 :: Theorem (Provable Psi -> Provable (Provable Psi -> P)) step5 = logic1 rule3 step4
step6 :: Theorem (Provable (Provable Psi -> P) -> Provable (Provable Psi) -> Provable P) step6 = rule3
step7 :: Theorem (Provable Psi -> Provable (Provable Psi) -> Provable P) step7 = logic2 step5 step6
step8 :: Theorem (Provable Psi -> Provable (Provable Psi)) step8 = rule2
step9 :: Theorem (Provable Psi -> Provable P) step9 = logic3 step7 step8
step10 :: Theorem (Provable Psi -> P) step10 = logic2 step9 premise
step11 :: Theorem ((Provable Psi -> P) -> Psi) step11 = psi2
step12 :: Theorem Psi step12 = logic1 step11 step10
step13 :: Theorem (Provable Psi) step13 = rule1 step12
step14 :: Theorem P step14 = logic1 step10 step13
-- All the steps squished together
lemma :: Theorem (Provable Psi -> P) lemma = logic2 (logic3 (logic2 (logic1 rule3 (rule1 psi1)) rule3) rule2) premise
theorem :: Theorem P theorem = logic1 lemma (rule1 (logic1 psi2 lemma))
To make sense of the code, you should interpret the type constructor Theorem as the symbol ⊢ from the Wikipedia proof, and Provable as the symbol ☐. All the assumptions have value "undefined" because we don't care about their computational content, only their types. The assumptions logic1..3 give just enough propositional logic for the proof to work, while rule1..3 are direct translations of the three rules from Wikipedia. The assumptions psi1 and psi2 describe the specific fixpoint used in the proof, because adding general fixpoint machinery would make the code much more complicated. The types P and Psi, of course, correspond to sentences P and Ψ, and "premise" is the premise of the whole theorem, that is, ⊢(☐P→P). The conclusion ⊢P can be seen in the type of step14.
As for the "squished" version, I guess I wrote it just to satisfy my refactoring urge. I don't recommend anyone to try reading that, except maybe to marvel at the complexity :-)
EDIT: in addition to the previous Reddit thread, there's now a new Reddit thread about this post.
Anthropics doesn't explain why the Cold War stayed Cold
(Epistemic status: There are some lines of argument that I haven’t even started here, which potentially defeat the thesis advocated here. I don’t go into them because this is already too long or I can’t explain them adequately without derailing the main thesis. Similarly some continuations of chains of argument and counterargument begun here are terminated in the interest of focussing on the lower-order counterarguments. Overall this piece probably overstates my confidence in its thesis. It is quite possible this post will be torn to pieces in the comments—possibly by my own aforementioned elided considerations. That’s good too.)
I
George VI, King of the United Kingdom, had five siblings. That is, the father of current Queen Elizabeth II had as many siblings as on a typical human hand. (This paragraph is true, and is not a trick; in particular, the second sentence of this paragraph really is trying to disambiguate and help convey the fact in question and relate it to prior knowledge, rather than introduce an opening for some sleight of hand so I can laugh at you later, or whatever fear such a suspiciously simple proposition might engender.)
Let it be known.
II
Exactly one of the following stories is true:
Story One
Recently I hopped on Facebook and saw the following post:
“I notice that I am confused about why a nuclear war never occurred. Like, I think (knowing only the very little I know now) that if you had asked me, at the start of the Cold War or something, the probability that it would eventually lead to a nuclear war, I would've said it was moderately likely. So what's up with that?”
The post had 14 likes. In the comments, the most-Liked explanation was:
“anthropically you are considerably more likely to live in a world where there never was a fullscale nuclear war”
That comment had 17 Likes. The second-most-liked comment that offered an explanation had 4 Likes.
Story Two
Too good to be true
A friend recently posted a link on his Facebook page to an informational graphic about the alleged link between the MMR vaccine and autism. It said, if I recall correctly, that out of 60 studies on the matter, not one had indicated a link.
Presumably, with 95% confidence.
This bothered me. What are the odds, supposing there is no link between X and Y, of conducting 60 studies of the matter, and of all 60 concluding, with 95% confidence, that there is no link between X and Y?
Answer: .95 ^ 60 = .046. (Use the first term of the binomial distribution.)
So if it were in fact true that 60 out of 60 studies failed to find a link between vaccines and autism at 95% confidence, this would prove, with 95% confidence, that studies in the literature are biased against finding a link between vaccines and autism.
[LINK] No Boltzmann Brains in an Empty Expanding Universe
Another link to Sean Carroll's blog: Squelching Boltzmann Brains (And Maybe Eternal Inflation). The discussion of Boltzmann brains has come up many times on LW, starting from this Eliezer's post. Now Sean and his collaborators argue that in an empty expanding universe:
Quantum fluctuations are not dynamical processes inherent to a system, but instead reflect the statistical nature of measurement outcomes. Making a denite measurement requires an out-of-equilibrium, low-entropy detection apparatus that interacts with an environment to induce decoherence. Quantum variables are not equivalent to classical stochastic variables. They may behave similarly when measured repeatedly over time, in which case it is sensible to identify the nonzero variance of a quantum-mechanical observable with the physical fluctuations of a classical variable. In a truly stationary state, however, there are no fluctuations that decohere. We conclude that systems in such a state|including, in particular, the Hartle-Hawking vacuum never fluctuate into lower-entropy states, including false vacua or congurations with Boltzmann brains.
Although our universe, today or during inflation, is of course not in the vacuum, the cosmic no-hair theorem implies that any patch in an expanding universe with a positive cosmological constant will asymptote to the vacuum. Within QFT in curved spacetime, the Boltzmann brain problem is thus eliminated: a patch in eternal de Sitter can form only a finite (and small) number of brains on its way to the vacuum.
In other words, in an empty universe no macroscopic areas of low entropy can form. And a non-vacuum expanding universe like ours becomes vacuum after a time too short to form more than a few Boltzmann brains.
Siren worlds and the perils of over-optimised search
tl;dr An unconstrained search through possible future worlds is a dangerous way of choosing positive outcomes. Constrained, imperfect or under-optimised searches work better.
Some suggested methods for designing AI goals, or controlling AIs, involve unconstrained searches through possible future worlds. This post argues that this is a very dangerous thing to do, because of the risk of being tricked by "siren worlds" or "marketing worlds". The thought experiment starts with an AI designing a siren world to fool us, but that AI is not crucial to the argument: it's simply an intuition pump to show that siren worlds can exist. Once they exist, there is a non-zero chance of us being seduced by them during a unconstrained search, whatever the search criteria are. This is a feature of optimisation: satisficing and similar approaches don't have the same problems.
The AI builds the siren worlds
Imagine that you have a superintelligent AI that's not just badly programmed, or lethally indifferent, but actually evil. Of course, it has successfully concealed this fact, as "don't let humans think I'm evil" is a convergent instrumental goal for all AIs.
We've successfully constrained this evil AI in a Oracle-like fashion. We ask the AI to design future worlds and present them to human inspection, along with an implementation pathway to create those worlds. Then if we approve of those future worlds, the implementation pathway will cause them to exist (assume perfect deterministic implementation for the moment). The constraints we've programmed means that the AI will do all these steps honestly. Its opportunity to do evil is limited exclusively to its choice of worlds to present to us.
The AI will attempt to design a siren world: a world that seems irresistibly attractive while concealing hideous negative features. If the human mind is hackable in the crude sense - maybe through a series of coloured flashes - then the AI would design the siren world to be subtly full of these hacks. It might be that there is some standard of "irresistibly attractive" that is actually irresistibly attractive: the siren world would be full of genuine sirens.
Even without those types of approaches, there's so much manipulation the AI could indulge in. I could imagine myself (and many people on Less Wrong) falling for the following approach:
Don't teach people how to reach the top of a hill
When is it faster to rediscover something on your own than to learn it from someone who already knows it?
Sometimes it's faster to re-derive a proof or algorithm than to look it up. Keith Lynch re-invented the fast Fourier transform because he was too lazy to walk all the way to the library to get a book on it, although that's an extreme example. But if you have a complicated proof already laid out before you, and you are not Marc Drexler, it's generally faster to read it than to derive a new one. Yet I found a knowledge-intensive task where it would have been much faster to tell someone nothing at all than to tell them how to do it.
Political Skills which Increase Income
Summary: This article is intended for those who are "earning to give" (i.e. maximize income so that it can be donated to charity). It is basically an annotated bibliography of a few recent meta-analyses of predictors of income.
Key Results
-
The degree to which management “sponsors” your career development is an important predictor of your salary, as is how skilled you are politically.
-
Despite the stereotype of a silver-tongued salesman preying on people’s biases, rational appeals are generally the best tactic.
-
After rationality, the best tactics are types of ingratiation, including flattery and acting modest.
Ng et al. performed a metastudy of over 200 individual studies of objective and subjective career success. Here are the variables they found best correlated with salary:
|
Predictor |
Correlation |
|
Political Knowledge & Skills |
0.29 |
|
Education Level |
0.29 |
|
Cognitive Ability (as measured by standardized tests) |
0.27 |
|
Age |
0.26 |
|
Training and Skill Development Opportunities |
0.24 |
|
Hours Worked |
0.24 |
|
Career Sponsorship |
0.22 |
(all significant at p = .05)
(For reference, the “Big 5” personality traits all have a correlation under 0.12.)
Before we go on, a few caveats: while these correlations are significant and important, none are overwhelming (the authors cite Cohen as saying the range 0.24-0.36 is “medium” and correlations over 0.37 are “large”). Also, in addition to the usual correlation/causation concerns, there is lots of cross-correlation: e.g. older people might have greater political knowledge but less education, thereby confusing things. For a discussion of moderating variables, see the paper itself.
Career Sponsorship
There are two broad models of career advancement: contest-mobility and sponsorship-mobility. They are best illustrated with an example.
Suppose Peter and Penelope are both equally talented entry-level employees. Under the contest-mobility model, they would both be equally likely to get a raise or promotion, because they are equally skilled.
Sponsorship-mobility theorists argue that even if Peter and Penelope are equally talented, it’s likely that one of them will catch the eye of senior management. Perhaps it’s due to one of them having an early success by chance, making a joke in a meeting, or simply just having a more memorable name, like Penelope. This person will be singled out for additional training and job opportunities. Because of this, they’ll have greater success in the company, which will lead to more opportunities etc. As a result, their initial small discrepancy in attention gets multiplied into a large differential.
The authors of the metastudy found that self-reported sponsorship levels (i.e. how much you feel the management of your company “sponsors” you) have a significant, although moderate, relationship to salary. Therefore, the level at which you currently feel sponsored in your job should be a factor when you consider alternate opportunities.
The Dilbert Effect
The strongest predictor of salary (tied with education level) is what the authors politely term “Political Knowledge & Skills” - less politely, how good you are at manipulating others.
Several popular books (such as Cialdini’s Influence) on the subject of influencing others exist, and the study of these “influence tactics” in business stretches back 30 years to Kipnis, Schmidt and Wilkinson. Recently, Higgins et al. reviewed 23 individual studies of these tactics and how they relate to career success. Their results:
|
Tactic |
Correlation |
Definition (From Higgins et al.) |
|
Rationality |
0.26 |
Using data and information to make a logical argument supporting one's request |
|
Ingratiation |
0.23 |
Using behaviors designed to increase the target's liking of oneself or to make oneself appear friendly in order to get what one wants |
|
Upward Appeal |
0.05 |
Relying on the chain of command, calling in superiors to help get one's way |
|
Self-Promotion |
0.01 |
Attempting to create an appearance of competence or that you are capable of completing a task |
|
Assertiveness |
-0.02 |
Using a forceful manner to get what one wants |
|
Exchange |
-0.03 |
Making an explicit offer to do something for another in exchange for their doing what one wants |
(Only ingratiation and rationality are significant.)
This site has a lot of information on how to make rational appeals, so I will focus on the less-talked-about ingratiation techniques.
How to be Ingratiating
Gordon analyzed 69 studies of ingratiation and found the following. (Unlike the previous two sections, success here is measured in lab tests as well as in career advancement. However, similar but less comprehensive results have been found in terms of career success):
|
Tactic |
Weighted Effectiveness (Cohen’s d difference between control and intervention) |
Description |
|
Other Enhancement |
0.31 |
Flattery |
|
Opinion Conformity |
0.23 |
“Go along to get along” |
|
Self-presentation |
0.15 |
Any of the following tactics: Self-promotion, self-deprecation, apologies, positive nonverbal displays and name usage |
|
Combination |
0.10 |
Includes studies where the participants weren’t told which strategy to use, in addition to when they were instructed to use multiple strategies |
|
Rendering Favors |
0.05 |
Self-presentation is split further:
|
Tactic |
Weighted Effect Size |
Comment |
|
Modesty |
0.77 |
|
|
Apology |
0.59 |
Apologizing for poor performance |
|
Generic |
0.28 |
When the participant is told in generic terms to improve their self-presentation |
|
Nonverbal behavior and name usage |
-0.14 |
Nonverbal behavior includes things like wearing perfume. Name usage means referring to people by name instead of a pronoun. |
|
Self-promotion |
-0.17 |
|
Moderators
One important moderator is the direction of the appeal. If you are talking to your boss, your tactics should be different than if you’re talking to a subordinate. Other-enhancement (flattery) is always the best tactic no matter who you’re talking to, but when talking to superiors it’s by far the best. When talking to those at similar levels to you, opinion conformity comes close to flattery, and the other techniques aren't far behind.
Unsurprisingly, when the target realizes you’re being ingratiating, the tactic is less effective. (Although effectiveness doesn’t go to zero - even when people realize you’re flattering them just to suck up, they generally still appreciate it.) Also, women are better at being ingratiating than men, and men are more influenced by these ingratiating tactics than women. The most important caveat is that lab studies find much larger effect sizes than in the field, to the extent that the average field effect for the ingratiating tactics is negative. This is probably due to the fact that lab experiments can be better controlled.
Conclusion
It’s unlikely that a silver-tongued receptionist will out-earn an introverted engineer. But simple techniques like flattery and attempting to get "sponsored" can appreciably improve returns, to the extent that political skills are one of the strongest predictors of salaries.
I would like to thank Brian Tomasik and Gina Stuessy for reading early drafts of this article.
References
Cohen, Jacob. Statistical power analysis for the behavioral sciences. Psychology Press, 1988.
Gordon, Randall A. "Impact of ingratiation on judgments and evaluations: A meta-analytic investigation." Journal of Personality and Social Psychology 71.1 (1996): 54.
Higgins, Chad A., Timothy A. Judge, and Gerald R. Ferris. "Influence tactics and work outcomes: a meta‐analysis." Journal of Organizational Behavior 24.1 (2003): 89-106.
Judge, Timothy A., and Robert D. Bretz Jr. "Political influence behavior and career success." Journal of Management 20.1 (1994): 43-65.
Kipnis, David, Stuart M. Schmidt, and Ian Wilkinson. "Intraorganizational influence tactics: Explorations in getting one's way." Journal of Applied psychology 65.4 (1980): 440.
Ng, Thomas WH, et al. "Predictors of objective and subjective career success: A meta‐analysis." Personnel psychology 58.2 (2005): 367-408.
Expecting Short Inferential Distances
Homo sapiens' environment of evolutionary adaptedness (aka EEA or "ancestral environment") consisted of hunter-gatherer bands of at most 200 people, with no writing. All inherited knowledge was passed down by speech and memory.
In a world like that, all background knowledge is universal knowledge. All information not strictly private is public, period.
In the ancestral environment, you were unlikely to end up more than one inferential step away from anyone else. When you discover a new oasis, you don't have to explain to your fellow tribe members what an oasis is, or why it's a good idea to drink water, or how to walk. Only you know where the oasis lies; this is private knowledge. But everyone has the background to understand your description of the oasis, the concepts needed to think about water; this is universal knowledge. When you explain things in an ancestral environment, you almost never have to explain your concepts. At most you have to explain one new concept, not two or more simultaneously.
Don't rely on the system to guarantee you life satisfaction
A brief essay intended for high school students: any thoughts?
If you go to school, take the classes that people tell you to, do your homework, and engage in the extracurricular activities that your peers do, you'll be setting yourself up for an "okay" life. But you can do better than that.
I like simplicity, but not THAT much
Followup to: L-zombies! (L-zombies?)
Reply to: Coscott's Preferences without Existence; Paul Christiano's comment on my l-zombies post
In my previous post, I introduced the idea of an "l-zombie", or logical philosophical zombie: A Turing machine that would simulate a conscious human being if it were run, but that is never run in the real, physical world, so that the experiences that this human would have had, if the Turing machine were run, aren't actually consciously experienced.
One common reply to this is to deny the possibility of logical philosophical zombies just like the possibility of physical philosophical zombies: to say that every mathematically possible conscious experience is in fact consciously experienced, and that there is no kind of "magical reality fluid" that makes some of these be experienced "more" than others. In other words, we live in the Tegmark Level IV universe, except that unlike Tegmark argues in his paper, there's no objective measure on the collection of all mathematical structures, according to which some mathematical structures somehow "exist more" than others (and, although IIRC that's not part of Tegmark's argument, according to which the conscious experiences in some mathematical structures could be "experienced more" than those in other structures). All mathematically possible experiences are experienced, and to the same "degree".
So why is our world so orderly? There's a mathematically possible continuation of the world that you seem to be living in, where purple pumpkins are about to start falling from the sky. Or the light we observe coming in from outside our galaxy is suddenly replaced by white noise. Why don't you remember ever seeing anything as obviously disorderly as that?
And the answer to that, of course, is that among all the possible experiences that get experienced in this multiverse, there are orderly ones as well as non-orderly ones, so the fact that you happen to have orderly experiences isn't in conflict with the hypothesis; after all, the orderly experiences have to be experienced as well.
One might be tempted to argue that it's somehow more likely that you will observe an orderly world if everybody who has conscious experiences at all, or if at least most conscious observers, see an orderly world. (The "most observers" version of the argument assumes that there is a measure on the conscious observers, a.k.a. some kind of magical reality fluid.) But this requires the use of anthropic probabilities, and there is simply no (known) system of anthropic probabilities that gives reasonable answers in general. Fortunately, we have an alternative: Wei Dai's updateless decision theory (which was motivated in part exactly by the problem of how to act in this kind of multiverse). The basic idea is simple (though the details do contain devils): We have a prior over what the world looks like; we have some preferences about what we would like the world to look like; and we come up with a plan for what we should do in any circumstance we might find ourselves in that maximizes our expected utility, given our prior.
*
In this framework, Coscott and Paul suggest, everything adds up to normality if, instead of saying that some experiences objectively exist more, we happen to care more about some experiences than about others. (That's not a new idea, of course, or the first time this has appeared on LW -- for example, Wei Dai's What are probabilities, anyway? comes to mind.) In particular, suppose we just care more about experiences in mathematically really simple worlds -- or more precisely, places in mathematically simple worlds that are mathematically simple to describe (since there's a simple program that runs all Turing machines, and therefore all mathematically possible human experiences, always assuming that human brains are computable). Then, even though there's a version of you that's about to see purple pumpkins rain from the sky, you act in a way that's best in the world where that doesn't happen, because that world has so much lower K-complexity, and because you therefore care so much more about what happens in that world.
There's something unsettling about that, which I think deserves to be mentioned, even though I do not think it's a good counterargument to this view. This unsettling thing is that on priors, it's very unlikely that the world you experience arises from a really simple mathematical description. (This is a version of a point I also made in my previous post.) Even if the physicists had already figured out the simple Theory of Everything, which is a super-simple cellular automaton that accords really well with experiments, you don't know that this simple cellular automaton, if you ran it, would really produce you. After all, imagine that somebody intervened in Earth's history so that orchids never evolved, but otherwise left the laws of physics the same; there might still be humans, or something like humans, and they would still run experiments and find that they match the predictions of the simple cellular automaton, so they would assume that if you ran that cellular automaton, it would compute them -- except it wouldn't, it would compute us, with orchids and all. Unless, of course, it does compute them, and a special intervention is required to get the orchids.
So you don't know that you live in a simple world. But, goes the obvious reply, you care much more about what happens if you do happen to live in the simple world. On priors, it's probably not true; but it's best, according to your values, if all people like you act as if they live in the simple world (unless they're in a counterfactual mugging type of situation, where they can influence what happens in the simple world even if they're not in the simple world themselves), because if the actual people in the simple world act like that, that gives the highest utility.
You can adapt an argument that I was making in my l-zombies post to this setting: Given these preferences, it's fine for everybody to believe that they're in a simple world, because this will increase the correspondence between map and territory for the people that do live in simple worlds, and that's who you care most about.
*
I mostly agree with this reasoning. I agree that Tegmark IV without a measure seems like the most obvious and reasonable hypothesis about what the world looks like. I agree that there seems no reason for there to be a "magical reality fluid". I agree, therefore, that on the priors that I'd put into my UDT calculation for how I should act, it's much more likely that true reality is a measureless Tegmark IV than that it has some objective measure according to which some experiences are "experienced less" than others, or not experienced at all. I don't think I understand things well enough to be extremely confident in this, but my odds would certainly be in favor of it.
Moreover, I agree that if this is the case, then my preferences are to care more about the simpler worlds, making things add up to normality; I'd want to act as if purple pumpkins are not about to start falling from the sky, precisely because I care more about the consequences my actions have in more orderly worlds.
But.
*
Imagine this: Once you finish reading this article, you hear a bell ringing, and then a sonorous voice announces: "You do indeed live in a Tegmark IV multiverse without a measure. You had better deal with it." And then it turns out that it's not just you who's heard that voice: Every single human being on the planet (who didn't sleep through it, isn't deaf etc.) has heard those same words.
On the hypothesis, this is of course about to happen to you, though only in one of those worlds with high K-complexity that you don't care about very much.
So let's consider the following possible plan of action: You could act as if there is some difference between "existence" and "non-existence", or perhaps some graded degree of existence, until you hear those words and confirm that everybody else has heard them as well, or until you've experienced one similarly obviously "disorderly" event. So until that happens, you do things like invest time and energy into trying to figure out what the best way to act is if it turns out that there is some magical reality fluid, and into trying to figure out what a non-confused version of something like a measure on conscious experience could look like, and you act in ways that don't kill you if we happen to not live in a measureless Tegmark IV. But once you've had a disorderly experience, just a single one, you switch over to optimizing for the measureless mathematical multiverse.
If the degree to which you care about worlds is really proportional to their K-complexity, with respect to what you and I would consider a "simple" universal Turing machine, then this would be a silly plan; there is very little to be gained from being right in worlds that have that much higher K-complexity. But when I query my intuitions, it seems like a rather good plan:
- Yes, I care less about those disorderly worlds. But not as much less as if I valued them by their K-complexity. I seem to be willing to tap into my complex human intuitions to refer to the notion of "single obviously disorderly event", and assign the worlds with a single such event, and otherwise low K-complexity, not that much lower importance than the worlds with actual low K-complexity.
- And if I imagine that the confused-seeming notions of "really physically exists" and "actually experienced" do have some objective meaning independent of my preferences, then I care much more about the difference between "I get to 'actually experience' a tomorrow" and "I 'really physically' get hit by a car today" than I care about the difference between the world with true low K-complexity and the worlds with a single disorderly event.
In other words, I agree that on the priors I put into my UDT calculation, it's much more likely that we live in measureless Tegmark IV; but my confidence in this isn't extreme, and if we don't, then the difference between "exists" and "doesn't exist" (or "is experienced a lot" and "is experienced only infinitesimally") is very important; much more important than the difference between "simple world" and "simple world plus one disorderly event" according to my preferences if we do live in a Tegmark IV universe. If I act optimally according to the Tegmark IV hypothesis in the latter worlds, that still gives me most of the utility that acting optimally in the truly simple worlds would give me -- or, more precisely, the utility differential isn't nearly as large as if there is something else going on, and I should be doing something about it, and I'm not.
This is the reason why I'm trying to think seriously about things like l-zombies and magical reality fluid. I mean, I don't even think that these are particularly likely to be exactly right even if the measureless Tegmark IV hypothesis is wrong; I expect that there would be some new insight that makes even more sense than Tegmark IV, and makes all the confusion go away. But trying to grapple with the confused intuitions we currently have seems at least a possible way to make progress on this, if it should be the case that there is in fact progress to be made.
*
Here's one avenue of investigation that seems worthwhile to me, and wouldn't without the above argument. One thing I could imagine finding, that could make the confusion go away, would be that the intuitive notion of "all possible Turing machines" is just wrong, and leads to outright contradictions (e.g., to inconsistencies in Peano Arithmetic, or something similarly convincing). Lots of people have entertained the idea that concepts like the real numbers don't "really" exist, and only the behavior of computable functions is "real"; perhaps not even that is real, and true reality is more restricted? (You can reinterpret many results about real numbers as results about computable functions, so maybe you could reinterpret results about computable functions as results about these hypothetical weaker objects that would actually make mathematical sense.) So it wouldn't be the case after all that there is some Turing machine that computes the conscious experiences you would have if pumpkins started falling from the sky.
Does the above make sense? Probably not. But I'd say that there's a small chance that maybe yes, and that if we understood the right kind of math, it would seem very obvious that not all intuitively possible human experiences are actually mathematically possible (just as obvious as it is today, with hindsight, that there is no Turing machine which takes a program as input and outputs whether this program halts). Moreover, it seems plausible that this could have consequences for how we should act. This, together with my argument above, make me think that this sort of thing is worth investigating -- even if my priors are heavily on the side of expecting that all experiences exist to the same degree, and ordinarily this difference in probabilities would make me think that our time would be better spent on investigating other, more likely hypotheses.
*
Leaving aside the question of how I should act, though, does all of this mean that I should believe that I live in a universe with l-zombies and magical reality fluid, until such time as I hear that voice speaking to me?
I do feel tempted to try to invoke my argument from the l-zombies post that I prefer the map-territory correspondences of actually existing humans to be correct, and don't care about whether l-zombies have their map match up with the territory. But I'm not sure that I care much more about actually existing humans being correct, if the measureless mathematical multiverse hypothesis is wrong, than I care about humans in simple worlds being correct, if that hypothesis is right. So I think that the right thing to do may be to have a subjective belief that I most likely do live in the measureless Tegmark IV, as long as that's the view that seems by far the least confused -- but continue to spend resources on investigating alternatives, because on priors they don't seem sufficiently unlikely to make up for the potential great importance of getting this right.
Subscribe to RSS Feed
= f037147d6e6c911a85753b9abdedda8d)