Less Wrong is a community blog devoted to refining the art of human rationality. Please visit our About page for more information.
Dr. Jubjub: Sir, I have been running some calculations and I’m worried about the way our slithy toves are heading.
Prof. Bandersnatch: Huh? Why? The toves seem fine to me. Just look at them, gyring and gimbling in the wabe over there.
Dr. Jubjub: Yes, but there is a distinct negative trend in my data. The toves are gradually losing their slithiness.
Prof. Bandersnatch: Hmm, okay. That does sound serious. How long until it becomes a problem?
Dr. Jubjub: Well, I’d argue that it’s already having negative effects but I’d say we will reach a real crisis in around 120 years.
Prof. Bandersnatch: Phew, okay, you had me worried there for a moment. But it sounds like this is actually a non-problem. We can carry on working on the important stuff – technology will bail us out here in time.
Dr. Jubjub: Sir! We already have the technology to fix the toves. The most straightforward way would be to whiffle their tulgey wood but we could also...
Prof. Bandersnatch: What?? Whiffle their tulgey wood? Do you have any idea what that would cost? And besides, people won’t stand for it – slithy toves with unwhiffled tulgey wood are a part of our way of life.
Dr. Jubjub: So, when you say technology will bail us out you mean you expect a solution that will be cheap, socially acceptable and developed soon?
Prof. Bandersnatch: Of course! Prof. Jabberwock assures me the singularity will be here around tea-time on Tuesday. That is, if we roll up our sleeves and don’t waste time with trivialities like your tove issue.
Maybe it’s just me but I feel like I run into a lot of conversations like this around here. On any problem that won’t become an absolute crisis in the next few decades, someone will take the Bandersnatch view that it will be more easily solved later (with cheaper or more socially acceptable technology) so we shouldn’t work directly on it now. The way out is forward - let’s step on the gas and get to the finish line before any annoying problems catch up with us.
For all I know, Bandersnatch is absolutely right. But my natural inclination is to take the Jubjub view. I think the chances of a basically business-as-usual future for the next 200 or 300 years are not epsilon. They may not be very high but they seem like they need to be seriously taken into account. Problems may prove harder than they look. Apparently promising technology may not become practical. Maybe we'll have the capacity for AI in 50 years - but need another 500 years to make it friendly. I'd prefer humanity to plan in such a way that things will gradually improve rather than gradually deteriorate, even in a slow-technology scenario.
Several months ago I began a list of "things to try," which I share at the bottom of this post. It suggests many mundane, trivial-to-medium-cost changes to lifestyle and routine. Now that I've spent some time with most of them and pursued at least as many more personal items in the same spirit, I'll suggest you do something similar. Why?
- Raise the temperature in your optimization algorithm: avoid the trap of doing too much analysis on too little data and escape local optima.
- You can think of this as a system for self-improvement; something that operates on a meta level, unlike an object-level goal or technique; something that helps you fail at almost everything but still win big.
- Variety of experience is an intrinsic pleasure to many, and it may make you feel less that time has flown as you look back on your life.
- Practice implementing small life changes, practice observing the effects of the changes, practice noticing further opportunities for changes, practice value of information calculations, and reinforce your self-image as an empiricist working to improve your life. Build small skills in the right order and you'll have better chances at bigger wins in the future.
- Advice often falls prey to the typical-mind (or typical-body) fallacy. That doesn't mean you should dismiss it out of hand. Think about not just how likely it is to work for you, but how beneficial it would be if it worked, how much it would cost to try, and how likely it is that trying it would give you enough information to change your behavior. Then just try it anyway if it's cheap enough, because you forgot to account for uncertainty in your model inputs.
- Speaking of value of information: don't ignore tweakable variables just because you don't yet have a gwern-tier tracking and evaluation apparatus for the perfect self-experiment. Sometimes you can expect consciously noticeable non-placebo effects from a successful trial. You might do better picking the low hanging fruit to gain momentum before you invest in a Zeo and a statistics textbook.
- You know what, if there's an effect, it may not even need to be non-placebo. C.f. "Lampshading," as well as the often-observed "honeymoon" period of success with new productivity systems.
- It's very tempting, especially in certain communities, to focus exclusively on shiny, counterintuitive, "rational," tech-based, hackeresque, or otherwise clever interventions and grand personal development schemes. Some of these are even good, but one suspects that some are optimized for punchiness, not effectiveness. Conversely, mundane ideas may not propagate as well, despite being potentially equally or more likely to succeed.
- If you were already convinced of all of the above, then great! I hope you have the agency to try stuff like this all the time. If not, you might find it useful, as I did, just to have a list like this available. It's one less trivial inconvenience between thinking "I should try more things" and actually trying something. I've also found that I'm more likely to notice and remember optimization opportunities now that I have a place to capture them. And having spent the time to write them down and occasionally look over them, I'm more likely to notice when I'm in a position to enact something context-dependent on the list.
I removed the terribly personal items from my list, but what remains is still somewhat tailored to my own situation and habits. These are not recommendations; they are just things that struck me as having enough potential value to try for a week or two. The list isn't not remotely comprehensive, even as far as mundane self-experiments are concerned, but it's left as an exercise to the reader to find and fill the gaps. Take this list as an example or as a starting point, and brainstorm ideas of your own in the comments. The usual recommendation applies against going overboard in domains where you're currently impulsive or unreflective.
Related posts: Boring Advice Repository, Break your habits: Be more empirical, On saying the obvious, Value of Information: Four Examples, Spend money on ergonomics, Go try things, Don't fear failure, Just try it: Quantity trumps quality, No, seriously, just try it, etc.
It feels like most people have a moral intuition along the lines of "you should let people do what they want, unless they're hurting other people". We follow this guideline, and we expect other people to follow it. I'll call this the permissiveness principle, that behaviour should be permitted by default. When someone violates the permissiveness principle, we might call them a fascist, someone who exercises control for the sake of control.
And there's another moral intuition, the harm-minimising principle: "you should not hurt other people unless you have a good reason". When someone violates harm-minimisation, we might call them a rake, someone who acts purely for their own pleasure without regard for others.
But sometimes people disagree about what counts as "hurting other people". Maybe one group of people believes that tic-tacs are sentient, and that eating them constitutes harm; and another group believes that tic-tacs are not sentient, so eating them does not hurt anyone.
What should happen here is that people try to work out exactly what it is they disagree about and why. What actually happens is that people appeal to permissiveness.
Of course, by the permissiveness principle, people should be allowed to believe what they want, because holding a belief is harmless as long as you don't act on it. So we say something like "I have no problem with people being morally opposed to eating tic-tacs, but they shouldn't impose their beliefs on the rest of us."
Except that by the harm-minimising principle, those people probably should impose their beliefs on the rest of us. Forbidding you to eat tic-tacs doesn't hurt you much, and it saves the tic-tacs a lot of grief.
It's not that they disagree with the permissiveness principle, they just think it doesn't apply. So appealing to the permissiveness principle isn't going to help much.
I think the problem (or at least part of it) is, depending how you look at it, either double standards or not-double-enough standards.
I apply the permissiveness principle "unless they're hurting other people", which really means "unless I think they're hurting other people". I want you to apply the permissiveness principle "unless they're hurting other people", which still means "unless I think they're hurting other people".
Meanwhile, you apply the permissiveness principle unless you think someone is hurting other people; and you want me to apply it unless you think they're hurting other people.
So when we disagree about whether or not something is hurting other people, I think you're a fascist because you're failing to apply the permissiveness principle; and you think I'm a rake because I'm failing to apply the harm-minimisation principle; or vice-versa. Neither of these things is true, of course.
It gets worse, because once I've decided that you're a fascist, I think the reason we're arguing is that you're a fascist. If you would only stop being a fascist, we could get along fine. You can go on thinking tic-tacs are sentient, you just need to stop being a fascist.
But you're not a fascist. The real reason we're arguing is that you think tic-tacs are sentient. You're acting exactly as you should do if tic-tacs were sentient, but they're not. I need to stop treating you like a fascist, and start trying to convince you that tic-tacs are not sentient.
And, symmetrically, you've decided I'm a rake, which isn't true, and you've decided that that's why we're arguing, which isn't true; we're arguing because I think tic-tacs aren't sentient. You need to stop treating me like a rake, and start trying to convince me that tic-tacs are sentient.
I don't expect either of us to actually convince the other, very often. If it was that easy, someone would probably have already done it. But at least I'd like us both to acknowledge that our opponent is neither a fascist nor a rake, they just believe something that isn't true.
Does the surveillance state affect us? It has affected me, and I didn't realize that it was affecting me until recently. I give a few examples of how it has affected me:
- I was once engaged in a discussion on Facebook about Obama's foreign policy. Around that time, I was going to apply for a US visa. I stopped the discussion early. Semi-consciously, I was worried that what I was writing would be checked by US visa officials and would lead to my visa being denied.
- I was once really interested in reading up on the Unabomber and his manifesto, because somebody mentioned that he had some interesting ideas, and though fundamentally misguided, he might have been onto something. I didn't explore much because I was worried---again semi-consciously---that my traffic history would be logged on some NSA computer somewhere, and that I'd pattern match to the Unabomber (I'm a physics grad student, the Unabomber was a mathematician).
- I didn't visit Silk Road as I was worried that my visits would be traced, even though I had no plans of buying anything.
- Just generally, I try to not search for some really weird stuff that I want to search for (I'm a curious guy!).
- I was almost not going to write this post.
A big Singularity-themed Hollywood movie out in April offers many opportunities to talk about AI risk
There's a big Hollywood movie coming out with an apocalyptic Singularity-like story, called Transcendence. (IMDB, Wiki, official site) With an A-list cast and big budget, I contend this movie is the front-runner to be 2014's most significant influence on discussions of superintelligence outside specialist circles. Anyone hoping to influence those discussions should start preparing some talking points.
I don't see anybody here agree with me on this. The movie has been briefly discussed on LW when it was first announced in March 2013, but since then, only the trailer (out since December) has been mentioned. MIRI hasn't published a word about it. This amazes me. We have three months till millions of people who never considered superintelligence are going to start thinking about it - is nobody bothering to craft a response to the movie yet? Shouldn't there be something that lazy journalists, given the job to write about this movie, can find?
Because if there isn't, they'll dismiss the danger of AI like Erik Sofge already did in an early piece about the movie for Popular Science, and nudge their readers to do so too. And that'd be a shame, wouldn't it?
After moving in with my new roomies (Danny and Bethany of Beeminder), I discovered they have a fair and useful way of auctioning off joint decisions. It helps you figure out how much you value certain chores or activities, and it guarantees that these decisions are worked out in a fair way. They call it "yootling", and wrote more about it here.
A quick example (Note: this only works if all participants are of the types of people who consider this sort of thing a Good Idea, and not A Grotesque Parody of Caring or whatnot):
Use Case: Who Picks up the Kids from Grandma's?
D and B are both busy working, but it's time to pick up the kids from their grandparents house. They decide to yootle for it.
B bids $100 (In a regular Normal Person exchange, this would be like saying "I'm elbows deep in code right now, and don't want to break flow. I'd really rather continue working right now, but of course I'll go if it's needed.")
D bids $15 (In a regular Normal Person exchange this would be like saying "I don't mind too much, though I do have other things to do now...")
So D "wins" the bid, and B pays him $15 to go get the kids from their grandma's.
Of course.... it would be a pain in the butt to constantly be paying each other, so instead they have a 10% chance of paying 10x the amount, and a 90% chance to pay nothing, using a random number generator.
This is made easier by the fact that we have a bot to run this, but before that they would use the high-tech solution of Holding Up Fingers.
We may do this multiple times per day, whenever there’s a good that we have shared ownership of and one of us wants to offload their shares onto the other person. The goods can be anything, e.g. the last brownie, but they’re more often “bads” like who will get up in the middle of the night with a vomiting child, or who will book plane tickets for a trip.We find this an elegant means of assigning loathed tasks. The person who minded least winds up doing the chore, but gets compensated for it at a price that by their own estimation was fair.
Joint purchase auctionThe decision auction and variants are about allocating shared or partially shared resources to one person or the other, or picking one person to do something. Once in a while you have the opposite problem: deciding on a joint purchase.Suppose Danny thinks we need a new sofa (this is very hypothetical). I think the one we have is just fine thank you. After some discussion I concede that it would be nice to have a sofa that was less doggy. Danny, being terribly excited about getting a new sofa does a bunch of research and finds his ideal sofa. I think it is a bit overpriced considering it is going to be a piece of gymnastics equipment for the kids for the next 6 years. Conflict ensues! I could bluff that I’m not interested in a new sofa at all and that he can buy it himself if he wants it that badly. But he probably doesn’t want it that bad, and I do want it a little. If only we could buy the sofa conditional on our combined utility for it exceeding the cost, and pay in proportion to our utilities to boot. Well, thanks to separate finances and the magic of mechanism design, we can! We submit sealed bids for the sofa and buy it if the sum of our bids is enough. (And, importantly, commit to not buying it for at least a year otherwise.) Any surplus is redistributed in proportion to our bids. For example, if Danny bid $80 and I bid $40 to buy a hundred dollar sofa, then we’d buy it, with Danny chipping in twice as much as me, namely $67 to my $33.
Generosity without sacrificing social efficiency“The payments are simply what keep us honest in assessing that.”If you’re thinking “how mercenary all this is!” then, well, I’m unclear how you made it this far into this post. But it’s not nearly as cold as it may sound. We do nice things for each other all the time, and frequently use yootling to make sure it’s socially efficient to do so. Suppose I invite Danny to a sing-along showing of Once More With Feeling (this may or may not be hypothetical) and Danny doesn’t exactly want to go but can see that I have value for his company. He might (quite non-hypothetically) say “I’ll half-accompany you!” by which he means that he’ll yootle me for whether he goes or not. In other words, he magnanimously decides to treat his joining me as a 50/50 joint decision. If I have greater value for him coming than he has for not coming, then I’ll pay him to come. But if it’s the other way around, he will pay me to let him off the hook. We don’t actually care much about the payments, though those are necessary for the auction to work. We care about making sure that he comes to the Buffy sing-along if and only if my value for his company exceeds his value for staying home. The payments are simply what keep us honest in assessing that. The increased fairness — the winner sharing their utility with the loser — is icing.
Another attack on the resource-based model of willpower, Michael Inzlicht, Brandon J. Schmeichel and C. Neil Macrae have a paper called "Why Self-Control Seems (but may not be) Limited" in press in Trends in Cognitive Sciences. Ungated version here.
Some of the most interesting points:
- Over 100 studies appear to be consistent with self-control being a limited resource, but generally these studies do not observe resource depletion directly, but infer it from whether or not people's performance declines in a second self-control task.
- The only attempts to directly measure the loss or gain of a resource have been studies measuring blood glucose, but these studies have serious limitations, the most important being an inability to replicate evidence of mental effort actually affecting the level of glucose in the blood.
- Self-control also seems to replenish by things such as "watching a favorite television program, affirming some core value, or even praying", which would seem to conflict with the hypothesis inherent resource limitations. The resource-based model also seems evolutionarily implausible.
The authors offer their own theory of self-control. One-sentence summary (my formulation, not from the paper): "Our brains don't want to only work, because by doing some play on the side, we may come to discover things that will allow us to do even more valuable work."
- Ultimately, self-control limitations are proposed to be an exploration-exploitation tradeoff, "regulating the extent to which the control system favors task engagement (exploitation) versus task disengagement and sampling of other opportunities (exploration)".
- Research suggests that cognitive effort is inherently aversive, and that after humans have worked on some task for a while, "ever more resources are needed to counteract the aversiveness of work, or else people will gravitate toward inherently rewarding leisure instead". According to the model proposed by the authors, this allows the organism to both focus on activities that will provide it with rewards (exploitation), but also to disengage from them and seek activities which may be even more rewarding (exploration). Feelings such as boredom function to stop the organism from getting too fixated on individual tasks, and allow us to spend some time on tasks which might turn out to be even more valuable.
The explanation of the actual proposed psychological mechanism is good enough that it deserves to be quoted in full:
Based on the tradeoffs identified above, we propose that initial acts of control lead to shifts in motivation away from “have-to” or “ought-to” goals and toward “want-to” goals (see Figure 2). “Have-to” tasks are carried out through a sense of duty or contractual obligation, while “want-to” tasks are carried out because they are personally enjoyable and meaningful ; as such, “want-to” tasks feel easy to perform and to maintain in focal attention . The distinction between “have-to” and “want-to,” however, is not always clear cut, with some “want-to” goals (e.g., wanting to lose weight) being more introjected and feeling more like “have-to” goals because they are adopted out of a sense of duty, societal conformity, or guilt instead of anticipated pleasure .
According to decades of research on self-determination theory , the quality of motivation that people apply to a situation ranges from extrinsic motivation, whereby behavior is performed because of external demand or reward, to intrinsic motivation, whereby behavior is performed because it is inherently enjoyable and rewarding. Thus, when we suggest that depletion leads to a shift from “have-to” to “want-to” goals, we are suggesting that prior acts of cognitive effort lead people to prefer activities that they deem enjoyable or gratifying over activities that they feel they ought to do because it corresponds to some external pressure or introjected goal. For example, after initial cognitive exertion, restrained eaters prefer to indulge their sweet tooth rather than adhere to their strict views of what is appropriate to eat . Crucially, this shift from “have-to” to “want-to” can be offset when people become (internally or externally) motivated to perform a “have-to” task . Thus, it is not that people cannot control themselves on some externally mandated task (e.g., name colors, do not read words); it is that they do not feel like controlling themselves, preferring to indulge instead in more inherently enjoyable and easier pursuits (e.g., read words). Like fatigue, the effect is driven by reluctance and not incapability  (see Box 2).
Research is consistent with this motivational viewpoint. Although working hard at Time 1 tends to lead to less control on “have-to” tasks at Time 2, this effect is attenuated when participants are motivated to perform the Time 2 task , personally invested in the Time 2 task , or when they enjoy the Time 1 task . Similarly, although performance tends to falter after continuously performing a task for a long period, it returns to baseline when participants are rewarded for their efforts ; and remains stable for participants who have some control over and are thus engaged with the task . Motivation, in short, moderates depletion . We suggest that changes in task motivation also mediate depletion .
Depletion, however, is not simply less motivation overall. Rather, it is produced by lower motivation to engage in “have-to” tasks, yet higher motivation to engage in “want-to” tasks. Depletion stokes desire . Thus, working hard at Time 1 increases approach motivation, as indexed by self-reported states, impulsive responding, and sensitivity to inherently-rewarding, appetitive stimuli . This shift in motivational priorities from “have-to” to “want-to” means that depletion can increase the reward value of inherently-rewarding stimuli. For example, when depleted dieters see food cues, they show more activity in the orbitofrontal cortex, a brain area associated with coding reward value, compared to non-depleted dieters .
Let's say that you are are at your local less wrong meet up and someone makes some strong claim and seems very sure of himself, "blah blah blah resurrected blah blah alicorn princess blah blah 99 percent sure." You think he is probably correct, you estimate a 67 percent chance, but you think he is way over confident. "Wanna bet?" You ask.
"Sure," he responds, and you both check your wallets and have 25 dollars each. "Okay," he says, "now you pick some betting odds, and I'll choose which side I want to pick."
"That's crazy," you say, "I am going to pick the odds so that I cannot be taken advantage of, which means that I will be indifferent between which of the two options you pick, which means that I will expect to gain 0 dollars from this transaction. I wont take it. It is not fair!"
"Okay," he says, annoyed with you. "We will both write down the probability we think that I am correct, average them together, and that will determine the betting odds. We'll bet as much as we can of our 25 dollars with those odds."
"What do you mean by 'average' I can think of at least four possibilities. Also, since I know your probability is high, I will just choose a high probability that is still less than it to maximize the odds in my favor regardless of my actual belief. Your proposition is not strategy proof."
"Fine, what do you suggest?"
You take out some paper, solve some differential equations, and explain how the bet should go.
Satisfied with your math, you share your probability, he puts 13.28 on the table, and you put 2.72 on the table.
"Now what?" He asks.
A third meet up member takes quickly takes the 16 dollars from the table and answers, "You wait."
I will now derive a general algorithm for determining a bet from two probabilities and a maximum amount of money that people are willing to bet. This algorithm is both strategy proof and fair. The solution turns out to be simple, so if you like, you can skip to the last paragraph, and use it next time you want to make a friendly bet. If you want to try to derive the solution on your own, you might want to stop reading now.
First, we have to be clear about what we mean by strategy proof and fair. "Strategy proof" is clear. Our algorithm should ensure that neither person believes that they can increase their expected profit by lying about their probabilities. "Fair" will be a little harder to define. There is more than one way to define "fair" in this context, but there is one way which I think is probably the best. When the players make the bet, they both will expect to make some profit. They will not both be correct, but they will both believe they are expected to make profit. I claim the bet is fair if both players expect to make the same profit on average.
Now, lets formalize the problem:
Alice believes S is true with probability p. Bob believes S is false with probability q. Both players are willing to bet up to d dollars. Without loss of generality, assume p+q>1. Our betting algorithm will output a dollar amount, f(p,q), for Alice to put on the table and a dollar amount, g(p,q) for Bob to put on the table. Then if S is true, Alice gets all the money, and if S is false, Bob gets all the money.
From Alice's point of view, her expected profit for Alice will be p(g(p,q))+(1-p)(-f(p,q)).
From Bob's point of view, his expected profit for Bob will be q(f(p,q))+(1-q)(-g(p,q)).
Setting these two values equal, and simplifying, we get that (1+p-q)g(p,q)=(1+q-p)f(p,q), which is the condition that the betting algorithm is fair.
For convenience of notation, we will define h(p,q) by h(p,q)=g(p,q)/(1+q-p)=f(p,q)/(1+p-q).
Now, we want to look at what will happen if Alice lies about her probability. If instead of saying p, Alice were to say that her probability was r, then her expected profit would be p(g(r,q))+(1-p)(-f(r,q)), which equals p(1+q-r)h(r,q)+(1-p)(-(1+r-q)h(r,q))=(2p-1-r+q)h(r,q).
We want this value as a function of r to be maximized when r=p, which means that -h+(2r-1-r+q)(dh/dr)=0.
Separation of variables gives us (1/h)dh=1/(-1+r+q)dr,
which integrates to ln(h)=C+ln(-1+r+q) at r=p,
which simplifies to h=e^C(-1+r+q)=e^C(-1+p+q).
This gives the solution f(p,q)=e^C(-1+p+q)(1+p-q)=e^C(p^2-(1-q)^2) and g(p,q)=e^C(-1+p+q)(1+q-p)=e^C(q^2-(1-p)^2).
It is quick to verify that this solution is actually fair, and both players' expected profit is maximized by honest reporting of beliefs.
The value of the constant multiplied out in front can be anything, and the most either player could ever have to put on the table is equal to this constant. Therefore, if both players are willing to bet up to d dollars, we should define e^C=d.
Alice and Bob are willing to bet up to d dollars, Alice thinks S is true with probability p, and Bob thinks S is false with probability q. Assuming p+q>1, Alice should put in d(p^2-(1-q)^2), while Bob should put in d(q^2-(1-p)^2). I suggest you use this algorithm next time you want to have a friendly wager (with a rational person), and I suggest you set d to 25 dollars and require both players to say an odd integer percent to ensure a whole number of cents.
There's been a lot of fuss lately about Google's gadgets. Computers can drive cars - pretty amazing, eh? I guess. But what amazed me as a child was that people can drive cars. I'd sit in the back seat while an adult controlled a machine taking us at insane speeds through a cluttered, seemingly quite unsafe environment. I distinctly remember thinking that something about this just doesn't add up.
It looked to me like there was just no adequate mechanism to keep the car on the road. At the speeds cars travel, a tiny deviation from the correct course would take us flying off the road in just a couple of seconds. Yet the adults seemed pretty nonchalant about it - the adult in the driver's seat could have relaxed conversations with other people in the car. But I knew that people were pretty clumsy. I was an ungainly kid but I knew even the adults would bump into stuff, drop things and generally fumble from time to time. Why didn't that seem to happen in the car? I felt I was missing something. Maybe there were magnets in the road?
Now that I am a driving adult I could more or less explain this to a 12-year-old me:
1. Yes, the course needs to be controlled very exactly and you need to make constant tiny course corrections or you're off to a serious accident in no time.
2. Fortunately, the steering wheel is a really good instrument for making small course corrections. The design is somewhat clumsiness-resistant.
3. Nevertheless, you really are just one misstep away from death and you need to focus intently. You can't take your eyes off the road for even one second. Under good circumstances, you can have light conversations while driving but a big part of your mind is still tied up by the task.
4. People can drive cars - but only just barely. You can't do it safely even while only mildly inebriated. That's not just an arbitrary law - the hit to your reflexes substantially increases the risks. You can do pretty much all other normal tasks after a couple of drinks, but not this.
So my 12-year-old self was not completely mistaken but still ultimately wrong. There are no magnets in the road. The explanation for why driving works out is mostly that people are just somewhat more capable than I'd thought. In my more sunny moments I hope that I'm making similar errors when thinking about artificial intelligence. Maybe creating a safe AGI isn't as impossible as it looks to me. Maybe it isn't beyond human capabilities. Maybe.
Edit: I intended no real analogy between AGI design and driving or car design - just the general observation that people are sometimes more competent than I expect. I find it interesting that multiple commenters note that they have also been puzzled by the relative safety of traffic. I'm not sure what lesson to draw.
On the most recent LessWrong readership survey, I assigned a probability of 0.30 on the cryonics question. I had previously been persuaded to sign up for cryonics by reading the sequences, but this thread and particularly this comment lowered my estimate of the chances of cryonics working considerably. Also relevant from the same thread was ciphergoth's comment:
By and large cryonics critics don't make clear exactly what part of the cryonics argument they mean to target, so it's hard to say exactly whether it covers an area of their expertise, but it's at least plausible to read them as asserting that cryopreserved people are information-theoretically dead, which is not guesswork about future technology and would fall under their area of expertise.
Based on this, I think there's a substantial chance that there's information out there that would convince me that the folks who dismiss cryonics as pseudoscience are essentially correct, that the right answer to the survey question was epsilon. I've seen what seem like convincing objections to cryonics, and it seems possible that an expanded version of those arguments, with full references and replies to pro-cryonics arguments, would convince me. Or someone could just go to the trouble of showing that a large majority of cryobiologists really do think cryopreserved people are information-theoretically dead.
However, it's not clear to me how well worth my time it is to seek out such information. It seems coming up with decisive information would be hard, especially since e.g. ciphergoth has put a lot of energy into trying to figure out what the experts think about cryonics and come away without a clear answer. And part of the reason I signed up for cryonics in the first place is because it doesn't cost me much: the largest component is the life insurance for funding, only $50 / month.
So I've decided to put a bounty on being persuaded to cancel my cryonics subscription. If no one succeeds in convincing me, it costs me nothing, and if someone does succeed in convincing me the cost is less than the cost of being signed up for cryonics for a year. And yes, I'm aware that providing one-sided financial incentives like this requires me to take the fact that I've done this into account when evaluating anti-cryonics arguments, and apply extra scrutiny to them.
Note that there are several issues that ultimately go in to whether you should sign up for cryonics (the neuroscience / evaluation of current technology, estimate of the probability of a "good" future, various philosophical issues), I anticipate the greatest chance of being persuaded from scientific arguments. In particular, I find questions about personal identity and consciousness of uploads made from preserved brains confusing, but think there are very few people in the world, if any, who are likely to have much chance of getting me un-confused about those issues. The offer is blind to the exact nature of the arguments given, but I mostly foresee being persuaded by the neuroscience arguments.
And of course, I'm happy to listen to people tell me why the anti-cryonics arguments are wrong and I should stay signed up for cryonics. There's just no prize for doing so.
Of all the stimulants I tried, modafinil is my favorite one. There are more powerful substances like e.g. amphetamine or methylphenidate, but modafinil has much less negative effects on physical as well as mental health and is far less addictive. All things considered, the cost-benefit-ratio of modafinil is unparalleled.
For those reasons I decided to publish my bachelor thesis on the cognitive effects of modafinil in healthy, non-sleep deprived individuals on LessWrong. Forgive me its shortcomings.
Here are some relevant quotes:
...the main research question of this thesis is if and to what extent modafinil has positive effects on cognitive performance (operationalized as performance improvements in a variety of cognitive tests) in healthy, non-sleep deprived individuals.... The abuse liability and adverse effects of modafinil are also discussed. A literature research of all available, randomized, placebo-controlled, double-blind studies which examined those effects was therefore conducted.
Overview of effects in healthy individuals:
...Altogether 19 randomized, double-blind, placebo-controlled studies about the effects of modafinil on cognitive functioning in healthy, non sleep-deprived individuals were reviewed. One of them (Randall et al., 2005b) was a retrospect analysis of 2 other studies (Randall et al., 2002 and 2005a), so 18 independent studies remain.
Out of the 19 studies, 14 found performance improvements in at least one of the administered cognitive tests through modafinil in healthy volunteers.
Modafinil significantly improved performance in 26 out of 102 cognitive tests, but significantly decreased performance in 3 cognitive tests.
...Several studies suggest that modafinil is only effective in subjects with lower IQ or lower baseline performance (Randall et al., 2005b; Müller et al., 2004; Finke et al., 2010). Significant differences between modafinil and placebo also often only emerge in the most difficult conditions of cognitive tests (Müller et al., 2004; Müller et al., 2012; Winder-Rhodes et al., 2010; Marchant et al., 2009).
...A study by Wong et al. (1999) of 32 healthy, male volunteers showed that the most frequently observed adverse effects among modafinil subjects were headache (34%), followed by insomnia, palpitations and anxiety (each occurring in 21% of participants). Adverse events were clearly dose- dependent: 50%, 83%, 100% and 100% of the participants in the 200 mg, 400 mg, 600 mg, and 800 mg dose groups respectively experienced at least one adverse event. According to the authors of this study the maximal safe dosage of modafinil is 600 mg.
...Using a randomized, double-blind, placebo-controlled design Rush et al. (2002) examined subjective and behavioral effects of cocaine (100, 200 or 300 mg), modafinil (200, 400 or 600 mg) and placebo in cocaine users….Of note, while subjects taking cocaine were willing to pay $3 for 100 mg, $6 for 200 mg and $10 for 300 mg cocaine, participants on modafinil were willing to pay $2, regardless of the dose. These results suggest that modafinil has a low abuse liability, but the rather small sample size (n=9) limits the validity of this study.
The study by Marchant et al. (2009) which is discussed in more detail in part 2.4.12 found that subjects receiving modafinil were significantly less (p<0,05) content than subjects receiving placebo which indicates a low abuse potential of modafinil. In contrast, in a study by Müller et al. (2012) which is also discussed in more detail above, modafinil significantly increased (p<0,05) ratings of "task-enjoyment" which may suggest a moderate potential for abuse.
...Overall, these results indicate that although modafinil promotes wakefulness, its effects are distinct from those of more typical stimulants like amphetamine and methylphenidate and more similar to the effects of caffeine which suggests a relatively low abuse liability.
In healthy individuals modafinil seems to improve cognitive performance, especially on the Stroop Task, stop-signal and serial reaction time tasks and tests of visual memory, working memory, spatial planning ability and sustained attention. However, these cognitive enhancing effects did only emerge in a subset of the reviewed studies. Additionally, significant performance increases may be limited to subjects with low baseline performance. Modafinil also appears to have detrimental effects on mental flexibility.
...The abuse liability of modafinil seems to be small, particularly in comparison with other stimulants such as amphetamine and methylphenidate. Headache and insomnia are the most common adverse effects of modafinil.
...Because several studies suggest that modafinil may only provide substantial beneficial effects to individuals with low baseline performance, ultimately the big question remains if modafinil can really improve the cognitive performance of already high-functioning, healthy individuals. Only in the latter case modafinil can justifiably be called a genuine cognitive enhancer.
You can download the whole thing below. (Just skip the sections on substance-dependent individuals and patients with dementia. My professor wanted them.)
About a year ago, I attended my first CFAR workshop and wrote a post about it here. I mentioned in that post that it was too soon for me to tell if the workshop would have a large positive impact on my life. In the comments to that post, I was asked to follow up on that post in a year to better evaluate that impact. So here we are!
Very short summary: overall I think the workshop had a large and persistent positive impact on my life.
However, anyone using this post to evaluate the value of going to a CFAR workshop themselves should be aware that I'm local to Berkeley and have had many opportunities to stay connected to CFAR and the rationalist community. More specifically, in addition to the January workshop, I also
- visited the March workshop (and possibly others),
- attended various social events held by members of the community,
- taught at the July workshop, and
- taught at SPARC.
These experiences were all very helpful in helping me digest and reinforce the workshop material (which was also improving over time), and a typical workshop participant might not have these advantages.
Answering a question
pewpewlasergun wanted me to answer the following question:
I'd like to know how many techniques you were taught at the meetup you still use regularly. Also which has had the largest effect on your life.
The short answer is: in some sense very few, but a lot of the value I got out of attending the workshop didn't come from specific techniques.
In more detail: to be honest, many of the specific techniques are kind of a chore to use (at least as of January 2013). I experimented with a good number of them in the months after the workshop, and most of them haven't stuck (but that isn't so bad; the cost of trying a technique and finding that it doesn't work for you is low, while the benefit of trying a technique and finding that it does work for you can be quite high!). One that has is the idea of a next action, which I've found incredibly useful. Next actions are the things that to-do list items should be, say in the context of using Remember The Milk. Many to-do list items you might be tempted to right down are difficult to actually do because they're either too vague or too big and hence trigger ugh fields. For example, you might have an item like
- Do my taxes
that you don't get around to until right before you have to because you have an ugh field around doing your taxes. This item is both too vague and too big: instead of writing this down, write down the next physical action you need to take to make progress on this item, which might be something more like
- Find tax forms and put them on desk
which is both concrete and small. Thinking in terms of next actions has been a huge upgrade to my GTD system (as was Workflowy, which I also started using because of the workshop) and I do it constantly.
But as I mentioned, a lot of the value I got out of attending the workshop was not from specific techniques. Much of the value comes from spending time with the workshop instructors and participants, which had effects that I find hard to summarize, but I'll try to describe some of them below:
The workshop readjusted my emotional attitudes towards several things for the better, and at several meta levels. For example, a short conversation with a workshop alum completely readjusted my emotional attitude towards both nutrition and exercise, and I started paying more attention to what I ate and going to the gym (albeit sporadically) for the first time in my life not long afterwards. I lost about 15 pounds this way (mostly from the eating part, not the gym part, I think).
At a higher meta level, I did a fair amount of experimenting with various lifestyle changes (cold showers, not shampooing) after the workshop and overall they had the effect of readjusting my emotional attitude towards change. I find it generally easier to change my behavior than I used to because I've had a lot of practice at it lately, and am more enthusiastic about the prospect of such changes.
(Incidentally, I think emotional attitude adjustment is an underrated component of causing people to change their behavior, at least here on LW.)
Using all of my strength
The workshop is the first place I really understood, on a gut level, that I could use my brain to think about something other than math. It sounds silly when I phrase it like that, but at some point in the past I had incorporated into my identity that I was good at math but absentminded and silly about real-world matters, and I used it as an excuse not to fully engage intellectually with anything that wasn't math, especially anything practical. One way or another the workshop helped me realize this, and I stopped thinking this way.
The result is that I constantly apply optimization power to situations I wouldn't have even tried to apply optimization power to before. For example, today I was trying to figure out why the water in my bathroom sink was draining so slowly. At first I thought it was because the strainer had become clogged with gunk, so I cleaned the strainer, but then I found out that even with the strainer removed the water was still draining slowly. In the past I might've given up here. Instead I looked around for something that would fit farther into the sink than my fingers and saw the handle of my plunger. I pumped the handle into the sink a few times and some extra gunk I hadn't known was there came out. The sink is fine now. (This might seem small to people who are more domestically talented than me, but trust me when I say I wasn't doing stuff like this before last year.)
Reflection and repair
Thanks to the workshop, my GTD system is now robust enough to consistently enable me to reflect on and repair my life (including my GTD system). For example, I'm quicker to attempt to deal with minor medical problems I have than I used to be. I also think more often about what I'm doing and whether I could be doing something better. In this regard I pay a lot of attention in particular to what habits I'm forming, although I don't use the specific techniques in the relevant CFAR unit.
For example, at some point I had recorded in RTM that I was frustrated by the sensation of hours going by without remembering how I had spent them (usually because I was mindlessly browsing the internet). In response, I started keeping a record of what I was doing every half hour and categorizing each hour according to a combination of how productively and how intentionally I spent it (in the first iteration it was just how productively I spent it, but I found that this was making me feel too guilty about relaxing). For example:
- a half-hour intentionally spent reading a paper is marked green.
- a half-hour half-spent writing up solutions to a problem set and half-spent on Facebook is marked yellow.
- a half-hour intentionally spent playing a video game is marked with no color.
- a half-hour mindlessly browsing the internet when I had intended to do work is marked red.
The act of doing this every half hour itself helps make me more mindful about how I spend my time, but having a record of how I spend my time has also helped me notice interesting things, like how less of my time is under my direct control than I had thought (but instead is taken up by classes, commuting, eating, etc.). It's also easier for me to get into a success spiral when I see a lot of green.
Being around workshop instructors and participants is consistently intellectually stimulating. I don't have a tactful way of saying what I'm about to say next, but: two effects of this are that I think more interesting thoughts than I used to and also that I'm funnier than I used to be. (I realize that these are both hard to quantify.)
I worry that I haven't given a complete picture here, but hopefully anything I've left out will be brought up in the comments one way or another. (Edit: this totally happened! Please read Anna Salamon's comment below.)
Takeaway for prospective workshop attendees
I'm not actually sure what you should take away from all this if your goal is to figure out whether you should attend a workshop yourself. My thoughts are roughly this: I think attending a workshop is potentially high-value and therefore that even talking to CFAR about any questions you might have is potentially high-value, in addition to being relatively low-cost. If you think there's even a small chance you could get a lot of value out of attending a workshop I recommend that you at least take that one step.
The trick of saying "yes" instead of "no" is *not* to say less often "no" at the cost at allowing things when you say "yes". That just trades the stress of saying "no" (standing consequently despite clash of wills) against the effort to fulfill, monitor, pay or clean up after the "yes".
Soft paternalism applied to parenting means saying "Yes, but" or "Yes, later" or "Yes, if". This signals to the child that you understand his/her wish but also supplies some context the child may not be aware of. It reduces your cost of saying "yes" at the expense of a cost to cash in the "yes" for the child.
Inspired by recent batch of productivity posts, I wrote my short story down. To a reasonable extent, this story is real.
|2:00||As usual, I put my breakfast into the microwave and set it to 2:00.|
|1:58||It took two seconds for a train of though to start:|
|"Waste, again. Again!"|
|"What am I supposed to do this entire time? Stare at the clock?"|
|"Useless and boring, but can't start any serious work or thought in time-frame this short"|
|"Why don't I switch to Soylent, hire a maid or just eat the damn thing cold?"|
|1:53||It took me five seconds to notice and derail that train on the basis of "Been there, done that, nothing significant changed since"|
|But this time something went differently.|
|1:52||"What do I get to lose if I just try to do something, anything?"|
|"Food's gonna get cold. Also you can't multitask that much, so no thinking either."|
|"So what. If I don't make it in time I'll just reheat that food and at least one other thing will be done already. And didn't I just say I can't think of anything serious that fast anyway?"|
|It took me 16 seconds to scan today's TODO for anything I had any chance to accomplish within ninety-something seconds.|
|"Work. Takes too long to start, hardware limitation"|
|"Read... what? Would take a while to find something new and short enough. I could prepare next time, not now"|
|"A shower. Usually takes me at least 5 minutes... but why?"|
|1:36||I rushed to the bathroom.|
|"Skip everything that can be skipped, but nothing important"|
|Leaving clothes where I stand.|
|"No time to fiddle with the faucet. Just turn it on roughly around the point where it's supposed to be"|
|A bit too cold. So what.|
|"No time to select soap/shampoo/gel. Just apply top-to-bottom whatever comes up first."|
|Blargh, hair conditioner. Bad idea.|
|"Note to self: sort this stuff"|
|Top-to-bottom, fast moves, keep accurate.|
|"Come on, come on, come on! I can't believe the microwave didn't finish yet!"|
|Head too. Don't skip anything important, remember? One last jet of water and I grab the towel.|
|"No need to dry the hair so much, you're not going out anytime soon"|
|I throw the towel back on the hanger and run to the kitchen. Did it beep already and I didn't hear it? Did it broke?|
|0:20||For 21 seconds of my Saved Time I allowed myself to stare at the clock to make sure time flows at the same rate it used to.|
|"It took me 54 seconds to take an okay shower. A minute and 40 seconds ago I didn't believe it was possible."|
|"What else can I do?"|
|"Now we're talking!"|
|0:001||I spent the last 20 seconds to build a mental model of what just happened and store it for later experiments...|
|... and then it hit me, again. I made breakfast, took a shower and thought of something new and possibly significant, all within the time-frame so short I didn't believe possible. I could multitask that much. And I will do better with training.|
Scott, known on LessWrong as Yvain, recently wrote a post complaining about an inaccurate rape statistic.
Arthur Chu, who is notable for winning money on Jeopardy recently, argued against Scott's stance that we should be honest in arguments in a comment thread on Jeff Kaufman's Facebook profile, which can be read here.
Scott just responded here, with a number of points relevant to the topic of rationalist communities.
I am interested in what LW thinks of this.
Obviously, at some point being polite in our arguments is silly. I'd be interested in people's opinions of how dire the real world consequences have to be before it's worthwhile debating dishonestly.
A long blog post explains why the author, a feminist, is not comfortable with the rationalist community despite thinking it is "super cool and interesting". It's directed specifically at Yvain, but it's probably general enough to be of some interest here.
I'm not sure if I can summarize this fairly but the main thrust seems to be that we are overly willing to entertain offensive/taboo/hurtful ideas and this drives off many types of people. Here's a quote:
In other words, prizing discourse without limitations (I tried to find a convenient analogy for said limitations and failed. Fenders? Safety belts?) will result in an environment in which people are more comfortable speaking the more social privilege they hold.
The author perceives a link between LW type open discourse and danger to minority groups. I'm not sure whether that's true or not. Take race. Many LWers are willing to entertain ideas about the existence and possible importance of average group differences in psychological traits. So, maybe LWers are racists. But they're racists who continually obsess over optimizing their philanthropic contributions to African charities. So, maybe not racists in a dangerous way?
An overly rosy view, perhaps, and I don't want to deny the reality of the blogger's experience. Clearly, the person is intelligent and attracted to some aspects of LW discourse while turned off by other aspects.
If you want people to ask you stuff reply to this post with a comment to that effect.
More accurately, ask any participating LessWronger anything that is in the category of questions they indicate they would answer.
If you want to talk about this post you can reply to my comment below that says "Discussion of this post goes here.", or not.
Eliezer and Marcello's article on tiling agents and the Löbian obstacle discusses several things that you intuitively would expect a rational agent to be able to do that, because of Löb's theorem, are problematic for an agent using logical reasoning. One of these desiderata is naturalistic trust: Imagine that you build an AI that uses PA for its mathematical reasoning, and this AI happens to find in its environment an automated theorem prover which, the AI carefully establishes, also uses PA for its reasoning. Our AI looks at the theorem prover's display and sees that it flashes a particular lemma that would be very useful for our AI in its own reasoning; the fact that it's on the prover's display means that the prover has just completed a formal proof of this lemma. Can our AI now use the lemma? Well, even if it can establish in its own PA-based reasoning module that there exists a proof of the lemma, by Löb's theorem this doesn't imply in PA that the lemma is in fact true; as Eliezer would put it, our agent treats proofs checked inside the boundaries of its own head different from proofs checked somewhere in the environment. (The above isn't fully formal, but the formal details can be filled in.)
At the MIRI's December workshop (which started today), we've been discussing a suggestion by Nik Weaver for how to handle this problem. Nik starts from a simple suggestion (which he doesn't consider to be entirely sufficient, and his linked paper is mostly about a much more involved proposal that addresses some remaining problems, but the simple idea will suffice for this post): Presumably there's some instrumental reason that our AI proves things; suppose that in particular, the AI will only take an action after it has proven that it is "safe" to take this action (e.g., the action doesn't blow up the planet). Nik suggests to relax this a bit: The AI will only take an action after it has (i) proven in PA that taking the action is safe; OR (ii) proven in PA that it's provable in PA that the action is safe; OR (iii) proven in PA that it's provable in PA that it's provable in PA that the action is safe; etc.
Now suppose that our AI sees that lemma, A, flashing on the theorem prover's display, and suppose that our AI can prove that A implies that action X is safe. Then our AI can also prove that it's provable that A -> safe(X), and it can prove that A is provable because it has established that the theorem prover works correctly; thus, it can prove that it's provable that safe(X), and therefore take action X.
Even if the theorem prover has only proved that A is provable, so that the AI only knows that it's provable that A is provable, it can use the same sort of reasoning to prove that it's provable that it's provable that safe(X), and again take action X.
But on hearing this, Eliezer and I had the same skeptical reaction: It seems that our AI, in an informal sense, "trusts" that A is true if it finds (i) a proof of A, or (ii) a proof that A is provable, or -- etc. Now suppose that the theorem prover our AI is looking at flashes statements on its display after it has established that they are "trustworthy" in this sense -- if it has found a proof, or a proof that there is a proof, etc. Then when A flashes on the display, our AI can only prove that there exists some n such that it's "provable^n" that A, and that's not enough for it to use the lemma. If the theorem prover flashed n on its screen together with A, everything would be fine and dandy; but if the AI doesn't know n, it's not able to use the theorem prover's work. So it still seems that the AI is unwilling to "trust" another system that reasons just like the AI itself.
I want to try to shed some light on this obstacle by giving an intuition for why the AI's behavior here could, in some sense, be considered to be the right thing to do. Let me tell you a little story.
One day you talk with a bright young mathematician about a mathematical problem that's been bothering you, and she suggests that it's an easy consequence of a theorem in cohistonomical tomolopy. You haven't heard of this theorem before, and find it rather surprising, so you ask for the proof.
"Well," she says, "I've heard it from my thesis advisor."
"Oh," you say, "fair enough. Um--"
"You're sure that your advisor checked it carefully, right?"
"Ah! Yeah, I made quite sure of that. In fact, I established very carefully that my thesis advisor uses exactly the same system of mathematical reasoning that I use myself, and only states theorems after she has checked the proof beyond any doubt, so as a rational agent I am compelled to accept anything as true that she's convinced herself of."
"Oh, I see! Well, fair enough. I'd still like to understand why this theorem is true, though. You wouldn't happen to know your advisor's proof, would you?"
"Ah, as a matter of fact, I do! She's heard it from her thesis advisor."
"Something the matter?"
"Er, have you considered..."
"Oh! I'm glad you asked! In fact, I've been curious myself, and yes, it does happen to be the case that there's an infinitely descending chain of thesis advisors all of which have established the truth of this theorem solely by having heard it from the previous advisor in the chain." (This parable takes place in a world without a big bang -- human history stretches infinitely far into the past.) "But never to worry -- they've all checked very carefully that the previous person in the chain used the same formal system as themselves. Of course, that was obvious by induction -- my advisor wouldn't have accepted it from her advisor without checking his reasoning first, and he would have accepted it from his advisor without checking, etc."
"Uh, doesn't it bother you that nobody has ever, like, actually proven the theorem?"
"Whatever in the world are you talking about? I've proven it myself! In fact, I just told you that infinitely many people have each proved it in slightly different ways -- for example my own proof made use of the fact that my advisor had proven the theorem, whereas her proof used her advisor instead..."
This can't literally happen with a sound proof system, but the reason is that that a system like PA can only accept things as true if they have been proven in a system weaker than PA -- i.e., because we have Löb's theorem. Our mathematician's advisor would have to use a weaker system than the mathematician herself, and the advisor's advisor a weaker system still; this sequence would have to terminate after a finite time (I don't have a formal proof of this, but I'm fairly sure you can turn the above story into a formal proof that something like this has to be true of sound proof systems), and so someone will actually have to have proved the actual theorem on the object level.
So here's my intuition: A satisfactory solution of the problems around the Löbian obstacle will have to make sure that the buck doesn't get passed on indefinitely -- you can accept a theorem because someone reasoning like you has established that someone else reasoning like you has proven the theorem, but there can only be a finite number of links between you and someone who has actually done the object-level proof. We know how to do this by decreasing the mathematical strength of the proof system, and that's not satisfactory, but my intuition is that a satisfactory solution will still have to make sure that there's something that decreases when you go up the chain of thesis advisors, and when that thing reaches zero you've found the thesis advisor that has actually proven the theorem. (I sense ordinals entering the picture.)
...aaaand in fact, I can now tell you one way to do something like this: Nik's idea, which I was talking about above. Remember how our AI "trusts" the theorem prover that flashes the number n which says how many times you have to iterate "that it's provable in PA that", but doesn't "trust" the prover that's exactly the same except it doesn't tell you this number? That's the thing that decreases. If the theorem prover actually establishes A by observing a different theorem prover flashing A and the number 1584, then it can flash A, but only with a number at least 1585. And hence, if you go 1585 thesis advisors up the chain, you find the gal who actually proved A.
The cool thing about Nik's idea is that it doesn't change mathematical strength while going down the chain. In fact, it's not hard to show that if PA proves a sentence A, then it also proves that PA proves A; and the other way, we believe that everything that PA proves is actually true, so if PA proves PA proves A, then it follows that PA proves A.
I can guess what Eliezer's reaction to my argument here might be: The problem I've been describing can only occur in infinitely large worlds, which have all sorts of other problems, like utilities not converging and stuff.
We settled for a large finite TV screen, but we could have had an arbitrarily larger finite TV screen. #infiniteworldproblems
We have Porsches for every natural number, but at every time t we have to trade down the Porsche with number t for a BMW. #infiniteworldproblems
We have ever-rising expectations for our standard of living, but the limit of our expectations doesn't equal our expectation of the limit. #infiniteworldproblems
-- Eliezer, not coincidentally after talking to me
I'm not going to be able to resolve that argument in this post, but briefly: I agree that we probably live in a finite world, and that finite worlds have many properties that make them nice to handle mathematically, but we can formally reason about infinite worlds of the kind I'm talking about here using standard, extremely well-understood mathematics.
Because proof systems like PA (or more conveniently ZFC) allow us to formalize this standard mathematical reasoning, a solution to the Löbian obstacle has to "work" properly in these infinite worlds, or we would be able to turn our story of the thesis advisors' proof that 0=1 into a formal proof of an inconsistency in PA, say. To be concrete, consider the system PA*, which consists of PA + the axiom schema "if PA* proves phi, then phi" for every formula phi; this is easily seen to be inconsistent by Löb's theorem, but if we didn't know that yet, we could translate the story of the thesis advisors (which are using PA* as their proof system this time) into a formal proof of the inconsistency of PA*.
Therefore, thinking intuitively in terms of infinite worlds can give us insight into why many approaches to the Löbian family of problems fail -- as long as we make sure that these infinite worlds, and their properties that we're using in our arguments, really can be formalized in standard mathematics, of course.
So I know we've already seen them buying a bunch of ML and robotics companies, but now they're purchasing Shane Legg's AGI startup. This is after they've acquired Boston Dynamics, several smaller robotics and ML firms, and started their own life-extension firm.
Is it just me, or are they trying to make Accelerando or something closely related actually happen? Given that they're buying up real experts and not just "AI is inevitable" prediction geeks (who shall remain politely unnamed out of respect for their real, original expertise in machine learning), has someone had a polite word with them about not killing all humans by sheer accident?
Over the last year, VincentYu, gwern and others have provided many papers for the LessWrong community (87% success rate in 2012) through previous help desk threads. We originally intended to provide editing, research and general troubleshooting help, but article downloads are by far the most requested service.
If you're doing a LessWrong relevant project we want to help you. If you need help accessing a journal article or academic book chapter, we can get it for you. If you need some research or writing help, we can help there too.
Turnaround times for articles published in the last 20 years or so is usually less than a day. Older articles often take a couple days.
Please make new article requests in the comment section of this thread.
If you would like to help out with finding papers, please monitor this thread for requests. If you want to monitor via RSS like I do, many RSS readers will give you the comment feed if you give it the URL for this thread (or use this link directly).
If you have some special skills you want to volunteer, mention them in the comment section.
I'm not sure I've ever seen such a compelling "rationality success story". There's so much that's right here.
The part that really grabs me about this is that there's no indication that his success has depended on "natural" skill or talent. And none of the strategies he's using are from novel research. He just studied the "literature" and took the results seriously. He didn't arbitrarily deviate from the known best practice based on aesthetics or intuition. And he kept a simple, single-minded focus on his goal. No lost purposes here --- just win as much money as possible, bank the winnings, and use it to self-insure. It's rationality-as-winning, plain and simple.
For reasons mentioned in So8res article as well as for other reasons: studying with a partner can be very good. In November, Adele_L had posted an article for people wanting to find a study partner. It got 17 comments, but only 1 since November 16th. So I thought we (I) should make a monthly thread on this instead of constantly going back to an old article which people might (seem to) forget about. If people seem to agree with that, I will make a post about it every month.
So if you're looking for a study partner for an online course or reading a manual (whether it's in the MIRI course list or not) tell others in the comment section.
I've had a manageable-but-important Problem for a few months now (financial in kind, details neither relevant nor interesting), of moderate complexity and relatively minor importance unless I leave it unsolved just a little longer.
Unfortunately, this seems to be the precise combination of things that triggers one of my ugh fields, which manifests subjectively as a fuzzy blank inability to maintain focus. Several times last week, it occurred to me that I should really Solve The Problem, but I wasn't able to get myself to spend any time thinking about it. Like, at all.
On Saturday, the Problem found itself top of mind once again. How irritating that I couldn't solve the Problem because it was the weekend, and when it wasn't the weekend, maybe Tuesday when work wasn't busy and the Bureau was open, I should really email Dr. Somebody and call Mrs. Administrator for the ...
I had a solution, and a plan. What the what?
My working theory is that when there's no chance of actually Doing Something, this particular ugh field deactivates.
To me, this suggests a strategy (of uncertain generalizability): when an ugh field is preventing thought about something important, find a time when action is impossible and use it to generate a plan.
I would feel better about this advice if it had a deep theoretical backer. Anybody?
[Summary: Trying to use new ideas is more productive than trying to evaluate them.]
I haven't posted to LessWrong in a long time. I have a fan-fiction blog where I post theories about writing and literature. Topics don't overlap at all between the two websites (so far), but I prioritize posting there much higher than posting here, because responses seem more productive there.
The key difference, I think, is that people who read posts on LessWrong ask whether they're "true" or "false", while the writers who read my posts on writing want to write. If I say something that doesn't ring true to one of them, he's likely to say, "I don't think that's quite right; try changing X to Y," or, "When I'm in that situation, I find Z more helpful", or, "That doesn't cover all the cases, but if we expand your idea in this way..."
Whereas on LessWrong a more typical response would be, "Aha, I've found a case for which your step 7 fails! GOTCHA!"
It's always clear from the context of a writing blog why a piece of information might be useful. It often isn't clear how a LessWrong post might be useful. You could blame the author for not providing you with that context. Or, you could be pro-active and provide that context yourself, by thinking as you read a post about how it fits into the bigger framework of questions about rationality, utility, philosophy, ethics, and the future, and thinking about what questions and goals you have that it might be relevant to.
When I was a freshman in high school, I was a mediocre math student: I earned a D in second semester geometry and had to repeat the course. By the time I was a senior in high school, I was one of the strongest few math students in my class of ~600 students at an academic magnet high school. I went on to earn a PhD in math. Most people wouldn't have guessed that I could have improved so much, and the shift that occurred was very surreal to me. It’s all the more striking in that the bulk of the shift occurred in a single year. I thought I’d share what strategies facilitated the change.
The pomodoro technique is, in short, starting a timer and doing 25 minutes of focused work on a single task without interruption, followed by a five minute break. Choose a new task, restart the timer, and repeat.
Throughout 2013 I used pomodoros to execute on pretty much all of my life projects, organized into the following categories:
- work – at MIRI
- bizdev – other income-generating projects
- growth – personal development projects (e.g. reading books, taking notes, making Anki decks; monthly reviews)
- misc – miscellaneous life maintenance projects (e.g. banking stuff, knocking off a bunch of small todo’s, house cleanup)
- health – exercise projects (mostly climbing, some running, some misc other stuff)
The Result: 5,008 Pomodoros
The end result was 2,504 hours of recorded work—5,008 pomodoros in total:
Stacked Pomodoros by Week in 2013
A summary, by category (with hours in brackets):
- work – 2,457 (1,228.5h) – 47.3 (23.7h) avg/week
- bizdev – 700 (350h) – 13.5 (6.7h) avg/week
- growth – 996 (498h) – 19.2 (9.6h) avg/week
- misc – 448 (224h) – 8.6 (4.3h) avg/week
- health – 407 (203.5h) – 7.8 (3.9h) avg/week
Grand Total: 5,008 (2,504h) – 96.3 (48.2h) avg/week
My version of the pomodoro technique
To be clear, I didn’t use the pomodoro technique 100% faithfully. Certain things here, such as most Health (exercise) stuff, I never actually ran a pomodoro timer. But since I had a system for tracking where and how I spent my time, and since “claiming” all that time helped motivate me e.g. to climb regularly, I included them.
Ways I deviate from the “true” pomodoro technique:
- I don’t always take breaks. For example, if I do two pomodoros, get in the zone, and work for another two hours straight, I’d still record that as 6 pomodoros (3 hours) total.
- I don’t always use a timer. Sometimes I just start working, remembering to take small intermittent breaks, and record the total time in pomodoros (4h of work = 8 pomodoros).
- I don’t record interruptions. You’re supposed to track all internal and external interruptions, but I don’t bother with that. I merely try remain conscious of interruptions and eliminate/avoid them as much as possible.
- I don’t let interruptions cancel out pomodoros. Let’s say I work for fifteen minutes and someone comes in to chat about something important that’s been on their mind. I know that “a pomodoro is indivisible”, but screw it, I chat, and when the conversation ends I count a pomodoro after ten more minutes of work. Pomodoro blasphemy? Maybe.
- I don’t always set targets. I don’t constantly set detailed pomodoro targets and track how many pomodoros were actually required. I only do this occasionally if I think my estimating ability is getting really off. I do set weekly pomodoro targets by category.
How did I track?
Near the end of 2012 I whipped up a simple web app that I use for tracking all of my pomodoros. Here’s a sample screenshot from a week from earlier this year:
Every pomodoro added is given a description, project, major area, and count. This way I can view all pomodoros by project, area, over a given date range, etc. (I’m pretty sure there are other apps out there that let you do basically the same thing, but I haven’t taken much time to explore them.)
Why I think it’s worked really well for me
Of all the productivity hacks I’ve tried over the last decade, the pomodoro technique was, for me, the hands-down most effective technique. My thoughts on why the pomodoro technique has worked so well for me:
- It helps you start – start the timer and then just start working. You’ve already decided what to work on, so just start already.
- It helps you focus on one thing at a time – work on only one thing and ignore everything else.
- It helps you prioritize – look at your lists/projects/tasks/whatever, pick the most important thing to work on, and then just start already.
- It helps create success spirals – when you have 5 successful pomodoros under your belt, it’s motivation to keep going.
In summary, if you haven’t yet, I highly recommend giving the pomodoro technique a try.
In physical science the first essential step in the direction of learning any subject is to find principles of numerical reckoning and practicable methods for measuring some quality connected with it. I often say that when you can measure what you are speaking about, and express it in numbers, you know something about it; but when you cannot measure it, when you cannot express it in numbers, your knowledge is of a meagre and unsatisfactory kind; it may be the beginning of knowledge, but you have scarcely in your thoughts advanced to the state of Science, whatever the matter may be.
-- Lord Kelvin
If you believe that science is about describing things mathematically, you can fall into a strange sort of trap where you come up with some numerical quantity, discover interesting facts about it, use it to analyze real-world situations - but never actually get around to measuring it. I call such things "theoretical quantities" or "fake numbers", as opposed to "measurable quantities" or "true numbers".
An example of a "true number" is mass. We can measure the mass of a person or a car, and we use these values in engineering all the time. An example of a "fake number" is utility. I've never seen a concrete utility value used anywhere, though I always hear about nice mathematical laws that it must obey.
The difference is not just about units of measurement. In economics you can see fake numbers happily coexisting with true numbers using the same units. Price is a true number measured in dollars, and you see concrete values and graphs everywhere. "Consumer surplus" is also measured in dollars, but good luck calculating the consumer surplus of a single cheeseburger, never mind drawing a graph of aggregate consumer surplus for the US! If you ask five economists to calculate it, you'll get five different indirect estimates, and it's not obvious that there's a true number to be measured in the first place.
Another example of a fake number is "complexity" or "maintainability" in software engineering. Sure, people have proposed different methods of measuring it. But if they were measuring a true number, I'd expect them to agree to the 3rd decimal place, which they don't :-) The existence of multiple measuring methods that give the same result is one of the differences between a true number and a fake one. Another sign is what happens when two of these methods disagree: do people say that they're both equally valid, or do they insist that one must be wrong and try to find the error?
It's certainly possible to improve something without measuring it. You can learn to play the piano pretty well without quantifying your progress. But we should probably try harder to find measurable components of "intelligence", "rationality", "productivity" and other such things, because we'd be better at improving them if we had true numbers in our hands.
Less Wrong requires no politics / minimal humor / definitely unambiguously rationality-relevant / careful referencing / airtight reasoning (as opposed to a sketch of something which isn't exactly true but points to the truth.) This makes writing for Less Wrong a chore as opposed to an enjoyable pastime.
But Kaj disagreed that this was the actual standard:
I agree with the "no politics" bit, but I don't think the rest are correct. I've certainly had "sketch of something that isn't quite true but points in the right direction" posts with no references and unclear connections to rationality promoted before (example), as well as ones plastered with unnecessary jokes (example).
This raises two questions: what is the real standard, and what should the standard be?
Because on the one hand, it's not clear Yvain is right, but on the other hand if he is right on the factual question, that standard seems way too high to me. It would suggest that, as John Maxwell says in the same thread, "The overwhelming LW moderation focus seems to be on stifling bad content. There's very little in place to encourage good content."
The wiki sort-of answers the factual question:
These traditionally go in Discussion:
- a link with minimal commentary
- a question or brainstorming opportunity for the Less Wrong community
Beyond that, here are some factors that suggest you should post in Main:
- Your post discusses core Less Wrong topics.
- The material in your post seems especially important or useful.
- You put a lot of thought or effort into your post. (Citing studies, making diagrams, and agonizing over wording are good indicators of this.)
- Your post is long or deals with difficult concepts. (If a post is in Main, readers know that it may take some effort to understand.)
- You've searched the Less Wrong archives, and you're pretty sure that you're saying something new and non-obvious.
But this isn't an entirely unambiguous answer: how many of the five "factors" does a post need to be in Main? Furthermore, it often seems that the "real" rules are significantly different than what the wiki says. Yvain's perception may be incorrect, but I think there were reasons why he (and presumably the people who upvoted his comment) had that perception. Also, Eliezer recently explained that:
Whenever a non-meta post stays under 5, I always feel free to move it to Discussion, especially if an upvoted comment has also suggested it. I don't always, but often do.
This makes me wonder what other poorly-publicized rules there are in this vicinity.
As for what the rules should be, I'm going to limit myself to two general suggestions:
- The standard for posting in Main should not be so high that it makes posting at LessWrong feel like a chore, thereby chasing away good contributors like Yvain.
- The standard should not be so high that it would force any significant portion of Eliezer's original sequences off into Discussion.
Finally, whatever standard we settle on, I think it's really important that we make it clearer to people what it is. Aside from the obvious benefits of doing that, I've found that trying to navigate the unclear Main/Discussion distinction is itself often enough to make blogging at LessWrong feel like a chore.
Edited to add: In terms of karma I'm currently the top contributor for the past 30 days on LessWrong by a wide margin. I managed this in spite of the fact that I'm in the middle of doing App Academy and have no time (this past week has been an exception because vacation). I take this not as evidence of how awesome I am, but as evidence that way too little quality content is being posted in Main.
Ever since Tversky and Kahneman started to gather evidence purporting to show that humans suffer from a large number of cognitive biases, other psychologists and philosophers have criticized these findings. For instance, philosopher L. J. Cohen argued in the 80's that there was something conceptually incoherent with the notion that most adults are irrational (with respect to a certain problem). By some sort of Wittgensteinian logic, he thought that the majority's way of reasoning is by definition right. (Not a high point in the history of analytic philosophy, in my view.) See chapter 8 of this book (where Gigerenzer, below, is also discussed).
Another attempt to resurrect human rationality is due to Gerd Gigerenzer and other psychologists. They have a) shown that if you tweak some of the heuristics and biases (i.e. the research program led by Tversky and Kahneman) experiments but a little - for instance by expressing probabilities in terms of frequencies - people make much fewer mistakes and b) argued, on the back of this, that the heuristics we use are in many situations good (and fast and frugal) rules of thumb (which explains why they are evolutionary adaptive). Regarding this, I don't think that Tversky and Kahneman ever doubted that the heuristics we use are quite useful in many situations. Their point was rather that there are lots of naturally occuring set-ups which fool our fast and frugal heuristics. Gigerenzer's findings are not completely uninteresting - it seems to me he does nuance the thesis of massive irrationality a bit - but his claims to the effect that these heuristics are rational in a strong sense are wildly overblown in my opnion. The Gigerenzer vs. Tversky/Kahneman debates are well discussed in this article (although I think they're too kind to Gigerenzer).
A strong argument against attempts to save human rationality is the argument from individual differences, championed by Keith Stanovich. He argues that the fact that some intelligent subjects consistently avoid to fall prey to the Wason Selection task, the conjunction fallacy, and other fallacies, indicates that there is something misguided with the notion that the answer that psychologists traditionally has seen as normatively correct is in fact misguided.
Hence I side with Tversky and Kahneman in this debate. Let me just mention one interesting and possible succesful method for disputing some supposed biases. This method is to argue that people have other kinds of evidence than the standard interpretation assumes, and that given this new interpretation of the evidence, the supposed bias in question is in fact not a bias. For instance, it has been suggested that the "false consensus effect" can be re-interpreted in this way:
The False Consensus Effect
Bias description: People tend to imagine that everyone responds the way they do. They tend to see their own behavior as typical. The tendency to exaggerate how common one’s opinions and behavior are is called the false consensus effect. For example, in one study, subjects were asked to walk around on campus for 30 minutes, wearing a sign board that said "Repent!". Those who agreed to wear the sign estimated that on average 63.5% of their fellow students would also agree, while those who disagreed estimated 23.3% on average.
Counterclaim (Dawes & Mulford, 1996): The correctness of reasoning is not estimated on the basis of whether or not one arrives at the correct result. Instead, we look at whether reach reasonable conclusions given the data they have. Suppose we ask people to estimate whether an urn contains more blue balls or red balls, after allowing them to draw one ball. If one person first draws a red ball, and another person draws a blue ball, then we should expect them to give different estimates. In the absence of other data, you should treat your own preferences as evidence for the preferences of others. Although the actual mean for people willing to carry a sign saying "Repent!" probably lies somewhere in between of the estimates given, these estimates are quite close to the one-third and two-thirds estimates that would arise from a Bayesian analysis with a uniform prior distribution of belief. A study by the authors suggested that people do actually give their own opinion roughly the right amount of weight.
(The quote is from an excellent Less Wrong article on this topic due to Kaj Sotala. See also this post by him, this by Andy McKenzie, this by Stuart Armstrong and this by lukeprog on this topic. I'm sure there are more that I've missed.)
It strikes me that the notion that people are "massively flawed" is something of an intellectual cornerstone of the Less Wrong community (e.g. note the names "Less Wrong" and "Overcoming Bias"). In the light of this it would be interesting to hear what people have to say about the rationality wars. Do you all agree that people are massively flawed?
Let me make two final notes to keep in mind when discussing these issues. Firstly, even though the heuristics and biases program is sometimes seen as pessimistic, one could turn the tables around: if they're right, we should be able to improve massively (even though Kahneman himself seems to think that that's hard to do in practice). I take it that CFAR and lots of LessWrongers who attempt to "refine their rationality" assume that this is the case. On the other hand, if Gigerenzer or Cohen are right, and we already are very rational, then it would seem that it is hard to do much better. So in a sense the latter are more pessimistic (and conservative) than the former.
Secondly, note that parts of the rationality wars seem to be merely verbal and revolve around how "rationality" is to be defined (tabooing this word is very often a good idea). The real question is not if the fast and frugal heuristics are in some sense rational, but whether there are other mental algorithms which are more reliable and effective, and whether it is plausible to assume that we could learn to use them on a large scale instead.
Reply to: Benja2010's Self-modification is the correct justification for updateless decision theory; Wei Dai's Late great filter is not bad news
"P-zombie" is short for "philosophical zombie", but here I'm going to re-interpret it as standing for "physical philosophical zombie", and contrast it to what I call an "l-zombie", for "logical philosophical zombie".
A p-zombie is an ordinary human body with an ordinary human brain that does all the usual things that human brains do, such as the things that cause us to move our mouths and say "I think, therefore I am", but that isn't conscious. (The usual consensus on LW is that p-zombies can't exist, but some philosophers disagree.) The notion of p-zombie accepts that human behavior is produced by physical, computable processes, but imagines that these physical processes don't produce conscious experience without some additional epiphenomenal factor.
An l-zombie is a human being that could have existed, but doesn't: a Turing machine which, if anybody ever ran it, would compute that human's thought processes (and its interactions with a simulated environment); that would, if anybody ever ran it, compute the human saying "I think, therefore I am"; but that never gets run, and therefore isn't conscious. (If it's conscious anyway, it's not an l-zombie by this definition.) The notion of l-zombie accepts that human behavior is produced by computable processes, but supposes that these computational processes don't produce conscious experience without being physically instantiated.
Actually, there probably aren't any l-zombies: The way the evidence is pointing, it seems like we probably live in a spatially infinite universe where every physically possible human brain is instantiated somewhere, although some are instantiated less frequently than others; and if that's not true, there are the "bubble universes" arising from cosmological inflation, the branches of many-worlds quantum mechanics, and Tegmark's "level IV" multiverse of all mathematical structures, all suggesting again that all possible human brains are in fact instantiated. But (a) I don't think that even with all that evidence, we can be overwhelmingly certain that all brains are instantiated; and, more importantly actually, (b) I think that thinking about l-zombies can yield some useful insights into how to think about worlds where all humans exist, but some of them have more measure ("magical reality fluid") than others.
So I ask: Suppose that we do indeed live in a world with l-zombies, where only some of all mathematically possible humans exist physically, and only those that do have conscious experiences. How should someone living in such a world reason about their experiences, and how should they make decisions — keeping in mind that if they were an l-zombie, they would still say "I have conscious experiences, so clearly I can't be an l-zombie"?
If we can't update on our experiences to conclude that someone having these experiences must exist in the physical world, then we must of course conclude that we are almost certainly l-zombies: After all, if the physical universe isn't combinatorially large, the vast majority of mathematically possible conscious human experiences are not instantiated. You might argue that the universe you live in seems to run on relatively simple physical rules, so it should have high prior probability; but we haven't really figured out the exact rules of our universe, and although what we understand seems compatible with the hypothesis that there are simple underlying rules, that's not really proof that there are such underlying rules, if "the real universe has simple rules, but we are l-zombies living in some random simulation with a hodgepodge of rules (that isn't actually ran)" has the same prior probability; and worse, if you don't have all we do know about these rules loaded into your brain right now, you can't really verify that they make sense, since there is some mathematically possible simulation whose initial state has you remember seeing evidence that such simple rules exist, even if they don't; and much worse still, even if there are such simple rules, what evidence do you have that if these rules were actually executed, they would produce you? Only the fact that you, like, exist, but we're asking what happens if we don't let you update on that.
I find myself quite unwilling to accept this conclusion that I shouldn't update, in the world we're talking about. I mean, I actually have conscious experiences. I, like, feel them and stuff! Yes, true, my slightly altered alter ego would reason the same way, and it would be wrong; but I'm right...
...and that actually seems to offer a way out of the conundrum: Suppose that I decide to update on my experience. Then so will my alter ego, the l-zombie. This leads to a lot of l-zombies concluding "I think, therefore I am", and being wrong, and a lot of actual people concluding "I think, therefore I am", and being right. All the thoughts that are actually consciously experienced are, in fact, correct. This doesn't seem like such a terrible outcome. Therefore, I'm willing to provisionally endorse the reasoning "I think, therefore I am", and to endorse updating on the fact that I have conscious experiences to draw inferences about physical reality — taking into account the simulation argument, of course, and conditioning on living in a small universe, which is all I'm discussing in this post.
NB. There's still something quite uncomfortable about the idea that all of my behavior, including the fact that I say "I think therefore I am", is explained by the mathematical process, but actually being conscious requires some extra magical reality fluid. So I still feel confused, and using the word l-zombie in analogy to p-zombie is a way of highlighting that. But this line of reasoning still feels like progress. FWIW.
But if that's how we justify believing that we physically exist, that has some implications for how we should decide what to do. The argument is that nothing very bad happens if the l-zombies wrongly conclude that they actually exist. Mostly, that also seems to be true if they act on that belief: mostly, what l-zombies do doesn't seem to influence what happens in the real world, so if only things that actually happen are morally important, it doesn't seem to matter what the l-zombies decide to do. But there are exceptions.
Consider the counterfactual mugging: Accurate and trustworthy Omega appears to you and explains that it just has thrown a very biased coin that had only a 1/1000 chance of landing heads. As it turns out, this coin has in fact landed heads, and now Omega is offering you a choice: It can either (A) create a Friendly AI or (B) destroy humanity. Which would you like? There is a catch, though: Before it threw the coin, Omega made a prediction about what you would do if the coin fell heads (and it was able to make a confident prediction about what you would choose). If the coin had fallen tails, it would have created an FAI if it has predicted that you'd choose (B), and it would have destroyed humanity if it has predicted that you would choose (A). (If it hadn't been able to make a confident prediction about what you would choose, it would just have destroyed humanity outright.)
There is a clear argument that, if you expect to find yourself in a situation like this in the future, you would want to self-modify into somebody who would choose (B), since this gives humanity a much larger chance of survival. Thus, a decision theory stable under self-modification would answer (B). But if you update on the fact that you consciously experience Omega telling you that the coin landed heads, (A) would seem to be the better choice!
One way of looking at this is that if the coin falls tails, the l-zombie that is told the coin landed heads still exists mathematically, and this l-zombie now has the power to influence what happens in the real world. If the argument for updating was that nothing bad happens even though the l-zombies get it wrong, well, that argument breaks here. The mathematical process that is your mind doesn't have any evidence about whether the coin landed heads or tails, because as a mathematical object it exists in both possible worlds, and it has to make a decision in both worlds, and that decision affects humanity's future in both worlds.
Back in 2010, I wrote a post arguing that yes, you would want to self-modify into something that would choose (B), but that that was the only reason why you'd want to choose (B). Here's a variation on the above scenario that illustrates the point I was trying to make back then: Suppose that Omega tells you that it actually threw its coin a million years ago, and if it had fallen tails, it would have turned Alpha Centauri purple. Now throughout your history, the argument goes, you would never have had any motive to self-modify into something that chooses (B) in this particular scenario, because you've always known that Alpha Centauri isn't, in fact, purple.
But this argument assumes that you know you're not a l-zombie; if the coin had in fact fallen tails, you wouldn't exist as a conscious being, but you'd still exist as a mathematical decision-making process, and that process would be able to influence the real world, so you-the-decision-process can't reason that "I think, therefore I am, therefore the coin must have fallen heads, therefore I should choose (A)." Partly because of this, I now accept choosing (B) as the (most likely to be) correct choice even in that case. (The rest of my change in opinion has to do with all ways of making my earlier intuition formal getting into trouble in decision problems where you can influence whether you're brought into existence, but that's a topic for another post.)
However, should you feel cheerful while you're announcing your choice of (B), since with high (prior) probability, you've just saved humanity? That would lead to an actual conscious being feeling cheerful if the coin has landed heads and humanity is going to be destroyed, and an l-zombie computing, but not actually experiencing, cheerfulness if the coin has landed heads and humanity is going to be saved. Nothing good comes out of feeling cheerful, not even alignment of a conscious' being's map with the physical territory. So I think the correct thing is to choose (B), and to be deeply sad about it.
You may be asking why I should care what the right probabilities to assign or the right feelings to have are, since these don't seem to play any role in making decisions; sometimes you make your decisions as if updating on your conscious experience, but sometimes you don't, and you always get the right answer if you don't update in the first place. Indeed, I expect that the "correct" design for an AI is to fundamentally use (more precisely: approximate) updateless decision theory (though I also expect that probabilities updated on the AI's sensory input will be useful for many intermediate computations), and "I compute, therefore I am"-style reasoning will play no fundamental role in the AI. And I think the same is true for humans' decisions — the correct way to act is given by updateless reasoning. But as a human, I find myself unsatisfied by not being able to have a picture of what the physical world probably looks like. I may not need one to figure out how I should act; I still want one, not for instrumental reasons, but because I want one. In a small universe where most mathematically possible humans are l-zombies, the argument in this post seems to give me a justification to say "I think, therefore I am, therefore probably I either live in a simulation or what I've learned about the laws of physics describes how the real world works (even though there are many l-zombies who are thinking similar thoughts but are wrong about them)."
And because of this, even though I disagree with my 2010 post, I also still disagree with Wei Dai's 2010 post arguing that a late Great Filter is good news, which my own 2010 post was trying to argue against. Wei argued that if Omega gave you a choice between (A) destroying the world now and (B) having Omega destroy the world a million years ago (so that you are never instantiated as a conscious being, though your choice as an l-zombie still influences the real world), then you would choose (A), to give humanity at least the time it's had so far. Wei concluded that this means that if you learned that the Great Filter is in our future, rather than our past, that must be good news, since if you could choose where to place the filter, you should place it in the future. I now agree with Wei that (A) is the right choice, but I don't think that you should be happy about it. And similarly, I don't think you should be happy about news that tells you that the Great Filter is later than you might have expected.
In a discussion a couple months ago, Luke said, "I think it's hard to tell whether donations do more good at MIRI, FHI, CEA, or CFAR." So I want to have a thread to discuss that.
My own very rudimentary thoughts: I think the research MIRI does is probably valuable, but I don't think it's likely to lead to MIRI itself building FAI. I'm convinced AGI is much more likely to be built by a government or major corporation, which makes me more inclined to think movement-building activities are likely to be valuable, to increase the odds of the people at that government or corporation being conscious of AI safety issues, which MIRI isn't doing.
It seems like FHI is the obvious organization to donate to for that purpose, but Luke seems to think CEA (the Centre for Effective Altruism) and CFAR could also be good for that, and I'm not entirely clear on why. I sometimes get the impression that some of CFAR's work ends up being covert movement-building for AI-risk issues, but I'm not sure to what extent that's true. I know very little about CEA, and a brief check of their website leaves me a little unclear on why Luke recommends them, aside from the fact that they apparently work closely with FHI.
This has some immediate real-world relevance to me: I'm currently in the middle of a coding bootcamp and not making any money, but today my mom offered to make a donation to a charity of my choice for Christmas. So any input on what to tell her would be greatly appreciated, as would more information on CFAR and CEA, which I'm sorely lacking in.
In late December 2013, Jonah, my collaborator at Cognito Mentoring, announced the service on LessWrong. Information about the service was also circulated in other venues with high concentrations of gifted and intellectually curious people. Since then, we're received ~70 emails asking for mentoring from learners across all ages, plus a few parents. At least 40 of our advisees heard of us through LessWrong, and the number is probably around 50. Of the 23 who responded to our advisee satisfaction survey, 16 filled in information on where they'd heard of us, and 14 of those 16 had heard of us from LessWrong. The vast majority of student advisees with whom we had substantive interactions, and the ones we felt we were able to help the most, came from LessWrong (we got some parents through the Davidson Forum post, but that's a very different sort of advising).
In this post, I discuss some common themes that emerged from our interaction with these advisees. Obviously, this isn't a comprehensive picture of the LessWrong community the way that Yvain's 2013 survey results were.
- A significant fraction of the people who contacted us via LessWrong aren't active LessWrong participants, and many don't even have user accounts on LessWrong. The prototypical advisees we got through LessWrong don't have many distinctive LessWrongian beliefs. Many of them use LessWrong primarily as a source of interesting stuff to read, rather than a community to be part of.
- About 25% of the advisees we got through LessWrong were female, and a slightly higher proportion of the advisees with whom we had substantive interaction (and subjectively feel we helped a lot) were female. You can see this by looking at the sex distribution of the public reviews of us from students.
- Our advisees included people in high school (typically, grades 11 and 12) and college. Our advisees in high school tended to be interested in mathematics, computer science, physics, engineering, and entrepreneurship. We did have a few who were interested in economics, philosophy, and the social sciences as well, but this was rarer. Our advisees in college and graduate school were also interested in the above subjects but skewed a bit more in the direction of being interested in philosophy, psychology, and economics.
- Somewhat surprisingly and endearingly, many of our advisees were interested in effective altruism and social impact. Some had already heard of the cluster of effective altruist ideas. Others were interested in generating social impact through entrepreneurship or choosing an impactful career, even though they weren't familiar with effective altruism until we pointed them to it. Of those who had heard of effective altruism as a cluster of ideas, some had either already consulted with or were planning to consult with 80,000 Hours, and were connecting with us largely to get a second opinion or to get opinion on matters other than career choice.
- Some of our advisees had had some sort of past involvement with MIRI/CFAR/FHI. Some were seriously considering working in existential risk reduction or on artificial intelligence. The two subsets overlapped considerably.
- Our advisees were somewhat better educated about rationality issues than we'd expect others of similar academic accomplishment to be, and more than the advisees we got from sources other than LessWrong. That's obviously not a surprise at all.
- We hadn't been expecting it, but many advisees asked us questions related to procrastination, social skills, and other life skills. We were initially somewhat ill-equipped to handle these, but we've built a base of recommendations, with some help from LessWrong and other sources.
- One thing that surprised me personally is that many of these people had never spent time exploring Quora. I'd have expected Quora to be much more widely known and used by the sort of people who were sufficiently aware of the Internet to know LessWrong. But it's possible there's not that much overlap.
My overall takeaway is that LessWrong seems to still be one of the foremost places that smart and curious young people interested in epistemic rationality visit. I'm not sure of the exact reason, though HPMOR probably gets a significant fraction of the credit. As long as things stay this way, LessWrong remains a great way to influence a subset of the young population today that's likely to be disproportionately represented among the decision-makers a few years down the line.
It's not clear to me why they don't participate more actively on LessWrong. Maybe no special reasons are needed: the ratio of lurkers to posters is huge for most Internet fora. Maybe the people who contacted us were relatively young and still didn't have an Internet presence, or were being careful about building one. On the other hand, maybe there is something about the comments culture that dissuades people from participating (this need not be a bad feature per se: one reason people may refrain from participating is that comments are held to a high bar and this keeps people from offering off-the-cuff comments). That said, if people could somehow participate more, LessWrong could transform itself into an interactive forum for smart and curious people that's head and shoulders above all the others.
PS: We've now made our information wiki publicly accessible. It's still in beta and a lot of content is incomplete and there are links to as-yet-uncreated pages all over the place. But we think it might still be interesting to the LessWrong audience.
If you had an interesting Less Wrong meetup recently, but don't have the time to write up a big report to post to Discussion, feel free to write a comment here. Even if it's just a couple lines about what you did and how people felt about it, it might encourage some people to attend meetups or start meetups in their area.
If you have the time, you can also describe what types of exercises you did, what worked and what didn't. This could help inspire meetups to try new things and improve themselves in various ways.
If you're inspired by what's posted below and want to organize a meetup, check out this page for some resources to get started! You can also check FrankAdamek's weekly post on meetups for the week.
Tell us about your meetup!
I ran across this bit of pop-sci (a review of Jeremy Dean's Making Habits, Breaking Habits), which claims that habits typically take around 66 days to form, not the 21 days that self-help articles tend to cite. The somewhat surprising thing to me, on reflection, was how readily I'd taken the 21-day statistic as fact. From the article:
When he became interested in how long it takes for us to form or change a habit, psychologist Jeremy Dean found himself bombarded with the same magic answer from popular psychology websites and advice columns: 21 days. And yet, strangely — or perhaps predictably, for the internet — this one-size-fits-all number was being applied to everything from starting a running regimen to keeping a diary, but wasn’t backed by any concrete data.
The original article is here. Abstract:
To investigate the process of habit formation in everyday life, 96 volunteers chose an eating, drinking or activity behaviour to carry out daily in the same context (for example ‘after breakfast’) for 12 weeks. They completed the self-report habit index (SRHI) each day and recorded whether they carried out the behaviour. The majority (82) of participants provided sufficient data for analysis, and increases in automaticity (calculated with a sub-set of SRHI items) were examined over the study period. Nonlinear regressions fitted an asymptotic curve to each individual's automaticity scores over the 84 days. The model fitted for 62 individuals, of whom 39 showed a good fit. Performing the behaviour more consistently was associated with better model fit. The time it took participants to reach 95% of their asymptote of automaticity ranged from 18 to 254 days; indicating considerable variation in how long it takes people to reach their limit of automaticity and highlighting that it can take a very long time. Missing one opportunity to perform the behaviour did not materially affect the habit formation process. With repetition of a behaviour in a consistent context, automaticity increases following an asymptotic curve which can be modelled at the individual level. [My emphasis.]
- There is an observed “automaticity plateau.” Can individuals influence the height of the plateau through interventions such as rewards? Would this change the exponential rate constant? Or do we have less control over these things than we think?
- 95% of maximum automaticity doesn't quite seem like the right metric to use to describe habit formation, especially if the maximum is on the low side.
- Presumably you'd need familiarity with the SRHI survey to answer this, but it's not clear to me what an automaticity score of 40 really means. (Examples or a baseline might help: what's my automaticity for toothbrushing? checking email?)
- N=96 seems small. It seems slightly problematic that the 14 participants who dropped out were not included in the analysis, and rather problematic that they used a 3-parameter model and only got a ‘good fit’ for half of the participants. (I'm not an expert in this, so I'd appreciate knowing if my intuitions here are right.)
- It seems that changing habits is harder than I'd previously thought, at least in the absence of CFAR-like techniques. (Which we still don't know if it works, as far as I know. I'm looking forward to their research.)
A brief essay intended for high school students: any thoughts?
If you go to school, take the classes that people tell you to, do your homework, and engage in the extracurricular activities that your peers do, you'll be setting yourself up for an "okay" life. But you can do better than that.
This post is to raise a question about the demographics of rationality: Is rationality something that can appeal to low-IQ people as well?
I don't mean in theory, I mean in practice. From what I've seen, people who are concerned about rationality (in the sense that it has on LW, OvercomingBias, etc.) are overwhelmingly high-IQ.
Meanwhile, HPMOR and other stories in the "rationality genre" appeal to me, and to other people I know. However I wonder: Perhaps part of the reason they appeal to me is that I think of myself as a smart person, and this allows me to identify with the main characters, cheer when they think their way to victory, etc. If I thought of myself as a stupid person, then perhaps I would feel uncomfortable, insecure, and alienated while reading the same stories.
So, I have four questions:
1.) Do we have reason to believe that the kind of rationality promoted on LW, OvercomingBias, CFAR, etc. appeals to a fairly normal distribution of people around the IQ mean? Or should we think, as I suggested, that people with lower IQ's are disposed to find the idea of being rational less attractive?
2.) Ditto, except replace "being rational" with "celebrating rationality through stories like HPMOR." Perhaps people think that rationality is a good thing in much the same way that being wealthy is a good thing, but they don't think that it should be celebrated, or at least they don't find such celebrations appealing.
3.) Supposing #1 and #2 have the answers I am suggesting, why?
4.) Making the same supposition, what are the implications for the movement in general?
Note: I chose to use IQ in this post instead of a more vague term like "intelligence," but I could easily have done the opposite. I'm happy to do whichever version is less problematic.
Looking at the discussion section recently, it seems like over half of the posts are meetups. I think it's really great that so many LessWrongers are able to get together and do interesting stuff. Looking at a lot of the topics, I often find myself thinking "I wonder what they ended up talking about." I looked at the meetups page and it looks like many give a description of the topic, but there is rarely any public followup. I also did a search which turned up surprisingly few post-meetup posts.
For example, this Los Angeles meetup from a few days ago about resolutions looked really interesting to me and I'm curious to hear what kinds of strategies were proposed and if there were any insights or anecdotes that came up that would be useful to share with those of us that couldn't attend.
I remember reading a meetup report back in November that told the story of the exercises they went through and it seemed to spark some good discussion. It even forced me to make a note to try some things on my own. This one was atypical in that it was very detailed and was a crosspost from a personal blog, but I feel like even short reports would give a chance for the rest of the community to chime in and give praise, suggestions, and feedback.
When I tried to think of reasons not to share what happened in meetups, I came up with a few potential factors:
- It's extra work
- Keeping it private increases the feeling of community within the group
- Meetups are supposed to be a safe place where your actions or comments won't be broadcast to the world
- Nothing really post-worthy happened
- It would allow LessWrongers who weren't in attendance to get involved in the discussion
- Insights would be shared with the whole community
- Meetup organizers and attendees could get suggestions for ways to improve future meetups
- Non-attendees could use these ideas to host their own meetups
- Summarizing key points of a discussion is helpful for those involved to retain the information they discussed
Given LW’s keen interest in bias, it would seem pertinent to be aware of the biases engendered by the karma system. Note: I used to be strictly opposed to comment scoring mechanisms, but witnessing the general effectiveness in which LWers use karma has largely redeemed the system for me.
In “Social Influence Bias: A Randomized Experiment” by Muchnik et al, random comments on a “social news aggregation Web site” were up-voted after being posted. The likelihood of such rigged comments receiving additional up-votes were quantified in comparison to a control group. The results show that users were significantly biased towards the randomly up-voted posts:
The up-vote treatment significantly increased the probability of up-voting by the first viewer by 32% over the control group ... Uptreated comments were not down-voted significantly more or less frequently than the control group, so users did not tend to correct the upward manipulation. In the absence of a correction, positive herding accumulated over time.
At the end of their five month testing period, the comments that had artificially received an up-vote had an average rating 25% higher than the control group. Interestingly, the severity of the bias was largely dependent on the topic of discussion:
We found significant positive herding effects for comment ratings in “politics,” “culture and society,” and “business,” but no detectable herding behavior for comments in “economics,” “IT,” “fun,” and “general news”.
The herding behavior outlined in the paper seems rather intuitive to me. If before I read a post, I see a little green ‘1’ next to it, I’m probably going to read the post in a better light than if I hadn't seen that little green ‘1’ next to it. Similarly, if I see a post that has a negative score, I’ll probably see flaws in it much more readily. One might say that this is the point of the rating system, as it allows the group as a whole to evaluate the content. However, I’m still unsettled by just how easily popular opinion was swayed in the experiment.
This certainly doesn't necessitate that we reprogram the site and eschew the karma system. Moreover, understanding the biases inherent in such a system will allow us to use it much more effectively. Discussion on how this bias affects LW in particular would be welcomed. Here are some questions to begin with:
- Should we worry about this bias at all? Are its effects negligible in the scheme of things?
- How does the culture of LW contribute to this herding behavior? Is it positive or negative?
- If there are damages, how can we mitigate them?
In the paper, they mentioned that comments were not sorted by popularity, therefore “mitigating the selection bias.” This of course implies that the bias would be more severe on forums where comments are sorted by popularity, such as this one.
For those interested, another enlightening paper is “Overcoming the J-shaped distribution of product reviews” by Nan Hu et al, which discusses rating biases on websites such as amazon. User gwern has also recommended a longer 2007 paper by the same authors which the one above is based upon: "Why do Online Product Reviews have a J-shaped Distribution? Overcoming Biases in Online Word-of-Mouth Communication"
In my previous post, I introduced the idea of an "l-zombie", or logical philosophical zombie: A Turing machine that would simulate a conscious human being if it were run, but that is never run in the real, physical world, so that the experiences that this human would have had, if the Turing machine were run, aren't actually consciously experienced.
One common reply to this is to deny the possibility of logical philosophical zombies just like the possibility of physical philosophical zombies: to say that every mathematically possible conscious experience is in fact consciously experienced, and that there is no kind of "magical reality fluid" that makes some of these be experienced "more" than others. In other words, we live in the Tegmark Level IV universe, except that unlike Tegmark argues in his paper, there's no objective measure on the collection of all mathematical structures, according to which some mathematical structures somehow "exist more" than others (and, although IIRC that's not part of Tegmark's argument, according to which the conscious experiences in some mathematical structures could be "experienced more" than those in other structures). All mathematically possible experiences are experienced, and to the same "degree".
So why is our world so orderly? There's a mathematically possible continuation of the world that you seem to be living in, where purple pumpkins are about to start falling from the sky. Or the light we observe coming in from outside our galaxy is suddenly replaced by white noise. Why don't you remember ever seeing anything as obviously disorderly as that?
And the answer to that, of course, is that among all the possible experiences that get experienced in this multiverse, there are orderly ones as well as non-orderly ones, so the fact that you happen to have orderly experiences isn't in conflict with the hypothesis; after all, the orderly experiences have to be experienced as well.
One might be tempted to argue that it's somehow more likely that you will observe an orderly world if everybody who has conscious experiences at all, or if at least most conscious observers, see an orderly world. (The "most observers" version of the argument assumes that there is a measure on the conscious observers, a.k.a. some kind of magical reality fluid.) But this requires the use of anthropic probabilities, and there is simply no (known) system of anthropic probabilities that gives reasonable answers in general. Fortunately, we have an alternative: Wei Dai's updateless decision theory (which was motivated in part exactly by the problem of how to act in this kind of multiverse). The basic idea is simple (though the details do contain devils): We have a prior over what the world looks like; we have some preferences about what we would like the world to look like; and we come up with a plan for what we should do in any circumstance we might find ourselves in that maximizes our expected utility, given our prior.
In this framework, Coscott and Paul suggest, everything adds up to normality if, instead of saying that some experiences objectively exist more, we happen to care more about some experiences than about others. (That's not a new idea, of course, or the first time this has appeared on LW -- for example, Wei Dai's What are probabilities, anyway? comes to mind.) In particular, suppose we just care more about experiences in mathematically really simple worlds -- or more precisely, places in mathematically simple worlds that are mathematically simple to describe (since there's a simple program that runs all Turing machines, and therefore all mathematically possible human experiences, always assuming that human brains are computable). Then, even though there's a version of you that's about to see purple pumpkins rain from the sky, you act in a way that's best in the world where that doesn't happen, because that world has so much lower K-complexity, and because you therefore care so much more about what happens in that world.
There's something unsettling about that, which I think deserves to be mentioned, even though I do not think it's a good counterargument to this view. This unsettling thing is that on priors, it's very unlikely that the world you experience arises from a really simple mathematical description. (This is a version of a point I also made in my previous post.) Even if the physicists had already figured out the simple Theory of Everything, which is a super-simple cellular automaton that accords really well with experiments, you don't know that this simple cellular automaton, if you ran it, would really produce you. After all, imagine that somebody intervened in Earth's history so that orchids never evolved, but otherwise left the laws of physics the same; there might still be humans, or something like humans, and they would still run experiments and find that they match the predictions of the simple cellular automaton, so they would assume that if you ran that cellular automaton, it would compute them -- except it wouldn't, it would compute us, with orchids and all. Unless, of course, it does compute them, and a special intervention is required to get the orchids.
So you don't know that you live in a simple world. But, goes the obvious reply, you care much more about what happens if you do happen to live in the simple world. On priors, it's probably not true; but it's best, according to your values, if all people like you act as if they live in the simple world (unless they're in a counterfactual mugging type of situation, where they can influence what happens in the simple world even if they're not in the simple world themselves), because if the actual people in the simple world act like that, that gives the highest utility.
You can adapt an argument that I was making in my l-zombies post to this setting: Given these preferences, it's fine for everybody to believe that they're in a simple world, because this will increase the correspondence between map and territory for the people that do live in simple worlds, and that's who you care most about.
I mostly agree with this reasoning. I agree that Tegmark IV without a measure seems like the most obvious and reasonable hypothesis about what the world looks like. I agree that there seems no reason for there to be a "magical reality fluid". I agree, therefore, that on the priors that I'd put into my UDT calculation for how I should act, it's much more likely that true reality is a measureless Tegmark IV than that it has some objective measure according to which some experiences are "experienced less" than others, or not experienced at all. I don't think I understand things well enough to be extremely confident in this, but my odds would certainly be in favor of it.
Moreover, I agree that if this is the case, then my preferences are to care more about the simpler worlds, making things add up to normality; I'd want to act as if purple pumpkins are not about to start falling from the sky, precisely because I care more about the consequences my actions have in more orderly worlds.
Imagine this: Once you finish reading this article, you hear a bell ringing, and then a sonorous voice announces: "You do indeed live in a Tegmark IV multiverse without a measure. You had better deal with it." And then it turns out that it's not just you who's heard that voice: Every single human being on the planet (who didn't sleep through it, isn't deaf etc.) has heard those same words.
On the hypothesis, this is of course about to happen to you, though only in one of those worlds with high K-complexity that you don't care about very much.
So let's consider the following possible plan of action: You could act as if there is some difference between "existence" and "non-existence", or perhaps some graded degree of existence, until you hear those words and confirm that everybody else has heard them as well, or until you've experienced one similarly obviously "disorderly" event. So until that happens, you do things like invest time and energy into trying to figure out what the best way to act is if it turns out that there is some magical reality fluid, and into trying to figure out what a non-confused version of something like a measure on conscious experience could look like, and you act in ways that don't kill you if we happen to not live in a measureless Tegmark IV. But once you've had a disorderly experience, just a single one, you switch over to optimizing for the measureless mathematical multiverse.
If the degree to which you care about worlds is really proportional to their K-complexity, with respect to what you and I would consider a "simple" universal Turing machine, then this would be a silly plan; there is very little to be gained from being right in worlds that have that much higher K-complexity. But when I query my intuitions, it seems like a rather good plan:
- Yes, I care less about those disorderly worlds. But not as much less as if I valued them by their K-complexity. I seem to be willing to tap into my complex human intuitions to refer to the notion of "single obviously disorderly event", and assign the worlds with a single such event, and otherwise low K-complexity, not that much lower importance than the worlds with actual low K-complexity.
- And if I imagine that the confused-seeming notions of "really physically exists" and "actually experienced" do have some objective meaning independent of my preferences, then I care much more about the difference between "I get to 'actually experience' a tomorrow" and "I 'really physically' get hit by a car today" than I care about the difference between the world with true low K-complexity and the worlds with a single disorderly event.
In other words, I agree that on the priors I put into my UDT calculation, it's much more likely that we live in measureless Tegmark IV; but my confidence in this isn't extreme, and if we don't, then the difference between "exists" and "doesn't exist" (or "is experienced a lot" and "is experienced only infinitesimally") is very important; much more important than the difference between "simple world" and "simple world plus one disorderly event" according to my preferences if we do live in a Tegmark IV universe. If I act optimally according to the Tegmark IV hypothesis in the latter worlds, that still gives me most of the utility that acting optimally in the truly simple worlds would give me -- or, more precisely, the utility differential isn't nearly as large as if there is something else going on, and I should be doing something about it, and I'm not.
This is the reason why I'm trying to think seriously about things like l-zombies and magical reality fluid. I mean, I don't even think that these are particularly likely to be exactly right even if the measureless Tegmark IV hypothesis is wrong; I expect that there would be some new insight that makes even more sense than Tegmark IV, and makes all the confusion go away. But trying to grapple with the confused intuitions we currently have seems at least a possible way to make progress on this, if it should be the case that there is in fact progress to be made.
Here's one avenue of investigation that seems worthwhile to me, and wouldn't without the above argument. One thing I could imagine finding, that could make the confusion go away, would be that the intuitive notion of "all possible Turing machines" is just wrong, and leads to outright contradictions (e.g., to inconsistencies in Peano Arithmetic, or something similarly convincing). Lots of people have entertained the idea that concepts like the real numbers don't "really" exist, and only the behavior of computable functions is "real"; perhaps not even that is real, and true reality is more restricted? (You can reinterpret many results about real numbers as results about computable functions, so maybe you could reinterpret results about computable functions as results about these hypothetical weaker objects that would actually make mathematical sense.) So it wouldn't be the case after all that there is some Turing machine that computes the conscious experiences you would have if pumpkins started falling from the sky.
Does the above make sense? Probably not. But I'd say that there's a small chance that maybe yes, and that if we understood the right kind of math, it would seem very obvious that not all intuitively possible human experiences are actually mathematically possible (just as obvious as it is today, with hindsight, that there is no Turing machine which takes a program as input and outputs whether this program halts). Moreover, it seems plausible that this could have consequences for how we should act. This, together with my argument above, make me think that this sort of thing is worth investigating -- even if my priors are heavily on the side of expecting that all experiences exist to the same degree, and ordinarily this difference in probabilities would make me think that our time would be better spent on investigating other, more likely hypotheses.
Leaving aside the question of how I should act, though, does all of this mean that I should believe that I live in a universe with l-zombies and magical reality fluid, until such time as I hear that voice speaking to me?
I do feel tempted to try to invoke my argument from the l-zombies post that I prefer the map-territory correspondences of actually existing humans to be correct, and don't care about whether l-zombies have their map match up with the territory. But I'm not sure that I care much more about actually existing humans being correct, if the measureless mathematical multiverse hypothesis is wrong, than I care about humans in simple worlds being correct, if that hypothesis is right. So I think that the right thing to do may be to have a subjective belief that I most likely do live in the measureless Tegmark IV, as long as that's the view that seems by far the least confused -- but continue to spend resources on investigating alternatives, because on priors they don't seem sufficiently unlikely to make up for the potential great importance of getting this right.
View more: Next