Less Wrong is a community blog devoted to refining the art of human rationality. Please visit our About page for more information.

Vegetarianism Ideological Turing Test Results

1 Raelifin 14 October 2015 12:34AM

Back in August I ran a Caplan Test (or more commonly an "Ideological Turing Test") both on Less Wrong and in my local rationality meetup. The topic was diet, specifically: Vegetarian or Omnivore?

If you're not familiar with Caplan Tests, I suggest reading Palladias' post on the subject or reading Wikipedia. The test I ran was pretty standard; thirteen blurbs were presented to the judges, selected by the toss of a coin to either be from a vegetarian or from an omnivore, and also randomly selected to be genuine or an impostor trying to pass themselves off as the alternative. My main contribution, which I haven't seen in previous tests, was using credence/probability instead of a simple "I think they're X".

I originally chose vegetarianism because I felt like it's an issue which splits our community (and particularly my local community) pretty well. A third of test participants were vegetarians, and according to the 2014 census, only 56% of LWers identify as omnivores.

Before you see the results of the test, please take a moment to say aloud how well you think you can do at predicting whether someone participating in the test was genuine or a fake.














If you think you can do better than chance you're probably fooling yourself. If you think you can do significantly better than chance you're almost certainly wrong. Here are some statistics to back that claim up.

I got 53 people to judge the test. 43 were from LessWrong, and 10 were from my local group. Averaging across the entire group, 51.1% of judgments were correct. If my Chi^2 math is correct, the p-value for the null hypothesis is 57% on this data. (Note that this includes people who judged an entry as 50%. If we don't include those folks the success rate drops to 49.4%.)

In retrospect, this seemed rather obvious to me. Vegetarians aren't significantly different from omnivores. Unlike a religion or a political party there aren't many cultural centerpieces to diet. Vegetarian judges did no better than omnivore judges, even when judging vegetarian entries. In other words, in this instance the minority doesn't possess any special powers for detecting other members of the in-group. This test shows null results; the thing that distinguishes vegetarians from omnivores is not familiarity with the other sides' arguments or culture, at least not to the degree that we can distinguish at a glance.

More interesting, in my opinion, than the null results were the results I got on the calibration of the judges. Back when I asked you to say aloud how good you'd be, what did you say? Did the last three paragraphs seem obvious? Would it surprise you to learn that not a single one of the 53 judges held their guesses to a confidence band of 40%-60%? In other words, every single judge thought themselves decently able to discern genuine writing from fakery. The numbers suggest that every single judge was wrong.

(The flip-side to this is, of course, that every entrant to the test won! Congratulations rationalists: signs point to you being able to pass as vegetarians/omnivores when you try, even if you're not in that category. The average credibility of an impostor entry was 59%, while the average credibility of a genuine response was 55%. No impostors got an average credibility below 49%.)

Using the logarithmic scoring rule for the calibration game we can measure the error of the community. The average judge got a score of -543. For comparison, a judge that answered 50% ("I don't know") to all questions would've gotten a score of 0. Only eight judges got a positive score, and only one had a score higher than 100 (consistent with random chance). This is actually one area where Less Wrong should feel good. We're not at all calibrated... but for this test at least, the judges from the website were much better calibrated than my local community (who mostly just lurk). If we separate the two groups we see that the average score for my community was -949, while LW had an average of -448. Given that I restricted the choices to multiples of 10, a random selection of credences gives an average score of -921.

In short, the LW community didn't prove to be any better at discerning fact from fiction, but it was significantly less overconfident. More de-biasing needs to be done, however! The next time you think of a probability to reflect your credence, ask yourself "Is this the sort of thing that anyone would know? Is this the sort of thing I would know?" That answer will probably be "no" a lot more than it feels like from the inside.

Full data (minus contact info) can be found here.

Those of you who submitted a piece of writing that I used, or who judged the test and left their contact information: I will be sending out personal scores very soon (probably by this weekend). Deep apologies regarding the delay on this post. I had a vacation in late August and it threw off my attention to this project.

[Link] Max Tegmark and Nick Bostrom Speak About AI Risk at UN International Security Event

2 Gram_Stone 13 October 2015 11:25PM

Stupid questions thread, October 2015

0 philh 13 October 2015 07:39PM

This thread is for asking any questions that might seem obvious, tangential, silly or what-have-you. Don't be shy, everyone has holes in their knowledge, though the fewer and the smaller we can make them, the better.

Please be respectful of other people's admitting ignorance and don't mock them for it, as they're doing a noble thing.

To any future monthly posters of SQ threads, please remember to add the "stupid_questions" tag.

New positions and recent hires at the Centre for the Study of Existential Risk (Cambridge, UK)

8 Sean_o_h 13 October 2015 11:11AM

[Cross-posted from EA Forum. Summary: Four new postdoc positions at the Centre for the Study of Existential Risk: Evaluation of extreme technological risk (philosophy, economics); Extreme risk and the culture of science (philosophy of science); Responsible innovation and extreme technological risk (science & technology studies, sociology, policy, governance); and an academic project manager (cutting across the Centre’s research projects, and playing a central role in Centre development). Please help us to spread the word far and wide in the academic community!]


An inspiring first recruitment round

The Centre for the Study of Existential Risk (Cambridge, UK) has been making excellent progress in building up our research team. Our previous recruitment round was a great success, and we made three exceptional hires. Dr Shahar Avin joined us in September from Google, with a background in the philosophy of science (Cambridge, UK). He is currently fleshing out several potential research projects, which will be refined and finalised following a research visit to FHI later this month. Dr Yang Liu joined us this month from Columbia University, with a background in mathematical logic and philosophical decision theory. Yang will work on problems in decision theory that relate to long-term AI, and will help us to link the excellent work being done at MIRI with relevant expertise and talent within academia. In February 2016, we will be joined by Dr Bonnie Wintle from the Centre of Excellence for Biosecurity Risk Analysis (CEBRA), who will lead our horizon-scanning work in collaboration with Professor Bill Sutherland’s group at Cambridge; among other things, she has worked on IARPA-funded development of automated horizon-scanning tools, and has been involved in the Good Judgement Project.

We are very grateful for the help of the existential risk and EA communities in spreading the word about these positions, and helping us to secure an exceptionally strong field. Additionally, I have now moved on from FHI to be CSER’s full-time Executive Director, and Huw Price is now 50% funded as CSER’s Academic Director (we share him with Cambridge’s Philosophy Faculty, where he remains Bertrand Russell Chair of Philosophy).

Four new positions:

We’re delighted to announce four new positions at the Centre for the Study of Existential Risk; details below. Unlike the previous round, where we invited project proposals from across our areas of interest, in this case we have several specific positions that we need to fill for our three year Managing Extreme Technological Risk project, funded by the Templeton World Charity Foundation; details are provided below. As we are building up our academic brand within a traditional university, we expect to predominantly hire from academia, i.e. academic researchers with (or near to the completion of) PhDs. However, we are open to hiring excellent candidates without candidates but with an equivalent and relevant level of expertise, for example in think tanks, policy settings or industry.

Three of these positions are in the standard academic postdoc mould, working on specific research projects. I’d like to draw attention to the fourth, the academic project manager. For this position, we are looking for someone with the intellectual versatility to engage across our research strands – someone who can coordinate these projects, synthesise and present our research to a range of audiences including funders, collaborators, policymakers and industry contacts. Additionally, this person will play a key role in developing the centre over the next two years, working with our postdocs and professorial advisors to secure funding, and contributing to our research, media, and policy strategy among other things. I’ve been interviewed in the past (https://80000hours.org/2013/02/bringing-it-all-together-high-impact-research-management/) about the importance of roles of this nature; right now I see it as our biggest bottleneck, and a position in which an ambitious person could make a huge difference.

We need your help – again!

In some ways, CSER has been the quietest of the existential risk organisations of late – we’ve mainly been establishing research connections, running lectures and seminars, writing research grants and building relations with policymakers (plus some behind-the scenes involvement with various projects). But we’ve been quite successful in these things, and now face an exciting but daunting level of growth: by next year we aim to have a team of 9-10 postdoctoral researchers here at Cambridge, plus senior professors and other staff. It’s very important we continue our momentum by getting world-class researchers motivated to do work of the highest impact. Reaching out and finding these people is quite a challenge, especially given our still-small team. So the help of the existential risk and EA communities in spreading the word – on your facebook feeds, on relevant mailing lists in your universities, passing them on to talented people you know – will make a huge difference to us.

Thank you so much!

Seán Ó hÉigeartaigh (Executive Director, CSER)


“The Centre for the Study of Existential Risk is delighted to announce four new postdoctoral positions for the subprojects below, to begin in January 2016 or as soon as possible afterwards. The research associates will join a growing team of researchers developing a general methodology for the management of extreme technological risk.

Evaluation of extreme technological risk will examine issues such as:

The use and limitations of approaches such as cost-benefit analysis when evaluating extreme technological risk; the importance of mitigating extreme technological risk compared to other global priorities; issues in population ethics as they relate to future generations; challenges associated with evaluating small probabilities of large payoffs; challenges associated with moral and evaluative uncertainty as they relate to the long-term future of humanity. Relevant disciplines include philosophy and economics, although suitable candidates outside these fields are welcomed. More: Evaluation of extreme technological risk

Extreme risk and the culture of science will explore the hypothesis that the culture of science is in some ways ill-adapted to successful long-term management of extreme technological risk, and investigate the option of ‘tweaking’ scientific practice, so as to improve its suitability for this special task. It will examine topics including inductive risk, use and limitations of the precautionary principle, and the case for scientific pluralism and ‘breakout thinking’ where extreme technological risk is concerned. Relevant disciplines include philosophy of science and science and technology studies, although suitable candidates outside these fields are welcomed. More: Extreme risk and the culture of science;

Responsible innovation and extreme technological risk asks what can be done to encourage risk-awareness and societal responsibility, without discouraging innovation, within the communities developing future technologies with transformative potential. What can be learned from historical examples of technology governance and culture-development? What are the roles of different forms of regulation in the development of transformative technologies with risk potential? Relevant disciplines include science and technology studies, geography, sociology, governance, philosophy of science, plus relevant technological fields (e.g., AI, biotechnology, geoengineering), although suitable candidates outside these fields are welcomed. More: Responsible innovation and extreme technological risk

We are also seeking to appoint an academic project manager, who will play a central role in developing CSER into a world-class research centre. We seek an ambitious candidate with initiative and a broad intellectual range for a postdoctoral role combining academic and administrative responsibilities. The Academic Project Manager will co-ordinate and develop CSER’s projects and the Centre’s overall profile, and build and maintain collaborations with academic centres, industry leaders and policy makers in the UK and worldwide. This is a unique opportunity to play a formative research development role in the establishment of a world-class centre. More: CSER Academic Project Manager

Candidates will normally have a PhD in a relevant field or an equivalent level of experience and accomplishment (for example, in a policy, industry, or think tank setting). Application Deadline: Midday (12:00) on November 12th 2015.”

Open thread, Oct. 12 - Oct. 18, 2015

4 MrMind 12 October 2015 06:57AM

If it's worth saying, but not worth its own post (even in Discussion), then it goes here.

Notes for future OT posters:

1. Please add the 'open_thread' tag.

2. Check if there is an active Open Thread before posting a new one. (Immediately before; refresh the list-of-threads page before posting.)

3. Open Threads should be posted in Discussion, and not Main.

4. Open Threads should start on Monday, and end on Sunday.

General buying considerations?

4 Elo 12 October 2015 05:18AM

The following is an incomplete list of suggestions for generic considerations that you might like to make when you go out to buy a thing. I have tried to put the list in order; being generic - certain things will be more or less important in different orders.


0. Do I need the thing? Am I just wanting it on a whim (you are allowed to do that, but at least try to not do that for many expensive things that don’t have resale value)?  If a month had gone by, would I still be wanting it?

  1. What is the thing? What functionality considerations do you need to make?  What does it need to do?  If you already had it - what would it be doing? Will it fit in your life?
  2. What is your expected use? Daily? Once-off? Occasional? (no more than 5 times in your predicted future)
  3. What do I want it to do?  Does this thing do what I want it to do?  (It can be very easy to buy a thing that doesn't quite suit the need because we get distracted between wanting a thing and getting a thing)

Consider your options that avoid buying it:

  1. Can I borrow one from a friend? Or a family member? (some things cannot be borrowed like a wristwatch - no sense borrowing one if it’s an item you wear every day - or other reasons to not borrow a thing)
  2. Can I get one second hand?  
    Some items are perfectly fine second hand, i.e. books, whereas others are potentially less fine (i.e. cars) where more can go wrong with a second hand one.  The point of this inclusion was to encourage you to consider it when you previously would not have. for whichever reason.  Books second hand can also be occasionally out of date or damaged; and cars second hand can be excellent purchases.
  3. Is anyone I know also interested in having the thing, and would they be willing to split the cost with me in order to have it on a kind of timeshare, and can we agree on a deprecation schedule such that one of us buys out the other's share in the future, if one of us is moving away or something?
  4. Renting/hiring the thing - as a one off. (works for most power tools, as well as storage space, a boat, all kinds of things...).  It is also an option to rent short term while you decide if the thing fits your life.  i.e. rent a jetski.  If you find you don’t use it enough to warrant a full purchase you only needed to invest a little bit of the final cost; and might be saving money to do so.
  5. Timeshare - businesses exist around sharing cars; boats; holiday houses and various other products.  You might be able to take advantage of these businesses.
  6. Can I apply for credit for the thing? Can I get the item on consignment?
  7. Could I earn money using the thing and return some costs? (Am I likely to do that based on my past experiences doing so with other purchases?)

Knowledge about the thing:

  1. Do any of your trusted friends have opinions or knowledge in the area?
  2. What do online reviews say?
  3. Is there a community of enthusiasts (i.e. Online) who have resources or who you can outsource the search to? 
  4. Are there experts in the field - (i.e. buying houses), is it worth engaging an expert for this transaction?
  5. How much time do I want to spend on considering and shopping vs how much use will I get out of the thing? (for items under $20, try not to spend more than half an hour on it; or it’s almost better to randomly buy one available {depending on your local minimum wage})

Purchase considerations:

  • What is my budget?
  • Can I afford it? (see options that avoid buying it)
  • Price range of the things on the market?
  • Is it cheaper somewhere else in the world and posted to me?
  • Can I ask for a discount?
  • Can I combine postage with other items?
  • How long will the thing last?
  • How long do I need it for?
  • How quickly do I need it?
  • Do I want to be able to sell it when I'm done?
  • What's the return policy of the various places selling it vs price vs shipping?
  • What is the shipping time?
  • Does it come with a warranty?  Does the warranty last long enough for my liking?
  • Are any laws, customs or taxes applicable to it; or its purchase, or resale?
  • What's the difference between the best price and the worst price, and when do you wind up spending more time (in terms of the value of your time) than that difference trying to get the best price?
  • Does it have resale value?  Do some have better resale than others? (are you actually a person who re-sells things? - have you resold a thing before?)
  • Can I get it in a physical store?  Can I get it online?

General specifics:

  • Is the one I want a quality item?
  • Is the item disposable or not? Have you considered the merits of a similar but disposable one? (or a similar non-disposable one)
  • Does it have the correct colour? Or other embellishments?
  • Do I have storage space for it within my existing storage area?
  • Is it big? Can I get a smaller one?
  • Is it heavy? Can I get a more lightweight version?
  • What are its power options? AC, DC, battery, built-in battery, built-in solar, etc.
  • What is it made out of? Does it come in metal, plastic, wood, etc. what would I prefer?
  • Does it suit my existing possessions?
  • Will this one cost more to repair than the other similar ones?

Miscellaneous considerations:

  • Do I have a backup for if this one fails?
  • What are the consequences of a lower-quality thing breaking while I'm using it?
  • Can I pay for it from someone who is going to donate proceeds to charitable causes?
  • For any purchase under $50 (adjust for your life circumstances) it’s not so much worth running through this checklist; but for more expensive purchases - it’s likely that if you want to appreciate that you put in effort and came to a good conclusion, a process like this will be helpful.
  • Is the process of buying it give me pleasure? Or I will suffer in a long line for it?
  • What kind of signalling is the thing going to give me?  Do I want that?
  • Does the thing have an upkeep or maintenance cost?


Nearly all of the points listed here could be expanded to its own post.  These points apply to everyone to different extents.  “Considering borrowing” is advice that is priceless to one person, and useless to another person.  similarly; “budget” might be significant to one person because they don’t spend often but then spend whatever they like when they need to; but useless to another person because they live and breathe budget.

I plan to cover this in another post about making advice applicable to you.

meta: 3 hours write up.  3-5 reviewers, slack channel inspiring the post, and giving me a place to flesh out the thoughts.

This post is certainly open to improvements.  Please add your comments below.

See also: My Table of contents for other posts in this collection.

See also other repositories on lesswrong:

Clothing is Hard (A Brief Adventure into my Inefficient Brain)

-2 k_ebel 12 October 2015 01:44AM

This is an active solicitation for suggestions on how to train it differently.


Apparently, this morning I put on my underwear wrong.  

Upon noticing that they were on incorrectly, I took them off by turning them inside out on the Z axis (top of head to bottom of feet), and then rotating them 180degreees along the Y axis (belly button to back, travelling through the spine). 

I noted the degrees of off-ness on the two axes, intending to remember them for the next time this happens.  Yes, this happens often enough that I'll probably use the information again.  Sometimes, even clothing is hard.


It was only then that I realized that the easier way to understand what happened would be to say that they were 180degrees off on the X axis (L shoulder to R shoulder, by travelling across the back).   




Ultimately, how this seems to play out is that I get ahead of myself in some rather strange ways.   I tend to think of things in motion before I fully understand them in their static forms.   In the example above, it would have meant that I was trying to store larger chunks of more complex data, when a simpler notation would have done just as well.   I also find that it can distract me from recognizing the context around whatever I'm observing.


I'm only just beginning to be able to identify when that's happening.   

Obviously, I want to address this.  I just don't know how to go about figuring out what needs to be done.  From how to gather more information, to what to do with it.   







Unbounded linear utility functions?

1 snarles 11 October 2015 11:30PM

The LW community seems to assume, by default, that "unbounded, linear utility functions are reasonable."  That is, if you value the existence of 1 swan at 1.5 utilons, then 10 swans should be worth 15, etc.

Yudkowsky in his post on scope insensitivity argues that nonlinearity of personal utility functions is a logical fallacy.

However, unbounded and linearly increasing utility functions lead to conundrums such as Pascal's Mugging.  A recent discussion topic on Pascal's Mugging suggests ignoring probabilities that are too small.  However, such extreme measures are not necessary if tamer utility functions are used: one images a typical personal utility function to be bounded and nonlinear. 

In that recent discussion topic, V_V and I questioned the adoption of such an unbounded, linear utility function.  I would argue that nonlinear of utility functions is not a logical fallacy.

To make my case clear, I will clarify my personal interpretation of utilitarianism.  Utility functions are mathematical constructs that can be used to model individual or group decision-making.  However, it is unrealistic to suppose that every individual actually has an utility function or even a preference ordering; at best, one could find a utility function which approximates the behavior of the individual.  This is confirmed by studies demonstrating the inconsistency of human preferences.  The decisions made by coordinated groups: e.g. corporate partners, citizens in a democracy, or the entire community of effective altruists could also be more or less well-approximated by a utility function: presumably, the accuracy of the utility function model of decision-making depends on the cohesion of the group.  Utilitarianism, as proposed by Bentham and Mills, proposes an ethical framework based on some idealized utility function.  Rather than using utility functions to model group decision-making, Bentham and Mills propose to use some utility function to guide decision-making, in the form of an ethical theory.  It is important to distinguish these two different use-cases of utility functions, which might be termed descriptive utility and prescriptive utility.

But what is ethics?  I hold the hard-nosed position that moral philosophies (including utiliarianism) are human inventions which serve the purpose of facilitating large-scale coordination.  Another way of putting it is that moral philosophy is a manifestation of the limited superrationality that our species possesses.  [Side note: one might speculate that the intellectual aspect of human political behavior, of forming alliances based on shared ideals (including moral philosophies), is a memetic or genetic trait which propogated due to positive selection pressure: moral philosophy is necessary for the development of city-states and larger political entities, which in turn rose as the dominant form of social organization in our species.  But this is a separate issue from the the discussion at hand.]

In this larger context, we can be prepared to evaluate the relative worth of a moral philosophy, such as utiliarianism, against competing philosophies.  If the purpose of a moral philosophy is to facilitate coordination, then an effective moral philosophy is one that can actually hope to achieve that kind of coordination.  Utiliarianism is a good candidate for facilitating global-level coordination due to its conceptual simplicity and because most people can agree with its principles, and it provides a clear framework for decision-making, provided that a suitable utility function can be identified, or at least that the properties of the "ideal utility function" can be debated.  Furthermore, utiliarianism, and related consequentialist moralities are arguably better equipped to handle tragedy of the commons than competing deontological theories.

And if we accept utiliarianism, and if our goal is to facilitate global coordination, we can go further to evaluate the properties of any proposed utility function, by the same criteria as before: i.e., how well will the proposed utility function facilitate global coordination.  Will the proposed utility function find broad support among the key players in the global community?  Unbounded, linearly increasing utility functions clearly fail, because few people would support conclusions such as "it's worth spending all our resources to prevent a 0.001% chance that 1e100 human lives will be created and tortured."

If so, why are such utility functions so dominant in the LW community?  One cannot overlook the biased composition of the LW community as a potential factor: generally proficient in mathematical or logical thinking, but less adept than the general population in empathetic skills.  Oversimplified theories, such as linear unbounded utility functions, appeal more strongly to this type of thinker, while more realistic but complicated utility functions are instinctively dismissed as "illogical" or "irrational", when they real reason that they are dismissed is not because they are actually concluded to be illogical, but because because they are precieved as uglier.

Yet another reason stems from the motives of the founders of the LW community, who make a living primarily out of researching existential risk and friendly AI.  Since existential risks are the kind of low-probability, long-term and high-impact event which would tend to be neglected by "intuitive" bounded and nonlinear utility functions, but favored by unintuitive, unbounded linear utility functions, it is in the founders' best interests to personally adopt a form of utiliarianism employing the latter type of utility function.

Finally, let me clarify that I do not dispute the existence of scope insensitivity.  I think the general population is ill-equipped to reason about problems on a global scale, and that education could help remedy this kind of scope insensitivity.  However, even if natural utility functions asymptote far too early, I doubt that the end result of proper training against scope insensitivity would be an unbounded linear utility function; rather, it would still be a nonlinear utility function, but which asymptotes at a larger scale.



Simulations Map: what is the most probable type of the simulation in which we live?

3 turchin 11 October 2015 05:10AM

There is a chance that we may be living in a computer simulation created by an AI or a future super-civilization. The goal of the simulations map is to depict an overview of all possible simulations. It will help us to estimate the distribution of other multiple simulations inside it along with their measure and probability. This will help us to estimate the probability that we are in a simulation and – if we are – the kind of simulation it is and how it could end.

Simulation argument

The simulation map is based on Bostrom’s simulation argument. Bostrom showed that that “at least one of the following propositions is true:

(1) the human species is very likely to go extinct before reaching a “posthuman” stage;

(2) any posthuman civilization is extremely unlikely to run a significant number of simulations of their evolutionary history (or variations thereof);

(3) we are almost certainly living in a computer simulation”. http://www.simulation-argument.com/simulation.html

The third proposition is the strongest one, because (1) requires that not only human civilization but almost all other technological civilizations should go extinct before they can begin simulations, because non-human civilizations could model human ones and vice versa. This makes (1) extremely strong universal conjecture and therefore very unlikely to be true. It requires that all possible civilizations will kill themselves before they create AI, but we can hardly even imagine such a universal course. If destruction is down to dangerous physical experiments, some civilizations may live in universes with different physics; if it is down to bioweapons, some civilizations would have enough control to prevent them.

In the same way, (2) requires that all super-civilizations with AI will refrain from creating simulations, which is unlikely.

Feasibly there could be some kind of universal physical law against the creation of simulations, but such a law is impossible, because some kinds of simulations already exist, for example human dreaming. During human dreaming very precise simulations of the real world are created (which can’t be distinguished from the real world from within – that is why lucid dreams are so rare). So, we could conclude that after small genetic manipulations it is possible to create a brain that will be 10 times more capable of creating dreams than an ordinary human brain. Such a brain could be used for the creation of simulations and strong AI surely will find more effective ways of doing it. So simulations are technically possible (and qualia is no problem for them as we have qualia in dreams).

Any future strong AI (regardless of whether it is FAI or UFAI) should run at least several million simulations in order to solve the Fermi paradox and to calculate the probability of the appearance of other AIs on other planets, and their possible and most typical goal systems. AI needs this in order to calculate the probability of meeting other AIs in the Universe and the possible consequences of such meetings.

As a result a priory estimation of me being in a simulation is very high, possibly 1000000 to 1. The best chance of lowering this estimation is to find some flaws in the argument, and possible flaws are discussed below.

Most abundant classes of simulations

If we live in a simulation, we are going to be interested in knowing the kind of simulation it is. Probably we belong to the most abundant class of simulations, and to find it we need a map of all possible simulations; an attempt to create one is presented here. 

There are two main reasons for simulation domination: goal and price. Some goals require the creation of very large number of simulations, so such simulations will dominate. Also cheaper and simpler simulations are more likely to be abundant.

Eitan_Zohar suggested http://lesswrong.com/r/discussion/lw/mh6/you_are_mostly_a_simulation/  that FAI will deliberately create an almost infinite number of simulations in order to dominate the total landscape and to ensure that most people will find themselves inside FAI controlled simulations, which will be better for them as in such simulations unbearable suffering can be excluded. (If in the infinite world an almost infinite number of FAIs exist, each of them could not change the landscape of simulation distribution, because its share in all simulations would be infinitely small. So we need a casual trade between an infinite number of FAIs to really change the proportion of simulations. I can't say that it is impossible, but it may be difficult.)

Another possible largest subset of simulations is the one created for leisure and for the education of some kind of high level beings.

The cheapest simulations are simple, low-resolution and me-simulations (one real actor, with the rest of the world around him like a backdrop), similar to human dreams. I assume here that simulations are distributed as the same power law as planets, cars and many other things: smaller and cheaper ones are more abundant.

Simulations could also be laid on one another in so-called Matryoshka simulations where one simulated civilization is simulating other civilizations. The lowest level of any Matryoshka system will be the most populated. If it is a Matryoshka simulation, which consists of historical simulations, the simulation levels in it will be in descending time order, for example the 24th century civilization models the 23rd century one, which in turn models the 22nd century one, which itself models the 21st century simulation. A simulation in a Matryoshka will end on the level where creation of the next level is impossible. The beginning of 21st century simulations will be the most abundant class in Matryoshka simulations (similar to our time period.)

Argument against simulation theory

There are several possible objections against the Simulation argument, but I find them not strong enough to do it. 

1.    Measure

The idea of measure was introduced to quantify the extent of the existence of something, mainly in quantum universe theories. While we don’t know how to actually measure “the measure”, the idea is based on intuition that different observers have different powers of existence, and as a result I could find myself to be one of them with a different probability. For example, if we have three functional copies of me, one of them is the real person, another is a hi-res simulation and the third one is low-res simulation, are my chances of being each of them equal (1/3)?

The «measure» concept is the most fragile element of all simulation arguments. It is based mostly on the idea that all copies have equal measure. But perhaps measure also depends on the energy of calculations. If we have a computer which is using 10 watts of energy to calculate an observer, it may be presented as two parallel computers which are using five watts each. These observers may be divided again until we reach the minimum amount of energy required for calculations, which could be called «Plank observer». In this case our initial 10 watt computer will be equal to – for example – one billion plank observers.

And here we see a great difference in the case of simulations, because simulation creators have to spend less energy on calculations (or it would be easier to make real world experiments). But in this case such simulations will have a lower measure. But if the total number of all simulations is large, then the total measure of all simulations will still be higher than the measure of real worlds. But if most real worlds end with global catastrophe, the result would be an even higher proportion of real worlds which could outweigh simulations after all.

2. Universal AI catastrophe

One possible universal global catastrophe could happen where a civilization develops an AI-overlord, but any AI will meet some kind of unresolvable math and philosophical problems which will terminate it at its early stages, before it can create many simulations. See an overview of this type of problem in my map “AI failures level”.

3. Universal ethics

Another idea is that all AIs converge to some kind of ethics and decision theory which prevent them from creating simulations, or they create p-zombie simulations only. I am skeptical about that.

4. Infinity problems

If everything possible exists or if the universe is infinite (which are equal statements) the proportion between two infinite sets is meaningless. We could overcome this conjecture using the idea of mathematical limit: if we take a bigger universe and longer periods of time, the simulations will be more and more abundant within them.

But in all cases, in the infinite universe any world exists an infinite number of times, and this means that my copies exist in real worlds an infinite number of times, regardless of whether I am in a simulation or not.

5. Non-uniform measure over Universe (actuality)

Contemporary physics is based on the idea that everything that exists, exists in equal sense, meaning that the Sun and very remote stars have the same measure of existence, even in casually separated regions of the universe. But if our region of space-time is somehow more real, it may change simulation distribution which will favor real worlds. 

6. Flux universe

The same copies of me exist in many different real and simulated worlds. In simple form it means that the notion that “I am in one specific world” is meaningless, but the distribution of different interpretations of the world is reflected in the probabilities of different events.

E.g. the higher the chances that I am in a simulation, the bigger the probability that I will experience some kind of miracles during my lifetime. (Many miracles almost prove that you are in simulation, like flying in dreams.) But here correlation is not causation.

The stronger version of the same principle implies that I am one in many different worlds, and I could manipulate the probability of finding myself in a set of possible worlds, basically by forgetting who I am and becoming equal to a larger set of observers. It may work without any new physics, it only requires changing the number of similar observers, and if such observers are Turing computer programs, they could manipulate their own numbers quite easily.

Higher levels of flux theory do require new physics or at least quantum mechanics in a many worlds interpretation. In it different interpretations of the world outside of the observer could interact with each other or experience some kind of interference.

See further discussion about a flux universe here: http://lesswrong.com/lw/mgd/the_consequences_of_dust_theory/

7. Bolzmann brains outweigh simulations

It may turn out that BBs outweigh both real worlds and simulations. This may not be a problem from a planning point of view because most BBs correspond to some real copies of me.

But if we take this approach to solve the BBs problem, we will have to use it in the simulation problem as well, meaning: "I am not in a simulation because for any simulation, there exists a real world with the same “me”. It is counterintuitive.

Simulation and global risks

Simulations may be switched off or may simulate worlds which are near global catastrophe. Such worlds may be of special interest for future AI because they help to model the Fermi paradox and they are good for use as games.

Miracles in simulations

The map also has blocks about types of simulation hosts, about many level simulations, plus ethics and miracles in simulations.

The main point about simulation is that it disturbs the random distribution of observers. In the real world I would find myself in mediocre situations, but simulations are more focused on special events and miracles (think about movies, dreams and novels). The more interesting my life is, the less chance that it is real.

If we are in simulation we should expect more global risks, strange events and miracles, so being in a simulation is changing our probability expectation of different occurrences.  

This map is parallel to the Doomsday argument map.

Estimations given in the map of the number of different types of simulation or required flops are more like place holders, and may be several orders of magnitude higher or lower.

I think that this map is rather preliminary and its main conclusions may be updated many times.

The pdf of the map is here, and jpg is below.

Previous posts with maps:

Digital Immortality Map

Doomsday Argument Map

AGI Safety Solutions Map

A map: AI failures modes and levels

A Roadmap: How to Survive the End of the Universe

A map: Typology of human extinction risks

Roadmap: Plan of Action to Prevent Human Extinction Risks

Immortality Roadmap

Conflicting advice on altruism

4 leplen 11 October 2015 12:25AM

As far as I can tell, rather than having a single well-defined set of preferences or utility function, my actions more closely reflect the outcome of a set of competing internal drives. One of my internal drives is strongly oriented towards a utilitarian altruism. While the altruist internal drive doesn't dominate my day-to-day life, compared to the influence of more basic drives like the desires for food, fun, and social validation, I have traditionally been very willing to drop whatever I'm doing and help someone who asks for, or appears to need help. This altruistic drive has an even more significant degree of influence on my long term planning, since my drives for food, fun, etc. are ambivalent between the many possible futures in which they can be well-satisfied. 

I'm not totally sure to what extent strong internal drives are genetic or learned or controllable, but I've had a fairly strong impulse towards altruism for well over a decade. Unfortunately, even over fairly long time frames it isn't clear to me that I've been a particularly "effective" altruist. This discussion attempts to understand some of the beliefs and behaviors that contributed to my personal failure/success as an altruist, and may also be helpful to other people looking to engage in or encourage similar prosocial habits.


Game Theory Model

Imagine a perfect altruist competing in a Prisoner's Dilemma style game. The altruist in this model is by definition a pure utilitarian who wants to maximize the average utility, but is completely insensitive to the distribution of the utility.1 A trivial real world example similar to this would be something like picking up litter in a public place. If the options are Pick up (Cooperate) and Litter (Defect) then an altruist might choose to pick up litter even though they themselves don't capture enough of the value to justify the action. Even if you're skeptical that unselfish pure utilitarians exist, the payoff matrix and much of this analysis applies to a broader range of prosocial behaviors where it's difficult for a single actor to capture the value he or she generates.

The prisoner's dilemma payoff matrix for the game in which the altruist is competing looks something like this:

  Agent B
Agent A Cooperate Defect
Cooperate 2,2 -2,4
Defect 4,-2 -1,-1

Other examples with altered payoff ratios are possible, but this particular payoff matrix creates an interesting inversion of the typical strategy for the prisoner's dilemma. If we label the altruist Agent A (A for Altruist), then A's dominant strategy is Cooperate. Just as in the traditional prisoner's dilemma, A prefers if B also cooperates, but A will cooperate regardless of what B does. The iterated prisoner's dilemma is even more interesting. If A and B are allowed to communicate before and between rounds, A may threaten to employ a tit-for-tat-like strategy and to defect in the future against defectors, but this threat is somewhat hollow, since regardless of threats, A's dominant strategy in any given round is still to cooperate. 

A population of naive altruists is somewhat unstable for the same reason that a population of naive cooperators is unstable. It's vulnerable to infiltration by defectors. The obvious meta-strategies for individual altruists and altruist populations are to either become proficient at identifying defectors and to ignore/avoid them or to successfully threaten defectors into cooperating. Both the identify/avoid and the threaten/punish tactics have costs associated with them, and which approach is a better strategy depends on how much players are expected to change over the course of time/a series of games. Incorrigible defectors cannot be threatened/punished and must be avoided,while for more malleable defectors it may be possible to threaten them into cooperation.

If we assume that agent B is selfish and we express the asymmetry in the agent values in terms of our payoff matrix, then the symmetric payoff matrix above is equivalent to the top portion of a new payoff matrix given by

  Agent B
Agent A Cooperate Defect
Cooperate 2,2 1,4
Defect 1,-2 -1,-1
Avoid 0,0 0,0

The only difference between the two matrices is in this latter case we've given the altruist an avoid option.  There is no simple way to include the threaten option, since threaten relies on trying to convince Agent B that Agent A is either unreasonable or not an altruist and including that sort of bluff in the formal model makes is difficult to create payoff matrices that are both simple and reasonable. However, we can still make a few improvements to our formal model before we're forced to abandon it and talk about the real world.


Adding Complexity

The relatively simple payoff matrices in the previous section can easily be made more realistic and more complicated. In the iterated version of the game, if the total number of times A can cooperate in games is limited, then for each game in which she cooperates, she incurs an opportunity cost equal to the difference between her received payoff and her ideal payout. Under this construction an altruist who cooperates with a defector receives a negative utility as long as games with other cooperators are available. 

  Agent B
Agent A Cooperate Defect
Cooperate 2,2 -1,4
Defect -1,-2 -3,-1
Avoid 0,0 0,0

In this instance, A no longer has a dominant strategy. A should cooperate with B if she thinks that B will cooperate, but A should avoid B if she thinks that B will defect. A thus has a strong incentive to build a sophisticated model of B, which can be used either to convince B to cooperate or at the very least correctly predict B's defection. For a perfect altruist, more information and judgment of agent B leads to better average outcomes.

The popularity of gatekeeper organizations like GiveWell and Charity Navigator in altruist communities makes a lot of sense if those communities are aware of their vulnerability to defectors. Because charitable dollars are so fungible, giving money to a charity is an instance where opportunity costs play a significant role. While meta-charities offer some other advantages, a significant part of their appeal, especially for organizations like Charity Navigator, is helping people avoid "bad" charities.

Interestingly, with this addition, A's behavior may to start to look less and less like pure altruism. Even if A is totally indifferent to the distribution of utility, if A can reliably identify some other altruists then she will preferentially cooperate with them and avoid games with unknown agents in which there is a risk of defection. The benefits of cooperation could then disproportionately accrue within the altruist in-group, even if none of the altruists intend that outcome.

An observer who had access only to the results of the games and not the underlying utility functions of the players would be unlikely to conclude that the clique of A-like agents that exhibited strong internal cooperation and avoided games with all other players had a purely altruistic utility function. Their actions pattern-match much more readily to something more selfish and more like typical human tribal behavior, suggesting either a self-serving or an "us versus them" utility function instead of one that has increasing the average payoff as its goal. If we include the threaten/punish option, the altruist population may look even less like a population of altruists.

That erroneous pattern match isn't a huge issue for the perfectly rational pure altruist in our game theory model. Unfortunately, human beings are often neither of those things. A significant amount of research suggests that people's beliefs are strongly influenced by their actions, and what they think those actions say about them. An actual human that started with the purely altruistic utility function of Agent A in this section, who rationally cooperated with a set of other easily identified altruists, might very well alter his utility function to seem more consistent with his actions. The game theoretic model, in which the values of the agent are independent of the agents choices starts to break down. 

Trying to be an altruist

While very few individuals are perfect altruists/pure utilitarians as defined here, a much larger fraction of the population nominally considers the altruist value system to be an ethical ideal. The ideal that people have approximately equal value may not always be reflected in how most people live, but many people espouse such a belief and even want to believe it. We see this idea under all sorts of labels: altruism,  being a utilitarian, trying to "love your neighbor as yourself", believing in the spiritual unity of humankind, or even just an innate sense of fairness.

Someone who is trying to be an altruist may have altruism or a similar ethical injunction as one of many of their internal drives, and the drive for altruism may be relatively weak compared to their desires for personal companionship, increased social status, greater material wealth, etc. For this individual, the primary threat to the effectiveness of their prosocial behavior is not the possibility that they might cooperate with a defector; it is instead the possibility that their selfish drives might overwhelm their desire to act altruistically, and they themselves might not cooperate. 


Received Wisdom on Altruism

Much of the cultural wisdom in my native culture that addresses how to be a good altruist is geared towards people who are trying to be altruists, rather than towards altruists who are trying to be effective. The best course of action in the two situations is often very different, but it took me a considerable amount of time to realize the distinction.

For people trying to be altruists, focusing on the opportunity costs of their altruism is exactly the wrong thing to do. Imagining all the other things that they could buy with their money instead of giving it to a homeless person or donating it to the AMF will make it very unlikely they will give the money away. Judging the motivations of others often provides ample excuses for not helping someone. Seeking out similar cooperators can quickly turn into self-serving tribalism and indifference towards people unlike the tribe. Most people have really stringent criteria for helping others, and so most given the chance to help, most people don't.

The cultural advice I received on altruism tended to focus on avoiding these pitfalls. It stressed ideas like, "Do whatever good you can, wherever you are", and emphasized not to judge or condemn others, but to give second chances, to try and believe in the fundamental goodness of people, and to try to cooperate and value non-tribe members and even enemies. 

When I was trying to be an altruist, I took much of this cultural how-to advice on altruism very seriously and for much of my life often helped/cooperated with anyone who asked, regardless of whether the other person was likely to defect. Even when people literally robbed me I would rationalize that whoever stole my bike must have really needed a bike, and so the even my involuntary "cooperation" with the thief probably was a net positive from a utilitarian standpoint.


Effective Altruism

I don't think I've been particularly effective as an altruist because I haven't been judgmental enough, because I've been too focused on doing whatever good I could where I was instead of finding the places I could do the most good and moving myself to those places. I'm now trying to spend nearly as much energy identifying opportunities to do good, as I do actively trying to improve the world.

At the same time, I'm still profoundly wary of the instinct not to help, or of thinking, "This isn't my best opportunity to do good" because I know that's it's very easy to get in the habit of not helping people. I'm trying to move away from my instinct towards reactive helping anyone who asks towards something that looks more like proactive planning, but I'm not at all convinced that most other people should be trying to move in that same direction.

As with achieving any goal, success requires a balance between insufficient planning and analysis paralysis. I think for altruism in particular, this balance was and is difficult to strike in part because of the large potential for motivated selfish reasoning, but also because most of my (our?) cultural wisdom emphasizes convenient immediate action as the correct form of altruism. Long term altruistic planning is typically not much mentioned or discussed, possibly because most people just aren't that strongly oriented towards utilitarian values.



If helping others is something that you're committed enough to that a significant limitation on your effectiveness is that you often help the wrong people, then diverting energy into judging who you help and consciously considering opportunity costs is probably a good idea. If helping others is something you'd like to do, but you rarely find yourself actually doing, the opposite advice may be apropos.



1. In idealized formulations of game theory, "utility" is intended to describe not just physical or monetary gain, but to include effects like desire for fairness, moral beliefs, etc. Symmetric games are fairly unrealistic under that assumption, and such a definition of utility would preclude our altruist from many games altogether. Utility in this first example is defined only in terms of personal gain, and explicitly does not include the effects of moral satisfaction, desire for fairness, etc.

Aumann Agreement Game

9 abramdemski 09 October 2015 05:14PM

I've written up a rationality game which we played several times at our local LW chapter and had a lot of fun with. The idea is to put Aumann's agreement theorem into practice as a multi-player calibration game, in which players react to the probabilities which other players give (each holding some privileged evidence). If you get very involved, this implies reasoning not only about how well your friends are calibrated, but also how much your friends trust each other's calibration, and how much they trust each other's trust in each other.

You'll need a set of trivia questions to play. We used these

The write-up includes a helpful scoring table which we have not play-tested yet. We did a plain Bayes loss rather than an adjusted Bayes loss when we played, and calculated things on our phone calculators. This version should feel a lot better, because the numbers are easier to interpret and you get your score right away rather than calculating at the end.

New LW Meetups: Reading, Stockholm, and Suzhou

2 FrankAdamek 09 October 2015 04:11PM

This summary was posted to LW Main on October 2nd. The following week's summary is here.

New meetups (or meetups with a hiatus of more than a year) are happening in:

Irregularly scheduled Less Wrong meetups are taking place in:

The remaining meetups take place in cities with regular scheduling, but involve a change in time or location, special meeting content, or simply a helpful reminder about the meetup:

Locations with regularly scheduled meetups: Austin, Berkeley, Berlin, Boston, Brussels, Buffalo, Cambridge UK, Canberra, Columbus, Denver, London, Madison WI, Melbourne, Moscow, Mountain View, New Hampshire, New York, Philadelphia, Research Triangle NC, Seattle, Sydney, Tel Aviv, Toronto, Vienna, Washington DC, and West Los Angeles. There's also a 24/7 online study hall for coworking LWers.

continue reading »

Toy model for wire-heading [EDIT: removed for improvement]

2 Stuart_Armstrong 09 October 2015 03:45PM

EDIT: these ideas are too underdeveloped, I will remove them and present a more general idea after more analysis.

This is a (very) simple toy model of the wire-heading problem to illustrate how it might or might not happen. The great question is "where do we add the (super)intelligence?"

Let's assume a simple model for an expected utility maximising agent. There's the input assessor module A, which takes various inputs and computes the agent's "reward" or "utility". For a reward-based agent, A is typically outside of the agent; for a utility-maximiser, it's typically inside the agent, though the distinction need not be sharp. And there's the the decision module D, which assess the possible actions to take to maximise the output of A. If E is the general environment, we have D+A+E.

Now let's make the agent superintelligent. If we add superintelligence to module D, then D will wirehead by taking control of A (whether A is inside the agent or not) and controlling E to prevent interference. If we add superintelligence to module A, then it will attempt to compute rewards as effectively as possible, sacrificing D and E to achieve it's efficient calculations.

Therefore to prevent wireheading, we need to "add superintelligence" to (D+A), making sure that we aren't doing so to some sub-section of the algorithm - which might be hard if the "superintelligence" is obscure or black-box.


Philosophical schools are approaches not positions

2 casebash 09 October 2015 09:46AM

One of the great challenges of learning philosophy is trying to understand the difference between different schools of thought. Often it can be almost impossible to craft a definition that is specific enough to be understandable, whilst also being general enough to convey to breadth of that school of thought. I would suggest that this is a result of trying to define a school as taking a particular position in a debate, when they would be better defined as taking a particular approach to answering a question.

Take for example dualism and monism. Dualists believe that there exist two substances (typically the material substance and some kind of soul/consciousness), while monists believe that there only exists one. The question of whether this debate is defined precisely enough to actually be answerable immediately crops up. Few people would object to labelling the traditional Christian model of souls which went to an afterlife as being a Dualist model or a model of our universe with no conscious beings whatsoever as being monist. However, providing a good, general definition of what would count as two substances and what would count as one seems extraordinarily difficult. The question then arises of whether the dualism vs. monism debate is actually in a form that is answerable.

In contrast, if Dualism and Monism are thought of as approaches, then there can conceivably exist some situations Dualism is clearly better, some situations where Monism is clearly better and some situations where it is debatable. Rather than labelling the situation to be unanswerable, it would be better to call it possibly unanswerable.

Once it is accepted that dualism and monism are approaches, rather than positions the debate becomes much clearer. We can define these approaches as follows: Monism argues for describing reality as containing a single substance, while dualism argues for describing reality as containing two substances: typically one being physical and the other being mental or spiritual. I originally wrote this sentence using the word ‘modelling’ instead of ‘describing’, but I changed it because I wanted to be neutral on the issue on whether we can talk about what actually exists or can only talk about models of reality. If it was meaningful to talk about whether one or two substances actually existed (as opposed to simply being useful models), then the monism and dualism approaches would collapse down to being positions. However, the assumption that they have a "real" existence, if that is actually a valid concept, should not be made at the outset, and hence we describe them as approaches.

Can we still have our dualism vs. monism debate? Sure, kind of. We begin by using philosophy to establish the facts. In some cases, only one description may match the situation, but in other cases, it may be ambiguous. If this occurs, we could allow a debate to occur over which is the better description . This seems like a positional debate, but simply understanding that it is a descriptional debate changes how the debate plays out. Some people would argue that this question isn’t a job for philosophers, but for linguists, and I acknowledge that's there's a lot of validity to this point of view. Secondly, these approaches could be crystalised into actual positions. This would involve creating criteria for one side to win and the other to lose. Many philosophers who belong to monism, for example, would dislike the "crystalised" monism for not representing their name, so it might be wise to give these crystilised positions their own name.

We also consider free will. Instead of understanding the free will school of philosophy to hold the position that F0 exists where F0 is what is really meant by free will, it is better to understand it as an general approach that argues that there is some aspect of reality accurately described by the phrase “free will”. Some people will find this definition unsatisfactory and almost tauntological, but no more precise statement can be made if we want to capture the actual breadth of thought. If you want to know what this person actually believes, then you’ll have to ask them to define what they are using free will to mean.

This discussion also leads us a better way to teach people about these terms. The first part is to explain how the particular approach tries to describe reality. The second is to explain why particular situations or thought experiments seems to make more sense with this description.

While I have maintained that philosophical schools should be understood as approaches, rather than positions, I admit the possibility than in a few cases philosophers might have actually managed to come to consensus and make the opposing schools of thought positions rather than approaches. This analysis would not apply to them. However, if these cases do in fact exist, the appear to be far and few between.

Note: I'm not completely happy with the monism, dualism example, I'll probably replace it later when I come across a better example for demonstrating my point.

Emotional tools for the beginner rationalist

3 Gleb_Tsipursky 09 October 2015 05:01AM

Something that I haven't seen really discussed is what kind of emotional tools would be good for beginner rationalists. I'm especially interested in this topic since as part of my broader project of spreading rationality to a wide audience and thus raising the sanity waterline, I come across a lot of people who are interested in becoming more rational, but have difficulty facing the challenges of the Valley of Bad Rationality. In other words, they have trouble acknowledging their own biases and faults, facing the illusions within their moral systems and values, letting go of cached patterns, updating their beliefs, etc. Many thus abandon their aspiration toward rationality before they get very far. I think this is a systematic failure mode of many beginner aspiring rationalists, and so I wanted to start a discussion about what we can do about it as a community.


Note that this emotional danger does not feel intuitive to me or likely to many of you. In a Facebook discussion with Viliam Bur, he pointed out how he did not experience the Valley. I personally did not experience it that much either. However, based on the evidence of the Intentional Insights outreach efforts, this is a typical mind fallacy particular to many but far from all aspiring rationalists. So we should make an effort to address it in order to raise the sanity waterline effectively.


I'll start by sharing what I found effective in my own outreach efforts. First, I found it helpful to frame the aspiration toward rationality not as a search for a perfect and unreachable ideal, but as a way of constant improvement from the baseline where all humans are to something better. I highlight the benefits people get from this improved mode of thinking, to prime people to focus on their current self and detach themselves from their past selves. I highlight the value of self-empathy and self-forgiveness toward oneself for holding mistaken views, and encourage people to think of themselves as becoming more right, rather than less wrong :-)


Another thing that I found helpful was to provide new aspiring rationalists with a sense of community and social belonging. Joining a community of aspiring rationalists who are sensitive toward a newcomers' emotions, and help that newcomer deal with the challenges s/he experiences, is invaluable for overcoming the emotional strains of the Valley. Something especially useful is having people who are trained in coaching/counseling and serve as mentors for new members, who can help be guides for their intellectual and emotional development alike. I'd suggest that every LW meetup group consider instituting a system of mentors who can provide emotional and intellectual support alike for new members.


Now I'd like to hear about your experiences traveling the Valley, and what tools you and others you know used to manage it. Also, what are your ideas about useful tools for that purpose in general? Look forward to hearing your thoughts!




[Link] Stephen Hawking AMA answers

6 AspiringRationalist 08 October 2015 11:13PM

Fiction Considered Harmful

6 abramdemski 08 October 2015 06:34PM

Epistemic status: playing devil's advocate.

I wrote the following a couple of weeks back for a meet-up post, and Gunnar_Zarncke suggested I should turn it into a discussion post:

continue reading »

Rationality Reading Group: Part K: Letting Go

5 Gram_Stone 08 October 2015 02:32AM

This is part of a semi-monthly reading group on Eliezer Yudkowsky's ebook, Rationality: From AI to Zombies. For more information about the group, see the announcement post.

Welcome to the Rationality reading group. This fortnight we discuss Part K: Letting Go (pp. 497-532)This post summarizes each article of the sequence, linking to the original LessWrong post where available.

K. Letting Go

121. The Importance of Saying "Oops" - When your theory is proved wrong, just scream "OOPS!" and admit your mistake fully. Don't just admit local errors. Don't try to protect your pride by conceding the absolute minimal patch of ground. Making small concessions means that you will make only small improvements. It is far better to make big improvements quickly. This is a lesson of Bayescraft that Traditional Rationality fails to teach.

122. The Crackpot Offer - If you make a mistake, don't excuse it or pat yourself on the back for thinking originally; acknowledge you made a mistake and move on. If you become invested in your own mistakes, you'll stay stuck on bad ideas.

123. Just Lose Hope Already - Casey Serin owes banks 2.2 million dollars after lying on mortgage applications in order to simultaneously buy 8 different houses in different states. The sad part is that he hasn't given up - he hasn't declared bankruptcy, and has just attempted to purchase another house. While this behavior seems merely stupid, it brings to mind Merton and Scholes of Long-Term Capital Management, who made 40% profits for three years, and then lost it all when they overleveraged. Each profession has rules on how to be successful, which makes rationality seem unlikely to help greatly in life. Yet it seems that one of the greater skills is not being stupid, which rationality does help with.

124. The Proper Use of Doubt - Doubt is often regarded as virtuous for the wrong reason: because it is a sign of humility and recognition of your place in the hierarchy. But from a rationalist perspective, this is not why you should doubt. The doubt, rather, should exist to annihilate itself: to confirm the reason for doubting, or to show the doubt to be baseless. When you can no longer make progress in this respect, the doubt is no longer useful to you as a rationalist.

125. You Can Face Reality - This post quotes a poem by Eugene Gendlin, which reads, "What is true is already so. / Owning up to it doesn't make it worse. / Not being open about it doesn't make it go away. / And because it's true, it is what is there to be interacted with. / Anything untrue isn't there to be lived. / People can stand what is true, / for they are already enduring it."

126. The Meditation on Curiosity - If you can find within yourself the slightest shred of true uncertainty, then guard it like a forester nursing a campfire. If you can make it blaze up into a flame of curiosity, it will make you light and eager, and give purpose to your questioning and direction to your skills.

127. No One Can Exempt You From Rationality's Laws - Traditional Rationality is phrased in terms of social rules, with violations interpretable as cheating - as defections from cooperative norms. But viewing rationality as a social obligation gives rise to some strange ideas. The laws of rationality are mathematics, and no social maneuvering can exempt you.

128. Leave a Line of Retreat - If you are trying to judge whether some unpleasant idea is true you should visualise what the world would look like if it were true, and what you would do in that situation. This will allow you to be less scared of the idea, and reason about it without immediately trying to reject it.

129. Crisis of Faith - A guide to making drastic but desirable changes to your map of the territory without losing your way.

130. The Ritual - Depiction of crisis of faith in Beisutsukai world.


This has been a collection of notes on the assigned sequence for this fortnight. The most important part of the reading group though is discussion, which is in the comments section. Please remember that this group contains a variety of levels of expertise: if a line of discussion seems too basic or too incomprehensible, look around for one that suits you better!

The next reading will cover Minds: An Introduction (pp. 539-545), Interlude: The Power of Intelligence (pp. 547-550), and Part L: The Simple Math of Evolution (pp. 553-613). The discussion will go live on Wednesday, 21 October 2015, right here on the discussion forum of LessWrong.

Group Rationality Diary, Oct. 6-18, 2015

4 polymathwannabe 06 October 2015 11:23PM

This is the public group rationality diary for October 6-18, 2015. It's a place to record and chat about it if you have done, or are actively doing, things like:

  • Established a useful new habit

  • Obtained new evidence that made you change your mind about some belief

  • Decided to behave in a different way in some set of situations

  • Optimized some part of a common routine or cached behavior

  • Consciously changed your emotions or affect with respect to something

  • Consciously pursued new valuable information about something that could make a big difference in your life

  • Learned something new about your beliefs, behavior, or life that surprised you

  • Tried doing any of the above and failed

Or anything else interesting which you want to share, so that other people can think about it, and perhaps be inspired to take action themselves. Try to include enough details so that everyone can use each other's experiences to learn about what tends to work out, and what doesn't tend to work out.

Crazy Ideas Thread - October 2015

6 Gunnar_Zarncke 06 October 2015 10:38PM

This thread is intended to provide a space for 'crazy' ideas. Ideas that spontaneously come to mind (and feel great), ideas you long wanted to tell but never found the place and time for and also for ideas you think should be obvious and simple - but nobody ever mentions them. 

Rules for this thread:

  1. Each crazy idea goes into its own top level comment and may be commented there.
  2. Voting should be based primarily on how original the idea is.
  3. Meta discussion of the thread should go to the top level comment intended for that purpose. 


If you create such a thread do the following :

  • Use "Crazy Ideas Thread" in the title.
  • Copy the rules.
  • Add the tag "crazy_idea".
  • Create a top-level comment saying 'Discussion of this thread goes here; all other top-level comments should be ideas or similar'
  • Add a second top-level comment with an initial crazy idea to start participation.

A few misconceptions surrounding Roko's basilisk

40 RobbBB 05 October 2015 09:23PM

There's a new LWW page on the Roko's basilisk thought experiment, discussing both Roko's original post and the fallout that came out of Eliezer Yudkowsky banning the topic on Less Wrong discussion threads. The wiki page, I hope, will reduce how much people have to rely on speculation or reconstruction to make sense of the arguments.

While I'm on this topic, I want to highlight points that I see omitted or misunderstood in some online discussions of Roko's basilisk. The first point that people writing about Roko's post often neglect is:


  • Roko's arguments were originally posted to Less Wrong, but they weren't generally accepted by other Less Wrong users.

Less Wrong is a community blog, and anyone who has a few karma points can post their own content here. Having your post show up on Less Wrong doesn't require that anyone else endorse it. Roko's basic points were promptly rejected by other commenters on Less Wrong, and as ideas not much seems to have come of them. People who bring up the basilisk on other sites don't seem to be super interested in the specific claims Roko made either; discussions tend to gravitate toward various older ideas that Roko cited (e.g., timeless decision theory (TDT) and coherent extrapolated volition (CEV)) or toward Eliezer's controversial moderation action.

In July 2014, David Auerbach wrote a Slate piece criticizing Less Wrong users and describing them as "freaked out by Roko's Basilisk." Auerbach wrote, "Believing in Roko’s Basilisk may simply be a 'referendum on autism'" — which I take to mean he thinks a significant number of Less Wrong users accept Roko’s reasoning, and they do so because they’re autistic (!). But the Auerbach piece glosses over the question of how many Less Wrong users (if any) in fact believe in Roko’s basilisk. Which seems somewhat relevant to his argument...?

The idea that Roko's thought experiment holds sway over some community or subculture seems to be part of a mythology that’s grown out of attempts to reconstruct the original chain of events; and a big part of the blame for that mythology's existence lies on Less Wrong's moderation policies. Because the discussion topic was banned for several years, Less Wrong users themselves had little opportunity to explain their views or address misconceptions. A stew of rumors and partly-understood forum logs then congealed into the attempts by people on RationalWiki, Slate, etc. to make sense of what had happened.

I gather that the main reason people thought Less Wrong users were "freaked out" about Roko's argument was that Eliezer deleted Roko's post and banned further discussion of the topic. Eliezer has since sketched out his thought process on Reddit:

When Roko posted about the Basilisk, I very foolishly yelled at him, called him an idiot, and then deleted the post. [...] Why I yelled at Roko: Because I was caught flatfooted in surprise, because I was indignant to the point of genuine emotional shock, at the concept that somebody who thought they'd invented a brilliant idea that would cause future AIs to torture people who had the thought, had promptly posted it to the public Internet. In the course of yelling at Roko to explain why this was a bad thing, I made the further error---keeping in mind that I had absolutely no idea that any of this would ever blow up the way it did, if I had I would obviously have kept my fingers quiescent---of not making it absolutely clear using lengthy disclaimers that my yelling did not mean that I believed Roko was right about CEV-based agents [= Eliezer’s early model of indirectly normative agents that reason with ideal aggregated preferences] torturing people who had heard about Roko's idea. [...] What I considered to be obvious common sense was that you did not spread potential information hazards because it would be a crappy thing to do to someone. The problem wasn't Roko's post itself, about CEV, being correct.

This, obviously, was a bad strategy on Eliezer's part. Looking at the options in hindsight: To the extent it seemed plausible that Roko's argument could be modified and repaired, Eliezer shouldn't have used Roko's post as a teaching moment and loudly chastised him on a public discussion thread. To the extent this didn't seem plausible (or ceased to seem plausible after a bit more analysis), continuing to ban the topic was a (demonstrably) ineffective way to communicate the general importance of handling real information hazards with care.


On that note, point number two:

  • Roko's argument wasn’t an attempt to get people to donate to Friendly AI (FAI) research. In fact, the opposite is true.

Roko's original argument was not 'the AI agent will torture you if you don't donate, therefore you should help build such an agent'; his argument was 'the AI agent will torture you if you don't donate, therefore we should avoid ever building such an agent.' As Gerard noted in the ensuing discussion thread, threats of torture "would motivate people to form a bloodthirsty pitchfork-wielding mob storming the gates of SIAI [= MIRI] rather than contribute more money." To which Roko replied: "Right, and I am on the side of the mob with pitchforks. I think it would be a good idea to change the current proposed FAI content from CEV to something that can't use negative incentives on x-risk reducers."

Roko saw his own argument as a strike against building the kind of software agent Eliezer had in mind. Other Less Wrong users, meanwhile, rejected Roko's argument both as a reason to oppose AI safety efforts and as a reason to support AI safety efforts.

Roko's argument was fairly dense, and it continued into the discussion thread. I’m guessing that this (in combination with the temptation to round off weird ideas to the nearest religious trope, plus misunderstanding #1 above) is why RationalWiki's version of Roko’s basilisk gets introduced as

a futurist version of Pascal’s wager; an argument used to try and suggest people should subscribe to particular singularitarian ideas, or even donate money to them, by weighing up the prospect of punishment versus reward.

If I'm correctly reconstructing the sequence of events: Sites like RationalWiki report in the passive voice that the basilisk is "an argument used" for this purpose, yet no examples ever get cited of someone actually using Roko’s argument in this way. Via citogenesis, the claim then gets incorporated into other sites' reporting.

(E.g., in Outer Places: "Roko is claiming that we should all be working to appease an omnipotent AI, even though we have no idea if it will ever exist, simply because the consequences of defying it would be so great." Or in Business Insider: "So, the moral of this story: You better help the robots make the world a better place, because if the robots find out you didn’t help make the world a better place, then they’re going to kill you for preventing them from making the world a better place.")

In terms of argument structure, the confusion is equating the conditional statement 'P implies Q' with the argument 'P; therefore Q.' Someone asserting the conditional isn’t necessarily arguing for Q; they may be arguing against P (based on the premise that Q is false), or they may be agnostic between those two possibilities. And misreporting about which argument was made (or who made it) is kind of a big deal in this case: 'Bob used a bad philosophy argument to try to extort money from people' is a much more serious charge than 'Bob owns a blog where someone once posted a bad philosophy argument.'



  • "Formally speaking, what is correct decision-making?" is an important open question in philosophy and computer science, and formalizing precommitment is an important part of that question.

Moving past Roko's argument itself, a number of discussions of this topic risk misrepresenting the debate's genre. Articles on Slate and RationalWiki strike an informal tone, and that tone can be useful for getting people thinking about interesting science/philosophy debates. On the other hand, if you're going to dismiss a question as unimportant or weird, it's important not to give the impression that working decision theorists are similarly dismissive.

What if your devastating take-down of string theory is intended for consumption by people who have never heard of 'string theory' before? Even if you're sure string theory is hogwash, then, you should be wary of giving the impression that the only people discussing string theory are the commenters on a recreational physics forum. Good reporting by non-professionals, whether or not they take an editorial stance on the topic, should make it obvious that there's academic disagreement about which approach to Newcomblike problems is the right one. The same holds for disagreement about topics like long-term AI risk or machine ethics.

If Roko's original post is of any pedagogical use, it's as an unsuccessful but imaginative stab at drawing out the diverging consequences of our current theories of rationality and goal-directed behavior. Good resources for these issues (both for discussion on Less Wrong and elsewhere) include:

The Roko's basilisk ban isn't in effect anymore, so you're welcome to direct people here (or to the Roko's basilisk wiki page, which also briefly introduces the relevant issues in decision theory) if they ask about it. Particularly low-quality discussions can still get deleted (or politely discouraged), though, at moderators' discretion. If anything here was unclear, you can ask more questions in the comments below.

How could one (and should one) convert someone from pseudoscience?

9 Vilx- 05 October 2015 11:53AM

I've known for a long time that some people who are very close to me are somewhat inclined to believe the pseudoscience world, but it always seemed pretty benign. In their everyday lives they're pretty normal people and don't do any crazy things, so this was a topic I mostly avoided and left it at that. After all - they seemed to find psychological value in it. A sense of control over their own lives, a sense of purpose, etc.

Recently I found out however that at least one of them seriously believes Bruce Lipton, who in essence preaches that happy thoughts cure cancer. Now I'm starting to get worried...

Thus I'm wondering - what can I do about it? This is in essence a religious question. They believe this stuff with just anecdotal proof. How do I disprove it without sounding like "Your religion is wrong, convert to my religion, it's right"? Pseudoscientists are pretty good at weaving a web of lies that sound quite logical and true.

The one thing I've come up with is to somehow introduce them to classical logical fallacies. That at least doesn't directly conflict with their beliefs. But beyond that I have no idea.

And perhaps more important is the question - should I do anything about it? The pseudoscientific world is a rosy one. You're in control of your life and your body, you control random events, and most importantly - if you do everything right, it'll all be OK. Even if I succeed in crushing that illusion, I have nothing to put in its place. I'm worried that revealing just how truly bleak the reality is might devastate them. They seem to be drawing a lot of their happiness from these pseudoscientific beliefs, either directly or indirectly.

And anyway, more likely that I won't succeed but just ruin my (healthy) relationship with them. Maybe it's best just not to interfere at all? Even if they end up hurting themselves, well... it was their choice. Of course, that also means that I'll be standing idly by and allowing bullshit to propagate, which is kinda not a very good thing. However right now they are not very pushy about their beliefs, and only talk about them if the topic comes up naturally, so I guess it's not that bad.

Any thoughts?

Open thread, Oct. 5 - Oct. 11, 2015

7 MrMind 05 October 2015 06:50AM

If it's worth saying, but not worth its own post (even in Discussion), then it goes here.

Notes for future OT posters:

1. Please add the 'open_thread' tag.

2. Check if there is an active Open Thread before posting a new one. (Immediately before; refresh the list-of-threads page before posting.)

3. Open Threads should be posted in Discussion, and not Main.

4. Open Threads should start on Monday, and end on Sunday.

[Link] Tetlock on the power of precise predictions to counter political polarization

6 Stefan_Schubert 04 October 2015 03:19PM

The prediction expert Philip Tetlock writes in New York Times on the power of precise predictions to counter political polarization. Note the similarity to Robin Hanson's futarchy idea.

IS there a solution to this country’s polarized politics?

Consider the debate over the nuclear deal with Iran, which was one of the nastiest foreign policy fights in recent memory. There was apocalyptic rhetoric, multimillion-dollar lobbying on both sides and a near-party-line Senate vote. But in another respect, the dispute was hardly unique: Like all policy debates, it was, at its core, a contest between competing predictions.

Opponents of the deal predicted that the agreement would not prevent Iran from getting the bomb, would put Israel at greater risk and would further destabilize the region. The deal’s supporters forecast that it would stop (or at least delay) Iran from fielding a nuclear weapon, would increase security for the United States and Israel and would underscore American leadership.

The problem with such predictions is that it is difficult to square them with objective reality. Why? Because few of them are specific enough to be testable. Key terms are left vague and undefined. (What exactly does “underscore leadership” mean?) Hedge words like “might” or “could” are deployed freely. And forecasts frequently fail to include precise dates or time frames. Even the most emphatic declarations — like former Vice President Dick Cheney’s prediction that the deal “will lead to a nuclear-armed Iran” — can be too open-ended to disconfirm.


Non-falsifiable predictions thus undermine the quality of our discourse. They also impede our ability to improve policy, for if we can never judge whether a prediction is good or bad, we can never discern which ways of thinking about a problem are best.

The solution is straightforward: Replace vague forecasts with testable predictions. Will the International Atomic Energy Agency report in December that Iran has adequately resolved concerns about the potential military dimensions of its nuclear program? Will Iran export or dilute its quantities of low-enriched uranium in excess of 300 kilograms by the deal’s “implementation day” early next year? Within the next six months, will any disputes over I.A.E.A. access to Iranian sites be referred to the Joint Commission for resolution?

Such questions don’t precisely get at what we want to know — namely, will the deal make the United States and its allies safer? — but they are testable and relevant to the question of the Iranian threat. Most important, they introduce accountability into forecasting. And that, it turns out, can depolarize debate.

In recent years, Professor Tetlock and collaborators have observed this depolarizing effect when conducting forecasting “tournaments” designed to identify what separates good forecasters from the rest of us. In these tournaments, run at the behest of the Intelligence Advanced Research Projects Activity (which supports research relevant to intelligence agencies), thousands of forecasters competed to answer roughly 500 questions on various national security topics, from the movement of Syrian refugees to the stability of the eurozone.

The tournaments identified a small group of people, the top 2 percent, who generated forecasts that, when averaged, beat the average of the crowd by well over 50 percent in each of the tournament’s four years. How did they do it? Like the rest of us, these “superforecasters” have political views, often strong ones. But they learned to seriously consider the possibility that they might be wrong.

What made such learning possible was the presence of accountability in the tournament: Forecasters were able see their competitors’ predictions, and that transparency reduced overconfidence and the instinct to make bold, ideologically driven predictions. If you can’t hide behind weasel words like “could” or “might,” you start constructing your predictions carefully. This makes sense: Modest forecasts are more likely to be correct than bold ones — and no one wants to look stupid.

This suggests a way to improve real-world discussion. Suppose, during the next ideologically charged policy debate, that we held a public forecasting tournament in which representatives from both sides had to make concrete predictions. (We are currently sponsoring such a tournament on the Iran deal.) Based on what we have seen in previous tournaments, this exercise would decrease the distance between the two camps. And because it would be possible to determine a “winner,” it would help us learn whether the conservative or liberal assessment of the issue was more accurate.


Either way, we would begin to emerge from our dark age of political polarization.

Experiment: Changing minds vs. preaching to the choir

12 cleonid 03 October 2015 11:27AM


      1. Problem

In the market economy production is driven by monetary incentives – higher reward for an economic activity makes more people willing to engage in it. Internet forums follow the same principle but with a different currency - instead of money the main incentive of internet commenters is the reaction of their audience. A strong reaction expressed by a large number of replies or “likes” encourages commenters to increase their output. Its absence motivates them to quit posting or change their writing style.

On neutral topics, using audience reaction as an incentive works reasonably well: attention focuses on the most interesting or entertaining comments. However, on partisan issues, such incentives become counterproductive. Political forums and newspaper comment sections demonstrate the same patterns:

  • The easiest way to maximize “likes” for a given amount of effort is by posting an emotionally charged comment which appeals to audience’s biases (“preaching to the choir”).


  • The easiest way to maximize the number of replies is by posting a low quality comment that goes against audience’s biases (“trolling”).


  • Both effects are amplified when the website places comments with most replies or “likes” at the top of the page.


The problem is not restricted to low-brow political forums. The following graph, which shows the average number of comments as a function of an article’s karma, was generated from the Lesswrong data.


The data suggests that the easiest way to maximize the number of replies is to write posts that are disliked by most readers. For instance, articles with the karma of -1 on average generate twice as many comments (20.1±3.4) as articles with the karma of +1 (9.3±0.8).

2. Technical Solution

Enabling constructive discussion between people with different ideologies requires reversing the incentives – people need to be motivated to write posts that sound persuasive to the opposite side rather than to their own supporters.

We suggest addressing this problem that this problem by changing the voting system. In brief, instead of votes from all readers, comment ratings and position on the page should be based on votes from the opposite side only. For example, in the debate on minimum wage, for arguments against minimum wage only the upvotes of minimum wage supporters would be counted and vice versa.

The new voting system can simultaneously achieve several objectives:

·         eliminate incentives for preaching to the choir

·         give posters a more objective feedback on the impact of their contributions, helping them improve their writing style

·     focus readers’ attention on comments most likely to change their minds instead of inciting comments that provoke an irrational defensive reaction.

3. Testing

If you are interested in measuring and improving your persuasive skills and would like to help others to do the same, you are invited to take part in the following experiment:


Step I. Submit Pro or Con arguments on any of the following topics (up to 3 arguments in total):

     Should the government give all parents vouchers for private school tuition?

     Should developed countries increase the number of immigrants they receive?

     Should there be a government mandated minimum wage?


Step II. For each argument you have submitted, rate 15 arguments submitted by others.


Step III.  Participants will be emailed the results of the experiment including:

-         ratings their arguments receive from different reviewer groups (supporters, opponents and neutrals)

-         the list of the most persuasive Pro & Con arguments on each topic (i.e. arguments that received the highest ratings from opposing and neutral groups)

-         rating distribution in each group


Step IV (optional). If interested, sign up for the next round.


The experiment will help us test the effectiveness of the new voting system and develop the best format for its application.





Digital Immortality Map: How to collect enough information about yourself for future resurrection by AI

6 turchin 02 October 2015 10:21PM

If someone has died it doesn’t mean that you should stop trying to return him to life. There is one clear thing that you should do (after cryonics): collect as much information about the person as possible, as well as store his DNA sample, and hope that future AI will return him to life based on this information.


Two meanings of “Digital immortality”

The term “Digital immortality” is often confused with the notion of mind uploading, as the end result is almost the same: a simulated brain in a computer. https://en.wikipedia.org/wiki/Digital_immortality

But here, by the term “Digital immortality” I mean reconstruction of the person based on his digital footprint and other traces by future AI after this person death.

Mind uploading in the future will happen while the original is still alive (or while the brain exists in a frozen state) and will be connected to a computer by some kind of sophisticated interface, or the brain will be scanned. It cannot be done currently. 

On the other hand, reconstruction based on traces will be done by future AI. So we just need to leave enough traces and we could do it now.

But we don’t know how much traces are enough, so basically we should try to produce and preserve as many traces as possible. However, not all traces are equal in their predictive value. Some are almost random, and others are so common that they do not provide any new information about the person.


Cheapest way to immortality

Creating traces is an affordable way of reaching immortality. It could even be done for another person after his death, if we start to collect all possible information about him. 

Basically I am surprised that people don’t do it all the time. It could be done in a simple form almost for free and in the background – just start a video recording app on your notebook, and record everything into shared folder connected with a free cloud. (Evocam program for Mac is excellent, and mail.ru provides up 100gb free).

But really good digital immortality require 2-3 month commitment for self-description with regular every year updates. It may also require maximum several thousand dollars investment in durable disks, DNA testing, videorecorders, and free time to do it.

I understand how to set up this process and could help anyone interested.



The idea of personal identity is outside the scope of this map. I have another map on this topic (now in draft), I assume that the problem of personal identity will be solved in the future. Perhaps we will prove that information only is enough to solve the problem, or we will find that continuity of consciousness, but we will be able to construct mechanisms to transfer this identity independently of information. 

Digital immortality requires a very weak notion of identity. i.e. a model of behavior and thought processes is enough for an identity. This model may have some differences from the original, which I call “one night difference”, that is the typical difference between me-yesterday and me-today after one night's sleep. The meaningful part of this information has size from several megabytes to gigabits, but we may need to collect much more information as we can’t now extract meaningful part from random.

DI may also be based on even weaker notion of identity, that anyone who thinks that he is me, is me. Weaker notions of identity require less information to be preserved, and in last case it may be around 10K bytes (including name, indexical information and basic traits description)

But the question about the number of traces needed to create an almost exact model of a personality is still open. It also depends on predictive power of future AI: the stronger is AI, the less traces are enough.

Digital immortality is plan C in my Immortality Roadmap, where Plan A is life extension and Plan B is cryonics; it is not plan A, because it requires solving the identity problem plus the existence of powerful future AI.



I created my first version of it in the year 1990 when I was 16, immediately after I had finished school. It included association tables, drawings and lists of all people known to me, as well as some art, memoires, audiorecordings and encyclopedia od everyday objects around me.

There are several approaches to achieving digital immortality. The most popular one is passive that is simply videorecording of everything you do.

My idea was that a person can actively describe himself from inside. He may find and declare the most important facts about himself. He may run specific tests that will reveal hidden levels of his mind and sub consciousness. He can write a diary and memoirs. That is why I called my digital immortality project “self-description”.


Structure of the map

This map consists of two parts: theoretical and practical. The theoretical part lists basic assumptions and several possible approaches to reconstructing an individual, in which he is considered as a black box. If real neuron actions will become observable, the "box" will become transparent and real uploading will be possible.

There are several steps in the practical part:

- The first step includes all the methods of fixing information while the person of interest is alive.

- The second step is about preservation of the information.

- The third step is about what should be done to improve and promote the process.

- The final fourth step is about the reconstruction of the individual, which will be performed by AI after his death. In fact it may happen soon, may be in next 20-50 years.

There are several unknowns in DI, including the identity problem, the size and type of information required to create an exact model of the person, and the required power of future AI to operate the process. These and other problems are listed in the box on the right corner of the map.

The pdf of the map is here, and jpg is below.


Previous posts with maps:

Doomsday Argument Map

AGI Safety Solutions Map

A map: AI failures modes and levels

A Roadmap: How to Survive the End of the Universe

A map: Typology of human extinction risks

Roadmap: Plan of Action to Prevent Human Extinction Risks

Immortality Roadmap













Weekly LW Meetups

4 FrankAdamek 02 October 2015 04:22PM

This summary was posted to LW Main on September 25th. The following week's summary is here.

New meetups (or meetups with a hiatus of more than a year) are happening in:

Irregularly scheduled Less Wrong meetups are taking place in:

The remaining meetups take place in cities with regular scheduling, but involve a change in time or location, special meeting content, or simply a helpful reminder about the meetup:

Locations with regularly scheduled meetups: Austin, Berkeley, Berlin, Boston, Brussels, Buffalo, Cambridge UK, Canberra, Columbus, Denver, London, Madison WI, Melbourne, Moscow, Mountain View, New York, Philadelphia, Research Triangle NC, Seattle, Sydney, Tel Aviv, Toronto, Vienna, Washington DC, and West Los Angeles. There's also a 24/7 online study hall for coworking LWers.

continue reading »

Donna Capsella and the four applicants, pt.1

0 Romashka 02 October 2015 02:15PM

Once upon a time, in a dark, cruel world – maybe a world darker and crueller than it is – there lived a woman who wanted a piece of the action. Her name was Capsella Medik, but we remember her as Donna Capsella. This is an anecdote from her youth, told by a man who lived to tell it.

...you've got to understand, Donna started small. Real small. No money, no allies, no kin, and her wiles were – as feminine as they are. Still, she was ambitious, even then, and she had to look the part.

Girl had a way with people. Here's how it went.

One night, she rents a room – one table, five chairs – and two armed bodies, and sets up a date with four men at once – Mr. Burr, Mr. Sapp, Mr. Ast and Mr. Oriss, who've never seen her before. All are single, thirty-ish white collars. One look at the guns, and they're no trouble at all.

On the table, there's a heap: a coloured picture, a box of beads, another box (empty), four stacks of paper, four pens, a calculator and a sealed envelope.

'So,' says Donna. 'I need a manager. A clever man who'd keep my bank happy while I am...abroad. I offer you to play a game – just one game – and the winner is going to sign these papers. You leave hired, or not at all.'

The game was based on Mendel's Laws – can you imagine? The police never stood a chance against her... She had it printed out – a kind of cheat-sheet. It's like, if you have some biological feature, it's either what your genes say, or you helped Nature along the way; and the exact – wording – can be different, so you have blue eyes or brown eyes. The wording is what they call allele. Some alleles, dominant, shout louder than others, recessive, so you'll have at most two copies of each gene (hopefully), but only one will ever be heard on the outside.

(It's not quite that simple, but we didn't protest. Guns, you know.)

So there was a picture of a plant whose leaves came in four shapes (made by two genes with two alleles each):


From left to right: simplex, rhomboidea, heteris and tenuis. Simplex had only recessive alleles, aabb. Rhomboidea and tenuis each had only one pair of recessive alleles – aaB? and A?bb. But heteris, that one was a puzzler: A?B?.

'Okay,' Donna waves her hand over the heap on the table. 'Here are the rules. You will see two parent plants, and then you will see their offspring – one at a time.' She shows us the box with the beads. 'Forty-eight kids total.' She begins putting some of the beads into the empty box, but we don't see which ones. 'The colours are like in the picture. You have to guess as much about the parents and the kids as you can as I go along. All betting stops when the last kid pops out. Guess wrong, even partially wrong, you lose a point, guess right, earn one. Screw around, you're out of the game. The one with the most points wins.'

'Uh,' mumbles Oriss. 'Can we, maybe, say we're not totally sure – ?..'

She smiles, and oh, those teeth. 'Yeah. Use your Bayes.'

And just like that, Oriss reaches to his stack of paper, ready to slog through all the calculations. (Oriss likes to go ahead and gamble based on some math, even if it's not rock solid yet.)

'Er,' tries Sapp. 'Do we have to share our guesses?'

'No, the others will only know that you earned or lost a point.'

And Sapp picks up his pen, but with a little frown. (He doesn't share much, does Sapp.)

'Um,' Ast breaks in. 'In a single round, do we guess simultaneously, or in some order?'

'Simultaneously. You write it down and give it to me.'

And Ast slumps down in his seat, sweating, and eyes the calculator. (Ast prefers to go where others lead, though he can change his mind lightning-fast.)

'Well,' Burr shrugs. 'I'll just follow rough heuristics, and we'll see how it goes.'

'Such as?' asks Donna, cocking her head to the side.

'As soon as there's a simplex kid, it all comes down to pure arithmetic, since we'll know both parents have at least one recessive allele for each of the genes. If both parents are heteris – and they will be, I see it in your eyes! – then the probability of at least one of them having at least one recessive allele is higher than the probability of neither having any. I can delay making guesses for a time and just learn what score the others get for theirs, since they're pretty easy to reverse-engineer – '

'What!' say Ast, Sapp and Oriss together.

'You won't get points fast enough,' Donna points out. 'You will lose.'

'I might lose. And you will hire me anyway. You need a clever man to keep your bank happy.'

Donna purses her lips.

'You haven't told anything of value, anything the others didn't know.'

'But of course,' Burr says humbly, and even the armed bodies scowl.

'You're only clever when you have someone to mooch off. I won't hire you alone.'


'Mind, I won't pick you if you lose too badly.'

Burr leers at her, and she swears under her breath.

'Enough,' says Donna and puts down two red beads – the parents – on the table.

We take our pens. She reaches out into the box of offspring.

The first bead is red.

And the second one is red.

And the third one is red.

...I tell you, it was the longest evening in my life.


So, what are your Fermi estimates for the numbers of points Mr. Burr, Mr. Sapp, Mr. Ast and Mr. Oriss each earned? And who was selected as a manager, or co-managers? And how many people left the room?

(I apologise - the follow-up won't be for a while.)

Two Growth Curves

31 AnnaSalamon 02 October 2015 12:59AM

Sometimes, it helps to take a model that part of you already believes, and to make a visual image of your model so that more of you can see it.

One of my all-time favorite examples of this: 

I used to often hesitate to ask dumb questions, to publicly try skills I was likely to be bad at, or to visibly/loudly put forward my best guesses in areas where others knew more than me.

I was also frustrated with this hesitation, because I could feel it hampering my skill growth.  So I would try to convince myself not to care about what people thought of me.  But that didn't work very well, partly because what folks think of me is in fact somewhat useful/important.

Then, I got out a piece of paper and drew how I expected the growth curves to go.

In blue, I drew the apparent-coolness level that I could achieve if I stuck with the "try to look good" strategy.  In brown, I drew the apparent-coolness level I'd have if I instead made mistakes as quickly and loudly as possible -- I'd look worse at first, but then I'd learn faster, eventually overtaking the blue line.

Suddenly, instead of pitting my desire to become smart against my desire to look good, I could pit my desire to look good now against my desire to look good in the future :)

I return to this image of two growth curves often when I'm faced with an apparent tradeoff between substance and short-term appearances.  (E.g., I used to often find myself scurrying to get work done, or to look productive / not-horribly-behind today, rather than trying to build the biggest chunks of capital for tomorrow.  I would picture these growth curves.)

October 2015 Media Thread

5 ArisKatsaris 01 October 2015 10:17PM

This is the monthly thread for posting media of various types that you've found that you enjoy. Post what you're reading, listening to, watching, and your opinion of it. Post recommendations to blogs. Post whatever media you feel like discussing! To see previous recommendations, check out the older threads.


  • Please avoid downvoting recommendations just because you don't personally like the recommended material; remember that liking is a two-place word. If you can point out a specific flaw in a person's recommendation, consider posting a comment to that effect.
  • If you want to post something that (you know) has been recommended before, but have another recommendation to add, please link to the original, so that the reader has both recommendations.
  • Please post only under one of the already created subthreads, and never directly under the parent media thread.
  • Use the "Other Media" thread if you believe the piece of media you want to discuss doesn't fit under any of the established categories.
  • Use the "Meta" thread if you want to discuss about the monthly media thread itself (e.g. to propose adding/removing/splitting/merging subthreads, or to discuss the type of content properly belonging to each subthread) or for any other question or issue you may have about the thread or the rules.

Polling Thread - Tutorial

5 Gunnar_Zarncke 01 October 2015 09:47PM

After some hiatus another installment of the Polling Thread.

This is your chance to ask your multiple choice question you always wanted to throw in. Get qualified numeric feedback to your comments. Post fun polls.

Additionally this is your chance to learn to write polls. This installment is devoted to try out polls for the cautious and curious.

These are the rules:

  1. Each poll goes into its own top level comment and may be commented there.
  2. You must at least vote all polls that were posted earlier than you own. This ensures participation in all polls and also limits the total number of polls. You may of course vote without posting a poll.
  3. Your poll should include a 'don't know' option (to avoid conflict with 2). I don't know whether we need to add a troll catch option here but we will see.

If you don't know how to make a poll in a comment look at the Poll Markup Help.

This is a somewhat regular thread. If it is successful I may post again. Or you may. In that case do the following :

  • Use "Polling Thread" in the title.
  • Copy the rules.
  • Add the tag "poll".
  • Link to this Thread or a previous Thread.
  • Create a top-level comment saying 'Discussion of this thread goes here; all other top-level comments should be polls or similar'
  • Add a second top-level comment with an initial poll to start participation.

[Link] Differential Technology Development - Some Early Thinking

3 MattG 01 October 2015 02:08AM

This article gives a simple model to think about the positive effects of a friendly AI vs. the negative effects of an unfriendly AI, and let's you plug in certain assumptions to see if speeding up AI progress is worthwhile. Thought some of you here might be interested.


The Trolley Problem and Reversibility

7 casebash 30 September 2015 04:06AM

The most famous problem used when discussing consequentialism is that of the tram problem. A tram is hurtling towards the 5 people on the track, but if you flick a switch it will change tracks and kill only the one person instead. Utilitarians would say that you should flick the switch as it is better for there to be a single death than five. Some deontologists might agree with this, however, much more would object and argue that you don’t have the right to make that decision. This problem has different variations, such as one where you push someone in front of the train instead of them being on the track, but we’ll consider this one, as if it is accepted then it moves you a large way towards utilitarianism.

Let’s suppose that someone flicks the switch, but then realises the other side was actually correct and that they shouldn’t have flicked it. Do they now have an obligation to flick the switch back? What is interesting is that if they had just walked into the room and the train was heading towards the one person, they would have had an obligation *not* to flick the switch, but, having flicked it, it seems that they have an obligation to flick it back the other way.

Where this gets more puzzling is when we imagine Bob having observed Aaron flicking the switch? Arguably, if Aaron had no right to flick the switch, then Bob would have obligation to flick it back (or, if not an obligation, this would surely count as a moral good?). It is hard to argue against this conclusion, assuming that there is a strong moral obligation for Aaron not to flick the switch, along the lines of “Do not kill”. This logic seems consistent with how we act in other situations; if someone had tried to kill someone or steal something important from them; then most people would reverse or prevent the action if they could. 

But what if Aaron reveals that he was only flicking the switch because Cameron had flicked it first? Then Bob would be obligated to leave it alone, as Aaron would be doing what Bob was planning to do: prevent interference. We can also complicate it by imagining that a strong gust of wind was about to come and flick the switch, but Bob flicked it first. Is there now a duty to undo Bob's flick of the switch or does that fact that the switch was going to flick anyway abrogate that duty? This obligation to trace back the history seems very strange indeed. I can’t see any pathway to find a logical contradiction, but I can’t imagine that many people would defend this state of affairs.

But perhaps the key principle here is non-interference. When Aaron flicks the switch, he has interfered and so he arguably has the limited right to undo his interference. But when Bob decides to reverse this, perhaps this counts as interference also. So while Bob receives credit for preventing Aaron’s interference, this is outweighed by committing interference himself - acts are generally considered more important than omissions. This would lead to Bob being required to take no action, as there wouldn’t be any morally acceptable pathway with which to take action.

I’m not sure I find this line of thought convincing. If we don’t want anyone interfering with the situation, couldn’t we lock the switch in place before anyone (including Aaron) gets the chance or even the notion to interfere? It would seem rather strange to argue that we have to leave the door open to interference even before we know anyone is planning to do so. Next suppose that we don’t have glue, but we can install a mechanism that will flick the switch back if anyone tries to flick it. Principally, this doesn’t seem any different from installing glue.

Next, suppose we don’t have a machine to flick it back, so instead we install Bob. It seems that installing Bob is just as moral as installing an actual mechanism. It would seem rather strange to argue that “installing” Bob is moral, but any action he takes is immoral. There might be cases where “installing” someone is moral, but certain actions they take will be immoral. One example would be “installing” a policeman to enforce a law that is imperfect. We can expect the decision to hire the policeman to be moral if the law is general good, but, in certain circumstances, flaws in this law might make enforcement immoral. But here, we are imagining that *any* action Bob takes is immoral interference. It therefore seems strange to suggest that installing him could somehow be moral and so this line of thought seems to lead to a contradiction.

We consider one last situation: that we aren't allowed to interfere and that setting up a mechanism to stop interference also counts as interference. We first imagine that Obama has ordered a drone attack that is going to kill a (robot, just go with it) terrorist. He knows that the drone attack will cause collateral damage, but it will also prevent the terrorist from killing many more people on American soil. He wakes up the next morning and realises that he was wrong to violate the deontological principles, so he calls off the attack. Are there any deotologists who would argue that he doesn’t have the right to rescind his order? Rescinding the order does not seem to count as "further interference", instead it seems to count as "preventing his interference from occurring". Flicking the switch back seems functionally identical to rescinding the order. The train hasn’t hit the intersection; so there isn’t any casual entanglement, so it seems like flicking the switch is best characterised as preventing the interference from occurring. If we want to make the scenarios even more similar, we can imagine that flicking the switch doesn't force the train to go down one track or another, but instead orders the driver to take one particular track. It doesn't seem like changing this aspect of the problem should alter the morality at all.

This post has shown that deontological objections to the Trolley Problem tend to lead to non-obvious philosophical commitments that are not very well known. I didn't write this post so much as to try to show that deontology is wrong, as to start as conversation and help deontologists understand and refine their commitments better.

I also wanted to include one paragraph I wrote in the comments: Let's assume that the train will arrive at the intersection in five minutes. If you pull the lever one way, then pull it back the other, you'll save someone from losing their job. There is no chance that the lever will get stuck out that you won't be able to complete the operation on trying. Clearly pulling the lever, then pulling it back is superior to not touching it. This seems to indicate that the sin isn't pulling the lever, but pulling it without the intent to pull it back. If the sin is pulling it without intent to pull it back, then it would seem very strange that gaining the intent to pull it back, then pulling it back would be a sin.

The application of the secretary problem to real life dating

5 Elo 29 September 2015 10:28PM

The following problem is best when not described by me:


Although there are many variations, the basic problem can be stated as follows:


There is a single secretarial position to fill.

There are n applicants for the position, and the value of n is known.

The applicants, if seen altogether, can be ranked from best to worst unambiguously.

The applicants are interviewed sequentially in random order, with each order being equally likely.

Immediately after an interview, the interviewed applicant is either accepted or rejected, and the decision is irrevocable.

The decision to accept or reject an applicant can be based only on the relative ranks of the applicants interviewed so far.

The objective of the general solution is to have the highest probability of selecting the best applicant of the whole group. This is the same as maximizing the expected payoff, with payoff defined to be one for the best applicant and zero otherwise.




After reading that you can probably see the application to real life.  There are a series of bad and good assumptions following, some are fair, some are not going to be representative of you.  I am going to try to name them all as I go so that you can adapt them with better ones for yourself.  Assuming that you plan to have children and you will probably be doing so like billions of humans have done so far in a monogamous relationship while married (the entire set of assumptions does not break down for poly relationships or relationship-anarchy, but it gets more complicated).  These assumptions help us populate the Secretary problem with numbers in relation to dating for the purpose of children.


If you assume that a biological female's clock ends at 40. (in that its hard and not healthy for the baby if you try to have a kid past that age), that is effectively the end of the pure and simple biological purpose of relationships. (environment, IVF and adoption aside for a moment).  (yes there are a few more years on that)


For the purpose of this exercise – as a guy – you can add a few years for the potential age gap you would tolerate. (i.e. my parents are 7 years apart, but that seems like a big understanding and maturity gap – they don't even like the same music), I personally expect I could tolerate an age gap of 4-5 years.

If you make the assumption that you start your dating life around the ages of 16-18. that gives you about [40-18=22]  22-24 (+5 for me as a male), years of expected dating potential time.

If you estimate the number of kids you want to have, and count either:

3 years for each kid OR

2 years for each kid (+1 kid – AKA 2 years)

(Twins will throw this number off, but estimate that they take longer to recover from, or more time raising them to manageable age before you have time to have another kid)

My worked example is myself – as a child of 3, with two siblings of my own I am going to plan to have 3 children. Or 8-9 years of child-having time. If we subtract that from the number above we end up with 11-16 (16-21 for me being a male) years of dating time.

Also if you happen to know someone with a number of siblings (or children) and a family dynamic that you like; then you should consider that number of children for yourself. Remember that as a grown-up you are probably travelling through the world with your siblings beside you.  Which can be beneficial (or detrimental) as well, I would be using the known working model of yourself or the people around you to try to predict whether you will benefit or be at a disadvantage by having siblings.  As they say; You can't pick your family - for better and worse.  You can pick your friends, if you want them to be as close as a default family - that connection goes both ways - it is possible to cultivate friends that are closer than some families.  However you choose to live your life is up to you.

Assume that once you find the right person - getting married (the process of organising a wedding from the day you have the engagement rings on fingers); and falling pregnant (successfully starting a viable pregnancy) takes at least a year. Maybe two depending on how long you want to be "we just got married and we aren't having kids just yet". It looks like 9-15 (15-20 for male adjusted) years of dating.

With my 9-15 years; I estimate a good relationship of working out whether I want to marry someone, is between 6 months and 2 years, (considering as a guy I will probably be proposing and putting an engagement ring on someone's finger - I get higher say about how long this might take than my significant other does.), (This is about the time it takes to evaluate whether you should put the ring on someone's finger).  For a total of 4 serious relationships on the low and long end and 30 serious relationships on the upper end. (7-40 male adjusted relationships)

Of course that's not how real life works. Some relationships will be longer and some will be shorter. I am fairly confident that all my relationships will fall around those numbers.

I have a lucky circumstance; I have already had a few serious relationships (substitute your own numbers in here).  With my existing relationships I can estimate how long I usually spend in a relationship. (2year + 6 year + 2month + 2month /4 = 2.1 years). Which is to say that I probably have a maximum and total of around 7-15 relationships before I gotta stop expecting to have kids, or start compromising on having 3 kids.




A solution to the secretary equation

A known solution that gives you the best possible candidate the most of the time is to try out 1/e candidates (or roughly 36%), then choose the next candidate that is better than the existing candidates. For my numbers that means to go through 3-7 relationships and then choose the next relationship that is better than all the ones before.  


I don't quite like that.  It depends on how big your set is; as to what the chance of you having the best candidate in the first 1/e trials and then sticking it out till the last candidate, and settling on them.  (this strategy has a ((1/n)*(1/e)) chance of just giving you the last person in the set - which is another opportunity cost risk - what if they are rubbish? Compromise on the age gap, the number of kids or the partners quality...)  If the set is 7, the chance that the best candidate is in the first 1/e is 5.26% (if the set is 15 - the chance is much lower at 2.45%).  


Opportunity cost

Each further relationship you have might be costing you another 2 years to get further out of touch with the next generation (kids these days!)  I tend to think about how old I will be when my kids are 15-20 am I growing rapidly out of touch with the next younger generation?  Two years is a very big opportunity spend - another 2 years could see you successfully running a startup and achieving lifelong stability at the cost of the opportunity to have another kid.  I don't say this to crush you with fear of inaction; but it should factor in along with other details of your situation.


A solution to the risk of having the best candidate in your test phase; or to the risk of lost opportunity - is to lower the bar; instead of choosing the next candidate that is better than all the other candidates; choose the next candidate that is better than 90% of the candidates so far.  Incidentally this probably happens in real life quite often.  In a stroke of, "you'll do"...


Where it breaks down


Real life is more complicated than that. I would like to think that subsequent relationships that I get into will already not suffer the stupid mistakes of the last ones; As well as the potential opportunity cost of exploration. The more time you spend looking for different partners – you might lose your early soul mate, or might waste time looking for a better one when you can follow a "good enough" policy. No one likes to know they are "good enough", but we do race the clock in our lifetimes. Life is what happens when you are busy making plans.


As someone with experience will know - we probably test and rule out bad partners in a single conversation, where we don't even get so far as a date.  Or don't last more than a week. (I. E the experience set is growing through various means).


People have a tendency to overrate the quality of a relationship while they are in it, versus the ones that already failed.


Did I do something wrong? 

“I got married early - did I do something wrong (or irrational)?”

No.  equations are not real life.  It might have been nice to have the equation, but you obviously didn't need it.  Also this equation assumes a monogamous relationship.  In real life people have overlapping relationships, you can date a few people and you can be poly. These are all factors that can change the simple assumptions of the equation. 


Where does the equation stop working?

Real life is hard.  It doesn't fall neatly into line, it’s complicated, it’s ugly, it’s rough and smooth and clunky.  But people still get by.  Don’t be afraid to break the rule. 

Disclaimer: If this equation is the only thing you are using to evaluate a relationship - it’s not going to go very well for you.  I consider this and many other techniques as part of my toolbox for evaluating decisions.

Should I break up with my partner?

What? no!  Following an equation is not a good reason to live your life.  

Does your partner make you miserable?  Then yes you should break up.


Do you feel like they are not ready to have kids yet and you want to settle down?  Tough call.  Even if they were agents also doing the equation; An equation is not real life.  Go by your brain; go by your gut.  Don’t go by just one equation.

Expect another post soon about reasonable considerations that should be made when evaluating relationships.

The given problem makes the assumption that you are able to evaluate partners in the sense that the secretary problem expects.  Humans are not all strategic and can’t really do that.  This is why the world is not going to perfectly follow this equation.  Life is complicated; there are several metrics that make a good partner and they don’t always trade off between one another.



Meta: writing time - 3 hours over a week; 5+ conversations with people about the idea, bothering a handful of programmers and mathematicians for commentary on my thoughts, and generally a whole bunch of fun talking about it.  This post was started on the slack channel when someone asked a related question.


My table of contents for other posts in my series.


Let me know if this post was helpful or if it worked for you or why not.

Ultimatums in the Territory

10 malcolmocean 28 September 2015 10:01PM

When you think of "ultimatums", what comes to mind?

Manipulativeness, maybe? Ultimatums are typically considered a negotiation tactic, and not a very pleasant one.

But there's a different thing that can happen, where an ultimatum is made, but where articulating it isn't a speech act but rather an observation. As in, the ultimatum wasn't created by the act of stating it, but rather, it already existed in some sense.

Some concrete examples: negotiating relationships

I had a tense relationship conversation a few years ago. We'd planned to spend the day together in the park, and I was clearly angsty, so my partner asked me what was going on. I didn't have a good handle on it, but I tried to explain what was uncomfortable for me about the relationship, and how I was confused about what I wanted. After maybe 10 minutes of this, she said, "Look, we've had this conversation before. I don't want to have it again. If we're going to do this relationship, I need you to promise we won't have this conversation again."

I thought about it. I spent a few moments simulating the next months of our relationship. I realized that I totally expected this to come up again, and again. Earlier on, when we'd had the conversation the first time, I hadn't been sure. But it was now pretty clear that I'd have to suppress important parts of myself if I was to keep from having this conversation.

"...yeah, I can't promise that," I said.

"I guess that's it then."

"I guess so."

I think a more self-aware version of me could have recognized, without her prompting, that my discomfort represented an unreconcilable part of the relationship, and that I basically already wanted to break up.

The rest of the day was a bit weird, but it was at least nice that we had resolved this. We'd realized that it was a fact about the world that there wasn't a serious relationship that we could have that we both wanted.

I sensed that when she posed the ultimatum, she wasn't doing it to manipulate me. She was just stating what kind of relationship she was interested in. It's like if you go to a restaurant and try to order a pad thai, and the waiter responds, "We don't have rice noodles or peanut sauce. You either eat somewhere else, or you eat something other than a pad thai."

An even simpler example would be that at the start of one of my relationships, my partner wanted to be monogamous and I wanted to be polyamorous (i.e. I wanted us both to be able to see other people and have other partners). This felt a bit tug-of-war-like, but eventually I realized that actually I would prefer to be single than be in a monogamous relationship.

I expressed this.

It was an ultimatum! "Either you date me polyamorously or not at all." But it wasn't me "just trying to get my way".

I guess the thing about ultimatums in the territory is that there's no bluff to call.

It happened in this case that my partner turned out to be really well-suited for polyamory, and so this worked out really well. We'd decided that if she got uncomfortable with anything, we'd talk about it, and see what made sense. For the most part, there weren't issues, and when there were, the openness of our relationship ended up just being a place where other discomforts were felt, not a generator of disconnection.

Normal ultimatums vs ultimatums in the territory

I use "in the territory" to indicate that this ultimatum isn't just a thing that's said but a thing that is true independently of anything being said. It's a bit of a poetic reference to the map-territory distinction.

No bluffing: preferences are clear

The key distinguishing piece with UITTs is, as I mentioned above, that there's no bluff to call: the ultimatum-maker isn't secretly really really hoping that the other person will choose one option or the other. These are the two best options as far as they can tell. They might have a preference: in the second story above, I preferred a polyamorous relationship to no relationship. But I preferred both of those to a monogamous relationship, and the ultimatum in the territory was me realizing and stating that.

This can actually be expressed formally, using what's called a preference vector. This comes from Keith Hipel at University of Waterloo. If the tables in this next bit doesn't make sense, don't worry about it: all important conclusions are expressed in the text.

First, we'll note that since each of us have two options, a table can be constructed which shows four possible states (numbered 0-3 in the boxes).

    My options
  options insist poly don't insist
offer relationship 3: poly relationship 1: mono relationship
don't offer 2: no relationship 0: (??) no relationship

This representation is sometimes referred to as matrix form or normal form, and has the advantage of making it really clear who controls which state transitions (movements between boxes). Here, my decision controls which column we're in, and my partner's decision controls which row we're in.

Next, we can consider: of these four possible states, which are most and least preferred, by each person? Here's my preferences, ordered from most to least preferred, left to right. The 1s in the boxes mean that the statement on the left is true.

state 3 2 1 0
I insist on polyamory 1 1 0 0
partner offers relationship 1 0 1 0
My preference vector (← preferred)

The order of the states represents my preferences (as I understand them) regardless of what my potential partner's preferences are. I only control movement in the top row (do I insist on polyamory or not). It's possible that they prefer no relationship to a poly relationship, in which case we'll end up in state 2. But I still prefer this state over state 1 (mono relationship) and state 0 (in which I don't ask for polyamory and my partner decides not to date me anyway). So whatever my partners preferences are, I've definitely made a good choice for me, by insisting on polyamory.

This wouldn't be true if I were bluffing (if I preferred state 1 to state 2 but insisted on polyamory anyway). If I preferred 1 to 2, but I bluffed by insisting on polyamory, I would basically be betting on my partner preferring polyamory to no relationship, but this might backfire and get me a no relationship, when both of us (in this hypothetical) would have preferred a monogamous relationship to that. I think this phenomenon is one reason people dislike bluffy ultimatums.

My partner's preferences turned out to be...

state 1 3 2 0
I insist on polyamory 0 1 1 0
partner offers relationship 1 1 0 0
Partner's preference vector (← preferred)

You'll note that they preferred a poly relationship to no relationship, so that's what we got! Although as I said, we didn't assume that everything would go smoothly. We agreed that if this became uncomfortable for my partner, then they would tell me and we'd figure out what to do. Another way to think about this is that after some amount of relating, my partner's preference vector might actually shift such that they preferred no relationship to our polyamorous one. In which case it would no longer make sense for us to be together.

UITTs release tension, rather than creating it

In writing this post, I skimmed a wikihow article about how to give an ultimatum, in which they say:

"Expect a negative reaction. Hardly anyone likes being given an ultimatum. Sometimes it may be just what the listener needs but that doesn't make it any easier to hear."

I don't know how accurate the above is in general. I think they're talking about ultimatums like "either you quit smoking or we break up". I can say that expect that these properties of an ultimatum contribute to the negative reaction:

  • stated angrily or otherwise demandingly
  • more extreme than your actual preferences, because you're bluffing
  • refers to what they need to do, versus your own preferences

So this already sounds like UITTs would have less of a negative reaction.

But I think the biggest reason is that they represent a really clear articulation of what one party wants, which makes it much simpler for the other party to decide what they want to do. Ultimatums in the territory tend to also be more of a realization that you then share, versus a deliberate strategy. And this realization causes a noticeable release of tension in the realizer too.

Let's contrast:

"Either you quit smoking or we break up!"


"I'm realizing that as much as I like our relationship, it's really not working for me to be dating a smoker, so I've decided I'm not going to. Of course, my preferred outcome is that you stop smoking, not that we break up, but I realize that might not make sense for you at this point."

Of course, what's said here doesn't necessarily correspond to the preference vectors shown above. Someone could say the demanding first thing when they actually do have a UITT preference-wise, and someone who's trying to be really NVCy or something might say the sceond thing even though they're actually bluffing and would prefer to . But I think that in general they'll correlate pretty well.

The "realizing" seems similar to what happened to me 2 years ago on my own, when I realized that the territory was issuing me an ultimatum: either you change your habits or you fail at your goals. This is how the world works: your current habits will get you X, and you're declaring you want Y. On one level, it was sad to realize this, because I wanted to both eat lots of chocolate and to have a sixpack. Now this ultimatum is really in the territory.

Another example could be realizing that not only is your job not really working for you, but that it's already not-working to the extent that you aren't even really able to be fully productive. So you don't even have the option of just working a bit longer, because things are only going to get worse at this point. Once you realize that, it can be something of a relief, because you know that even if it's hard, you're going to find something better than your current situation.

Loose ends

More thoughts on the break-up story

One exercise I have left to the reader is creating the preference vectors for the break-up in the first story. HINT: (rot13'd) Vg'f fvzvyne gb gur cersrerapr irpgbef V qvq fubj, jvgu gjb qrpvfvbaf: fur pbhyq vafvfg ba ab shgher fhpu natfgl pbairefngvbaf be abg, naq V pbhyq pbagvahr gur eryngvbafuvc be abg.

An interesting note is that to some extent in that case I wasn't even expressing a preference but merely a prediction that my future self would continue to have this angst if it showed up in the relationship. So this is even more in the territory, in some senses. In my model of the territory, of course, but yeah. You can also think of this sort of as an unconscious ultimatum issued by the part of me that already knew I wanted to break up. It said "it's preferable for me to express angst in this relationship than to have it be angst free. I'd rather have that angst and have it cause a breakup than not have the angst."

Revealing preferences

I think that ultimatums in the territory are also connected to what I've called Reveal Culture (closely related to Tell Culture, but framed differently). Reveal cultures have the assumption that in some fundamental sense we're on the same side, which makes negotiations a very different thing... more of a collaborative design process. So it's very compatible with the idea that you might just clearly articulate your preferences.

Note that there doesn't always exist a UITT to express. In the polyamory example above, if I'd preferred a mono relationship to no relationship, then I would have had no UITT (though I could have bluffed). In this case, it would be much harder for me to express my preferences, because if I leave them unclear then there can be kind of implicit bluffing. And even once articulated, there's still no obvious choice. I prefer this, you prefer that. We need to compromise or something. It does seem clear that, with these preferences, if we don't end up with some relationship at the end, we messed up... but deciding how to resolve it is outside the scope of this post.

Knowing your own preferences is hard

Another topic this post will point at but not explore is: how do you actually figure out what you want? I think this is a mix of skill and process. You can get better at the general skill by practising trying to figure it out (and expressing it / acting on it when you do, and seeing if that works out well). One process I can think of that would be helpful is Gendlin's Focusing. Nate Soares has written about how introspection is hard and to some extent you don't ever actually know what you want: You don't get to know what you're fighting for. But, he notes,

"There are facts about what we care about, but they aren't facts about the stars. They are facts about us."

And they're hard to figure out. But to the extent that we can do so and then act on what we learn, we can get more of what we want, in relationships, in our personal lives, in our careers, and in the world.

(This article crossposted from my personal blog.)

Examples of growth mindset or practice in fiction

11 Swimmer963 28 September 2015 09:47PM

As people who care about rationality and winning, it's pretty important to care about training. Repeated practice is how humans acquire skills, and skills are what we use for winning.

Unfortunately, it's sometimes hard to get System 1 fully on board with the fact that repeated, difficult, sometimes tedious practice is how we become awesome. I find fiction to be one of the most useful ways of communicating things like this to my S1. It would be great to have a repository of fiction that shows characters practicing skills, mastering them, and becoming awesome, to help this really sink in.

However, in fiction the following tropes are a lot more common:

  1. hero is born to greatness and only needs to discover that greatness to win [I don't think I actually need to give examples of this?]
  2. like (1), only the author talks about the skill development or the work in passing… but in a way that leaves the reader's attention (and system 1 reinforcement?) on the "already be awesome" part, rather that the "practice to become awesome" part [HPMOR; the Dresden Files, where most of the implied practice takes place between books.]
  3. training montage, where again the reader's attention isn't on the training long enough to reinforce the "practice to become awesome" part, but skips to the "wouldn't it be great to already be awesome" part [TVtropes examples].
  4. The hero starts out ineffectual and becomes great over the course of the book, but this comes from personal revelations and insights, rather than sitting down and practicing [Nice Dragons Finish Last is an example of this].

Example of exactly the wrong thing:
The Hunger Games - Katniss is explicitly up against the Pledges who have trained their whole lives for this one thing, but she has … something special that causes her to win. Also archery is her greatest skill, and she's already awesome at it from the beginning of the story and never spends time practicing.

Close-but-not-perfect examples of the right thing:
The Pillars of the Earth - Jack pretty explicitly has to travel around Europe to acquire the skills he needs to become great. Much of the practice is off-screen, but it's at least a pretty significant part of the journey.
The Honor Harrington series: the books depict Honor, as well as the people around her, rising through the ranks of the military and gradually levelling up, with emphasis on dedication to training, and that training is often depicted onscreen – but the skills she's training in herself and her subordinates aren't nearly as relevant as the "tactical genius" that she seems to have been born with.

I'd like to put out a request for fiction that has this quality. I'll also take examples of fiction that fails badly at this quality, to add to the list of examples, or of TVTropes keywords that would be useful to mine. Internet hivemind, help?

Open thread, Sep. 28 - Oct. 4, 2015

3 MrMind 28 September 2015 07:13AM

If it's worth saying, but not worth its own post (even in Discussion), then it goes here.

Notes for future OT posters:

1. Please add the 'open_thread' tag.

2. Check if there is an active Open Thread before posting a new one. (Immediately before; refresh the list-of-threads page before posting.)

3. Open Threads should be posted in Discussion, and not Main.

4. Open Threads should start on Monday, and end on Sunday.

Dry Ice Cryonics- Preliminary Thoughts

8 Fluttershy 28 September 2015 07:00AM

This post is a spot-check of Alcor's claim that cryonics can't be carried out at dry ice temperatures, and a follow-up to this comment. This article isn't up to my standards, yet I'm posting it now, rather than polishing it more first, because I strongly fear that I might never get around to doing so later if I put it off. Despite my expertise in chemistry, I don't like chemistry, so writing this took a lot of willpower. Thanks to Hugh Hixon from Alcor for writing "How Cold is Cold Enough?".


More research (such as potentially hiring someone to find the energies of activation for lots of different degradative reactions which happen after death) is needed to determine if long-term cryopreservation at the temperature of dry ice is reasonable, or even preferable to storage in liquid nitrogen.

On the outside view, I'm not very confident that dry ice cryonics will end up being superior to liquid nitrogen cryonics. Still, it's very hard to say one way or the other a priori. There are certain factors that I can't easily quantify that suggest that cryopreservation with dry ice might be preferable to cryopreservation with liquid nitrogen (specifically, fracturing, as well as the fact that the Arrhenius equation doesn't account for poor stirring), and other such factors that suggest preservation in liquid nitrogen to be preferable (specifically, that being below the glass transition temperature prevents movement/chemical reactions, and that nanoscale ice crystals, which can grow during rewarming, can form around the glass transition temperature).

(I wonder if cryoprotectant solutions with different glass transition temperatures might avoid either of the two problems mentioned in the last sentence for dry ice cryonics? I just heard about the issue of nanoscale ice crystals earlier today, so my discussion of them is an afterthought.)


Using dry ice to cryopreserve people for future revival could be cheaper than using liquid nitrogen for the same purpose (how much would using dry ice cost?). Additionally, lowering the cost of cryonics could increase the number of people who sign up for cryonics-- which would, in turn, give us a better chance at e.g. legalizing the initiation of the first phases of cryonics for terminal patients just before legal death.

This document by Alcor suggests that, for neuro and whole-body patients, an initial deposit of 6,600 or 85,438 USD into the patient's trust fund is, respectively, more than enough to generate enough interest to safely cover a patient's annual storage cost indefinitely. Since around 36% of this amount is spent on liquid nitrogen, this means that completely eliminating the cost of replenishing the liquid nitrogen in the dewars would reduce the up-front cost that neuro and whole-body patients with Alcor would pay by around 2,350 or 31,850 USD, respectively. This puts a firm upper bound on the amount that could be saved by Alcor patients by switching to cryopreservation with dry ice, since some amount would need to be spent each year on purchasing additional dry ice to maintain the temperature at which patients are stored. (A small amount could probably be saved on the cost which comes from cooling patients down immediately after death, as well).

This LW discussion is also relevant to storage costs in cryonics. I'm not sure how much CI spends on storage.

Relevant Equations and Their Limitations

Alcor's "How Cold is Cold Enough?" is the only article which I've found that takes an in-depth look at whether storage of cryonics patients at temperatures above the boiling point of liquid nitrogen would be feasible. It's a generally well-written article, though it makes an assumption regarding activation energy that I'll be forced to examine later on.

The article starts off by introducing the Arrhenius equation, which is used to determine the rate constant of a chemical reaction at a given temperature. The equation is written:

k = A * e^(-Ea/RT)          (1)


  • k is the rate constant you solve for (the units vary between reactions)
  • A is a constant you know (same units as k)
  • Ea is the activation energy (kJ/mol)
  • R is the ideal gas constant (kJ/K*mol)
  • T is the temperature (K)
As somewhat of an aside, this is the same k that you would plug into rate law equation, which you have probably seen before:

v = k * [A]m[B]n     (2)
  • v is the rate of the reaction (mol/(L*s))
  • k is the rate constant, from the Arrhenius equation above
  • [A] and [B] are the concentrations of reactants-- there might be more or less than two (mol/L)
  • m and n are constants that you know
The Arrhenius equation-- equation 1, here-- does make some assumptions which don't always hold. Firstly, the activation energy of some reactions changes with temperature, and secondly, it is sometimes necessary to use the modified Arrhenius equation (not shown here) to fit rate constant v. temperature data, as noted just before equation 5 in this paper. This is worth mentioning because, while the Arrhenius equation is quite robust, the data doesn't always fit our best models in chemistry.

Lastly, and most importantly, the Arrhenius equation assumes that all reactants are always being mixed perfectly, which is definitely not the case in cryopreserved patients. I have no idea how to quantify this effect, though after taking this effect into consideration, we should expect degradation reactions in cryopreserved individuals to happen much more slowly than the Arrhenius equation would explicitly predict.

Alcor on "How Cold is Cold Enough?"

The Alcor article goes on to calculate the ratio of the value of k, the rate constant, at 77.36 Kelvin (liquid nitrogen), to the value of k at other temperatures for the enzyme Catalase. This ratio is equal to the factor by which a reaction would be slowed down when cooled from a given temperature down to 77 K. While the calculations are correct, Catalase is not the ideal choice of enzyme here. Ideally, we'd want to calculate this ratio for whatever degradative enzyme/reaction had the lowest activation energy, because then, if the ratio of k at 37 Celsius (body temperature) to k at the temperature of dry ice was big enough, we could be rather confident that all other degradative reactions would be slowed down at dry ice temperatures by a greater factor than the degradative reaction with the lowest activation energy would be. Of course, as shown in equation 2 of this post, the concentrations of reactants of degradative reactions do matter to the speed of those reactions at dry ice temperatures, though differences in the ratio of k at 37 C to k at dry ice temperatures between different degradative reactions will matter much, much more strongly in determining v, the rate of the reaction, than differences in concentrations of reactants will.

I'm also quite confused by the actual value given for the Ea of catalase in the Alcor article-- a quick google search suggests the Ea to be around 8 kJ/mol or 11 kJ/mol, though the Alcor article uses a value of 7,000 cal/(mol*K), i.e. 29.3 kJ/(mol*K), which can only be assumed to have been a typo in terms of the units used.

Of course, as the author mentions, Ea values aren't normally tabulated. The Ea for a reaction can be calculated with just two experimentally determined (Temperature, k (rate constant)) pairs, so it wouldn't take too long to experimentally determine a bunch of Eas for degradative reactions which normally take place in the human body after death, especially if we could find a biologist who had a good a priori idea of which degradative reactions would be the fastest.

Using the modified form of the Arrhenius equation from Alcor's "How Cold is Cold Enough", we could quickly estimate what the smallest Ea for a degradative biological reaction would be that would result in some particular and sufficiently small number of reactions taking place at dry ice temperatures over a certain duration of time. For example, when neglecting stirring effects, it turns out that 100 years at dry ice temperature (-78.5 C) ought to be about equal to 3 minutes at body temperature for a reaction with an Ea of 72.5 kJ/mol. Reactions with higher Eas would be slowed down relatively more by an identical drop in temperature.

So, if we were unable to find any degradative biological reactions with Eas less than (say) 72.5 kJ/mol, that would be decent evidence in favor of dry ice cryonics working reasonably well (given that the 100 years and three minutes figures are numbers that I just made up-- 100 years being a possible duration of storage, and three minutes being an approximation of how long one can live without oxygen being supplied to the brain).

Damage from Causes Other Than Chemical Reactions in Dry Ice Cryonics

Just before publishing this article, I came across Alcor's "Cryopreservation and Fracturing", which mentioned that 

The most important instability for cryopreservation purposes is a tendency toward ice nucleation. At temperatures down to 20 degrees below the glass transition temperature, water molecules are capable of small translations and rotations to form nanoscale ice-crystals, and there is strong thermodynamic incentive to do so [5, 6]. These nanoscale crystals (called "nuclei") remain small and biologically insignificant below the glass transition, but grow quickly into damaging ice crystals as the temperature rises past -90°C during rewarming. Accumulating ice nuclei are therefore a growing liability that makes future ice-free rewarming efforts progressively more difficult the longer vitrified tissue is stored near the glass transition temperature. For example, storing a vitrification solution 10 degrees below the glass transition for six months was found to double the warming rate necessary to avoid ice growth during rewarming [5]. The vitrification solution that Alcor uses is far more stable than the solution used (VS41A) in this particular experiment, but Alcor must store its patients far longer than six months.

The same article also discusses fracturing, which can damage tissues stored more than 20 C below the glass transition temperature. If nanoscale ice crystals form in patients stored in dry ice (I expect they would), and grew during rewarming from dry ice temperatures (I have no idea if they would), that could be very problematic.

Implications of this Research for Liquid Nitrogen Cryonics

If someone has a graph of how body temperature varies with time during the process of cryopreservation, it would be trivial to compute the time-at-body-temperature equivalent of the time that freezing takes. My bet is that getting people frozen too slowly hurts folks's chances of revival far more than they intuit.

Instrumental Rationality Questions Thread

6 AspiringRationalist 27 September 2015 09:22PM

Previous thread: http://lesswrong.com/lw/mnq/instrumental_rationality_questions_thread/

This thread is for asking the rationalist community for practical advice.  It's inspired by the stupid questions series, but with an explicit focus on instrumental rationality.

Questions ranging from easy ("this is probably trivial for half the people on this site") to hard ("maybe someone here has a good answer, but probably not") are welcome.  However, please stick to problems that you actually face or anticipate facing soon, not hypotheticals.

As with the stupid questions thread, don't be shy, everyone has holes in their knowledge, though the fewer and the smaller we can make them, the better, and please be respectful of other people's admitting ignorance and don't mock them for it, as they're doing a noble thing.

(See also the Boring Advice Repository)

Hypothetical situations are not meant to exist

1 casebash 27 September 2015 10:58AM

Hypotheticals are a powerful tool for testing intuitions. However, many people believe that it is problematic a hypothetical does not represent a realistic situation. On the contrary, it is only problematic if it is represented as being realistic when it is not realistic. Realism isn’t required if the aim is simply to show that there is *some* situation where the proposed principle breaks. We may still choose to utilise an imperfect principle, but when we know about the potential for breakage, we are much less likely to be tripped up if we find a situation where the principle is invalid.

It is instructive to look at physics. In physics, we model balls by perfect spherical objects. Nobody believes that a perfectly spherical object exists in real life. However, they provide a baseline theory from which further ideas can be explored. Bumps or ellipticity can be added later. Indeed, they probably *should* be added later. Unless a budding physicist can demonstrate their competence with the simple case, they probably should not be trusted with dealing with the much more complicated real world situation.

If you are doubting a hypothetical, then you haven’t accepted the hypothetical. You can doubt that a hypothetical will have any relevance from outside the hypothetical, but once you step inside the hypothetical you cannot doubt the hypothetical or you never stepped inside in the first place.

This topic has been discussed previously on LessWrong, but a single explanation won't prove compelling to everyone, so it is useful to have different explanations that explain the same topic in a different way.

TimS states similar thoughts in Please Don’t Fight the Hypothetical:

Likewise, people who responds to the Trolley problem by saying that they would call the police are not talking about the moral intuitions that the Trolley problem intends to explore.  There's nothing wrong with you if those problems are not interesting to you.  But fighting the hypothetical by challenging the premises of the scenario is exactly the same as saying, "I don't find this topic interesting for whatever reason, and wish to talk about something I am interested in."

In, The Least Convenient World, Yvain recommends limiting your responses as follows:

[Say] "I completely reject the entire basis of your argument" or "I accept the basis of your argument, but it doesn't apply to the real world because of contingent fact X." If you just say "Yeah, well, contigent fact X!" and walk away, you've left yourself too much wiggle room.

You may also want to check out A note on hypotheticals by PhilGoetz



View more: Next