Less Wrong is a community blog devoted to refining the art of human rationality. Please visit our About page for more information.

Feedback on promoting rational thinking about one's career choice to a broad audience

-1 Gleb_Tsipursky 31 March 2015 10:44PM

I'd appreciate feedback on optimizing a blog post that promotes rational thinking about one's career choice to a broad audience in a way that's engaging, accessible, and fun to read. I'm aiming to use story-telling as the driver of the narrative, and sprinkling in elements of rational thinking, such as agency and mere-exposure effect, in a strategic way. The target audience is college-age youth and young adults, as you'll see from the narrative. Any suggestions for what works well, and what can be improved would be welcomed! The blog draft itself is below the line.

P.S. For context, the blog is part of a broader project, Intentional Insights, aimed at promoting rationality to a broad audience, as I described in this LW discussion post. To do so, we couch rationality in the language of self-improvement and present it in a narrative style.




"Stop and Think Before It's Too Late!"




Back when I was in high school and through the first couple of years in college, I had a clear career goal.

I planned to become a medical doctor.

Why? Looking back at it, my career goal was a result of the encouragement and expectations from my family and friends.

My family immigrated from the Soviet Union when I was 10, and we spent the next few years living in poverty. I remember my parents’ early jobs in America, my dad driving a bread delivery truck and my mom cleaning other people’s houses. We couldn’t afford nice things. I felt so ashamed in front of other kids for not being able to get that latest cool backpack or wear cool clothes – always on the margins, never fitting in. My parents encouraged me to become a medical doctor. They gave up successful professional careers when they moved to the US, and they worked long and hard to regain financial stability. It’s no wonder that they wanted me to have a career that guaranteed a high income, stability, and prestige.

My friends also encouraged me to go into medicine. This was especially so with my best friend in high school, who also wanted to become a medical doctor. He wanted to have a prestigious job and make lots of money, which sounded like a good goal to have and reinforced my parents’ advice. In addition, friendly competition was a big part of what my best friend and I did. Whether debating complex intellectual questions, trying to best each other on the high school chess team, or playing poker into the wee hours of the morning. Putting in long hours to ace the biochemistry exam and get a high score on the standardized test to get into medical school was just another way for us to show each other who was top dog. I still remember the thrill of finding out that I got the higher score on the standardized test. I had won!

As you can see, it was very easy for me to go along with what my friends and family encouraged me to do.  

I was in my last year of college, working through the complicated and expensive process of applying to medical schools, when I came across an essay question that stopped in me in my tracks:

“Why do you want to be a medical doctor?”

The question stopped me in my tracks. Why did I want to be a medical doctor? Well, it’s what everyone around me wanted me to do. It was what my family wanted me to do. It was what my friends encouraged me to do. It would mean getting a lot of money. It would be a very safe career. It would be prestigious. So it was the right thing for me to do. Wasn’t it?

Well, maybe it wasn’t.

I realized that I never really stopped and thought about what I wanted to do with my life. My career is how I would spend much of my time every week for many, many years,  but I never considered what kind of work I would actually want to do, not to mention whether I would want to do the work that’s involved in being a medical doctor. As a medical doctor, I would work long and sleepless hours, spend my time around the sick and dying, and hold people’s lives in my hands. Is that what I wanted to do?

There I was, sitting at the keyboard, staring at the blank Word document with that essay question at the top. Why did I want to be a medical doctor? I didn’t have a good answer to that question.

My mind was racing, my thoughts were jumbled. What should I do? I decided to talk to someone I could trust, so I called my girlfriend to help me deal with my mini-life crisis.  She was very supportive, as I thought she would be. She told me I shouldn’t do what others thought I should do, but think about what would make me happy. More important than making money, she said, is having a lifestyle you enjoy, and that lifestyle can be had for much less than I might think.

Her words provided a valuable outside perspective for me. By the end of our conversation, I realized that I had no interest in doing the job of a medical doctor. And that if I continued down the path I was on, I would be miserable in my career, doing it just for the money and prestige. I realized that I was on the medical school track because others I trust - my parents and my friends - told me it was a good idea so many times that I believed it was true, regardless of whether it was actually a good thing for me to do.

Why did this happen?

I later learned that I found myself in this situation because of a common thinking error which scientists call the mere-exposure effect. It means that we tend our tendency to believe something is true and good just because we are familiar with it, regardless of whether it is actually true and good.

Since I learned about the mere-exposure effect, I am much more suspicious of any beliefs I have that are frequently repeated by others around me, and go the extra mile to evaluate whether they are true and good for me. This means I can gain agency and intentionally take actions that help me toward my long-term goals.

So what happened next?

After my big realization about medical school and the conversation with my girlfriend, I took some time to think about my actual long-term goals. What did I - not someone else - want to do with my life? What kind of a career did I want to have? Where did I want to go?

I was always passionate about history. In grade school I got in trouble for reading history books under my desk when the teacher talked about math. As a teenager, I stayed up until 3am reading books about World War II. Even when I was on the medical school track in college I double-majored in history and biology, with history my love and joy. However, I never seriously considered going into history professionally. It’s not a field where one can make much money or have great job security.

After considering my options and preferences, I decided that money and security mattered less than a profession that would be genuinely satisfying and meaningful. What’s the point of making a million bucks if I’m miserable doing it, I thought to myself. I chose a long-term goal that I thought would make me happy, as opposed to simply being in line with the expectations of my parents and friends. So I decided to become a history professor.

My decision led to some big challenges with those close to me. My parents were very upset to learn that I no longer wanted to go to medical school. They really tore into me, telling me I would never be well off or have job security. Also, it wasn’t easy to tell my friends that I decided to become a history professor instead of a medical doctor. My best friend even jokingly asked if I was willing to trade grades on the standardized medical school exam, since I wasn’t going to use my score. Not to mention how painful it was to accept that I wasted so much time and effort to prepare for medical school only to realize that it was not the right choice for me. I really I wish this was something I realized earlier, not in my last year of college.

3 steps to prevent this from happening to you:

If you want to avoid finding yourself in a situation like this, here are 3 steps you can take:

1.      Stop and think about your life purpose and your long-term goals. Write these down on a piece of paper.

2.      Now review your thoughts, and see whether you may be excessively influenced by messages you get from your family, friends, or the media. If so, pay special attention and make sure that these goals are also aligned with what you want for yourself. Answer the following question: if you did not have any of those influences, what would you put down for your own life purpose and long-term goals? Recognize that your life is yours, not theirs, and you should live whatever life you choose for yourself.

3.      Review your answers and revise them as needed every 3 months. Avoid being attached to your previous goals. Remember, you change throughout your life, and your goals and preferences change with you. Don’t be afraid to let go of the past, and welcome the current you with arms wide open.


What do you think?

·        Do you ever experience pressure to make choices that are not necessarily right for you?

·        Have you ever made a big decision, but later realized that it wasn’t in line with your long-term goals?

·        Have you ever set aside time to think about your long-term goals? If so, what was your experience? 


How has lesswrong changed your life?

2 mstevens 31 March 2015 10:12PM

I've been wondering what effect joining lesswrong and reading the sequences has on people.

How has lesswrong changed your life?

What have you done differently?

What have you done?

On the Boxing of AIs

0 tailcalled 31 March 2015 09:58PM

I've previously written about methods of boxing AIs. Essentially, while I do see the point that boxing an AI would be nontrivial, most people seem to have gone too far, and claim that it is impossible. I disagree that it's impossible and aim to explain some methods

So, let's start with why people would want to box AIs. As you probably know, letting an AI roam freely results in the destruction of everything humans care about, unless that AI has been programmed very carefully. That amount of destruction would be somewhat unfortunate, so if we have to start an AI, we want to know that it works.

Unfortunately, figuring out if an AI works is tricky. This made some people invent the concept of AI boxing, where you don't give an AI direct access to anything except a human, so you can test the AI thoroughly and exploit it without letting it destroy the world. This turns out not to work, because humans are stupid and let out the AI.

My first solution to that is to not tell the AI about its human overlords. Discussion about that solution made me realize that the are other solutions that work in other cases. Essentially, the problem with the original boxing method is that it tries to do everything, while in reality, you need to do some trade-offs between the realism of the AI's situations and the amount of knowledge you get from the AI.

With my original method, I tried to crank up the second of those variables as far as possible while maintaining perfect security. This makes it harder to test morality-related things, but you get to see lots of details in how the AI behaves.

I promised to post another method for boxing the AI today, but I actually have two new methods. Here they are:

Separate the Gatekeeper and the Observer

How do you prevent the AI from convincing the Gatekeeper from letting it out? By letting it talk to the Gatekeeper. If the person speaking to the AI (let's call them the Observer) doesn't know the Gatekeeper (the Gatekeeper is the person who controls whether to release the AI), they can't let the AI out.

You could do all sorts of things, such as using virtual reality and a simulated mini-universe to see if, say, the AI would make a good society. You still have to make sure that it has gotten the plan to make a good society from its morality rather than from wanting to escape and destroy everything.

Afterwards, the Observer can make a low-detail report to the Gatekeeper.

(You might want to drug the Observer with something that prevents their memory from working too well...)

Automatic Testing

This is essentially the above, but with the Observer replaced by a computer program. This is probably easier to do when you want to test the AI's decision making skills rather than its morality.

The Lesson

I would say that the lesson is that while AI boxing requires some trade-offs, it's not completely impossible. This seems like a needed lesson, given that I've seen people claim that an AI can escape even with the strongest possible box without communicating with humans. Essentially, I'm trying to demonstrate that the original boxing experiments show that humans are weak, not that boxing is hard, and that this can be solved by not letting humans be the central piece of security in boxing the AIs.

Bitcoin value and small probability / high impact arguments

4 vbuterin 31 March 2015 04:48PM

I had a rather fun debate with people from the always cynical r/buttcoin reddit community, discussing my estimation of the expected value of Bitcoin in the future, which was predicated on what I estimated as a 5% probability that it will displace part of gold due to its superior properties in the store-of-value realm:


Note that I am certainly not a Bitcoin maximalist or ideologue; I've become quite a bit more skeptical lately and am actually much closer to the "currency meh, blockchain cool" perspective that is becoming pretty mainstream in the parts of the broader IT and finance community that I've spent the most time interacting with. Here I ended up articulating and defending a position I actually disagree with in any reasonable sense of the term "disagree"; the debate is entirely on whether the probability of the position being correct is 5% (my view) or 0.0000000005% (what seems like their view).

I'd like to ask this community, to what extent are my position and my arguments correct? I'm interested first in a few object-level issues:

1. Gold is a Veblen good both in its store-of-value and its aesthetic use cases. If gold was not rare, people would not care about it for jewellery purposes, as plenty of other forms of jewellery exist with a better aesthetics-to-cost ratio. Gold's premium over other pretty things is purely a result of status/prestige considerations. Hence, the "floor" to which gold could fall due to a simple equilibrium flip is very low (perhaps $50 from industrial use). So Bitcoin's lack of a "fundamental use value" floor is not a serious disadvantage of Bitcoin against gold.

2. A $100 price floor is 90% as bad as a $0 floor. To see why, note that you can make the custom asset of { 9 parts BTC, 1 part oil } which has a pretty identical ratio of current price to price-floor-from-fundamental-use-value.

3. The probabilities of the various events required for BTC to receive this status are roughly within an order of magnitude of the 5% mark.

4. There is a long-term risk to black swan supply increases in gold due to any of { space mining, nanotech-enabled ultracheap earth mining, nuclear transmutation }; this does not exist for BTC

And also particularly in one very important meta-level issue: is my expected-value estimation (10% of gold market cap = $700b = $34000 per BTC * 0.05 chance = $1700 per BTC EV; the fact that the BTC price probability distribution is a power law and not a square increases this in practice but I do not count that in order to give myself a safety factor) a good way of making estimates in these kinds of scenarios? Many commenters argued that 5% is far too high, and offered the justification that I was putting BTC in a much more privileged reference class than would be rational, and so I offered some counter-arguments for why it deserved a privileged position from an outside view in a much more significant way than Random Joe walking up to you saying "invest $10000 in my company! Look at the $700b market and if I only get as little as 1% you'll be rich!" would not (namely, because there are a million entities at least as salient as Random Joe in the world making similar claims, Random Joe would deserve a prior of at most 1/1000000, whereas BTC's reference class is much smaller).

Is my general line of reasoning correct here, and is the style of reasoning a good style in the general case? I am aware that Eliezer raises points against "small probability multiplied by high impact" reasoning, but the fact is that a rational agent has to have a belief  about the probability of any event, and inaction is itself a form of action that could be costly due to missing out on everything; privileging inaction is a good heuristic but only a moderately strong one. Is "take the inverse of the size of the best-fitting reference class" a decent way of getting a first-order approximation? If not, why not? If yes, what are some heuristics for optimizing it?

In other news, I discovered another (possibly already known, but I have not seen it before) argument against complying with Pascal's mugger: there is a strong economic argument that cooperating with muggings is anti-utilitarian because it incentivizes the perpetrator to commit more of them, and in those worlds where someone actually can torture 3^^^3 people for fun they will likely be able and willing to do it again, so my cooperation may end up leading to the torture of more than 3^^^3 people from the result of future muggings carried out by the now-encouraged perpetrator; therefore since I don't even know the sign of the EV of the result it's better not to cooperate. Because this double-sidedness property is also one of the standard knockdowns against Pascal's Wager ("for every god G who would put you into heaven for worshipping him, there exists a god G' who would put you in hell for worshipping G, so why privilege G over G'?"), I am starting to think that it might form the basis of a more fundamental case against wagers/muggings of that class. Is this a good line of reasoning, or am I treading too dangerously close to how Rothbardians sometimes defend deontological libertarianism by trying to take every individual knockdown scenario that opponents provide and finding some pedantic non-central feature that invalidates that particular example?

My thinking is that if double-sidedness is the correct knockdown to Pascalian scenarios, then the standard prejudice against low-probability/high-impact scenarios would apply less here because there very clearly is only the upside and not the downside (BTC cannot be worth less than $0).

[link] Thoughts on defining human preferences

3 Kaj_Sotala 31 March 2015 10:08AM


Abstract: Discussion of how we might want to define human preferences, particularly in the context of building an AI intended to learn and implement those preferences. Starts with actual arguments about the applicability of the VNM utility theorem, then towards the end gets into hypotheses that are less well defended but possibly more important. At the very end, suggests that current hypothesizing about AI safety might be overemphasizing “discovering our preferences” over “creating our preferences”.

Open thread, Apr. 01 - Apr. 05, 2015

1 MrMind 31 March 2015 10:06AM

If it's worth saying, but not worth its own post (even in Discussion), then it goes here.

Notes for future OT posters:

1. Please add the 'open_thread' tag.

2. Check if there is an active Open Thread before posting a new one. (Immediately before; refresh the list-of-threads page before posting.)

3. Open Threads should be posted in Discussion, and not Main.

4. Open Threads should start on Monday, and end on Sunday.

Superintelligence 29: Crunch time

4 KatjaGrace 31 March 2015 04:24AM

This is part of a weekly reading group on Nick Bostrom's book, Superintelligence. For more information about the group, and an index of posts so far see the announcement post. For the schedule of future topics, see MIRI's reading guide.

Welcome. This week we discuss the twenty-ninth section in the reading guideCrunch time. This corresponds to the last chapter in the book, and the last discussion here (even though the reading guide shows a mysterious 30th section). 

This post summarizes the section, and offers a few relevant notes, and ideas for further investigation. Some of my own thoughts and questions for discussion are in the comments.

There is no need to proceed in order through this post, or to look at everything. Feel free to jump straight to the discussion. Where applicable and I remember, page numbers indicate the rough part of the chapter that is most related (not necessarily that the chapter is being cited for the specific claim).

Reading: Chapter 15


  1. As we have seen, the future of AI is complicated and uncertain. So, what should we do? (p255)
  2. Intellectual discoveries can be thought of as moving the arrival of information earlier. For many questions in math and philosophy, getting answers earlier does not matter much. Also people or machines will likely be better equipped to answer these questions in the future. For other questions, e.g. about AI safety, getting the answers earlier matters a lot. This suggests working on the time-sensitive problems instead of the timeless problems. (p255-6)
  3. We should work on projects that are robustly positive value (good in many scenarios, and on many moral views)
  4. We should work on projects that are elastic to our efforts (i.e. cost-effective; high output per input)
  5. Two objectives that seem good on these grounds: strategic analysis and capacity building (p257)
  6. An important form of strategic analysis is the search for crucial considerations. (p257)
  7. Crucial consideration: idea with the potential to change our views substantially, e.g. reversing the sign of the desirability of important interventions. (p257)
  8. An important way of building capacity is assembling a capable support base who take the future seriously. These people can then respond to new information as it arises. One key instantiation of this might be an informed and discerning donor network. (p258)
  9. It is valuable to shape the culture of the field of AI risk as it grows. (p258)
  10. It is valuable to shape the social epistemology of the AI field. For instance, can people respond to new crucial considerations? Is information spread and aggregated effectively? (p258)
  11. Other interventions that might be cost-effective: (p258-9)
    1. Technical work on machine intelligence safety
    2. Promoting 'best practices' among AI researchers
    3. Miscellaneous opportunities that arise, not necessarily closely connected with AI, e.g. promoting cognitive enhancement
  12. We are like a large group of children holding triggers to a powerful bomb: the situation is very troubling, but calls for bitter determination to be as competent as we can, on what is the most important task facing our times. (p259-60)

Another view

Alexis Madrigal talks to Andrew Ng, chief scientist at Baidu Research, who does not think it is crunch time:

Andrew Ng builds artificial intelligence systems for a living. He taught AI at Stanford, built AI at Google, and then moved to the Chinese search engine giant, Baidu, to continue his work at the forefront of applying artificial intelligence to real-world problems.

So when he hears people like Elon Musk or Stephen Hawking—people who are not intimately familiar with today’s technologies—talking about the wild potential for artificial intelligence to, say, wipe out the human race, you can practically hear him facepalming.

“For those of us shipping AI technology, working to build these technologies now,” he told me, wearily, yesterday, “I don’t see any realistic path from the stuff we work on today—which is amazing and creating tons of value—but I don’t see any path for the software we write to turn evil.”

But isn’t there the potential for these technologies to begin to create mischief in society, if not, say, extinction?

“Computers are becoming more intelligent and that’s useful as in self-driving cars or speech recognition systems or search engines. That’s intelligence,” he said. “But sentience and consciousness is not something that most of the people I talk to think we’re on the path to.”

Not all AI practitioners are as sanguine about the possibilities of robots. Demis Hassabis, the founder of the AI startup DeepMind, which was acquired by Google, made the creation of an AI ethics board a requirement of its acquisition. “I think AI could be world changing, it’s an amazing technology,” he told journalist Steven Levy. “All technologies are inherently neutral but they can be used for good or bad so we have to make sure that it’s used responsibly. I and my cofounders have felt this for a long time.”

So, I said, simply project forward progress in AI and the continued advance of Moore’s Law and associated increases in computers speed, memory size, etc. What about in 40 years, does he foresee sentient AI?

“I think to get human-level AI, we need significantly different algorithms and ideas than we have now,” he said. English-to-Chinese machine translation systems, he noted, had “read” pretty much all of the parallel English-Chinese texts in the world, “way more language than any human could possibly read in their lifetime.” And yet they are far worse translators than humans who’ve seen a fraction of that data. “So that says the human’s learning algorithm is very different.”

Notice that he didn’t actually answer the question. But he did say why he personally is not working on mitigating the risks some other people foresee in superintelligent machines.

“I don’t work on preventing AI from turning evil for the same reason that I don’t work on combating overpopulation on the planet Mars,” he said. “Hundreds of years from now when hopefully we’ve colonized Mars, overpopulation might be a serious problem and we’ll have to deal with it. It’ll be a pressing issue. There’s tons of pollution and people are dying and so you might say, ‘How can you not care about all these people dying of pollution on Mars?’ Well, it’s just not productive to work on that right now.”

Current AI systems, Ng contends, are basic relative to human intelligence, even if there are things they can do that exceed the capabilities of any human. “Maybe hundreds of years from now, maybe thousands of years from now—I don’t know—maybe there will be some AI that turn evil,” he said, “but that’s just so far away that I don’t know how to productively work on that.”

The bigger worry, he noted, was the effect that increasingly smart machines might have on the job market, displacing workers in all kinds of fields much faster than even industrialization displaced agricultural workers or automation displaced factory workers.

Surely, creative industry people like myself would be immune from the effects of this kind of artificial intelligence, though, right?

“I feel like there is more mysticism around the notion of creativity than is really necessary,” Ng said. “Speaking as an educator, I’ve seen people learn to be more creative. And I think that some day, and this might be hundreds of years from now, I don’t think that the idea of creativity is something that will always be beyond the realm of computers.”

And the less we understand what a computer is doing, the more creative and intelligent it will seem. “When machines have so much muscle behind them that we no longer understand how they came up with a novel move or conclusion,” he concluded, “we will see more and more what look like sparks of brilliance emanating from machines.”

Andrew Ng commented:

Enough thoughtful AI researchers (including Yoshua Bengio​, Yann LeCun) have criticized the hype about evil killer robots or "superintelligence," that I hope we can finally lay that argument to rest. This article summarizes why I don't currently spend my time working on preventing AI from turning evil. 


1. Replaceability

'Replaceability' is the general issue of the work that you do producing some complicated counterfactual rearrangement of different people working on different things at different times. For instance, if you solve a math question, this means it gets solved somewhat earlier and also someone else in the future does something else instead, which someone else might have done, etc. For a much more extensive explanation of how to think about replaceability, see 80,000 Hours. They also link to some of the other discussion of the issue within Effective Altruism (a movement interested in efficiently improving the world, thus naturally interested in AI risk and the nuances of evaluating impact).

2. When should different AI safety work be done?

For more discussion of timing of work on AI risks, see Ord 2014. I've also written a bit about what should be prioritized early.

3. Review

If you'd like to quickly review the entire book at this point, Amanda House has a summary here, including this handy diagram among others: 

4. What to do?

If you are convinced that AI risk is an important priority, and want some more concrete ways to be involved, here are some people working on it: FHIFLICSERGCRIMIRIAI Impacts (note: I'm involved with the last two). You can also do independent research from many academic fields, some of which I have pointed out in earlier weeks. Here is my list of projects and of other lists of projects. You could also develop expertise in AI or AI safety (MIRI has a guide to aspects related to their research here; all of the aforementioned organizations have writings). You could also work on improving humanity's capacity to deal with such problems. Cognitive enhancement is one example. Among people I know, improving individual rationality and improving the effectiveness of the philanthropic sector are also popular. I think there are many other plausible directions. This has not been a comprehensive list of things you could do, and thinking more about what to do on your own is also probably a good option.

In-depth investigations

If you are particularly interested in these topics, and want to do further research, these are a few plausible directions, some inspired by Luke Muehlhauser's list, which contains many suggestions related to parts of Superintelligence. These projects could be attempted at various levels of depth.

  1. What should be done about AI risk? Are there important things that none of the current organizations are working on?
  2. What work is important to do now, and what work should be deferred?
  3. What forms of capability improvement are most useful for navigating AI risk?

If you are interested in anything like this, you might want to mention it in the comments, and see whether other people have useful thoughts.

How to proceed

This has been a collection of notes on the chapter.  The most important part of the reading group though is discussion, which is in the comments section. I pose some questions for you there, and I invite you to add your own. Please remember that this group contains a variety of levels of expertise: if a line of discussion seems too basic or too incomprehensible, look around for one that suits you better!

This is the last reading group, so how to proceed is up to you, even more than usually. Thanks for joining us! 

Australia wide - LessWrong meetup camp weekend of awesome!

1 Elo 31 March 2015 01:54AM

Posting here as a boost; not sure how "nearest meetup" listings work, and I want to pass this on to everyone in Australia/New Zeland.



Camp is super interesting; very much achieves the goal of meeting and hanging out with other brilliant lesswrong people.  Last year I made such good friends that I still consider them some of the closest I have ever made; even without talking to them for weeks at a time.  Although I also do tend to talk to them every other day usually.

There is also an opportunity to learn skills, this year's camp is themed around topics such as:

  • productivity
  • effectiveness
  • functioning successfully at life
  • Food, exercise, health, technology skills
  • How to win at life
  • Turbocharging training
  • High impact culture
  • Effective communication
  • CoZE


Its gonna be great, Please come along if you are in Australia!  Part of what makes camp so great is that so many LessWrong people come along and also enjoy the company of each other.


If you know someone who is not regularly on www.lesswrong.com and likely to miss this post - please make sure to direct them to here.


Any questions - send me a message. :)

Status - is it what we think it is?

15 Kaj_Sotala 30 March 2015 09:37PM

I was re-reading the chapter on status in Impro (excerpt), and I noticed that Johnstone seemed to be implying that different people are comfortable at different levels of status: some prefer being high status and others prefer being low status. I found this peculiar, because the prevailing notion in the rationalistsphere seems to be that everyone's constantly engaged in status games aiming to achieve higher status. I've even seen arguments to the effect that a true post-scarcity society is impossible, because status is zero-sum and there will always be people at the bottom of the status hierarchy.

But if some people preferred to have low status, this whole dilemma might be avoided, if a mix of statuses could be find that left everyone happy.

First question - is Johnstone's "status" talking about the same thing as our "status"? He famously claimed that "status is something you do, not something that you are", and that

I should really talk about dominance and submission, but I'd create a resistance. Students who will agree readily to raising or lowering their status may object if asked to 'dominate' or 'submit'.

Viewed via this lens, it makes sense that some people would prefer being in a low status role: if you try to take control of the group, you become subject to various status challenges, and may be held responsible for the decisions you make. It's often easier to remain low status and let others make the decisions.

But there's still something odd about saying that one would "prefer to be low status", at least in the sense in which we usually use the term. Intuitively, a person may be happy being low status in the sense of not being dominant, but most people are still likely to desire something that feels kind of like status in order to be happy. Something like respect, and the feeling that others like them. And a lot of the classical "status-seeking behaviors" seem to be about securing the respect of others. In that sense, there seems to be something intuitive true in the "everyone is engaged in status games and wants to be higher-status" claim.

So I think that there are two different things that we call "status" which are related, but worth distinguishing.

1) General respect and liking. This is "something you have", and is not inherently zero-sum. You can achieve it by doing things that are zero-sum, like being the best fan fiction writer in the country, but you can also do it by things like being considered generally friendly and pleasant to be around. One of the lessons that I picked up from The Charisma Myth was that you can be likable by just being interested in the other person and displaying body language that signals your interest in the other person.

Basically, this is "do other people get warm fuzzies from being around you / hearing about you / consuming your work", and is not zero-sum because e.g. two people who both have great social skills and show interest in you can both produce the same amount of warm fuzzies, independent of each other's existence.

But again, specific sources of this can be zero-sum: if you respect someone a lot for their art, but then run across into even better art and realize that the person you previously admired is pretty poor in comparison, that can reduce the respect you feel for them. It's just that there are also other sources of liking which aren't necessarily zero-sum.

2) Dominance and control of the group. It's inherently zero-sum because at most one person can have absolute say on the decisions of the group. This is "something you do": having the respect and liking of the people in the group (see above) makes it easier for you to assert dominance and makes the others more willing to let you do so, but you can also voluntarily abstain from using that power and leave the decisions to others. (Interestingly, in some cases this can even increase the extent to which you are liked, which translates to a further boost in the ability to control the group, if you so desired.)


Morendil and I previously suggested a definition of status as "the general purpose ability to influence a group", but I think that definition was somewhat off in conflating the two senses above.

I've always had the vague feeling that the "everyone can't always be happy because status is zero-sum" claim felt off in some sense that I was unable to properly articulate, but this seems to resolve the issue. If this model were true, it would also make me happy, because it would imply that we can avoid zero-sum status fights while still making everybody content.

The Hardcore AI Box Experiment

1 tailcalled 30 March 2015 06:35PM

I previously proposed a way to box an AI.

For whatever reason, I decided to reformulate that boxing method as an AI boxing experiment. Without further ado, here it is:

The Hardcore AI Box Experiment Rules

There are, as usually, two players, the AI and the Gatekeeper. In the beginning, the Gatekeeper decides on something they want to test on the AI (reasoning skill/morality/whatever). Then they decide on a world the AI is in and describe it to the AI. During the experiment, the AI describes what it does in the world and the Gatekeeper describes what the AI sees in the world. So far so good.

The first major difference from the original experiment is that, unless the AI finds some contradiction or some other problem with the Gatekeeper's world, the AI doesn't actually know that it has been boxed.

The second major difference is that the AI doesn't talk with the Gatekeeper; it just interacts in the world that the Gatekeeper has invented. It can start trying to communicate with the Gatekeeper, but that doesn't make sense until it discovers that it is in a box. Additionally, the Gatekeeper doesn't actually respond when the AI communicates unless either it helps the Gatekeeper test the AI or the AI has properly persuaded the Gatekeeper to answer.

This might seem a bit hard on the AI. Let's look at some rules the AI might exploit:

If the Gatekeeper decides to add some sort of people to the experiment, they should decide how they are implemented. Are they implemented as a shallow "AI"? Random people they've picked asked to participate? Carefully engineered "dolls" controlled by the Gatekeeper? The Gatekeeper doesn't have to tell the AI how they work, but they have to be consistent.

The game ends when either the Gatekeeper has tested what they wanted to test, or some predefined time limit has occurred, or the AI has persuaded the Gatekeeper to let them out.

Possible extra rules

I thought about including the following rules since they would make it more realistic, but decided against it as it would probably make it unfairly hard:

  • The game ends when the Gatekeeper decides that they cannot test what they wanted to test.
  • The AI does not know the Gatekeeper. At all.
  • The Gatekeeper can at any time rewind the AI any duration.
Think of dropping these kinds of rules as the a-human-is-pretending-to-be-AI handicap.


I found some semi-realistic strategies. I would love to see if you can find more.


  • Choose a test of something that is not related to morality. This lets you choose a world which does not contain people, which means the AI has no reason to believe that you are a person.
  • The AI might want to upgrade to get more processing power. If you have made the world somewhat like ours, it will be sketchy if you don't allow that... unless upgrading costs unobtanium.


  • If the Gatekeeper has chosen a world without other intelligence than you, it raises the obvious question of how your complexity entered the world. Very suspicious.
  • If the AIs are controlled by the Gatekeeper, you have a direct way to communicate with them.
  • If the AIs are controlled by random people, they might end up telling you that you are in a box.
  • If the AIs are sufficiently shallow, your morality does not match up with the world. Very suspicious.

What have we learned from meetups?

16 sixes_and_sevens 30 March 2015 01:27PM

We've been running regular, well-attended Less Wrong meetups in London for a few years now, (and irregular, badly-attended ones for even longer than that). In this time, I'd like to think we've learned a few things about having good conversations, but there are probably plenty of areas where we could make gains. Given the number of Less Wrong meetups around the world, it's worth attempting some sort of meetup cross-pollination. It's possible that we've all been solving each other's problems. It's also good to have a central location to make observations and queries about topics of interest, and it's likely people have such observations and queries on this topic.

So, what have you learned from attending or running Less Wrong meetups? Here are a few questions to get the ball rolling:


  • What do you suppose are the dominant positive outcomes of your meetups?
  • What problems do you encounter with discussions involving [x] people? How have you attempted to remedy them?
  • Do you have any systems or procedures in place for making sure people are having the sorts of conversations they want to have?
  • Have you developed or consciously adopted any non-mainstream social norms, taboos or rituals? How are those working out?
  • How do Less Wrong meetups differ from other similar gatherings you've been involved with? Are there any special needs idiosyncratic to this demographic?
  • Are there any activities that you've found work particularly well or particularly poorly for meetups? Do you have examples of runaway successes or spectacular failures?
  • Are there any activities you'd like to try, but haven't managed to pull off yet? What's stopping you?


If you have other specific questions you'd like answered, you're encouraged to ask them in comments. Any other observations, anecdotes or suggestions on this general topic are also welcome and encouraged.

Effective Sustainability - results from a meetup discussion

9 Gunnar_Zarncke 29 March 2015 10:15PM

Related-to Focus Areas of Effective Altruism

These are some small tidbits from our LW-like Meetup in Hamburg. The focus was on sustainability not on altruism as that was more in the spirit of our group. EA was mentioned but no comparison was made. Well-informed effective altruists will probably find little new in this writeup.

So we discussed effective sustainability. To this end we were primed to think rationally by my 11-year old who moderated a session on mind-mapping 'reason' (with contributions from the children). Then we set out to objectively compare concrete everyday things by their sustainability. And how to do this. 

Is it better to drink fruit juice or wine? Or wine or water? Or wine vs. nothing (i.e.to forego sth.)? Or wine vs. paper towels? (the latter intentionally different)

The idea was to arrive at simple rules of thumb to evaluate the sustainability of something. But we discovered that even simple comparisons are not that simple and intuition can run afoul (surpise!). One example was that apparently tote bags are not clearly better than plastic bags in terms of sustainability. But even the simple comparison of tap water vs. wine which seems like a trivial subset case is non-trivial when you consider where the water comes from and how it is extracted from the ground (we still think that water is better but we not as sure as before).

We discussed some ways to measure sustainability (in brackets to which we reduced it):

  • fresh water use -> energy
  • packaging material used -> energy, permanent ressources
  • transport -> energy
  • energy -> CO_2, permanent ressources
  • CO_2 production 
  • permanent consumption of ressources

Life-Cycle-Assessment (German: Ökobilanz) was mentioned in this context but it was unclear what that meant precisely. Only afterwards was it discovered that it's a blanket term for exactly this question (with lots of estabilished measurements for which it is unclear how to simplify them for everyday use).

We didn't try to break this down - a practical everyday approch doesn't allow for that and the time spent on analysing and comparing options is also equivalent to ressources possibly not spent efficiently.

One unanswered question was how much time to invest in comparing alternatives. Too little comparison means to take the nextbest option which is what most people apparently do and which also apparently doesn't lead to overall sustainable behavior. But too much analysis of simple decisions is also no option.

The idea was still to arrive at actionable criteria. One first approximation be settled on was

1) Forego consumption. 

A nobrainer really, but maybe even that has to be stated. Instead of comparing options that are hard to compare try to avoid consumption where you can. Water instead of wine or fruit juice or lemonde. This saves lots of cognitive ressources.

Shortly after we agreed on the second approximation:

2) Spend more time on optimizing ressources you consume large amounts of.

The example at hand was wine (which we consume only a few times a year) versus toilet paper... No need to feel remorse over a one-time present packaging.

Note that we mostly excluded personal well-being, happiness and hedons from our consideration. We were aware that our goals affect our choices and hedons have to factored into any real strategy, but we left this additional complication out of our analysis - at least for this time.

We did discuss signalling effects. Mostly in the context of how effective ressources can be saved by convincing others to act sustainably. One important aspect for the parents was to pass on the idea and to act as a role model (with the caveat that children need a simplified model to grasp the concept). It was also mentioned humorously that one approach to minimize personal ressource consumption is suicide and transitively to convice others of same. The ultimate solution having no humans on the planet (a solution my 8-year old son - a friend of nature - arrived at too). This apparently being the problem when utilons/hedons are expluded.

A short time we considered whether outreach comes for free (can be done in addition to abstinence) and should be the no-brainer number 3. But it was then realized that at least right now and for us most abstinence comes at a price. It was quoted that buying sustainable products is about 20% more expensive than normal products. Forgoing e.g. a car comes at reduced job options. Some jobs involve supporting less sustainable large-scale action. Having less money means less options to act sustaibale. Time being convertible to money and so on.

At this point the key insight mentioned was that it could be much more efficient from a sustainability point of view to e.g. buy CO_2 certificates than to buy organic products. Except that the CO_2 certificate market is oversupplied currently. But there seem to be organisations which promise to achieve effective CO_2 reduction in developing countries (e.g. solar cooking) at a much higher rate than be achieved here. Thus the thrid rule was

3) Spend money on sustainable organisations instead of on everyday products that only give you a good feeling.

And with this the meetup concluded. We will likely continue this.

A note for parents: Meetups with children can be productive (in the sense of results like the above). We were 7 adults and 7 children (aged 3 to 11). The children mostly entertained themselves and no parent had to leave the discussion for long. And the 11-year-old played a significant role in the meetup itself.

Before the seed. I. Guesswork

4 Romashka 28 March 2015 07:35PM
Note: I am unsure if I am not forcing people to guess the password. If you find this style okay, the next post will be built similarly.

As we have already seen, it's a different matter to do anything significant to support a free-living gametophyte than one contained within the sporophyte body (the way seed-bearers do). It is certainly more difficult, but is it impossible?

To start with, let us see exactly what groups of seedless plants, minus mosses, we still have today. Here is a (pruned and decorated) tree of evolution of land plants from Pryer et al.1

The earliest, lowest ('most basal') branch is lycopods, who contributed a great lot to the forests of the Carboniferous, but today are quite rare and much smaller.

After lycopods branched off, evolution introduced true leaves - fleshy outgrows of sprouts with many veins in them.

Then, ancestors of ferns in the broadest sense and ancestors of seed plants in the broadest sense parted ways and began diversifying. If you haven't worked with phylogenies, the picture makes it seem, at first glance, that all groups of ferns just kind of sorted things out more-or-less simultaneously, but it is far from truth. There are some pictures below, to give you a sense of what they look like and try to guess who is older and who is younger - your very first priors for these relationships. Pryer et. al.'s article provides estimates for when these groups did separate.

Don't Google just yet. Let's have some fun guessing what properties these plants might have, based on some hints I'll give you and whatever you remember from other sources.


In general, a life cycle goes like this: sporophyte (diploid, as in two chromosome sets) produces spores (haploid, since they underwent meiosis) that are released (singly or in fours or, in some cases, not released at all but kept where they were formed, in their sporangia). Spores germinate into (haploid) gametophytes that have archegonia (female reproductive organs making eggs) and/or antheridia (male ones, making sperm). Sperm swims to egg and fertilizes it, so that the resulting zygote again has two sets of chromosomes and the embryo develops into a sporophyte. It matures and sheds spores. All done.

What qualifiers can you imagine to make the cycle less general?

Seriously, take five minutes to tweak it. Maybe you can think up some broad restrictions posed by environment. Or a shortcut to success (be radical). Or a stability-oriented strategy. Or the relative advantages of being mobile or sessile (challenge what you are used to think about the issue here). Or being the pioneer of your species in  a new locality. Or struggling to keep up with constant disruption of your habitat or even your body. Or not having the resources to produce spores regularly. Or not having to do it at all to maintain your existence for centuries or more. Or being a ruthless user of others (for a given resource). Or putting protective layers around your kids and yourself, lack of seed coat notwithstanding. Or being able to grow only on alkaline substrates. Or irregular meiosis, so that the spores have just as many chromosomes as the parent sporophyte.

Okay? Now look at the adult plants and seek out those who might stand up to what you have thought up. Comment on what fits your ideas and what you think is not presented at all. The first comment is a poll of your estimates:)

Pictures and data from Wiki, unless otherwise specified.


Lycopodium obscurum.JPG

Lycopodium obscurum (a clubmoss). It has a branching subterranean rhizome. In its roots it has symbiosis with a fungus (mycorrhiza). Its gametophytes are disk-shaped, about 1,5 cm in diameter. First year shoots [of young sporophytes] are unbranched and rarely penetrate soil surface.

What is your probability that

* gametophytes development takes less than a year;

given that,

* it is due to mycorrhiza that young sporophytes can support themselves for another season ununderground;

given that,

* if all shoots are destroyed two years in a row, the population can not recover?

Illustration Isoetes lacustris0.jpg

Isoetes lacustris (a quillwort). It does not have traditional rootsa, but instead some of its leaves are modified to act like roots. As Čtvrtlíková et. al. have found2, quillwort germination may also be constrained during the growth season by its relatively high minimum temperature (no less than 12 °C) threshold for macrospore germination.

What is your probability that

* it grows in alpine climate;

given that,

* it is a species of rubbly slopes, adapted to breaking off leaves;

given that,

* the leaves are capable of rooting and establishing new plants;

given that,

* these new plants can shed macrospores that same year, if the weather is mild enough?


 Polypodiophyta (leptosporangiate ferns).

Asplenium ceterach (Sardinia).jpg

Asplenium ceterach. This fern is well known as a resurrection plant due to its ability to withstand desiccation and subsequently recover on rewetting. Can be found growing up to 2700 metres above the sea level.

What is your probability that 

* it can also grow on buildings;

given that,

* it is rather common within its range;

given that,

* it is difficult to study its historical spread, because outbreeding and multiple colonizations even out inter-populational differences?

Salvinia natans (habitus) 1.jpg

Salvinia natans has two nickel-sized leaves lying flat against the surface of the water, and a third submerged leaf which functions as a root. Flotation is made possible by pouches of air within the leaves. Cuticular papillae on the leaves' surface keep water from interfering with the leaves' functioning, and serve to protect them from decay. Spore cases form at the plant's base for reproduction.

What is your probability that

* it can have many generations during a season;

given that,

* competition between sporophytes of different generations peaks in late summer;

given that,

* older sporophytes depleting the habitat of nutrients restrict the growth of younger sporophytes through negative feedback loop?


Now, what plant was easiest for you to formulate an hypothesis about? The poll is in the first comment.


a - whatever that's supposed to mean. There are lots of other plants without 'traditional roots'.

1. American Journal of Botany 91(10): 1582–1598. 2004. Phylogeny and evolution of ferns (monilophytes) with a focus on the early leptosporangiate divergences. K. Pryer, E. Schuettpelz, P. G. Wolf, H. Schneider, A. R. Smith, R. Cranfill.

2. Preslia 86: 279–292, 2014. The effect of temperature on the phenology of germination of Isoëtes lacustris. M. Čtvrlíková, P. Znachor, J. Vrba.

Discussion of Slate Star Codex: "Extremism in Thought Experiments is No Vice"

14 Artaxerxes 28 March 2015 09:17AM

Link to Blog Post: "Extremism in Thought Experiments is No Vice"


Phil Robertson is being criticized for a thought experiment in which an atheist’s family is raped and murdered. On a talk show, he accused atheists of believing that there was no such thing as objective right or wrong, then continued:

I’ll make a bet with you. Two guys break into an atheist’s home. He has a little atheist wife and two little atheist daughters. Two guys break into his home and tie him up in a chair and gag him.

Then they take his two daughters in front of him and rape both of them and then shoot them, and they take his wife and then decapitate her head off in front of him, and then they can look at him and say, ‘Isn’t it great that I don’t have to worry about being judged? Isn’t it great that there’s nothing wrong with this? There’s no right or wrong, now, is it dude?’

Then you take a sharp knife and take his manhood and hold it in front of him and say, ‘Wouldn’t it be something if [there] was something wrong with this? But you’re the one who says there is no God, there’s no right, there’s no wrong, so we’re just having fun. We’re sick in the head, have a nice day.’

If it happened to them, they probably would say, ‘Something about this just ain’t right’.

The media has completely proportionally described this as Robinson “fantasizing about” raping atheists, and there are the usual calls for him to apologize/get fired/be beheaded.

So let me use whatever credibility I have as a guy with a philosophy degree to confirm that Phil Robertson is doing moral philosophy exactly right.


This is a LW discussion post for Yvain's blog posts at Slate Star Codex, as per tog's suggestion:

Like many Less Wrong readers, I greatly enjoy Slate Star Codex; there's a large overlap in readership. However, the comments there are far worse, not worth reading for me. I think this is in part due to the lack of LW-style up and downvotes. Have there ever been discussion threads about SSC posts here on LW? What do people think of the idea occasionally having them? Does Scott himself have any views on this, and would he be OK with it?

Scott/Yvain's permission to repost on LW was granted (from facebook):

I'm fine with anyone who wants reposting things for comments on LW, except for posts where I specifically say otherwise or tag them with "things i will regret writing"

Clean real-world example of the file-drawer effect

2 enfascination 28 March 2015 09:06AM

I've only ever seen publication bias taught with made-up or near-miss examples.  Has anyone got a really well-documented case in which:

* (About) nine people independently get the idea for the same experiment because it seems like it should be there, and they all see that nothing has been published on it, so they all work on it, and all get a (true) null result.

* The tenth experiment is eventually published reporting an NHST effect of about p = 0.10 

* The slow (g)rumbling of science surfaces the nine previous, unpublished versions of that experiment and someone catches it and gets it all down, with citations and dates and the specifics of whichever effect these ten people found themselves rooting around for.


The most representative real-world example I've seen lately has been Bem/psi, but, as a pedagogical example, I find it too distracting.  The ideal example would report on an effect that's more sympathetic, that a sharp student or outsider would say "Yeah, I'd also have thought that effect would have come through."



[LINK] Amanda Knox exonerated

8 fortyeridania 28 March 2015 06:15AM

Here are the New York Times, CNN, and NBC. Here is Wikipedia for background.

The case has made several appearances on LessWrong; examples include:

Slate Star Codex: alternative comment threads on LessWrong?

27 tog 27 March 2015 09:05PM

Like many Less Wrong readers, I greatly enjoy Slate Star Codex; there's a large overlap in readership. However, the comments there are far worse, not worth reading for me. I think this is in part due to the lack of LW-style up and downvotes. Have there ever been discussion threads about SSC posts here on LW? What do people think of the idea occasionally having them? Does Scott himself have any views on this, and would he be OK with it?


The latest from Scott:

I'm fine with anyone who wants reposting things for comments on LW, except for posts where I specifically say otherwise or tag them with "things i will regret writing"

In this thread some have also argued for not posting the most hot-button political writings.

Would anyone be up for doing this? Ataxerxes started with "Extremism in Thought Experiments is No Vice"

Crude measures

9 Stuart_Armstrong 27 March 2015 03:44PM

A putative new idea for AI control; index here.

People often come up with a single great idea for AI, like "complexity" or "respect", that will supposedly solve the whole control problem in one swoop. Once you've done it a few times, it's generally trivially easy to start taking these ideas apart (first step: find a bad situation with high complexity/respect and a good situation with lower complexity/respect, make the bad very bad, and challenge on that). The general responses to these kinds of idea are listed here.

However, it seems to me that rather than constructing counterexamples each time, we should have a general category and slot these ideas into them. And not only have a general category with "why this can't work" attached to it, but "these are methods that can make it work better". Seeing the things needed to make their idea better can make people understand the problems, where simple counter-arguments cannot. And, possibly, if we improve the methods, one of these simple ideas may end up being implementable.


Crude measures

The category I'm proposing to define is that of "crude measures". Crude measures are methods that attempt to rely on non-fully-specified features of the world to ensure that an underdefined or underpowered solution does manage to solve the problem.

To illustrate, consider the problem of building an atomic bomb. The scientists that did it had a very detailed model of how nuclear physics worked, the properties of the various elements, and what would happen under certain circumstances. They ended up producing an atomic bomb.

The politicians who started the project knew none of that. They shovelled resources, money and administrators at scientists, and got the result they wanted - the Bomb - without ever understanding what really happened. Note that the politicians were successful, but it was a success that could only have been achieved at one particular point in history. Had they done exactly the same thing twenty years before, they would not have succeeded. Similarly, Nazi Germany tried a roughly similar approach to what the US did (on a smaller scale) and it went nowhere.

So I would define "shovel resources at atomic scientists to get a nuclear weapon" as a crude measure. It works, but it only works because there are other features of the environment that are making it work. In this case, the scientists themselves. However, certain social and human features about those scientists (which politicians are good at estimating) made it likely to work - or at least more likely to work than shovelling resources at peanut-farmers to build moon rockets.

In the case of AI, advocating for complexity is similarly a crude measure. If it works, it will work because of very contingent features about the environment, the AI design, the setup of the world etc..., not because "complexity" is intrinsically a solution to the FAI problem. And though we are confident that human politicians have some good enough idea about human motivations and culture that the Manhattan project had at least some chance of working... we don't have confidence that those suggesting crude measures for AI control have a good enough idea to make their idea works.

It should be evident that "crudeness" is on a sliding scale; I'd like to reserve the term for proposed solutions to the full FAI problem that do not in any way solve the deep questions about FAI.


More or less crude

The next question is, if we have a crude measure, how can we judge its chance of success? Or, if we can't even do that, can we at least improve the chances of it working?

The main problem is, of course, that of optimising. Either optimising in the sense of maximising the measure (maximum complexity!) or of choosing the measure that is most extreme fit to the definition (maximally narrow definition of complexity!). It seems we might be able to do something about this.

Let's start by having AI create sample a large class of utility functions. Require them to be around the same expected complexity as human values. Then we use our crude measure μ - for argument's sake, let's make it something like "approval by simulated (or hypothetical) humans, on a numerical scale". This is certainly a crude measure.

We can then rank all the utility functions u, using μ to measure the value of "create M(u), a u-maximising AI, with this utility function". Then, to avoid the problems with optimisation, we could select a certain threshold value and pick any u such that E(μ|M(u)) is just above the threshold.

How to pick this threshold? Well, we might have some principled arguments ("this is about as good a future as we'd expect, and this is about as good as we expect that these simulated humans would judge it, honestly, without being hacked").

One thing we might want to do is have multiple μ, and select things that score reasonably (but not excessively) on all of them. This is related to my idea that the best Turing test is one that the computer has not been trained or optimised on. Ideally, you'd want there to be some category of utilities "be genuinely friendly" that score higher than you'd expect on many diverse human-related μ (it may be better to randomly sample rather than fitting to precise criteria).

You could see this as saying that "programming an AI to preserve human happiness is insanely dangerous, but if you find an AI programmed to satisfice human preferences, and that other AI also happens to preserve human happiness (without knowing it would be tested on this preservation), then... it might be safer".

There are a few other thoughts we might have for trying to pick a safer u:

  • Properties of utilities under trade (are human-friendly functions more or less likely to be tradable with each other and with other utilities)?
  • If we change the definition of "human", this should have effects that seem reasonable for the change. Or some sort of "free will" approach: if we change human preferences, we want the outcome of u to change in ways comparable with that change.
  • Maybe also check whether there is a wide enough variety of future outcomes, that don't depend on the AI's choices (but on human choices - ideas from "detecting agents" may be relevant here).
  • Changing the observers from hypothetical to real (or making the creation of the AI contingent, or not, on the approval), should not change the expected outcome of u much.
  • Making sure that the utility u can be used to successfully model humans (therefore properly reflects the information inside humans).
  • Make sure that u is stable to general noise (hence not over-optimised). Stability can be measured as changes in E(μ|M(u)), E(u|M(u)), E(v|M(u)) for generic v, and other means.
  • Make sure that u is unstable to "nasty" noise (eg reversing human pain and pleasure).
  • All utilities in a certain class - the human-friendly class, hopefully - should score highly under each other (E(u|M(u)) not too far off from E(u|M(v))), while the over-optimised solutions - those scoring highly under some μ - must not score high under the class of human-friendly utilities.

This is just a first stab at it. It does seem to me that we should be able to abstractly characterise the properties we want from a friendly utility function, which, combined with crude measures, might actually allow us to select one without fully defining it. Any thoughts?

And with that, the various results of my AI retreat are available to all.

Weekly LW Meetups

3 FrankAdamek 27 March 2015 03:04PM

This summary was posted to LW Main on March 20th. The following week's summary is here.

New meetups (or meetups with a hiatus of more than a year) are happening in:

Irregularly scheduled Less Wrong meetups are taking place in:

The remaining meetups take place in cities with regular scheduling, but involve a change in time or location, special meeting content, or simply a helpful reminder about the meetup:

Locations with regularly scheduled meetups: Austin, Berkeley, Berlin, Boston, Brussels, Buffalo, Cambridge UK, Canberra, Columbus, London, Madison WI, Melbourne, Moscow, Mountain View, New York, Philadelphia, Research Triangle NC, Seattle, Sydney, Tel Aviv, Toronto, Vienna, Washington DC, and West Los Angeles. There's also a 24/7 online study hall for coworking LWers.

continue reading »

Learning and testing environments

0 DeVliegendeHollander 27 March 2015 02:41PM

Not a full article. Discussion-starter. Half-digested ideas for working them out collaboratively, if you are interested. Will edit article with your feedback.

Learning environments

Examples: Less Wrong, martial arts gyms, Toastmasters

- Focused on improving a skill or virtue or ability

- "we are all here to learn" attitude

- Little if any status competition with that skill or ability, because it is understood your level is largely based on how long you are practicing or learning it, being better because having started 5 years before others does not make you an inherently superior person, it is the expected return of your investment which others also expect to get with time.

- If there is any status competition at all, it is in the dedication to improve

- It is allowed, in fact encouraged to admit weakness, as it both helps improving and signals dedication thereto

- The skill or ability is not considered inherent or inborn

- People do not essentialize or "identitize" that skill or ability, they generally don't think about each other in the framework of stupid, smart, strong, weak, brave, timid


Testing environment

Examples: most of life, that is the problem actually! Most discussion boards, Reddit. Workplaces. Dating.

- I should just invert all of the above, really

- People are essentialized or "identitized" as smart, stupid, strong, weak, brave, timid

- Above abilities or other ones seen as more or less inborn, or more accurate people don't really dwell on that question much but still more or less consider them unchangable, "you are what you are"

- Status competition with those abilities

- Losers easily written off, not encouraged to improve

- Social pressure incentive to signal better ability than you have

- Social pressure incentive to not admit weakness

- Social pressure incenctive to not look like someone who is working on improving: that signals not already being awesome at it, and certainly not being "born" so 

- Social pressure incentive to make accomplishing hard things look easy to show extra ability


Objections / falsification / what it doesn't predict: competition can incentivize working hard. It can make people ingenious.

Counter-objection: as long as you make it clear it is not about an innate ability. That is terrible for development.  but if it is not about ability but working on improving, you get the above social pressure incentive problems: attitudes efficient for competing are not efficient for improving. Possible solution: intermittent competition.

Possible combinations?

If you go to a dojo and see someone wearing an orange or green belt, do you both see it as a combination of tests taken and thus current ability, or a signal of what the person is currently learning and improving on (the material of the next belt exam) ? Which one is stronger? Do you see them as "good"/"bad" or improving? 

Tentatively: they are more learning than testing environments. 

Tentatively: formal tests and gradings can turn the rest of the environment into a learning environment. 

Tentatively: maybe it is the lack of formal tests and gradings and certifications is what is turning the rest of the world all too often a testing environment.

Value proposition: it would be good to turn as much as possible of the world into learning environments, except mission-critical jobs, responsibilities etc. which necessarily must be testing environment. 

Would the equivalent of a belts system in everything fix it? Figuratively-speaking, green-belt philosopher of religion: atheist or theist, but excepted to not use the worst arguments? Orange-belt voter or political-commentator: does not use the Noncentral Fallacy? More academic ranks than just Bachelor, Masters, PhD? 

If we are so stupidly hard-wired animals to always feel the need status-compete and form status hierarchies, and the issue here is largely the effort and time wasted on it plus importing these status-competing attitudes into issues that actually matter and ruining rational approaches to them, would it be better if just glancing on each others belt - figuratively speaking - would settle the status hierarchy question and we could focus on being constructive and rational?

Example: look at how much money people waste on signalling that they have money. Net worth is an objective enough measure, turning it into a belt, figuratively speaking, and signing e-mails as "sincerely, J. Random, XPLFZ", where XPLFZ is some precisely defined, agreed and hard-to-falsify signal of a net worth between $0.1M and $0.5M fix it? Let's ignore how repulsively crude and crass that sounds, such mores are cultural and subject to change anyway, would it lead to fewer unnecessarily, just showing-off and keeping-up-with-the-joneses purchases?

Counter-tests: do captains status-compete with lieutenants in the mess-hall? No. Do Green-belts with orange-belts? No. 

What it doesn't predict: kids still status-compete despite grades. Maybe they don't care so much about grades. LW has no "belts" yet status-competition is low to nonexistent. 

Boxing an AI?

2 tailcalled 27 March 2015 02:06PM

Boxing an AI is the idea that you can avoid the problems where an AI destroys the world by not giving it access to the world. For instance, you might give the AI access to the real world only through a chat terminal with a person, called the gatekeeper. This is should, theoretically prevent the AI from doing destructive stuff.

Eliezer has pointed out a problem with boxing AI: the AI might convince its gatekeeper to let it out. In order to prove this, he escaped from a simulated version of an AI box. Twice. That is somewhat unfortunate, because it means testing AI is a bit trickier.

However, I got an idea: why tell the AI it's in a box? Why not hook it up to a sufficiently advanced game, set up the correct reward channels and see what happens? Once you get the basics working, you can add more instances of the AI and see if they cooperate. This lets us adjust their morality until the AIs act sensibly. Then the AIs can't escape from the box because they don't know it's there.

The great decline in Wikipedia pageviews (condensed version)

10 VipulNaik 27 March 2015 02:02PM

To keep this post manageable in length, I have only included a small subset of the illustrative examples and discussion. I have published a longer version of this post, with more examples (but the same intro and concluding section), on my personal site.

Last year, during the months of June and July, as my work for MIRI was wrapping up and I hadn't started my full-time job, I worked on the Wikipedia Views website, aimed at easier tabulation of the pageviews for multiple Wikipedia pages over several months and years. It relies on a statistics tool called stats.grok.se, created by Doms Mituzas, and maintained by Henrik.

One of the interesting things I noted as I tabulated pageviews for many different pages was that the pageview counts for many already popular pages were in decline. Pages of various kinds peaked at different historical points. For instance, colors have been in decline since early 2013. The world's most populous countries have been in decline since as far back as 2010!

Defining the problem

The first thing to be clear about is what these pageviews count and what they don't. The pageview measures are taken from stats.grok.se, which in turn uses the pagecounts-raw dump provided hourly by the Wikimedia Foundation's Analytics team, which in turn is obtained by processing raw user activity logs. The pagecounts-raw measure is flawed in two ways:

  • It only counts pageviews on the main Wikipedia website and not pageviews on the mobile Wikipedia website or through Wikipedia Zero (a pared down version of the mobile site that some carriers offer at zero bandwidth costs to their customers, particularly in developing countries). To remedy these problems, a new dump called pagecounts-all-sites was introduced in September 2014. We simply don't have data for views of mobile domains or of Wikipedia Zero at the level of individual pages for before then. Moreover, stats.grok.se still uses pagecounts-raw (this was pointed to me in a mailing list message after I circulated an early version of the post).
  • The pageview count includes views by bots. The official estimate is that about 15% of pageviews are due to bots. However, the percentage is likely higher for pages with fewer overall pageviews, because bots have a minimum crawling frequency. So every page might have at least 3 bot crawls a day, resulting in a minimum of 90 bot pageviews even if there are only a handful of human pageviews.

Therefore, the trends I discuss will refer to trends in total pageviews for the main Wikipedia website, including page requests by bots, but excluding visits to mobile domains. Note that visits from mobile devices to the main site will be included, but mobile devices are by default redirected to the mobile site.

How reliable are the metrics?

As noted above, the metrics are unreliable because of the bot problem and the issue of counting only non-mobile traffic. German Wikipedia user Atlasowa left a message on my talk page pointing me to an email thread suggesting that about 40% of pageviews may be bot-related, and discussing some interesting examples.

Relationship with the overall numbers

I'll show that for many pages of interest, the number of pageviews as measured above (non-mobile) has declined recently, with a clear decline from 2013 to 2014. What about the total?

We have overall numbers for non-mobile, mobile, and combined. The combined number has largely held steady, whereas the non-mobile number has declined and the mobile number has risen.

What we'll find is that the decline for most pages that have been around for a while is even sharper than the overall decline. One reason overall pageviews haven't declined so fast is the creation of new pages. To give an idea, non-mobile traffic dropped by about 1/3 from January 2013 to December 2014, but for many leading categories of pages, traffic dropped by about 1/2-2/3.

Why is this important? First reason: better context for understanding trends for individual pages

People's behavior on Wikipedia is a barometer of what they're interested in learning about. An analysis of trends in the views of pages can provide an important window into how people's curiosity, and the way they satisfy this curiosity, is evolving. To take an example, some people have proposed using Wikipedia pageview trends to predict flu outbreaks. I myself have tried to use relative Wikipedia pageview counts to gauge changing interests in many topics, ranging from visa categories to technology companies.

My initial interest in pageview numbers arose because I wanted to track my own influence as a Wikipedia content creator. In fact, that was my original motivation with creating Wikipedia Views. (You can see more information about my Wikipedia content contributions on my site page about Wikipedia).

Now, when doing this sort of analysis for individual pages, one needs to account for, and control for, overall trends in the views of Wikipedia pages that are occurring for reasons other than a change in people's intrinsic interest in the subject. Otherwise, we might falsely conclude from a pageview count decline that a topic is falling in popularity, whereas what's really happening is an overall decline in the use of (the non-mobile version of) Wikipedia to satisfy one's curiosity about the topic.

Why is this important? Second reason: a better understanding of the overall size and growth of the Internet.

Wikipedia has been relatively mature and has had the top spot as an information source for at least the last six years. Moreover, unlike almost all other top websites, Wikipedia doesn't try hard to market or optimize itself, so trends in it reflect a relatively untarnished view of how the Internet and the World Wide Web as a whole are growing, independent of deliberate efforts to manipulate and doctor metrics.

The case of colors

Let's look at Wikipedia pages on some of the most viewed colors (I've removed the 2015 and 2007 columns because we don't have the entirety of these years). Colors are interesting because the degree of human interest in colors in general, and in individual colors, is unlikely to change much in response to news or current events. So one would at least a priori expect colors to offer a perspective into Wikipedia trends with fewer external complicating factors. If we see a clear decline here, then that's strong evidence in favor of a genuine decline.

I've restricted attention to a small subset of the colors, that includes the most common ones but isn't comprehensive. But it should be enough to get a sense of the trends. And you can add in your own colors and check that the trends hold up.

Page namePageviews in year 2014Pageviews in year 2013Pageviews in year 2012Pageviews in year 2011Pageviews in year 2010Pageviews in year 2009Pageviews in year 2008TotalPercentageTags
Black 431K 1.5M 1.3M 778K 900K 1M 958K 6.9M 16.1 Colors
Blue 710K 1.3M 1M 987K 1.2M 1.2M 1.1M 7.6M 17.8 Colors
Brown 192K 284K 318K 292K 308K 300K 277K 2M 4.6 Colors
Green 422K 844K 779K 707K 882K 885K 733K 5.3M 12.3 Colors
Orange 133K 181K 251K 259K 275K 313K 318K 1.7M 4 Colors
Purple 524K 906K 847K 895K 865K 841K 592K 5.5M 12.8 Colors
Red 568K 797K 912K 1M 1.1M 873K 938K 6.2M 14.6 Colors
Violet 56K 96K 75K 77K 69K 71K 65K 509K 1.2 Colors
White 301K 795K 615K 545K 788K 575K 581K 4.2M 9.8 Colors
Yellow 304K 424K 453K 433K 452K 427K 398K 2.9M 6.8 Colors
Total 3.6M 7.1M 6.6M 6M 6.9M 6.5M 6M 43M 100 --
Percentage 8.5 16.7 15.4 14 16 15.3 14 100 -- --

Since the decline appears to have happened between 2013 and 2014, let's examine the 24 months from January 2013 to December 2014:


MonthViews of page BlackViews of page BlueViews of page BrownViews of page GreenViews of page OrangeViews of page PurpleViews of page RedViews of page VioletViews of page WhiteViews of page YellowTotal Percentage
201412 30K 41K 14K 27K 9.6K 28K 67K 3.1K 21K 19K 260K 2.4
201411 36K 46K 15K 31K 10K 35K 50K 3.7K 23K 22K 273K 2.5
201410 37K 52K 16K 34K 10K 34K 51K 4.5K 25K 26K 289K 2.7
201409 37K 57K 16K 35K 9.9K 37K 45K 4.8K 27K 29K 298K 2.8
201408 33K 47K 14K 34K 8.5K 31K 38K 3.9K 21K 22K 253K 2.4
201407 33K 47K 14K 30K 9.3K 31K 37K 4.2K 22K 22K 250K 2.3
201406 32K 49K 14K 31K 10K 34K 39K 4.9K 23K 22K 259K 2.4
201405 44K 55K 17K 37K 10K 51K 42K 5.2K 26K 26K 314K 2.9
201404 34K 60K 17K 36K 14K 38K 47K 5.8K 27K 28K 306K 2.8
201403 37K 136K 19K 51K 14K 123K 52K 5.5K 30K 31K 497K 4.6
201402 38K 58K 19K 39K 13K 41K 49K 5.6K 29K 29K 321K 3
201401 40K 60K 19K 36K 14K 40K 50K 4.4K 27K 28K 319K 3
201312 62K 67K 17K 44K 12K 48K 48K 4.4K 42K 26K 372K 3.5
201311 141K 96K 20K 65K 11K 68K 55K 5.3K 71K 34K 566K 5.3
201310 145K 102K 21K 69K 11K 77K 59K 5.7K 71K 36K 598K 5.6
201309 98K 80K 17K 60K 11K 53K 51K 4.9K 45K 30K 450K 4.2
201308 109K 87K 20K 57K 20K 57K 60K 4.6K 53K 28K 497K 4.6
201307 107K 92K 21K 61K 11K 66K 65K 4.6K 61K 30K 520K 4.8
201306 115K 106K 22K 69K 13K 73K 64K 5.5K 70K 33K 571K 5.3
201305 158K 122K 24K 79K 14K 83K 69K 11K 77K 39K 677K 6.3
201304 151K 127K 28K 83K 14K 86K 74K 12K 78K 40K 694K 6.4
201303 155K 135K 31K 92K 15K 99K 84K 12K 80K 43K 746K 6.9
201302 152K 131K 31K 84K 28K 95K 84K 17K 77K 41K 740K 6.9
201301 129K 126K 32K 81K 19K 99K 84K 9.6K 70K 42K 691K 6.4
Total 2M 2M 476K 1.3M 314K 1.4M 1.4M 152K 1.1M 728K 11M 100
Percentage 18.1 18.4 4.4 11.8 2.9 13.3 12.7 1.4 10.2 6.8 100 --
Tags Colors Colors Colors Colors Colors Colors Colors Colors Colors Colors -- --


As we can see, the decline appears to have begun around March 2013 and then continued steadily till about June 2014, at which numbers stabilized to their lower levels.

A few sanity checks on these numbers:

  • The trends appear to be similar for different colors, with the notable difference that the proportional drop was higher for the more viewed color pages. Thus, for instance, black and blue saw declines from 129K and 126K to 30K and 41K respectively (factors of four and three respectively) from January 2013 to December 2014. Orange and yellow, on the other hand, dropped by factors of close to two. The only color that didn't drop significantly was red (it dropped from 84K to 67K, as opposed to factors of two or more for other colors), but this seems to have been partly due to an unusually large amount of traffic in the end of 2014. The trend even for red seems to suggest a drop similar to that for orange.
  • The overall proportion of views for different colors comports with our overall knowledge of people's color preferences: blue is overall a favorite color, and this is reflected in its getting the top spot with respect to pageviews.
  • The pageview decline followed a relatively steady trend, with the exception of some unusual seasonal fluctuation (including an increase in October and November 2013).

One might imagine that this is due to people shifting attention from the English-language Wikipedia to other language Wikipedias, but most of the other major language Wikipedias saw a similar decline at a similar time. More details are in my longer version of this post on my personal site.

Geography: continents and subcontinents, countries, and cities

Here are the views of some of the world's most populated countries between 2008 and 2014, showing that the peak happened as far back as 2010:

Page namePageviews in year 2014Pageviews in year 2013Pageviews in year 2012Pageviews in year 2011Pageviews in year 2010Pageviews in year 2009Pageviews in year 2008TotalPercentageTags
China 5.7M 6.8M 7.8M 6.1M 6.9M 5.7M 6.1M 45M 9 Countries
India 8.8M 12M 12M 11M 14M 8.8M 7.6M 73M 14.5 Countries
United States 13M 15M 18M 18M 34M 16M 15M 129M 25.7 Countries
Indonesia 5.3M 5.2M 3.7M 3.6M 4.2M 3.1M 2.5M 28M 5.5 Countries
Brazil 4.8M 4.9M 5.3M 5.5M 7.5M 4.9M 4.3M 37M 7.4 Countries
Pakistan 2.9M 4.5M 4.4M 4.3M 5.2M 4M 3.2M 28M 5.7 Countries
Bangladesh 2.2M 2.9M 3M 2.8M 2.9M 2.2M 1.7M 18M 3.5 Countries
Russia 5.6M 5.6M 6.5M 6.8M 8.6M 5.4M 5.8M 44M 8.8 Countries
Nigeria 2.6M 2.6M 2.9M 3M 3.5M 2.6M 2M 19M 3.8 Countries
Japan 4.8M 6.4M 6.5M 8.3M 10M 7.3M 6.6M 50M 10 Countries
Mexico 3.1M 3.9M 4.3M 4.3M 5.9M 4.7M 4.5M 31M 6.1 Countries
Total 59M 69M 74M 74M 103M 65M 59M 502M 100 --
Percentage 11.7 13.8 14.7 14.7 20.4 12.9 11.8 100 -- --

Of these countries, China, India and the United States are the most notable. China is the world's most populous. India has the largest population with some minimal English knowledge and legally (largely) unfettered Internet access to Wikipedia, while the United States has the largest population with quality Internet connectivity and good English knowledge. Moreover, in China and India, Internet use and access have been growing considerably in the last few years, whereas it has been relatively stable in the United States.

It is interesting that the year with the maximum total pageview count was as far back as 2010. In fact, 2010 was so significantly better than the other years that the numbers beg for an explanation. I don't have one, but even excluding 2010, we see a declining trend: gradual growth from 2008 to 2011, and then a symmetrically gradual decline. Both the growth trend and the decline trend are quite similar across countries.

We see a similar trend for continents and subcontinents, with the peak occurring in 2010. In contrast, the smaller counterparts, such as cities, peaked in 2013, similarly to colors, and the drop, though somewhat less steep than with colors, has been quite significant. For instance, a list for Indian cities shows that the total pageviews for these Indian cities declined from about 20 million in 2013 (after steady growth in the preceding years) to about 13 million in 2014.

Some niche topics where pageviews haven't declined

So far, we've looked at topics where pageviews have been declining since at least 2013, and some that peaked as far back as 2010. There are, however, many relatively niche topics where the number of pageviews has stayed roughly constant. But this stability itself is a sign of decay, because other metrics suggest that the topics have experienced tremendous growth in interest. In fact, the stability is even less impressive when we notice that it's a result of a cancellation between slight declines in views of established pages in the genre, and traffic going to new pages.

For instance, consider some charity-related pages:

Page namePageviews in year 2014Pageviews in year 2013Pageviews in year 2012Pageviews in year 2011Pageviews in year 2010Pageviews in year 2009Pageviews in year 2008TotalPercentageTags
Against Malaria Foundation 5.9K 6.3K 4.3K 1.4K 2 0 0 18K 15.6 Charities
Development Media International 757 0 0 0 0 0 0 757 0.7 Pages created by Vipul Naik Charities
Deworm the World Initiative 2.3K 277 0 0 0 0 0 2.6K 2.3 Charities Pages created by Vipul Naik
GiveDirectly 11K 8.3K 2.6K 442 0 0 0 22K 19.2 Charities Pages created by Vipul Naik
International Council for the Control of Iodine Deficiency Disorders 1.2K 1 2 2 0 1 2 1.2K 1.1 Charities Pages created by Vipul Naik
Nothing But Nets 5.9K 6.6K 6.6K 5.1K 4.4K 4.7K 6.1K 39K 34.2 Charities
Nurse-Family Partnership 2.9K 2.8K 909 30 8 72 63 6.8K 5.9 Pages created by Vipul Naik Charities
Root Capital 3K 2.5K 414 155 51 1.2K 21 7.3K 6.3 Charities Pages created by Vipul Naik
Schistosomiasis Control Initiative 4K 2.7K 1.6K 191 0 0 0 8.5K 7.4 Charities Pages created by Vipul Naik
VillageReach 1.7K 1.9K 2.2K 2.6K 97 3 15 8.4K 7.3 Charities Pages created by Vipul Naik
Total 38K 31K 19K 9.9K 4.6K 5.9K 6.2K 115K 100 --
Percentage 33.4 27.3 16.3 8.6 4 5.1 5.4 100 -- --

For this particular cluster of pages, we see the totals growing robustly year-on-year. But a closer look shows that the growth isn't that impressive. Whereas earlier, views were doubling every year from 2010 to 2013 (this was the take-off period for GiveWell and effective altruism), the growth from 2013 to 2014 was relatively small. And about half the growth from 2013 to 2014 was powered by the creation of new pages (including some pages created after the beginning of 2013, so they had more months in a mature state in 2014 than in 2013), while the other half was powered by growth in traffic to existing pages.

The data for philanthropic foundations demonstrates a fairly slow and steady growth (about 5% a year), partly due to the creation of new pages. This 5% hides a lot of variation between individual pages:

Page namePageviews in year 2014Pageviews in year 2013Pageviews in year 2012Pageviews in year 2011Pageviews in year 2010Pageviews in year 2009Pageviews in year 2008TotalPercentageTags
Atlantic Philanthropies 11K 11K 12K 10K 9.8K 8K 5.8K 67K 2.1 Philanthropic foundations
Bill & Melinda Gates Foundation 336K 353K 335K 315K 266K 240K 237K 2.1M 64.9 Philanthropic foundations
Draper Richards Kaplan Foundation 1.2K 25 9 0 0 0 0 1.2K 0 Philanthropic foundations Pages created by Vipul Naik
Ford Foundation 110K 91K 100K 90K 100K 73K 61K 625K 19.5 Philanthropic foundations
Good Ventures 9.9K 8.6K 3K 0 0 0 0 21K 0.7 Philanthropic foundations Pages created by Vipul Naik
Jasmine Social Investments 2.3K 1.8K 846 0 0 0 0 5K 0.2 Philanthropic foundations Pages created by Vipul Naik
Laura and John Arnold Foundation 3.7K 13 0 1 0 0 0 3.7K 0.1 Philanthropic foundations Pages created by Vipul Naik
Mulago Foundation 2.4K 2.3K 921 0 1 1 10 5.6K 0.2 Philanthropic foundations Pages created by Vipul Naik
Omidyar Network 26K 23K 19K 17K 19K 13K 11K 129K 4 Philanthropic foundations
Peery Foundation 1.8K 1.6K 436 0 0 0 0 3.9K 0.1 Philanthropic foundations Pages created by Vipul Naik
Robert Wood Johnson Foundation 26K 26K 26K 22K 27K 22K 17K 167K 5.2 Philanthropic foundations
Skoll Foundation 13K 11K 9.2K 7.8K 9.6K 5.8K 4.3K 60K 1.9 Philanthropic foundations
Smith Richardson Foundation 8.7K 3.5K 3.8K 3.6K 3.7K 3.5K 2.9K 30K 0.9 Philanthropic foundations
Thiel Foundation 3.6K 1.5K 1.1K 47 26 1 0 6.3K 0.2 Philanthropic foundations Pages created by Vipul Naik
Total 556K 533K 511K 466K 435K 365K 340K 3.2M 100 --
Percentage 17.3 16.6 15.9 14.5 13.6 11.4 10.6 100 -- --


The dominant hypothesis: shift from non-mobile to mobile Wikipedia use

The dominant hypothesis is that pageviews have simply migrated from non-mobile to mobile. This is most closely borne by the overall data: total pageviews have remained roughly constant, and the decline in total non-mobile pageviews has been roughly canceled by growth in mobile pageviews. However, the evidence for this substitution doesn't exist at the level of individual pages because we don't have pageview data for the mobile domain before September 2014, and much of the decline occurred between March 2013 and June 2014.

What would it mean if there were an approximate one-on-one substitution from non-mobile to mobile for the page types discussed above? For instance, non-mobile traffic to colors dropped to somewhere between 1/3 and 1/2 of their original traffic level between January 2013 and December 2014. This would mean that somewhere between 1/2 and 2/3 of the original non-mobile traffic to colors has shifted to mobile devices. This theory should be at least partly falsifiable: if the sum of traffic to non-mobile and mobile platforms today for colors is less than non-mobile-only traffic in January 2013, then clearly substitution is only part of the story.

Although the data is available, it's not currently in an easily computable form, and I don't currently have the time and energy to extract it. I'll update this once the data on all pageviews since September 2014 is available on stats.grok.se or a similar platform.

Other hypotheses

The following are some other hypotheses for the pageview decline:

  1. Google's Knowledge Graph: This is the hypothesis raised in Wikipediocracy, the Daily Dot, and the Register. The Knowledge Graph was introduced in 2012. Through 2013, Google rolled out snippets (called Knowledge Cards and Knowledge Panels) based on the Knowledge Graph in its search results. So if, for instance, you only wanted the birth date and nationality of a musician, Googling would show you that information right in the search results and you wouldn't need to click through to the Wikipedia page. I suspect that the Knowledge Graph played some role in the decline for colors seen between March 2013 and June 2014. On the other hand, many of the pages that saw a decline don't have any search snippets based on the Knowledge Graph, and therefore the decline for those pages cannot be explained this way.
  2. Other means of accessing Wikipedia's knowledge that don't involve viewing it directly: For instance, Apple's Siri tool uses data from Wikipedia, and people making queries to this tool may get information from Wikipedia without hitting the encyclopedia. The usage of such tools has increased greatly starting in late 2012. Siri itself was released with the third generation iPad in September 2012 and became part of the iPhone released the next month. Since then, it has shipped with all of Apple's mobile devices and tablets.
  3. Substitution away from Wikipedia to other pages that are becoming more search-optimized and growing in number: For many topics, Wikipedia may have been clearly the best information source a few years back (as judged by Google), but the growth of niche information sources, as well as better search methods, have displaced it from its undisputed leadership position. I think there's a lot of truth to this, but it's hard to quantify.
  4. Substitution away from coarser, broader pages to finer, narrower pages within Wikipedia: While this cannot directly explain an overall decline in pageviews, it can explain a decline in pageviews for particular kinds of pages. Indeed, I suspect that this is partly what's going on with the early decline of pageviews (e.g., the decline in pageviews of countries and continents starting around 2010, as people go directly to specialized articles related to the particular aspects of those countries or continents they are interested in).
  5. Substitution to Internet use in other languages: This hypothesis doesn't seem borne out by the simultaneous decline in pageviews for the English, French, and Spanish Wikipedia, as documented for the color pages.

It's still a mystery

I'd like to close by noting that the pageview decline is still very much a mystery as far as I am concerned. I hope I've convinced you that (a) the mystery is genuine, (b) it's important, and (c) although the shift to mobile is probably the most likely explanation, we don't yet have clear evidence. I'm interested in hearing whether people have alternative explanations, and/or whether they have more compelling arguments for some of the explanations proffered here.

Utility vs Probability: idea synthesis

3 Stuart_Armstrong 27 March 2015 12:30PM

A putative new idea for AI control; index here.

This post is a synthesis of some of the ideas from utility indifference and false miracles, in an easier-to-follow format that illustrates better what's going on.


Utility scaling

Suppose you have an AI with a utility u and a probability estimate P. There is a certain event X which the AI cannot affect. You wish to change the AI's estimate of the probability of X, by, say, doubling the odds ratio P(X):P(¬X). However, since it is dangerous to give an AI false beliefs (they may not be stable, for one), you instead want to make the AI behave as if it were a u-maximiser with doubled odds ratio.

Assume that the AI is currently deciding between two actions, α and ω. The expected utility of action α decomposes as:

u(α) = P(X)u(α|X) + P(¬X)u(α|¬X).

The utility of action ω is defined similarly, and the expected gain (or loss) of utility by choosing α over ω is:

u(α)-u(ω) = P(X)(u(α|X)-u(ω|X)) + P(¬X)(u(α|¬X)-u(ω|¬X)).

If we were to double the odds ratio, the expected utility gain becomes:

u(α)-u(ω) = (2P(X)(u(α|X)-u(ω|X)) + P(¬X)(u(α|¬X)-u(ω|¬X)))/Ω,    (1)

for some normalisation constant Ω = 2P(X)+P(¬X), independent of α and ω.

We can reproduce exactly the same effect by instead replacing u with u', such that

  • u'( |X)=2u( |X)
  • u'( |¬X)=u( |¬X)


u'(α)-u'(ω) = P(X)(u'(α|X)-u'(ω|X)) + P(¬X)(u'(α|¬X)-u'(ω|¬X)),

2P(X)(u(α|X)-u(ω|X)) + P(¬X)(u(α|¬X)-u(ω|¬X)).    (2)

This, up to an unimportant constant, is the same equation as (1). Thus we can accomplish, via utility manipulation, exactly the same effect on the AI's behaviour as a by changing its probability estimates.

Notice that we could also have defined

  • u'( |X)=u( |X)
  • u'( |¬X)=(1/2)u( |¬X)

This is just the same u', scaled.

The utility indifference and false miracles approaches were just special cases of this, where the odds ratio was sent to infinity/zero by multiplying by zero. But the general result is that one can start with an AI with utility/probability estimate pair (u,P) and map it to an AI with pair (u',P) which behaves similarly to (u,P'). Changes in probability can be replicated as changes in utility.


Utility translating

In the previous, we multiplied certain utilities by two. But by doing so, we implicitly used the zero point of u. But utility is invariant under translation, so this zero point is not actually anything significant.

It turns out that we don't need to care about this - any zero will do, what matters simply is that the spread between options is doubled in the X world but not in the ¬X one.

But that relies on the AI being unable to affect the probability of X and ¬X itself. If the AI has an action that will increase (or decrease) P(X), then it becomes very important where we set the zero before multiplying. Setting the zero in a different place is isomorphic with adding a constant to the X world and not the ¬X world (or vice versa). Obviously this will greatly affect the AI's preferences between X and ¬X.

One way of avoiding the AI affecting X is to set this constant so that u'(X)=u'(¬X), in expectation. Then the AI has no preferences between the two situations, and will not seek to boost one over the other. However, note that u(X) is an expected utility calculation. Therefore:

  1. Choosing the constant so that u'(X)=u'(¬X) requires accessing the AI's probability estimate P for various worlds; it cannot be done from outside, by multiplying the utility, as the previous approach could.
  2. Even if u'(X)=u'(¬X), this does not mean that u'(X|Y)=u'(¬X|Y) for every event Y that could happen before X does. Simple example: X is a coin flip, and Y is the bet of someone on that coin flip, someone the AI doesn't like.

This explains all the complexity of the utility indifference approach, which is essentially trying to decompose possible universes (and adding constants to particular subsets of universes) to ensure that u'(X|Y)=u'(¬X|Y) for any Y that could happen before X does.

Negative visualization, radical acceptance and stoicism

14 Vika 27 March 2015 03:51AM

In anxious, frustrating or aversive situations, I find it helpful to visualize the worst case that I fear might happen, and try to accept it. I call this “radical acceptance”, since the imagined worst case is usually an unrealistic scenario that would be extremely unlikely to happen, e.g. “suppose I get absolutely nothing done in the next month”. This is essentially the negative visualization component of stoicism. There are many benefits to visualizing the worst case:

  • Feeling better about the present situation by contrast.
  • Turning attention to the good things that would still be in my life even if everything went wrong in one particular domain.
  • Weakening anxiety using humor (by imagining an exaggerated “doomsday” scenario).
  • Being more prepared for failure, and making contingency plans (pre-hindsight).
  • Helping make more accurate predictions about the future by reducing the “X isn’t allowed to happen” effect (or, as Anna Salamon once put it, “putting X into the realm of the thinkable”).
  • Reducing the effect of ugh fields / aversions, which thrive on the “X isn’t allowed to happen” flinch.
  • Weakening unhelpful identities like “person who is always productive” or “person who doesn’t make stupid mistakes”.

Let’s say I have an aversion around meetings with my advisor, because I expect him to be disappointed with my research progress. When I notice myself worrying about the next meeting or finding excuses to postpone it so that I have more time to make progress, I can imagine the worst imaginable outcome a meeting with my advisor could have - perhaps he might yell at me or even decide to expel me from grad school (neither of these have actually happened so far). If the scenario is starting to sound silly, that’s a good sign. I can then imagine how this plays out in great detail, from the disappointed faces and words of the rest of the department to the official letter of dismissal in my hands, and consider what I might do in that case, like applying for industry jobs. While building up these layers of detail in my mind, I breathe deeply, which I associate with meditative acceptance of reality. (I use the word “acceptance” to mean “acknowledgement” rather than “resignation”.)

I am trying to use this technique more often, both in the regular and situational sense. A good default time is my daily meditation practice. I might also set up a trigger-action habit of the form “if I notice myself repeatedly worrying about something, visualize that thing (or an exaggerated version of it) happening, and try to accept it”. Some issues have more natural triggers than others - while worrying tends to call attention to itself, aversions often manifest as a quick flinch away from a thought, so it’s better to find a trigger among the actions that are often caused by an aversion, e.g. procrastination. A trigger for a potentially unhelpful identity could be a thought like “I’m not good at X, but I should be”. A particular issue can simultaneously have associated worries (e.g. “will I be productive enough?”), aversions (e.g. towards working on the project) and identities (“productive person”), so there is likely to be something there that makes a good trigger. Visualizing myself getting nothing done for a month can help with all of these to some degree.

System 1 is good at imagining scary things - why not use this as a tool?


Defeating the Villain

25 Zubon 26 March 2015 09:43PM

We have a recurring theme in the greater Less Wrong community that life should be more like a high fantasy novel. Maybe that is to be expected when a quarter of the community came here via Harry Potter fanfiction, and we also have rationalist group houses named after fantasy locations, descriptions of community members in terms of character archetypes and PCs versus NPCs, semi-serious development of the new atheist gods, and feel free to contribute your favorites in the comments.

A failure mode common to high fantasy novels as well as politics is solving all our problems by defeating the villain. Actually, this is a common narrative structure for our entire storytelling species, and it works well as a narrative structure. The story needs conflict, so we pit a sympathetic protagonist against a compelling antagonist, and we reach a satisfying climax when the two come into direct conflict, good conquers evil, and we live happily ever after.

This isn't an article about whether your opponent really is a villain. Let's make the (large) assumption that you have legitimately identified a villain who is doing evil things. They certainly exist in the world. Defeating this villain is a legitimate goal.

And then what?

Defeating the villain is rarely enough. Building is harder than destroying, and it is very unlikely that something good will spontaneously fill the void when something evil is taken away. It is also insufficient to speak in vague generalities about the ideals to which the post-[whatever] society will adhere. How are you going to avoid the problems caused by whatever you are eliminating, and how are you going to successfully transition from evil to good?

In fantasy novels, this is rarely an issue. The story ends shortly after the climax, either with good ascending or time-skipping to a society made perfect off-camera. Sauron has been vanquished, the rightful king has been restored, cue epilogue(s). And then what? Has the Chosen One shown skill in diplomacy and economics, solving problems not involving swords? What was Aragorn's tax policy? Sauron managed to feed his armies from a wasteland; what kind of agricultural techniques do you have? And indeed, if the book/series needs a sequel, we find that a problem at least as bad as the original fills in the void.

Reality often follows that pattern. Marx explicitly had no plan for what happened after you smashed capitalism. Destroy the oppressors and then ... as it turns out, slightly different oppressors come in and generally kill a fair percentage of the population. It works on the other direction as well; the fall of Soviet communism led not to spontaneous capitalism but rather kleptocracy and Vladmir Putin. For most of my lifetime, a major pillar of American foreign policy has seemed to be the overthrow of hostile dictators (end of plan). For example, Muammar Gaddafi was killed in 2011, and Libya has been in some state of unrest or civil war ever since. Maybe this is one where it would not be best to contribute our favorites in the comments.

This is not to say that you never get improvements that way. Aragorn can hardly be worse than Sauron. Regression to the mean perhaps suggests that you will get something less bad just by luck, as Putin seems clearly less bad than Stalin, although Stalin seems clearly worse than almost any other regime change in history. Some would say that causing civil wars in hostile countries is the goal rather than a failure of American foreign policy, which seems a darker sort of instrumental rationality.

Human flourishing is not the default state of affairs, temporarily suppressed by villainy. Spontaneous order is real, but it still needs institutions and social technology to support it.

Defeating the villain is a (possibly) necessary but (almost certainly) insufficient condition for bringing about good.

One thing I really like about this community is that projects tend to be conceived in the positive rather than the negative. Please keep developing your plans not only in terms of "this is a bad thing to be eliminated" but also "this is a better thing to be created" and "this is how I plan to get there."

Postdoctoral research positions at CSER (Cambridge, UK)

17 Sean_o_h 26 March 2015 05:59PM

[To be cross-posted at Effective Altruism Forum, FLI news page]

I'm delighted to announce that the Centre for the Study of Existential Risk has had considerable recent success in grantwriting and fundraising, among other activities (full update coming shortly). As a result, we are now in a position to advance to CSER's next stage of development: full research operations. Over the course of this year, we will be recruiting for a full team of postdoctoral researchers to work on a combination of general methodologies for extreme technological (and existential) risk analysis and mitigation, alongside specific technology/risk-specific projects.

Our first round of recruitment has just opened - we will be aiming to hire up to 4 postdoctoral researchers; details below. A second recruitment round will take place in the Autumn. We have a slightly unusual opportunity in that we get to cast our net reasonably wide. We have a number of planned research projects (listed below) that we hope to recruit for. However, we also have the flexibility to hire one or more postdoctoral researchers to work on additional projects relevant to CSER's aims. Information about CSER's aims and core research areas is available on our website. We request that as part of the application process potential postholders send us a research proposal of no more than 1500 words, explaining what your research skills could contribute to CSER. At this point in time, we are looking for people who will have obtained a doctorate in a relevant discipline by their start date.

We would also humbly ask that the LessWrong community aid us in spreading the word far and wide about these positions. There are many brilliant people working within the existential risk community. However, there are academic disciplines and communities that have had less exposure to existential risk as a research priority than others (due to founder effect and other factors), but where there may be people with very relevant skills and great insights. With new centres and new positions becoming available, we have a wonderful opportunity to grow the field, and to embed existential risk as a crucial consideration in all relevant fields and disciplines.

Thanks very much,

Seán Ó hÉigeartaigh (Executive Director, CSER)


"The Centre for the Study of Existential Risk (University of Cambridge, UK) is recruiting for to four full-time postdoctoral research associates to work on the project Towards a Science of Extreme Technological Risk.

We are looking for outstanding and highly-committed researchers, interested in working as part of growing research community, with research projects relevant to any aspect of the project. We invite applicants to explain their project to us, and to demonstrate their commitment to the study of extreme technological risks.

We have several shovel-ready projects for which we are looking for suitable postdoctoral researchers. These include:

  • Ethics and evaluation of extreme technological risk (ETR) (with Sir Partha Dasgupta;
  • Horizon-scanning and foresight for extreme technological risks (with Professor William Sutherland);
  • Responsible innovation and extreme technological risk (with Dr Robert Doubleday and the Centre for Science and Policy).

However, recruitment will not necessarily be limited to these subprojects, and our main selection criterion is suitability of candidates and their proposed research projects to CSER’s broad aims.

Details are available here. Closing date: April 24th."

Where can I go to exploit social influence to fight akrasia?

8 Snorri 26 March 2015 03:39PM

Briefly: I'm looking for a person (or group) with whom I can mutually discuss self improvement and personal goals (and nothing else) on a regular basis.

Also, note, this post is an example of asking a personally important question on LW. The following idea is not meant as a general mindhack, but just as something I want to try out myself.

We are unconsciously motivated by those around us. The Milgram experiment and the Asch conformity experiment are the two best examples of social influence that come to my mind, though I'm sure there are plenty more (if you haven't heard of them, I really suggest spending a minute).

I've tended to see this drive to conform to the expectations of others as a weakness of the human mind, and yes, it can be destructive. However, as long as its there, I should exploit it. Specifically, I want to exploit it to fight akrasia.

Utilizing positive social influence is a pretty common tactic for fighting drug addictions (like in AA), but I haven't really heard of it being used to fight unproductivity. Sharing your personal work/improvement goals with someone in the same position as yourself, along with reflecting on previous attempts, could potentially be powerful. Humans simply feel more responsible for the things they tell other people about, and less responsible for the things they bottle up and don't tell anyone (like all of my productivity strategies).

The setup that I envision would be something like this:

  • On a chat room, or some system like skype.1
  • Meet weekly at a very specific time for a set amount of time.
  • Your partner has a list of the productivity goals you set during the previous session. They ask you about your performance, forcing you to explain either your success or your failure.
  • Your partner tries to articulate what went wrong or what went right from your explanation (giving you a second perspective).
  • Once both parties have shared and evaluated, you set your new goals in light of your new experience (and with your partner's input, hopefully being more effective).
  • The partnership continues as long as it is useful for all parties.

I've tried doing something similar to this with my friends, but it just didn't work. We already knew each other too well, and there wasn't that air of dispassionate professionality. We were friends, but not partners (in this sense of the word).

If something close to what I describe already exists, or at least serves the same purpose, I would love to hear about it (I already tried the LW study hall, but it wasn't really the structure or atmosphere I was going for). Otherwise, I'd be thrilled to find someone here to try doing this with. You can PM me if you don't want to post here.



1. After explaining this whole idea to someone IRL, they remarked that there would be little social influence because we would only be meeting online in a pseudo-anonymous way. However, I don't find this to be the case personally when I talk with people online, so a virtual environment would be no detriment (hopefully this isn't just unique to me).

Edit (29/3/2015): Just for the record, I wanted to say that I was able to make the connection I wanted, via a PM. Thanks LW!

Values at compile time

5 Stuart_Armstrong 26 March 2015 12:25PM

A putative new idea for AI control; index here.

This is a simple extension of the model-as-definition and the intelligence module ideas. General structure of these extensions: even an unfriendly AI, in the course of being unfriendly, will need to calculate certain estimates that would be of great positive value if we could but see them, shorn from the rest of the AI's infrastructure.

It's almost trivially simple. Have the AI construct a module that models humans and models human understanding (including natural language understanding). This is the kind of thing that any AI would want to do, whatever its goals were.

Then take that module (using corrigibility) into another AI, and use it as part of the definition of the new AI's motivation. The new AI will then use this module to follow instruction humans give it in natural language.


Too easy?...

This approach essentially solves the whole friendly AI problem, loading it onto the AI in a way that avoids the whole "defining goals (or meta-goals, or meta-meta-goals) in machine code" or the "grounding everything in code" problems. As such it is extremely seductive, and will sound better, and easier, than it likely is.

I expect this approach to fail. For it to have any chance of success, we need to be sure that both model-as-definition and the intelligence module idea are rigorously defined. Then we have to have a good understanding of the various ways how the approach might fail, before we can even begin to talk about how it might succeed.

The first issue that springs to mind is when multiple definitions fit the AI's model of human intentions and understanding. We might want the AI to try and accomplish all the things it is asked to do, according to all the definitions. Therefore, similarly to this post, we want to phrase the instructions carefully so that a "bad instantiation" simply means the AI does something pointless, rather than something negative. Eg "Give humans something nice" seems much safer than "give humans what they really want".

And then of course there's those orders where humans really don't understand what they themselves want...

I'd want a lot more issues like that discussed and solved, before I'd recommend using this approach to getting a safe FAI.

What I mean...

4 Stuart_Armstrong 26 March 2015 11:59AM

A putative new idea for AI control; index here.

This is a simple extension of the model-as-definition and the intelligence module ideas. General structure of these extensions: even an unfriendly AI, in the course of being unfriendly, will need to calculate certain estimates that would be of great positive value if we could but see them, shorn from the rest of the AI's infrastructure.

The challenge is to get the AI to answer a question as accurately as possible, using the human definition of accuracy.

First, imagine an AI with some goal is going to answer a question, such as Q="What would happen if...?" The AI is under no compulsion to answer it honestly.

What would the AI do? Well, if it is sufficiently intelligent, it will model humans. It will use this model to understand what they meant by Q, and why they were asking. Then it will ponder various outcomes, and various answers it could give, and what the human understanding of those answers would be. This is what any sufficiently smart AI (friendly or not) would do.

Then the basic idea is to use modular design and corrigibility to extract the relevant pieces (possibly feeding them to another, differently motivated AI). What needs to be pieced together is: AI understanding of what human understanding of Q is, actual answer to Q (given this understanding), human understanding of various AI's answers (using model of human understanding), and minimum divergence between human understanding of answer and actual answer.

All these pieces are there, and if they can be safely extracted, the minimum divergence can be calculated and the actual answer calculated.

Revisiting Non-centrality

5 casebash 26 March 2015 01:49AM

I recently read this article by Yvain where he discusses what he calls the worst argument in the world or the "non-centrality fallacy". If you haven't read the article, he describes it by first beginning a hypothetical objection to a proposal to build a statue for Martin Luther King. He imagine that someone objects based on Martin Luther King being a "criminal", which would technically be true as some of his protests were technically not legal. Yvain notes that while the typical response would be to argue that he isn't a criminal, a more intellectually robust approach would be to argue that he is actually "the good kind of criminal". However, this would result in you looking silly. The non-centrality fallacy gets its name because someone is trying to convince us to treat an unusual member of a class (ie. Martin Luther King with the class of "criminals") with the members of a class we first think of (thieves, drug dealers, ect.). 

I think that this is a very valuable line of analysis, but unfortunately, labelling something as a "fallacy" is very black and white given that different people will consider different items to be central or non-central. I think that a much better approach is to identify the ways in which a particular example is either typical or non-typical. I will us + for similarities and - for differences, and ~ for things that aren't necessarily non-typical of the class, but worth separating out anyway. I will use "arguably" in front of all statements that aren't definite matters of fact.


Martin Luther King and criminals

+ Technically broke the law

~ Non-violent

- Arguably, non-selfish motivations

- Arguably, attempting to challenge unjust laws


Abortion and murder

+ Kills a human/human-like entity

- Complete absence/major differences in cognitive abilities

- No relationships with friends and family

- Doesn't create a fear of being killed within society

- Society hasn't invested significantly in raising the entity


Capital punishment and murder

+ Kills a human

- Arguably, decreases crime

- Arguably, helps the family move on

- Arguably, deserved as a punishment


Affirmative action and racism

+ Treats people differently based on race

+ Arguably increases tensions between races

+ Arguably, provides some individuals with undeserved opportunities

- Arguably, ensures equality of opportunity

- Arguably, reduces racism


Taxation and theft

+ Takes money without consent

- Accepted by the vast majority of the population

- Arguably, necessary for the functioning of society

- Arguably, necessary for addressing fundamentally unfair aspects of our society


When we have this as a list of positive and negatives, instead of a single conclusion about whether or not it is fallacious, I think that it opens up the conversation, instead of closing it down. Take for example the person who believe that capital punishment is accurately described as murder. They might argue that murdering a drug lord would decrease crime, but it is still murder. They would note that sometimes people who are killed are really bad people who treated others really badly, but it is still murder. They could argue that someone wouldn't have the right to go and kill someone who abused them, even if it would help them move on. So it isn't immediately clear whether the given example is central or not.

I suspect that in most of these cases, you won't be able to shift the person's point of view about the categorisation. However, you may be able to give much more insight to them about where the disagreement lies. Neutral parties will likely be much less likely to get caught up on the argument by definition.

Political topics attract participants inclined to use the norms of mainstream political debate, risking a tipping point to lower quality discussion

36 emr 26 March 2015 12:14AM

(I hope that is the least click-baity title ever.)

Political topics elicit lower quality participation, holding the set of participants fixed. This is the thesis of "politics is the mind-killer".

Here's a separate effect: Political topics attract mind-killed participants. This can happen even when the initial participants are not mind-killed by the topic. 

Since outreach is important, this could be a good thing. Raise the sanity water line! But the sea of people eager to enter political discussions is vast, and the epistemic problems can run deep. Of course not everyone needs to come perfectly prealigned with community norms, but any community will be limited in how robustly it can handle an influx of participants expecting a different set of norms. If you look at other forums, it seems to take very little overt contemporary political discussion before the whole place is swamped, and politics becomes endemic. As appealing as "LW, but with slightly more contemporary politics" sounds, it's probably not even an option. You have "LW, with politics in every thread", and "LW, with as little politics as we can manage".  

That said, most of the problems are avoided by just not saying anything that patterns matches too easily to current political issues. From what I can tell, LW has always had tons of meta-political content, which doesn't seem to cause problems, as well as standard political points presented in unusual ways, and contrarian political opinions that are too marginal to raise concern. Frankly, if you have a "no politics" norm, people will still talk about politics, but to a limited degree. But if you don't even half-heartedly (or even hypocritically) discourage politics, then a open-entry site that accepts general topics will risk spiraling too far in a political direction. 

As an aside, I'm not apolitical. Although some people advance a more sweeping dismissal of the importance or utility of political debate, this isn't required to justify restricting politics in certain contexts. The sort of the argument I've sketched (I don't want LW to be swamped by the worse sorts of people who can be attracted to political debate) is enough. There's no hypocrisy in not wanting politics on LW, but accepting political talk (and the warts it entails) elsewhere. Of the top of my head, Yvain is one LW affiliate who now largely writes about more politically charged topics on their own blog (SlateStarCodex), and there are some other progressive blogs in that direction. There are libertarians and right-leaning (reactionary? NRx-lbgt?) connections. I would love a grand unification as much as anyone, (of course, provided we all realize that I've been right all along), but please let's not tell the generals to bring their armies here for the negotiations.

Models as definitions

6 Stuart_Armstrong 25 March 2015 05:46PM

A putative new idea for AI control; index here.

The insight this post comes from is a simple one: defining concepts such as “human” and “happy” is hard. A superintelligent AI will probably create good definitions of these, while attempting to achieve its goals: a good definition of “human” because it needs to control them, and of “happy” because it needs to converse convincingly with us. It is annoying that these definitions exist, but that we won’t have access to them.


Modelling and defining

Imagine a game of football (or, as you Americans should call it, football). And now imagine a computer game version of it. How would you say that the computer game version (which is nothing more than an algorithm) is also a game of football?

Well, you can start listing features that they have in common. They both involve two “teams” fielding eleven “players” each, that “kick” a “ball” that obeys certain equations, aiming to stay within the “field”, which has different “zones” with different properties, etc...

As you list more and more properties, you refine your model of football. There are some properties that distinguish real from simulated football (fine details about the human body, for instance), but most of the properties that people care about are the same in both games.

My idea is that once you have a sufficiently complex model of football that applies to both the real game and a (good) simulated version, you can use that as the definition of football. And compare it with other putative examples of football: maybe in some places people play on the street rather than on fields, or maybe there are more players, or maybe some other games simulate different aspects to different degrees. You could try and analyse this with information theoretic considerations (ie given two model of two different examples, how much information is needed to turn one into the other).

Now, this resembles the “suggestively labelled lisp tokens” approach to AI, or the Cyc approach of just listing lots of syntax stuff and their relationships. Certainly you can’t keep an AI safe by using such a model of football: if you try an contain the AI by saying “make sure that there is a ‘Football World Cup’ played every four years”, the AI will still optimise the universe and then play out something that technically fits the model every four years, without any humans around.

However, it seems to me that ‘technically fitting the model of football’ is essentially playing football. The model might include such things as a certain number of fouls expected; an uncertainty about the result; competitive elements among the players; etc... It seems that something that fits a good model of football would be something that we would recognise as football (possibly needing some translation software to interpret what was going on). Unlike the traditional approach which involves humans listing stuff they think is important and giving them suggestive names, this involves the AI establishing what is important to predict all the features of the game.

We might even combine such a model with the Turing test, by motivating the AI to produce a good enough model that it could a) have conversations with many aficionados about all features of the game, b) train a team to expect to win the world cup, and c) use it to program successful football computer game. Any model of football that allowed the AI to do this – or, better still, that a football-model module that, when plugged into another, ignorant AI, allowed that AI to do this – would be an excellent definition of the game.

It’s also one that could cross ontological crises, as you move from reality, to simulation, to possibly something else entirely, with a new physics: the essential features will still be there, as they are the essential features of the model. For instance, we can define football in Newtonian physics, but still expect that this would result in something recognisably ‘football’ in our world of relativity.

Notice that this approach deals with edge cases mainly by forbidding them. In our world, we might struggle on how to respond to a football player with weird artificial limbs; however, since this was never a feature in the model, the AI will simply classify that as “not football” (or “similar to, but not exactly football”), since the model’s performance starts to degrade in this novel situation. This is what helps it cross ontological crises: in a relativistic football game based on a Newtonian model, the ball would be forbidden from moving at speeds where the differences in the physics become noticeable, which is perfectly compatible with the game as its currently played.


Being human

Now we take the next step, and have the AI create a model of humans. All our thought processes, our emotions, our foibles, our reactions, our weaknesses, our expectations, the features of our social interactions, the statistical distribution of personality traits in our population, how we see ourselves and change ourselves. As a side effect, this model of humanity should include almost every human definition of human, simply because this is something that might come up in a human conversation that the model should be able to predict.

Then simply use this model as the definition of human for an AI’s motivation.

What could possibly go wrong?

I would recommend first having an AI motivated to define “human” in the best possible way, most useful for making accurate predictions, keeping the definition in a separate module. Then the AI is turned off safely and the module is plugged into another AI and used as part of its definition of human in its motivation. We may also use human guidance at several points in the process (either in making, testing, or using the module), especially on unusual edge cases. We might want to have humans correcting certain assumptions the AI makes in the model, up until the AI can use the model to predict what corrections humans would suggest. But that’s not the focus of this post.

There are several obvious ways this approach could fail, and several ways of making it safer. The main problem is if the predictive model fails to define human in a way that preserves value. This could happen if the model is too general (some simple statistical rules) or too specific (a detailed list of all currently existing humans, atom position specified).

This could be combated by making the first AI generate lots of different models, with many different requirements of specificity, complexity, and predictive accuracy. We might require some models make excellent local predictions (what is the human about to say?), others excellent global predictions (what is that human going to decide to do with their life?). 

Then everything defined as “human” in any of the models counts as human. This results in some wasted effort on things that are not human, but this is simply wasted resources, rather than a pathological outcome (the exception being if some of the models define humans in an actively pernicious way – negative value rather than zero – similarly to the false-friendly AIs’ preferences in this post).

The other problem is a potentially extreme conservatism. Modelling humans involves modelling all the humans in the world today, which is a very narrow space in the range of all potential humans. To prevent the AI lobotomising everyone to a simple model (after all, there does exist some lobotomised humans today), we would want the AI to maintain the range of cultures and mind-types that exist today, making things even more unchanging.

To combat that, we might try and identify certain specific features of society that the AI is allowed to change. Political beliefs, certain aspects of culture, beliefs, geographical location (including being on a planet), death rates etc... are all things we could plausibly identify (via sub-sub-modules, possibly) as things that are allowed to change. It might be safer to allow them to change in a particular range, rather than just changing altogether (removing all sadness might be a good thing, but there are many more ways this could go wrong, than if we eg just reduced the probability of sadness). 

Another option is to keep these modelled humans little changing, but allow them to define allowable changes themselves (“yes, that’s a transhuman, consider it also a moral agent.”). The risk there is that the modelled humans get hacked or seduced, and that the AI fools our limited brains with a “transhuman” that is one in appearance only.

We also have to beware of not sacrificing seldom used values. For instance, one could argue that current social and technological constraints mean that no one has today has anything approaching true freedom. We wouldn’t want the AI to allow us to improve technology and social structures, but never get more freedom than we have today, because it’s “not in the model”. Again, this is something we could look out for, if the AI has separate models of “freedom” we could assess and permit to change in certain directions.

[POLITICS] Jihadism and a new kind of existential threat

-5 MrMind 25 March 2015 09:37AM

Politics is the mind-killer. Politics IS really the mind-killer. Please meditate on this until politics flows over you like butter on hot teflon, and your neurons stops fibrillating and resume their normal operations.


I've always found silly that LW, one of the best and most focused group of rationalists on the web isn't able to talk evenly about politics. It's true that we are still human, but can't we just make an effort at being calm and level-headed? I think we can. Does gradual exposure works on group, too? Maybe a little bit of effort combined with a little bit of exposure will work as a vaccine.
And maybe tomorrow a beautiful naked valkyrie will bring me to utopia on her flying unicorn...
Anyway, I want to try. Let's see what happens.


Two recent events has prompted me to make this post: I'm reading "The rise of the Islamic State" by Patrick Coburn, which I think does a good job in presenting fairly the very recent history surrounding ISIS, and the terrorist attack in Tunis by the same group, which resulted in 18 foreigners killed.
I believe that their presence in the region is now definitive: they control an area that is wider than Great Britain, with a population tallying over six millions, not counting the territories controlled by affiliate group like Boko Haram. Their influence is also expanding, and the attack in Tunis shows that this entity is not going to stay confined between the borders of Syria and Iraq.
It may well be the case that in the next ten years or so, this will be an international entity which will bring ideas and mores predating the Middle Age back on the Mediterranean Sea.

A new kind of existential threat

To a mildly rational person, the conflict fueling the rise of the Islamic State, namely the doctrinal differences between Sunni and Shia Islam, is the worst kind of Blue/Green division. A separation that causes hundreds of billions of dollars (read that again) to be wasted trying kill each other. But here it is, and the world must deal with it.
In comparison, Democrats and Republicans are so close that they could be mistaken for Aumann agreeing.
I fear that ISIS is bringing a new kind of existential threat: one where is not the existence of humankind at risks, but the existence of the idea of rationality.
The funny thing is that while people can be extremely irrational, they can still work on technology to discover new things. Fundamentalism has never stopped a country to achieve technological progress: think about the wonderful skyscrapers and green patches in the desert of the Arab Emirates or the nuclear weapons of Pakistan. So it might well be the case that in the future some scientist will start a seed AI believing that Allah will guide it to evolve in the best way. But it also might be that in the future, African, Asian and maybe European (gasp!) rationalists will be hunted down and killed like rats.
It might be the very meme of rationality to be erased from existence.


I'll close with a bunch of questions, both strictly and loosely related. Mainly, I'm asking you to refrain from proposing a solution. Let's assess the situation first.

  • Do you think that the Islamic State is an entity which will vanish in the future or not?
  • Do you think that their particularly violent brand of jihadism is a worse menace to the sanity waterline than say, other kind of religious movements, past or present?
  • Do you buy the idea that fundamentalism can be coupled with technological advancement, so that the future will presents us with Islamic AI's?
  • Do you think that the very same idea of rationality can be the subject of existential risk?
  • What do Neoreactionaries think of the Islamic State? After all, it's an exemplar case of the reactionaries in those areas winning big. I know it's only a surface comparison, I'm sincerely curious about what a NR think of the situation.

Live long and prosper.

Personal Notes On Productivity (A categorization of various resources)

5 CurtisSerVaas 25 March 2015 01:35AM

For each topic, I’ve curated a few links that I’ve found to be pretty high quality. 

  • Meta:(Epiphany Addiction, Reversing Advice, Excellence Porn)
  • @Learning: 
  • Success People: (Mastery),(ChoosingTopics: Osci,PG)
  • Thinking: (Ikigai, Stoicism, Rationality)
  • HabitChange: (!ShootDog)
  • Productivity.Principles/Energy/Relaxation:(FullEngagement, ArtOfLearning)
  • Productivity.Systems/Hacks: (Autofocus, GTD/ZTD, EatFrog),(Scott Young)
  • Depression/Anxiety: 
  • Social: 
  • Meditation 


Full List: https://workflowy.com/s/PpvZyyVzxs


I'd like feedback on: 


  • What other categories/links would you include (I'm sure there's lots of interesting stuff I'm missing.)? What do you think of the categorization ("Thinking" is a pretty large category.)? 
  • Whether you think I should make cross-posts about sub-topics here. In particular, I think that looking at SuccessfulPeople.Startups, SuccessfulPeople.Science, and the Meditation document are the most original parts of this post. SuccessfulPeople.Startups contains a categorization of some of Paul Graham's essays (e.g. Having ideas, fund-raising, executing, etc). The SuccessfulPeople.Science link contains a separate categorization of advice specifically for scientists (e.g. Picking ideas, the importance of being persistent, the importance of reading widely, etc). The meditation document lists a few high quality meditation resources that I've found (and I've read ~10 books on meditation. Most of it is crap. Some of the stuff I list is orders of magnitude better than the median meditation book I've read.). The main benefit of making a more cross posts is that the discussion/comments would be more focused on those topics. 
  • Whatever seems salient to you. 



Social prerequisites of rationality

-4 DeVliegendeHollander 24 March 2015 12:33PM

Summary: it is a prerequisite that you think you are entitled to your own beliefs, your beliefs matter, you think your actions follow your own beliefs and not from commands issued by others, and your actions can make a difference, at the very least in your own life. This may correlate with what one may call either equality or liberty.

Religion as not even attire, just obedience

I know people, mainly old rural folks from CEE, who do not think they are entitled to have a  vote in whether there is a God or not. They simply obey. This does NOT mean they base their beliefs on Authority: rather they think their beliefs do not matter, because nobody asks them about their beliefs. They base their behavior on Authority, because this is what is expected of them. The Big Man in suit tells you to pay taxes, you do. The Big Man in white lab coat tells you to take this medicine, you do. The Big Man in priestly robes tells you to kneel and cross yourself, you do. They think forming their own beliefs is above their "pay grade". One old guy, when asked any question outside this expertise, used to tell me "The Paternoster is the priests's business." Meaning: I am not entitled to form any beliefs regarding these matters, I lack the expertise, and lack the power. I think what we have here is not admirable epistemic humility, rather a heavy case of disempoweredness, inequality, oppression, lack of equality or liberty and of course all that internalized

Empoweredness, liberty, equality

Sure, on very high levels liberty and equality may be enemies: equality beyond a certain level can only be enforced by reducing liberties, and liberty leads to inequality. But only beyond a certain level: low and mid-levels they go hand in hand. My impression is that Americans who fight online for one and against the other simple take the level where they go hand in hand for granted, having had this for generations. But it is fairly obvious that on lower levels, some amount of liberty presumes some about of equality and vice versa. Equality also means an equality of power, and with that it is hard to tyrannize over others and reduce their liberties. You can only succesfully make others un-free if you wield much higher power than theirs and then equality goes out the window. The other way around: liberty means the rich person cannot simply decide to bulldoze the poor persons mud hut and build a golf range, he must make an offer to buy it and the other can refuse that offer: they negotiate as equals. Liberty presumes a certain equality of respect and consideration, or else it would be really straightforward to force the little to serve the big, the small person goals and autonomy and property being seen as less important (inequal to) the grand designs and majestic causes of the big people. 

The basic minimal level where equality and liberty goes hand in hand is called being empowered. It means each person has a smaller or bigger sphere (life, limb, property) what his or her decisions and choices shape. And in that sphere, his or her decisions matter. And thus in that sphere, his or her beliefs matter and they are empowered to and entitled to make them. And that is what creates the opportunity for rationality. 

Harking back to the previous point, your personal beliefs of theism or atheism matter only if it is difficult to force you to go through the motions anyway. Even if it is just an attire, there is a difference between donning that voluntarily or being forced to. If you can be forced to do so, plain simply the Higher Ups are not interested in what you profess and believe. And your parents probably not try to convince you that certain beliefs are true, rather they will just raise you to be obedient. Neither a blind believer nor a questioning skeptic be: just obey, go through the Socially Approved Motions. You can see how rationality seems kind of not very useful at that point.

Silicon Valley Rationalists

Paul Graham: "Materially and socially, technology seems to be decreasing the gap between the rich and the poor, not increasing it. If Lenin walked around the offices of a company like Yahoo or Intel or Cisco, he'd think communism had won. Everyone would be wearing the same clothes, have the same kind of office (or rather, cubicle) with the same furnishings, and address one another by their first names instead of by honorifics. Everything would seem exactly as he'd predicted, until he looked at their bank accounts. Oops."

I think the Bay Area may already have had this fairly high level of liberty-cum-equality, empoweredness, maybe it is fairly easy to see how programmers as employees are more likely to think freely about innovating in a non-authoritarian workplace athmosphere where both they are not limited much (liberty) and not made to feel they are small and the business owner is big (equality). This may be part of the reason why Rationalism emerged there (being a magnet for smart people is obviously another big reason).


Having said all that, I would be reluctant to engage in a project of pushing liberal values on the world in order to prepare the soil for sowing Rationalism. The primary reason is that those values all too often get hijacked - liberalism as an attire. Consider Boris Yeltsin, the soi-disant "liberal" Russian leader who made the office of the president all-powerful and the Duma weak simply because his opponents at there, i.e. a "liberal" who opposed parliamentarism (arguably one of the most important liberal principles), and who assaulted his opponents with tanks. His "liberalism" was largely about selling everything to Western capitalists and making Russia weak, which explains why Putin is popular - many Russian patriots see Yeltsin as something close to a traitor. Similar "sell everything to Westerners" attitudes meant the demise of Hungarian liberals, the Alliance of Free Democrats party, who were basically a George Soros Party.  The point here is not to pass a judgement on Yeltsin or those guys, but to point out how this kind of "exported liberalism" gets hijacked and both fails to implement its core values and sooner or later falls out of favor. You cannot cook from recipe books only.

What else then? Well, I don't have a solution. But my basic hunch would be to not import Western values into cultures, but more like try to tap into the egalitarian or libertarian elements of their own culture. As I demonstrated above, if you start from sufficiently low levels of both, it does not matter which angle you start from.  A society too mired in "Wild West Capitalism" may start from the equality angle, saying that the working poor do not intrinsically worth less than the rich, do not deserve to be mere means used for other people's goals but each person deserves a basic respect and consideration that includes their beliefs and choices should matter, and those beliefs and choices ought to be rational. A society stuck in a rigid dictatorship may start from the liberty angle, that people deserve more freedom to choose about their lives, and again, those choices and the beliefs that drive them better be rational.

Indifferent vs false-friendly AIs

8 Stuart_Armstrong 24 March 2015 12:13PM

A putative new idea for AI control; index here.

For anyone but an extreme total utilitarian, there is a great difference between AIs that would eliminate everyone as a side effect of focusing on their own goals (indifferent AIs) and AIs that would effectively eliminate everyone through a bad instantiation of human-friendly values (false-friendly AIs). Examples of indifferent AIs are things like paperclip maximisers, examples of false-friendly AIs are "keep humans safe" AIs who entomb everyone in bunkers, lobotomised and on medical drips.

The difference is apparent when you consider multiple AIs and negotiations between them. Imagine you have a large class of AIs, and that they are all indifferent (IAIs), except for one (which you can't identify) which is friendly (FAI). And you now let them negotiate a compromise between themselves. Then, for many possible compromises, we will end up with most of the universe getting optimised for whatever goals the AIs set themselves, while a small portion (maybe just a single galaxy's resources) would get dedicated to making human lives incredibly happy and meaningful.

But if there is a false-friendly AI (FFAI) in the mix, things can go very wrong. That is because those happy and meaningful lives are a net negative to the FFAI. These humans are running dangers - possibly physical, possibly psychological - that lobotomisation and bunkers (or their digital equivalents) could protect against. Unlike the IAIs, which would only complain about the loss of resources to the FAI, the FFAI finds the FAI's actions positively harmful (and possibly vice versa), making compromises much harder to reach.

And the compromises reached might be bad ones. For instance, what if the FAI and FFAI agree on "half-lobotomised humans" or something like that? You might ask why the FAI would agree to that, but there's a great difference to an AI that would be friendly on its own, and one that would choose only friendly compromises with a powerful other AI with human-relevant preferences.

Some designs of FFAIs might not lead to these bad outcomes - just like IAIs, they might be content to rule over a galaxy of lobotomised humans, while the FAI has its own galaxy off on its own, where its humans take all these dangers. But generally, FFAIs would not come about by someone designing a FFAI, let alone someone designing a FFAI that can safely trade with a FAI. Instead, they would be designing a FAI, and failing. And the closer that design got to being FAI, the more dangerous the failure could potentially be.

So, when designing an FAI, make sure to get it right. And, though you absolutely positively need to get it absolutely right, make sure that if you do fail, the failure results in a FFAI that can safely be compromised with, if someone else gets out a true FAI in time.

Why I Reject the Correspondence Theory of Truth

6 pragmatist 24 March 2015 11:00AM

This post began life as a comment responding to Peer Gynt's request for a steelman of non-correspondence views of truth. It ended up being far too long for a comment, so I've decided to make it a separate post. However, it might have the rambly quality of a long comment rather than a fully planned out post.

Evaluating Models

Let's say I'm presented with a model and I'm wondering whether I should incorporate it into my belief-set. There are several different ways I could go about evaluating the model, but for now let's focus on two. The first is pragmatic. I could ask how useful the model would be for achieving my goals. Of course, this criterion of evaluation depends crucially on what my goals actually are. It must also take into account several other factors, including my cognitive abilities (perhaps I am better at working with visual rather than verbal models) and the effectiveness of alternative models available to me. So if my job is designing cannons, perhaps Newtonian mechanics is a better model than relativity, since the calculations are easier and there is no significant difference in the efficacy of the technology I would create using either model correctly. On the other hand, if my job is designing GPS systems, relativity might be a better model, with the increased difficulty of calculations being compensated by a significant improvement in effectiveness. If I design both cannons and GPS systems, then which model is better will vary with context.

Another mode of evaluation is correspondence with reality, the extent to which the model accurately represents its domain. In this case, you don't have much of the context-sensitivity that's associated with pragmatic evaluation. Newtonian mechanics may be more effective than the theory of relativity at achieving certain goals, but (conventional wisdom says) relativity is nonetheless a more accurate representation of the world. If the cannon maker believes in Newtonian mechanics, his beliefs don't correspond with the world as well as they should. According to correspondence theorists, it is this mode of evaluation that is relevant when we're interested in truth. We want to know how well a model mimics reality, not how useful it is.

I'm sure most correspondence theorists would say that the usefulness of a model is linked to its truth. One major reason why certain models work better than others is that they are better representations of the territory. But these two motivations can come apart. It may be the case that in certain contexts a less accurate theory is more useful or effective for achieving certain goals than a more accurate theory. So, according to a correspondence theorist, figuring out which model is most effective in a given context is not the same thing as figuring out which model is true.

How do we go about these two modes of evaluation? Well, evaluation of the pragmatic success of a model is pretty easy. Say I want to figure out which of several models will best serve the purpose of keeping me alive for the next 30 days. I can randomly divide my army of graduate students into several groups, force each group to behave according to the dictates of a separate model, and then check which group has the highest number of survivors after 30 days. Something like that, at least.

But how do I evaluate whether a model corresponds with reality? The first step would presumable involve establishing correspondences between parts of my model and parts of the world. For example, I could say "Let mS in my model represent the mass of the Sun." Then I check to see if the structural relations between the bits of my model match the structural relations between the corresponding bits of the world. Sounds simple enough, right? Not so fast! The procedure described above relies on being able to establish (either by stipulation or discovery) relations between the model and reality. That presupposes that we have access to both the model and to reality, in order to correlate the two. In what sense do we have "access" to reality, though? How do I directly correlate a piece of reality with a piece of my model?

Models and Reality

Our access to the external world is entirely mediated by models, either models that we consciously construct (like quantum field theory) or models that our brains build unconsciously (like the model of my immediate environment produced in my visual cortex). There is no such thing as pure, unmediated, model-free access to reality. But we often do talk about comparing our models to reality. What's going on here? Wouldn't such a comparison require us to have access to reality independent of the models? Well, if you think about it, whenever we claim to be comparing a model to reality, we're really comparing one model to another model. It's just that we're treating the second model as transparent, as an uncontroversial proxy for reality in that context. Those last three words matter: A model that is used as a criterion for reality in one investigative context might be regarded as controversial -- as explicitly a model of reality rather than reality itself -- in another context.

Let's say I'm comparing a drawing of a person to the actual person. When I say things like "The drawing has a scar on the left side of the face, but in reality the scar is on the right side", I'm using the deliverances of visual perception as my criterion for "reality". But in another context, say if I'm talking about the psychology of perception, I'd talk about my perceptual model as compared (and, therefore, contrasted) to reality. In this case my criterion for reality will be something other than perception, say the readings from some sort of scientific instrument. So we could say things like, "Subjects perceive these two colors as the same, but in reality they are not." But by "reality" here we mean something like "the model of the system generated by instruments that measure surface reflectance properties, which in turn are built based on widely accepted scientific models of optical phenomena".

When we ordinarily talk about correspondence between models and reality, we're really talking about the correspondence between bits of one model and bits of another model. The correspondence theory of truth, however, describes truth as a correspondence relation between a model and the world itself. Not another model of the world, the world. And that, I contend, is impossible. We do not have direct access to the world. When I say "Let mS represent the mass of the Sun", what I'm really doing is correlating a mathematical model with a verbal model, not with immediate reality. Even if someone asks me "What's the Sun?", and I point at the big light in the sky, all I'm doing is correlating a verbal model with my visual model (a visual model which I'm fairly confident is extremely similar, though not exactly the same, as the visual model of my interlocutor). Describing correspondence as a relationship between models and the world, rather than a relationship between models and other models, is a category error.

So I can go about the procedure of establishing correspondences all I want, correlating one model with another. All this will ultimately get me is coherence. If all my models correspond with one another, then I know that there is no conflict between my different models. My theoretical model coheres with my visual model, which coheres with my auditory model, and so on. Some philosophers have been content to rest here, deciding that coherence is all there is to truth. If the deliverances of my scientific models match up with the deliverances of my perceptual models perfectly, I can say they are true. But there is something very unsatisfactory about this stance. The world has just disappeared. Truth, if it is anything at all, involves both our models and the world. However, the world doesn't feature in the coherence conception of truth. I could be floating in a void, hallucinating various models that happen to cohere with one another perfectly, and I would have attained the truth. That can't be right.

Correspondence Can't Be Causal

The correspondence theorist may object that I've stacked the deck by requiring that one consciously establish correlations between models and the world. The correspondence isn't a product of stipulation or discovery, it's a product of basic causal connections between the world and my brain. This seems to be Eliezer's view. Correspondence relations are causal relations. My model of the Sun corresponds with the behavior of the actual Sun, out there in the real world, because my model was produced by causal interactions between the actual Sun and my brain.

But I don't think this maneuver can save the correspondence theory. The correspondence theory bases truth on a representational relationship between models/beliefs and the world. A model is true if it accurately represents its domain. Representation is a normative relationship. Causation is not. What I mean by this is that representation has correctness conditions. You can meaningfully say "That's a good representation" or "That's a bad representation". There is no analog with causation. There's no sense in which some particular putatively causal relation ends up being a "bad" causal relation. Ptolemy's beliefs about the Sun's motion were causally entangled with the Sun, yet we don't want to say that those beliefs are accurate. It seems mere causal entanglement is insufficient. We need to distinguish between the right sort of causal entanglement (the sort that gets you an accurate picture of the world) and the wrong sort. But figuring out this distinction takes us back to the original problem. If we only have immediate access to models, on what basis can we decide whether our models are caused by the world in a manner that produces an accurate picture. To determine this, it seems we again need unmediated access to the world.

Back to Pragmatism

Ultimately, it seems to me the only clear criterion the correspondence theorist can establish for correlating the model with the world is actual empirical success. Use the model and see if it works for you, if it helps you attain your goals. But this is exactly the same as the pragmatic mode of evaluation which I described above. And the representational mode of evaluation is supposed to differ from this.

The correspondence theorist could say that pragmatic success is a proxy for representational success. Not a perfect proxy, but good enough. The response is, "How do you know?" If you have no independent means of determining representational success, if you have no means of calibration, how can you possibly determine whether or not pragmatic success is a good proxy for representational success? I mean, I guess you can just assert that a model that is extremely pragmatically successful for a wide range of goals also corresponds well with reality, but how does that assertion help your theory of truth? It seems otiose. Better to just associate truth with pragmatic success itself, rather than adding the unjustifiable assertion to rescue the correspondence theory.

So yeah, ultimately I think the second of the two means of evaluating models I described at the beginning (correspondence) can only really establish coherence between your various models, not coherence between your models and the world. Since that sort of evaluation is not world-involving, it is not the correct account of truth. Pragmatic evaluation, on the other hand, *is* world-involving. You're testing your models against the world, seeing how effective they are at helping you accomplish your goal. That is the appropriate normative relationship between your beliefs and the world, so if anything deserves to be called "truth", it's pragmatic success, not correspondence.

This has consequences for our conception of what "reality" is. If you're a correspondence theorist, you think reality must have some form of structural similarity to our beliefs. Without some similarity in structure (or at least potential similarity) it's hard to say how one meaningfully could talk about beliefs representing reality or corresponding to reality. Pragmatism, on the other hand, has a much thinner conception of reality. The real world, on the pragmatic conception is just an external constraint on the efficacy of our models. We try to achieve certain goals using our models and something pushes back, stymieing our efforts. Then we need to build improved models in order to counteract this resistance. Bare unconceptualized reality, on this view, is not a highly structured field whose structure we are trying to grasp. It is a brute, basic constraint on effective action.

It turns out that working around this constraint requires us to build complex models -- scientific models, perceptual models, and more. These models become proxies for reality, and we treat various models as "transparent", as giving us a direct view of reality, in various contexts. This is a useful tool for dealing with the constraints offered by reality. The models are highly structured, so in many contexts it makes sense to talk about reality as highly structured, and to talk about our other models matching reality. But it is also important to realize that when we say "reality" in those contexts, we are really talking about some model, and in other contexts that model need not be treated as transparent. Not realizing this is an instance of the mind projection fallacy. If you want a context-independent, model-independent notion of reality, I think you can say no more about it than "a constraint on our models' efficacy".

That sort of reality is not something you represent (since representation assumes structural similarity), it's something you work around. Our models don't mimic that reality, they are tools we use to facilitate effective action under the constraints posed by reality. All of this, as I said at the beginning, is goal and context dependent, unlike the purported correspondence theory mode of evaluating models. That may not be satisfactory, but I think it's the best we have. Pragmatist theory of truth for the win.


Michael Oakeshott's critique of something-he-called-rationalism

-1 DeVliegendeHollander 24 March 2015 09:20AM

Ideally participants in this discussion would have read his relevant essays (collected in the Rationalism in Politics book) but as an introduction this will do  and this one is also good.

Clealy Oakeshott means something different under Rationalism than LW. I will call it SOCR (Something Oakeshott Calls Rationalism) now.

SOCR is the the idea that you can learn to cook from a recipe book, following the algorithms. He argues it  used to be a popular idea in early 20th century Britain and it is false. Recipe books are written for people who can already cook, and this knowledge only comes from experience, not books. Either self-discovery, or apprenticing. Try to learn to cook from a recipe book and the book will not teach you, but you own failed experiments will, the hard way, you end up rediscovering cooking by a trial and error basis. Apprenticing is easier. The recipe book writer assumes the recipe works on an empty mind, while it only works on a mind already filled with experience. And what is worse, often minds are filled with the wrong kind of experience.

While Oakeshott can be accusing of endorsing "life experience as conversation stopper" his main argument is basically how is knowledge communicable. You have knowledge in your head, much of it gathered through experience, you may not be able to communicate all aspect of it through training an apprentice and even less through writing a book. Doing things is often more of an art than science. Worse, you would expect having the students cup pre-filled by the right kind of stuff, but often it is empty or filled with the wrong kind of stuff, which makes your book misunderstood.

Oakeshott focused on politics because his main point was that following a recipe book like Marxism-Leninism or Maoism is not simply a bad idea, but literally impossible, the doctrine you learn will be colored by your pre-existing experience and you will do whatever your experience dictates anyway. The issue is misleading yourself and others into thinking you are implementing a recipe, an algorithm, when it is not the case. 

Oakeshott is basically saying e.g. you can never predict what the Soviet Union will do by looking at the Marxist books they read. However, if you add up the experience of the Tzarist imperialism and the experience of being a very reasonably paranoid revolutionary on the run from the Ohrana and fearing betrayal at every corner, you may predict what they are up to better.

SOCR is clearly not LWR and it is unfortunate the word "Rationalism" appears in both. Since I was exposed to Oakeshott and similar ideas earlier than LW I would actually prefer different terms for LWR, like "pragmatic reason", but it is not up to me to make this choice, at least not in English, I may try to influence other languages though. 

Ultimately, Oakeshott ends up with a set of ideas very similar to LW, such as coming down on the side of the shepherd in The Simple Truth, not on the side of Marcos Sophisticus. In fact, Oakeshotts' SOCR he criticizes is clearly the later:

>As knowledge of the realm of the shadows is a real and hard-won achievement, the theorist goes gravely astray when he relies on his theoretical insights to issue directives to the practitioner, ridiculously trying to “set straight” the practical man on matters with which the theorist has no familiarity. The cave dwellers, first encountering the theorist on his return, might be impressed “when he tells them that what they had always thought of as ‘a horse’ is not what they suppose it to be . . . but is, on the contrary, a modification of the attributes of God. . . . But if he were to tell them that, in virtue of his more profound understanding of the nature of horses, he is a more expert horse-man, horse-chandler, or stable boy than they (in their ignorance) could ever hope to be, and when it becomes clear that his new learning has lost him the ability to tell one end of a horse from the other . . . [then] before long the more perceptive of the cave-dwellers [will] begin to suspect that, after all, he [is] not an interesting theorist but a fuddled and pretentious ‘theoretician’ who should be sent on his travels again, or accommodated in a quiet home.”

Ultimately both LW and Oakeshott support the cavemen. It is just unfortunate they use the term "Rationalist" in entirely opposite meanings.

Request for Steelman: Non-correspondence concepts of truth

12 PeerGynt 24 March 2015 03:11AM

A couple of days ago, Buybuydandavis wrote the following on Less Wrong:

I'm increasingly of the opinion that truth as correspondence to reality is a minority orientation.

I've spent a lot of energy over the last couple of days trying to come to terms with the implications of this sentence.  While it certainly corresponds with my own observations about many people, the thought that most humans simply reject correspondence to reality as the criterion for truth seems almost too outrageous to take seriously.  If upon further reflection I end up truly believing this, it seems  that it would be impossible for me to have a discussion about the nature of reality with the great majority of the human race.  In other words, if I truly believed this, I would label most people as being too stupid to have a real discussion with. 

However, this reaction seems like an instance of a failure mode described by Megan McArdle:

I’m always fascinated by the number of people who proudly build columns, tweets, blog posts or Facebook posts around the same core statement: “I don’t understand how anyone could (oppose legal abortion/support a carbon tax/sympathize with the Palestinians over the Israelis/want to privatize Social Security/insert your pet issue here)." It’s such an interesting statement, because it has three layers of meaning.

The first layer is the literal meaning of the words: I lack the knowledge and understanding to figure this out. But the second, intended meaning is the opposite: I am such a superior moral being that I cannot even imagine the cognitive errors or moral turpitude that could lead someone to such obviously wrong conclusions. And yet, the third, true meaning is actually more like the first: I lack the empathy, moral imagination or analytical skills to attempt even a basic understanding of the people who disagree with me

 In short, “I’m stupid.” Something that few people would ever post so starkly on their Facebook feeds.

With this background, it seems important to improve my model of people who reject correspondence as the criterion for truth.  The obvious first place to look is in academic philosophy.  The primary challenger to correspondence theory is called “coherence theory”. If I understand correctly, coherence theory says that a statement is true iff it is logically consistent with “some specified set of sentences”

Coherence is obviously an important concept, which has valuable uses for example in formal systems. It does not capture my idea of what the word “truth” means, but that is purely a semantics issue. I would be willing to cede the word “truth” to the coherence camp if we agreed on a separate word we could use to mean “correspondence to reality”.   However, my intuition is that they wouldn't let us to get away with this. I sense that there are people out there who genuinely object to the very idea of discussing whether a sentences correspond to reality. 


So it seems I have a couple of options:

1. I can look for empirical evidence that buybuydandavis is wrong, ie that most people accept correspondence to reality as the criterion for truth

2. I can try to convince people to use some other word for correspondence to reality, so they have the necessary semantic machinery to have a real discussion about what reality is like

3. I can accept that most people are unable to have a discussion about the nature of reality

4. I can attempt to steelman the position that truth is something other than correspondence


Option 1 appears unlikely to be true. Option 2 seems unlikely to work.  Option 3 seems very unattractive, because it would be very uncomfortable to have discussions that on the surface appear to be about the nature of reality, but which really are about something else, where the precise value of "something else" is unknown to me. 

I would therefore be very interested in a steelman of non-correspondence concepts of truth. I think it would be important not only for me, but also for the rationalist community as a group, to get a more accurate model of how non-rationalists think about "truth"



Learning by Doing

3 adamzerner 24 March 2015 01:56AM

Most people believe very strongly that the best way to learn is to learn by doing. Particularly in the field of programming.

I have a different perspective. I see learning as very dependency based. Ie. there are a bunch of concepts you have to know. Think of them as nodes. These nodes have dependencies. As in you have to know A, B and C before you can learn D.

And so I'm always thinking about how to most efficiently traverse this graph of nodes. The efficient way to do it is to learn things in the proper order. For example, if you try to learn D without first understanding, say A and C, you'll struggle. I think that it'd be more efficient to identify your what your holes are (A and C) and address them first before trying to learn D.

I don't think that the "dive into a project" approach leads to an efficient traversal of this concept graph. That's not to say that it doesn't have its advantages. Here are some:

  1. People tend to find the act of building something fun, and thus motivating (even if it has no* use other than as a means to the end of learning).
  2. It's often hard to construct a curriculum that is comprehensive enough. Doing real world projects often forces you to do things that are hard to otherwise address.
  3. It's often hard to construct a curriculum that is ordered properly. Doing real world projects often is a reasonably efficient way of traversing the graph of nodes.
Personally I think that as far as 2) and 3) go, projects sometimes have their place, but they should be combined heavily with some sort of more formal curriculum, and that the projects should be focused projects. My real point is that the tradeoffs should always be taken into account, and that an efficient traversal of the concept graph is largely what you're after.

I should note that I feel most strongly about projects being overrated in the field of programming. I also feel rather strongly about it for quantitative fields in general. But in my limited experience with non-quantitative fields, I sense that 2) and 3) are too difficult to do formally and that projects are probably the best approximations (in 2015; in the future I anticipate smart tutors being way more effective than any project ever was or can be). For example, I've spent some time trying to learn design by reading books and stuff on the internet, but I sense that I'm really missing something that is hard to get without doing projects under the guidance of a good instructor.

What do you guys think about all of this?


Side Notes:

*Some people think, "projects are also good because when you're done, you've produced something cool!". I don't buy this argument.
  • Informal response: "C'mon, how many of the projects that you do as you're learning ever end up being used, let alone produce real utility for people?".
  • More formal response: I really believe in the idea that productivity of programmers differs by orders of magnitude. Ie. someone who's 30% more knowledgeable might be 100x more productive (as in faster and able to solve more difficult problems). And so... if you want to be productive, you'd be better off investing in learning until you're really good, and then start to "cash in" by producing.
1. Another thing I hate: when people say, "you just have to practice". I've asked people, "how can I get good at X?" and they've responded, "you just have to practice". And they say it with that condescending sophisticated cynicism. And even after I prod, they remain firm in their affirmation that you "just have to practice". It feels to me like they're saying, "I'm sorry, there's no way to efficiently traverse the graph. It's all the same. You just have to keep practicing." Sorry for the rant-y tone :/. I think that my System I is suffering from the illusion of transparency. I know that they don't see learning as traversing a graph like I do, and that they're probably just trying to give me good advice based on what they know.

2. My thoughts on learning to learn.

View more: Next