Revitalizing Less Wrong seems like a lost purpose, but here are some other ideas

19 John_Maxwell_IV 12 June 2016 07:38AM

This is a response to ingres' recent post sharing Less Wrong survey results. If you haven't read & upvoted it, I strongly encourage you to--they've done a fabulous job of collecting and presenting data about the state of the community.

So, there's a bit of a contradiction in the survey results.  On the one hand, people say the community needs to do more scholarship, be more rigorous, be more practical, be more humble.  On the other hand, not much is getting posted, and it seems like raising the bar will only exacerbate that problem.

I did a query against the survey database to find the complaints of top Less Wrong contributors and figure out how best to serve their needs.  (Note: it's a bit hard to read the comments because some of them should start with "the community needs more" or "the community needs less", but adding that info would have meant constructing a much more complicated query.)  One user wrote:

[it's not so much that there are] overly high standards,  just not a very civil or welcoming climate . why write content for free and get trashed when I can go write a grant application or a manuscript instead?

ingres emphasizes that in order to revitalize the community, we would need more content.  Content is important, but incentives for producing content might be even more important.  Social status may be the incentive humans respond most strongly to.  Right now, from a social status perspective, the expected value of creating a new Less Wrong post doesn't feel very high.  Partially because many LW posts are getting downvotes and critical comments, so my System 1 says my posts might as well.  And partially because the Less Wrong brand is weak enough that I don't expect associating myself with it will boost my social status.

When Less Wrong was founded, the primary failure mode guarded against was Eternal September.  If Eternal September represents a sort of digital populism, Less Wrong was attempting a sort of digital elitism.  My perception is that elitism isn't working because the benefits of joining the elite are too small and the costs are too large.  Teddy Roosevelt talked about the man in the arena--I think Less Wrong experienced the reverse of the evaporative cooling EY feared, where people gradually left the arena as the proportional number of critics in the stands grew ever larger.

Given where Less Wrong is at, however, I suspect the goal of revitalizing Less Wrong represents a lost purpose.

ingres' survey received a total of 3083 responses.  Not only is that about twice the number we got in the last survey in 2014, it's about twice the number we got in 20132012, and 2011 (though much bigger than the first survey in 2009).  It's hard to know for sure, since previous surveys were only advertised on the LessWrong.com domain, but it doesn't seem like the diaspora thing has slowed the growth of the community a ton and it may have dramatically accelerated it.

Why has the community continued growing?  Here's one possibility.  Maybe Less Wrong has been replaced by superior alternatives.

  • CFAR - ingres writes: "If LessWrong is serious about it's goal of 'advancing the art of human rationality' then it needs to figure out a way to do real investigation into the subject."  That's exactly what CFAR does.  CFAR is a superior alternative for people who want something like Less Wrong, but more practical.  (They have an alumni mailing list that's higher quality and more active than Less Wrong.)  Yes, CFAR costs money, because doing research costs money!
  • Effective Altruism - A superior alternative for people who want something that's more focused on results.
  • Facebook, Tumblr, Twitter - People are going to be wasting time on these sites anyway.  They might as well talk about rationality while they do it.  Like all those phpBB boards in the 00s, Less Wrong has been outcompeted by the hot new thing, and I think it's probably better to roll with it than fight it.  I also wouldn't be surprised if interacting with others through social media has been a cause of community growth.
  • SlateStarCodex - SSC already checks most of the boxes under ingres' "Future Improvement Wishlist Based On Survey Results".  In my opinion, the average SSC post has better scholarship, rigor, and humility than the average LW post, and the community seems less intimidating, less argumentative, more accessible, and more accepting of outside viewpoints.
  • The meatspace community - Meeting in person has lots of advantages.  Real-time discussion using Slack/IRC also has advantages.

Less Wrong had a great run, and the superior alternatives wouldn't exist in their current form without it.  (LW was easily the most common way people heard about EA in 2014, for instance, although sampling effects may have distorted that estimate.)  But that doesn't mean it's the best option going forward.

Therefore, here are some things I don't think we should do:

  • Try to be a second-rate version of any of the superior alternatives I mentioned above.  If someone's going to put something together, it should fulfill a real community need or be the best alternative available for whatever purpose it serves.
  • Try to get old contributors to return to Less Wrong for the sake of getting them to return.  If they've judged that other activities are a better use of time, we should probably trust their judgement.  It might be sensible to make an exception for old posters that never transferred to the in-person community, but they'd be harder to track down.
  • Try to solve the same sort of problems Arbital or Metaculus is optimizing for.  No reason to step on the toes of other projects in the community.

But that doesn't mean there's nothing to be done.  Here are some possible weaknesses I see with our current setup:

  • If you've got a great idea for a blog post, and you don't already have an online presence, it's a bit hard to reach lots of people, if that's what you want to do.
  • If we had a good system for incentivizing people to write great stuff (as opposed to merely tolerating great stuff the way LW culture historically has), we'd get more great stuff written.
  • It can be hard to find good content in the diaspora.  Possible solution: Weekly "diaspora roundup" posts to Less Wrong.  I'm too busy to do this, but anyone else is more than welcome to (assuming both people reading LW and people in the diaspora want it).

ingres mentions the possibility of Scott Alexander somehow opening up SlateStarCodex to other contributors.  This seems like a clearly superior alternative to revitalizing Less Wrong, if Scott is down for it:

  • As I mentioned, SSC already seems to have solved most of the culture & philosophy problems that people complained about with Less Wrong.
  • SSC has no shortage of content--Scott has increased the rate at which he creates open threads to deal with an excess of comments.
  • SSC has a stronger brand than Less Wrong.  It's been linked to by Ezra Klein, Ross Douthat, Bryan Caplan, etc.

But the most important reasons may be behavioral reasons.  SSC has more traffic--people are in the habit of visiting there, not here.  And the posting habits people have acquired there seem more conducive to community.  Changing habits is hard.

As ingres writes, revitalizing Less Wrong is probably about as difficult as creating a new site from scratch, and I think creating a new site from scratch for Scott is a superior alternative for the reasons I gave.

So if there's anyone who's interested in improving Less Wrong, here's my humble recommendation: Go tell Scott Alexander you'll build an online forum to his specification, with SSC community feedback, to provide a better solution for his overflowing open threads.  Once you've solved that problem, keep making improvements and subfora so your forum becomes the best available alternative for more and more use cases.

And here's my humble suggestion for what an SSC forum could look like:

As I mentioned above, Eternal September is analogous to a sort of digital populism.  The major social media sites often have a "mob rule" culture to them, and people are increasingly seeing the disadvantages of this model.  Less Wrong tried to achieve digital elitism and it didn't work well in the long run, but that doesn't mean it's impossible.  Edge.org has found a model for digital elitism that works.  There may be other workable models out there.  A workable model could even turn in to a successful company.  Fight the hot new thing by becoming the hot new thing.

My proposal is based on the idea of eigendemocracy.  (Recommended that you read the link before continuing--eigendemocracy is cool.)  In eigendemocracy, your trust score is a composite rating of what trusted people think of you.  (It sounds like infinite recursion, but it can be resolved using linear algebra.)

Eigendemocracy is a complicated idea, but a simple way to get most of the way there would be to have a forum where having lots of karma gives you the ability to upvote multiple times.  How would this work?  Let's say Scott starts with 5 karma and everyone else starts with 0 karma.  Each point of karma gives you the ability to upvote once a day.  Let's say it takes 5 upvotes for a post to get featured on the sidebar of Scott's blog.  If Scott wants to feature a post on the sidebar of his blog, he upvotes it 5 times, netting the person who wrote it 1 karma.  As Scott features more and more posts, he gains a moderation team full of people who wrote posts that were good enough to feature.  As they feature posts in turn, they generate more co-moderators.

Why do I like this solution?

  • It acts as a cultural preservation mechanism.  On reddit and Twitter, sheer numbers rule when determining what gets visibility.  The reddit-like voting mechanisms of Less Wrong meant that the site deliberately kept a somewhat low profile in order to avoid getting overrun.  Even if SSC experienced a large influx of new users, those users would only gain power to affect the visibility of content if they proved themselves by making quality contributions first.
  • It takes the moderation burden off of Scott and distributes it across trusted community members.  As the community grows, the mod team grows with it.
  • The incentives seem well-aligned.  Writing stuff Scott likes or meta-likes gets you recognition, mod powers, and the ability to control the discussion--forms of social status.  Contrast with social media sites where hyperbole is a shortcut to attention, followers, upvotes.  Also, unlike Less Wrong, there'd be no punishment for writing a low quality post--it simply doesn't get featured and is one more click away from the SSC homepage.

TL;DR - Despite appearances, the Less Wrong community is actually doing great.  Any successor to Less Wrong should try to offer compelling advantages over options that are already available.

The Singularity Wars

52 JoshuaFox 14 February 2013 09:44AM

(This is a introduction, for  those not immersed in the Singularity world, into the history of and relationships between SU, SIAI [SI, MIRI], SS, LW, CSER, FHI, and CFAR. It also has some opinions, which are strictly my own.)

The good news is that there were no Singularity Wars. 

The Bay Area had a Singularity University and a Singularity Institute, each going in a very  different direction. You'd expect to see something like the People's Front of Judea and the Judean People's Front, burning each other's grain supplies as the Romans moved in. 

continue reading »

Intuitive cooperation

16 Adele_L 25 July 2014 01:48AM

This is an exposition of some of the main ideas in the paper Robust Cooperation. My goal is to make the ideas and proofs seem natural and intuitive - instead of some mysterious thing where we invoke Löb's theorem at the right place and the agents magically cooperate. Also I hope it is accessible to people without a math or CS background. Be warned, it is pretty cheesy ok.

 


 

In a small quirky town, far away from other cities or towns, the most exciting event is a game called (for historical reasons) The Prisoner's Dilemma. Everyone comes out to watch the big tournament at the end of Summer, and you (Alice) are especially excited because this year it will be your first time playing in the tournament! So you've been thinking of ways to make sure that you can do well.

 

The way the game works is this: Each player can choose to cooperate or defect with the other player. If you both cooperate, then you get two points each. If one of you defects, then that player will get three points, and the other player won't get any points. But if you both defect, then you each get only one point. You have to make your decisions separately, without communicating with each other - however, everyone is required to register the algorithm they will be using before the tournament, and you can look at the other player's algorithm if you want to. You also are allowed to use some outside help in your algorithm. 

Now if you were a newcomer, you might think that no matter what the other player does, you can always do better by defecting. So the best strategy must be to always defect! Of course, you know better, if everyone tried that strategy, then they would end up defecting against each other, which is a shame since they would both be better off if they had just cooperated. 

But how can you do better? You have to be able to describe your algorithm in order to play. You have a few ideas, and you'll be playing some practice rounds with your friend Bob soon, so you can try them out before the actual tournament. 

Your first plan:

I'll cooperate with Bob if I can tell from his algorithm that he'll cooperate with me. Otherwise I'll defect. 

For your first try, you'll just run Bob's algorithm and see if he cooperates. But there's a problem - if Bob tries the same strategy, he'll have to run your algorithm, which will run his algorithm again, and so on into an infinite loop!

So you'll have to be a bit more clever than that... luckily you know a guy, Shady, who is good at these kinds of problems. 

 


 

You call up Shady, and while you are waiting for him to come over, you remember some advice your dad Löb gave you. 

(Löb's theorem) "If someone says you can trust them on X, well then they'll just tell you X." 

If  (someone tells you If [I tell you] X, then X is true)

Then  (someone tells you X is true)

(See The Cartoon Guide to Löb's Theorem[pdf] for a nice proof of this)

Here's an example:

Sketchy watch salesman: Hey, if I tell you these watches are genuine then they are genuine!

You: Ok... so are these watches genuine?

Sketchy watch salesman: Of course!

It's a good thing to remember when you might have to trust someone. If someone you already trust tells you you can trust them on something, then you know that something must be true. 

On the other hand, if someone says you can always trust them, well that's pretty suspicious... If they say you can trust them on everything, that means that they will never tell you a lie - which is logically equivalent to them saying that if they were to tell you a lie, then that lie must be true. So by Löb's theorem, they will lie to you. (Gödel's second incompleteness theorem)

 


 

Despite his name, you actually trust Shady quite a bit. He's never told you or anyone else anything that didn't end up being true. And he's careful not to make any suspiciously strong claims about his honesty.

So your new plan is to ask Shady if Bob will cooperate with you. If so, then you will cooperate. Otherwise, defect. (FairBot)

It's game time! You look at Bob's algorithm, and it turns out he picked the exact same algorithm! He's going to ask Shady if you will cooperate with him. Well, the first step is to ask Shady, "will Bob cooperate with me?" 

Shady looks at Bob's algorithm and sees that if Shady says you cooperate, then Bob cooperates. He looks at your algorithm and sees that if Shady says Bob cooperates, then you cooperate. Combining these, he sees that if he says you both cooperate, then both of you will cooperate. So he tells you that you will both cooperate (your dad was right!)

Let A stand for "Alice cooperates with Bob" and B stand for "Bob cooperates with Alice".

From looking at the algorithms,  and 

So combining these, .

Then by Löb's theorem, .

Since that means that Bob will cooperate, you decide to actually cooperate. 

Bob goes through an analagous thought process, and also decides to cooperate. So you cooperate with each other on the prisoner's dilemma! Yay!

 


 

That night, you go home and remark, "it's really lucky we both ended up using Shady to help us, otherwise that wouldn't have worked..."

Your dad interjects, "Actually, it doesn't matter - as long as they were both smart enough to count, it would work. This  doesn't just say 'I tell you X', it's stronger than that - it actually says 'Anyone who knows basic arithmetic will tell you X'. So as long as they both know a little arithmetic, it will still work - even if one of them is pro-axiom-of-choice, and the other is pro-axiom-of-life. The cooperation is robust." That's really cool! 

But there's another issue you think of. Sometimes, just to be tricky, the tournament organizers will set up a game where you have to play against a rock. Yes, literally just a rock that holding the cooperate button down. If you played against a rock with your current algorithm, well you start by asking Shady if the rock will cooperate with you. Shady is like, "well yeah, duh." So then you cooperate too. But you could have gotten three points by defecting! You're missing out on a totally free point! 

You think that it would be a good idea to make sure the other player isn't a complete idiot before you cooperate with them. How can you check? Well, let's see if they would cooperate with a rock placed on the defect button (affectionately known as 'DefectRock'). If they know better than that, and they will cooperate with you, then you will cooperate with them. 

 


 

The next morning, you excitedly tell Shady about your new plan. "It will be like before, except this time, I also ask you if the other player will cooperate with DefectRock! If they are dumb enough to do that, then I'll just defect. That way, I can still cooperate with other people who use algorithms like this one, or the one from before, but I can also defect and get that extra point when there's just a rock on cooperate."

Shady get's an awkward look on his face, "Sorry, but I can't do that... or at least it wouldn't work out the way you're thinking. Let's say you're playing against Bob, who is still using the old algorithm. You want to know if Bob will cooperate with DefectRock, so I have to check and see if I'll tell Bob that DefectRock will cooperate with him. I would have say I would never tell Bob that DefectRock will cooperate with him. But by Löb's theorem, that means I would tell you this obvious lie! So that isn't gonna work."

Notation,  if X cooperates with Y in the prisoner's dilemma (or = D if not). 

You ask Shady, does ?

Bob's algorithm:  only if .

So to say , we would need 

This is equivalent to , since  is an obvious lie. 

By Löb's theorem, , which is a lie. 

<Extra credit: does the fact that Shady is the one explaining this mean you can't trust him?>

<Extra extra credit: find and fix the minor technical error in the above argument.>

Shady sees the dismayed look on your face and adds, "...but, I know a guy who can vouch for me, and I think maybe that could make your new algorithm work."

So Shady calls his friend T over, and you work out the new details. You ask Shady if Bob will cooperate with you, and you ask T if Bob will cooperate with DefectRock. So T looks at Bob's algorithm, which asks Shady if DefectRock will cooperate with him. Shady, of course, says no. So T sees that Bob will defect against DefectRock, and lets you know. Like before, Shady tells you Bob will cooperate with you, and thus you decide to cooperate! And like before, Bob decides to cooperate with you, so you both cooperate! Awesome! (PrudentBot)

If Bob is using your new algorithm, you can see that the same argument goes through mostly unchanged, and that you will still cooperate! And against a rock on cooperate, T will tell you that it will cooperate with DefectRock, so you can defect and get that extra point! This is really great!!

 


 

(ok now it's time for the really cheesy ending)

It's finally time for the tournament. You have a really good feeling about your algorithm, and you do really well! Your dad is in the audience cheering for you, with a really proud look on his face. You tell your friend Bob about your new algorithm so that he can also get that extra point sometimes, and you end up tying for first place with him!

A few weeks later, Bob asks you out, and you two start dating. Being able to cooperate with each other robustly is a good start to a healthy relationship, and you live happily ever after! 

The End.

[Link] - Policy Challenges of Accelerating Technological Change: Security Policy and Strategy Implications of Parallel Scientific Revolutions

4 ete 28 January 2015 03:29PM

From a paper by Center for Technology and National Security Policy & National Defense University:

"Strong AI: Strong AI has been the holy grail of artificial intelligence research for decades. Strong AI seeks to build a machine which can simulate the full range of human cognition, and potentially include such traits as consciousness, sentience, sapience, and self-awareness. No AI system has so far come close to these capabilities; however, many now believe that strong AI may be achieved sometime in the 2020s. Several technological advances are fostering this optimism; for example, computer processors will likely reach the computational power of the human brain sometime in the 2020s (the so-called “singularity”). Other fundamental advances are in development, including exotic/dynamic processor architectures, full brain simulations, neuro-synaptic computers, and general knowledge representation systems such as IBM Watson. It is difficult to fully predict what such profound improvements in artificial cognition could imply; however, some credible thinkers have already posited a variety of potential risks related to loss of control of aspects of the physical world by human beings. For example, a 2013 report commissioned by the United Nations has called for a worldwide moratorium on the development and use of autonomous robotic weapons systems until international rules can be developed for their use.

National Security Implications: Over the next 10 to 20 years, robotics and AI will continue to make significant improvements across a broad range of technology applications of relevance to the U.S. military. Unmanned vehicles will continue to increase in sophistication and numbers, both on the battlefield and in supporting missions. Robotic systems can also play a wider range of roles in automating routine tasks, for example in logistics and administrative work. Telemedicine, robotic assisted surgery, and expert systems can improve military health care and lower costs. The built infrastructure, for example, can be managed more effectively with embedded systems, saving energy and other resources. Increasingly sophisticated weak AI tools can offload much of the routine cognitive or decisionmaking tasks that currently require human operators. Assuming current systems move closer to strong AI capabilities, they could also play a larger and more significant role in problem solving, perhaps even for strategy development or operational planning. In the longer term, fully robotic soldiers may be developed and deployed, particularly by wealthier countries, although the political and social ramifications of such systems will likely be significant. One negative aspect of these trends, however, lies in the risks that are possible due to unforeseen vulnerabilities that may arise from the large scale deployment of smart automated systems, for which there is little practical experience. An emerging risk is the ability of small scale or terrorist groups to design and build functionally capable unmanned systems which could perform a variety of hostile missions."

So strong AI is on the american military's radar, and at least some involved have a basic understanding of the fact that it could be risky. The paper also contains brief overviews of many other potentially transformational technologies.

Double Corrigibility: better Corrigibility

5 Stuart_Armstrong 28 April 2016 02:46PM

A putative new idea for AI control; index here.

Corrigibility was an attempt to allow an AI to safely change its values, without seeking to provoke or avoid the change. The idea is that, when the AI's utility changes from u to v at time t, it maximises a meta-utility U such that

  • U≤t = u
  • U>t = v + E(u|u→u) - E(v|u→v).

Here u→u designates the event that u remains unchanged, while u→v designates the change.

As has been shown, adding those expectation terms means the AI will not seek to provoke or resist such a utility change - it is indifferent to the change. Those expectation terms can be designated compensatory rewards.

What it is not indifferent to, though, is changes to E. If the AI knows that change u→v is coming, then it has incentives to change E to E' with a high E'(u|u→u) and a low E'(v|u→v). Its ability to rewire is limited by the fact that E' will serve as its future expectation estimator, so it won't mangle it completely, but it certainly can twist it a bit.

Notice that the AI is now no longer indifferent to the change of utility. It can benefit from rewiring E this way only if a change is coming up. Notice that E' probably can't rewire itself any more, so the rewired AI will be indifferent, but not before rewiring.

 

Double corrigibility

To solve this problem, we need to turn to corrigibility a second time. Let ρ be the AI's current expectation/probability estimator. Let ρ' be one that it can conceivably change into. Let C be a random variable designating the sum total of all compensatory rewards. Now, to guarantee that C is finite would require bounds on all the utility functions and some discount rate. In practice, if often will be finite. That's because E(C)=E(u0|u0→u0)-E(un |un-1→un, H), where u0 is the AI's first utility, un its last, and H intermediate history. So some milder restrictions on the ui should suffice to bound C.

Then, whenever the AI takes an action that changes ρ to ρ', it gets the special meta-compensatory reward

  • Eρ(C|ρ→ρ)-Eρ(C|ρ→ρ').

This prevents it from deriving any advantage to C via changes to ρ. Or, more precisely, it prevents it from deriving any expected advantage, that it can figure out using its current ρ.

Note that it doesn't prevent improvements to ρ in general, it just removes the incentives to change it to affect compensatory rewards.

Since any action a might have an indirect effect on ρ, the "utility maximising" for a given u must be changed to:

  • Eρ(u|a) + Σρ' Pρ(ρ→ρ'|a) (Eρ(C|ρ→ρ)-Eρ(C|ρ→ρ')),

where Pρ is the probability estimate corresponding to ρ; the probability term can be rewritten as Eρ(Iρ→ρ') for Iρ→ρ' the indicator function for ρ→ρ'. In fact the whole line above can be rewritten as

  • Eρ(u|a) + Eρ(Eρ(C|ρ→ρ)-Eρ(C|ρ→ρ') | a).

For this to work, Eρ needs to be able to say sensible things about itself, and also about Eρ', which is used to estimate C if ρ→ρ'.

If we compare this with various ways of factoring out variables, we can see that it's a case where we have a clear default, ρ, and are estimating deviations from that.

Paid research assistant position focusing on artificial intelligence and existential risk

7 crmflynn 02 May 2016 06:27PM

Yale Assistant Professor of Political Science Allan Dafoe is seeking Research Assistants for a project on the political dimensions of the existential risks posed by advanced artificial intelligence. The project will involve exploring issues related to grand strategy and international politics, reviewing possibilities for social scientific research in this area, and institution building. Familiarity with international relations, existential risk, Effective Altruism, and/or artificial intelligence are a plus but not necessary. The project is done in collaboration with the Future of Humanity Institute, located in the Faculty of Philosophy at the University of Oxford. There are additional career opportunities in this area, including in the coming academic year and in the future at Yale, Oxford, and elsewhere. If interested in the position, please email allan.dafoe@yale.edu with a copy of your CV, a writing sample, an unofficial copy of your transcript, and a short (200-500 word) statement of interest. Work can be done remotely, though being located in New Haven, CT or Oxford, UK is a plus.

[Link] White House announces a series of workshops on AI, expresses interest in safety

11 AspiringRationalist 04 May 2016 02:50AM

Rationality Reading Group: Part Z: The Craft and the Community

6 Gram_Stone 04 May 2016 11:03PM

This is part of a semi-monthly reading group on Eliezer Yudkowsky's ebook, Rationality: From AI to Zombies. For more information about the group, see the announcement post.


Welcome to the Rationality reading group. This fortnight we discuss Part Z: The Craft and the Community (pp. 1651-1750). This post summarizes each article of the sequence, linking to the original LessWrong post where available.

Z. The Craft and the Community

312. Raising the Sanity Waterline - Behind every particular failure of social rationality is a larger and more general failure of social rationality; even if all religious content were deleted tomorrow from all human minds, the larger failures that permit religion would still be present. Religion may serve the function of an asphyxiated canary in a coal mine - getting rid of the canary doesn't get rid of the gas. Even a complete social victory for atheism would only be the beginning of the real work of rationalists. What could you teach people without ever explicitly mentioning religion, that would raise their general epistemic waterline to the point that religion went underwater?

313. A Sense That More Is Possible - The art of human rationality may have not been much developed because its practitioners lack a sense that vastly more is possible. The level of expertise that most rationalists strive to develop is not on a par with the skills of a professional mathematician - more like that of a strong casual amateur. Self-proclaimed "rationalists" don't seem to get huge amounts of personal mileage out of their craft, and no one sees a problem with this. Yet rationalists get less systematic training in a less systematic context than a first-dan black belt gets in hitting people.

314. Epistemic Viciousness - An essay by Gillian Russell on "Epistemic Viciousness in the Martial Arts" generalizes amazingly to possible and actual problems with building a community around rationality. Most notably the extreme dangers associated with "data poverty" - the difficulty of testing the skills in the real world. But also such factors as the sacredness of the dojo, the investment in teachings long-practiced, the difficulty of book learning that leads into the need to trust a teacher, deference to historical masters, and above all, living in data poverty while continuing to act as if the luxury of trust is possible.

315. Schools Proliferating Without Evidence - The branching schools of "psychotherapy", another domain in which experimental verification was weak (nonexistent, actually), show that an aspiring craft lives or dies by the degree to which it can be tested in the real world. In the absence of that testing, one becomes prestigious by inventing yet another school and having students, rather than excelling at any visible performance criterion. The field of hedonic psychology (happiness studies) began, to some extent, with the realization that you could measure happiness - that there was a family of measures that by golly did validate well against each other. The act of creating a new measurement creates new science; if it's a good measurement, you get good science.

316. Three Levels of Rationality Verification - How far the craft of rationality can be taken, depends largely on what methods can be invented for verifying it. Tests seem usefully stratifiable into reputational, experimental, andorganizational. A "reputational" test is some real-world problem that tests the ability of a teacher or a school (like running a hedge fund, say) - "keeping it real", but without being able to break down exactly what was responsible for success. An "experimental" test is one that can be run on each of a hundred students (such as a well-validated survey). An "organizational" test is one that can be used to preserve the integrity of organizations by validating individuals or small groups, even in the face of strong incentives to game the test. The strength of solution invented at each level will determine how far the craft of rationality can go in the real world.

317. Why Our Kind Can't Cooperate - The atheist/libertarian/technophile/sf-fan/early-adopter/programmer/etc crowd, aka "the nonconformist cluster", seems to be stunningly bad at coordinating group projects. There are a number of reasons for this, but one of them is that people are as reluctant to speak agreement out loud, as they are eager to voice disagreements - the exact opposite of the situation that obtains in more cohesive and powerful communities. This is not rational either! It is dangerous to be half a rationalist (in general), and this also applies to teaching only disagreement but not agreement, or only lonely defiance but not coordination. The pseudo-rationalist taboo against expressing strong feelings probably doesn't help either.

318. Tolerate Tolerance - One of the likely characteristics of someone who sets out to be a "rationalist" is a lower-than-usual tolerance for flawed thinking. This makes it very important to tolerate other people's tolerance - to avoid rejecting them because they tolerate people you wouldn't - since otherwise we must all have exactly the same standards of tolerance in order to work together, which is unlikely. Even if someone has a nice word to say about complete lunatics and crackpots - so long as they don't literally believe the same ideas themselves - try to be nice to them? Intolerance of tolerance corresponds to punishment of non-punishers, a very dangerous game-theoretic idiom that can lock completely arbitrary systems in place even when they benefit no one at all.

319. Your Price for Joining - The game-theoretical puzzle of the Ultimatum game has its reflection in a real-world dilemma: How much do you demand that an existing group adjust toward you, before you will adjust toward it? Our hunter-gatherer instincts will be tuned to groups of 40 with very minimal administrative demands and equal participation, meaning that we underestimate the inertia of larger and more specialized groups and demand too much before joining them. In other groups this resistance can be overcome by affective death spirals and conformity, but rationalists think themselves too good for this - with the result that people in the nonconformist cluster often set their joining prices way way way too high, like an 50-way split with each player demanding 20% of the money. Nonconformists need to move in the direction of joining groups more easily, even in the face of annoyances and apparent unresponsiveness. If an issue isn't worth personally fixing by however much effort it takes, it's not worth a refusal to contribute.

320. Can Humanism Match Religion's Output? - Anyone with a simple and obvious charitable project - responding with food and shelter to a tidal wave in Thailand, say - would be better off by far pleading with the Pope to mobilize the Catholics, rather than with Richard Dawkins to mobilize the atheists. For so long as this is true, any increase in atheism at the expense of Catholicism will be something of a hollow victory, regardless of all other benefits. Can no rationalist match the motivation that comes from the irrational fear of Hell? Or does the real story have more to do with the motivating power of physically meeting others who share your cause, and group norms of participating?

321. Church vs. Taskforce - Churches serve a role of providing community - but they aren't explicitly optimized for this, because their nominal role is different. If we desire community without church, can we go one better in the course of deleting religion? There's a great deal of work to be done in the world; rationalist communities might potentially organize themselves around good causes, while explicitly optimizing for community.

322. Rationality: Common Interest of Many Causes - Many causes benefit particularly from the spread of rationality - because it takes a little more rationality than usual to see their case, as a supporter, or even just a supportive bystander. Not just the obvious causes like atheism, but things like marijuana legalization. In the case of my own work this effect was strong enough that after years of bogging down I threw up my hands and explicitly recursed on creating rationalists. If such causes can come to terms with not individually capturing all the rationalists they create, then they can mutually benefit from mutual effort on creating rationalists. This cooperation may require learning to shut up about disagreements between such causes, and not fight over priorities, except in specialized venues clearly marked.

323. Helpless Individuals - When you consider that our grouping instincts are optimized for 50-person hunter-gatherer bands where everyone knows everyone else, it begins to seem miraculous that modern-day large institutions survive at all. And in fact, the vast majority of large modern-day institutions simply fail to exist in the first place. This is why funding of Science is largely through money thrown at Science rather than donations from individuals - research isn't a good emotional fit for the rare problems that individuals can manage to coordinate on. In fact very few things are, which is why e.g. 200 million adult Americans have such tremendous trouble supervising the 535 members of Congress. Modern humanity manages to put forth very little in the way of coordinated individual effort to serve our collective individual interests.

324. Money: The Unit of Caring - Omohundro's resource balance principle implies that the inside of any approximately rational system has a common currency of expected utilons. In our world, this common currency is called "money" and it is the unit of how much society cares about something - a brutal yet obvious point. Many people, seeing a good cause, would prefer to help it by donating a few volunteer hours. But this avoids the tremendous gains of comparative advantage, professional specialization, and economies of scale - the reason we're not still in caves, the only way anything ever gets done in this world, the tools grownups use when anyone really cares. Donating hours worked within a professional specialty and paying-customer priority, whether directly, or by donating the money earned to hire other professional specialists, is far more effective than volunteering unskilled hours.

325. Purchase Fuzzies and Utilons Separately - Wealthy philanthropists typically make the mistake of trying to purchase warm fuzzy feelings, status among friends, and actual utilitarian gains, simultaneously; this results in vague pushes along all three dimensions and a mediocre final result. It should be far more effective to spend some money/effort on buying altruistic fuzzies at maximum optimized efficiency (e.g. by helping people in person and seeing the results in person), buying status at maximum efficiency (e.g. by donating to something sexy that you can brag about, regardless of effectiveness), and spending most of your money on expected utilons (chosen through sheer cold-blooded shut-up-and-multiply calculation, without worrying about status or fuzzies).

326. Bystander ApathyThe bystander effect is when groups of people are less likely to take action than an individual. There are a few explanations for why this might be the case.

327. Collective Apathy and the Internet - The causes of bystander apathy are even worse on the Internet. There may be an opportunity here for a startup to deliberately try to avert bystander apathy in online group coordination.

328. Incremental Progress and the Valley - The optimality theorems for probability theory and decision theory, are for perfect probability theory and decision theory. There is no theorem that incremental changes toward the ideal, starting from a flawed initial form, must yield incremental progress at each step along the way. Since perfection is unattainable, why dare to try for improvement? But my limited experience with specialized applications suggests that given enough progress, one can achieve huge improvements over baseline - it just takes a lot of progress to get there.

329. Bayesians vs. BarbariansSuppose that a country of rationalists is attacked by a country of Evil Barbarians who know nothing of probability theory or decision theory. There's a certain concept of "rationality" which says that the rationalists inevitably lose, because the Barbarians believe in a heavenly afterlife if they die in battle, while the rationalists would all individually prefer to stay out of harm's way. So the rationalist civilization is doomed; it is too elegant and civilized to fight the savage Barbarians... And then there's the idea that rationalists should be able to (a) solve group coordination problems, (b) care a lot about other people and (c) win...

330. Beware of Other-Optimizing - Aspiring rationalists often vastly overestimate their own ability to optimize other people's lives. They read nineteen webpages offering productivity advice that doesn't work for them... and then encounter the twentieth page, or invent a new method themselves, and wow, it really works - they've discovered the true method. Actually, they've just discovered the one method in twenty that works for them, and their confident advice is no better than randomly selecting one of the twenty blog posts. Other-Optimizing is exceptionally dangerous when you have power over the other person - for then you'll just believe that they aren't trying hard enough.

331. Practical Advice Backed by Deep Theories - Practical advice is genuinely much, much more useful when it's backed up by concrete experimental results, causal models that are actually true, or valid math that is validly interpreted. (Listed in increasing order of difficulty.) Stripping out the theories and giving the mere advice alone wouldn't have nearly the same impact or even the same message; and oddly enough, translating experiments and math into practical advice seems to be a rare niche activity relative to academia. If there's a distinctive LW style, this is it.

332. The Sin of Underconfidence - When subjects know about a bias or are warned about a bias, overcorrection is not unheard of as an experimental result. That's what makes a lot of cognitive subtasks so troublesome - you know you're biased but you're not sure how much, and if you keep tweaking you may overcorrect. The danger of underconfidence (overcorrecting for overconfidence) is that you pass up opportunities on which you could have been successful; not challenging difficult enough problems; losing forward momentum and adopting defensive postures; refusing to put the hypothesis of your inability to the test; losing enough hope of triumph to try hard enough to win. You should ask yourself "Does this way of thinking make me stronger, or weaker?"

333. Go Forth and Create the Art! - I've developed primarily the art of epistemic rationality, in particular, the arts required for advanced cognitive reductionism... arts like distinguishing fake explanations from real ones and avoiding affective death spirals. There is much else that needs developing to create a craft of rationality - fighting akrasia; coordinating groups; teaching, training, verification, and becoming a proper experimental science; developing better introductory literature... And yet it seems to me that there is a beginning barrier to surpass before you can start creating high-quality craft of rationality, having to do with virtually everyone who tries to think lofty thoughts going instantly astray, or indeed even realizing that a craft of rationality exists and that you ought to be studying cognitive science literature to create it. It's my hope that my writings, as partial as they are, will serve to surpass this initial barrier. The rest I leave to you.

 


This has been a collection of notes on the assigned sequence for this fortnight. The most important part of the reading group though is discussion, which is in the comments section. Please remember that this group contains a variety of levels of expertise: if a line of discussion seems too basic or too incomprehensible, look around for one that suits you better!

This is the end, beautiful friend!

A toy model of the control problem

19 Stuart_Armstrong 16 September 2015 02:59PM

EDITED based on suggestions for improving the model

Jaan Tallinn has suggested creating a toy model of the control problem, so that it can be analysed without loaded concepts like "autonomy", "consciousness", or "intentionality". Here a simple (too simple?) attempt:

 

A controls B. B manipulates A.

Let B be a robot agent that moves in a two dimensional world, as follows:

continue reading »

Several free CFAR summer programs on rationality and AI safety

18 AnnaSalamon 14 April 2016 02:35AM
CFAR will be running several free summer programs this summer which are currently taking applications.  Please apply if you’re interested, and forward the programs also to anyone else who may be a good fit!
continue reading »

View more: Prev | Next