Filter Last three months

Less Wrong is a community blog devoted to refining the art of human rationality. Please visit our About page for more information.

Zombies Redacted

33 Eliezer_Yudkowsky 02 July 2016 08:16PM

I looked at my old post Zombies! Zombies? and it seemed to have some extraneous content.  This is a redacted and slightly rewritten version.

continue reading »

A Second Year of Spaced Repetition Software in the Classroom

29 tanagrabeast 01 May 2016 10:14PM

This is a follow-up to last year's report. Here, I will talk about my successes and failures using Spaced Repetition Software (SRS) in the classroom for a second year. The year's not over yet, but I have reasons for reporting early that should become clear in a subsequent post. A third post will then follow, and together these will constitute a small sequence exploring classroom SRS and the adjacent ideas that bubble up when I think deeply about teaching.

Summary

I experienced net negative progress this year in my efforts to improve classroom instruction via spaced repetition software. While this is mostly attributable to shifts in my personal priorities, I have also identified a number of additional failure modes for classroom SRS, as well as additional shortcomings of Anki for this use case. My experiences also showcase some fundamental challenges to teaching-in-general that SRS depressingly spotlights without being any less susceptible to. Regardless, I am more bullish than ever about the potential for classroom SRS, and will lay out a detailed vision for what it can be in the next post.

continue reading »

2016 LessWrong Diaspora Survey Analysis: Part Two (LessWrong Use, Successorship, Diaspora)

26 ingres 10 June 2016 07:40PM

2016 LessWrong Diaspora Survey Analysis

Overview

  • Results and Dataset
  • Meta
  • Demographics
  • LessWrong Usage and Experience
  • LessWrong Criticism and Successorship
  • Diaspora Community Analysis (You are here)
  • Mental Health Section
  • Basilisk Section/Analysis
  • Blogs and Media analysis
  • Politics
  • Calibration Question And Probability Question Analysis
  • Charity And Effective Altruism Analysis

Introduction

Before it was the LessWrong survey, the 2016 survey was a small project I was working on as market research for a website I'm creating called FortForecast. As I was discussing the idea with others, particularly Eliot he made the suggestion that since he's doing LW 2.0 and I'm doing a site that targets the LessWrong demographic, why don't I go ahead and do the LessWrong Survey? Because of that, this years survey had a lot of questions oriented around what you would want to see in a successor to LessWrong and what you think is wrong with the site.

LessWrong Usage and Experience

How Did You Find LessWrong?

Been here since it was started in the Overcoming Bias days: 171 8.3%
Referred by a link: 275 13.4%
HPMOR: 542 26.4%
Overcoming Bias: 80 3.9%
Referred by a friend: 265 12.9%
Referred by a search engine: 131 6.4%
Referred by other fiction: 14 0.7%
Slate Star Codex: 241 11.7%
Reddit: 55 2.7%
Common Sense Atheism: 19 0.9%
Hacker News: 47 2.3%
Gwern: 22 1.1%
Other: 191 9.308%

How do you use Less Wrong?

I lurk, but never registered an account: 1120 54.4%
I've registered an account, but never posted: 270 13.1%
I've posted a comment, but never a top-level post: 417 20.3%
I've posted in Discussion, but not Main: 179 8.7%
I've posted in Main: 72 3.5%

[54.4% lurkers.]

How often do you comment on LessWrong?

I have commented more than once a week for the past year.: 24 1.2%
I have commented more than once a month for the past year but less than once a week.: 63 3.1%
I have commented but less than once a month for the past year.: 225 11.1%
I have not commented this year.: 1718 84.6%

[You could probably snarkily title this one "LW usage in one statistic". It's a pretty damning portrait of the sites vitality. A whopping 84.6% of people have not commented this year a single time.]

How Long Since You Last Posted On LessWrong?

I wrote one today.: 12 0.637%
Within the last three days.: 13 0.69%
Within the last week.: 22 1.168%
Within the last month.: 58 3.079%
Within the last three months.: 75 3.981%
Within the last six months.: 68 3.609%
Within the last year.: 84 4.459%
Within the last five years.: 295 15.658%
Longer than five years.: 15 0.796%
I've never posted on LW.: 1242 65.924%

[Supermajority of people have never commented on LW, 5.574% have within the last month.]

About how much of the Sequences have you read?

Never knew they existed until this moment: 215 10.3%
Knew they existed, but never looked at them: 101 4.8%
Some, but less than 25% : 442 21.2%
About 25%: 260 12.5%
About 50%: 283 13.6%
About 75%: 298 14.3%
All or almost all: 487 23.3%

[10.3% of people taking the survey have never heard of the sequences. 36.3% have not read a quarter of them.]

Do you attend Less Wrong meetups?

Yes, regularly: 157 7.5%
Yes, once or a few times: 406 19.5%
No: 1518 72.9%

[However the in-person community seems to be non-dead.]

Is physical interaction with the Less Wrong community otherwise a part of your everyday life, for example do you live with other Less Wrongers, or you are close friends and frequently go out with them?

Yes, all the time: 158 7.6%
Yes, sometimes: 258 12.5%
No: 1652 79.9%

About the same number say they hang out with LWers 'all the time' as say they go to meetups. I wonder if people just double counted themselves here. Or they may go to meetups and have other interactions with LWers outside of that. Or it could be a coincidence and these are different demographics. Let's find out.

P(Community part of daily life | Meetups) = 40%

Significant overlap, but definitely not exclusive overlap. I'll go ahead and chalk this one up up to coincidence.

Have you ever been in a romantic relationship with someone you met through the Less Wrong community?

Yes: 129 6.2%
I didn't meet them through the community but they're part of the community now: 102 4.9%
No: 1851 88.9%

LessWrong Usage Differences Between 2016 and 2014 Surveys

How do you use Less Wrong?

I lurk, but never registered an account: +19.300% 1125 54.400%
I've registered an account, but never posted: -1.600% 271 13.100%
I've posted a comment, but never a top-level post: -7.600% 419 20.300%
I've posted in Discussion, but not Main: -5.100% 179 8.700%
I've posted in Main: -3.300% 73 3.500%

About how much of the sequences have you read?

Never knew they existed until this moment: +3.300% 217 10.400%
Knew they existed, but never looked at them: +2.100% 103 4.900%
Some, but less than 25%: +3.100% 442 21.100%
About 25%: +0.400% 260 12.400%
About 50%: -0.400% 284 13.500%
About 75%: -1.800% 299 14.300%
All or almost all: -5.000% 491 23.400%

Do you attend Less Wrong meetups?

Yes, regularly: -2.500% 160 7.700%
Yes, once or a few times: -2.100% 407 19.500%
No: +7.100% 1524 72.900%

Is physical interaction with the Less Wrong community otherwise a part of your everyday life, for example do you live with other Less Wrongers, or you are close friends and frequently go out with them?

Yes, all the time: +0.200% 161 7.700%
Yes, sometimes: -0.300% 258 12.400%
No: +2.400% 1659 79.800%

Have you ever been in a romantic relationship with someone you met through the Less Wrong community?

Yes: +0.800% 132 6.300%
I didn't meet them through the community but they're part of the community now: -0.400% 102 4.900%
No: +1.600% 1858 88.800%

Write Ins

In a bit of a silly oversight I forgot to ask survey participants what was good about the community, so the following is going to be a pretty one sided picture. Below are the complete write ins respondents submitted

Issues With LessWrong At It's Peak

Philosophical Issues With LessWrong At It's Peak[Part One]
Philosophical Issues With LessWrong At It's Peak[Part Two]
Community Issues With LessWrong At It's Peak[Part One]
Community Issues With LessWrong At It's Peak[Part Two]

Issues With LessWrong Now

Philosophical Issues With LessWrong Now[Part One]
Philosophical Issues With LessWrong Now[Part Two]
Community Issues With LessWrong Now[Part One]
Community Issues With LessWrong Now[Part Two]

Peak Philosophy Issue Tallies

Philosophy Issues (Sample Size: 233)
Label Code Tally
Arrogance A 16
Bad Aesthetics BA 3
Bad Norms BN 3
Bad Politics BP 5
Bad Tech Platform BTP 1
Cultish C 5
Cargo Cult CC 3
Doesn't Accept Criticism DAC 3
Don't Know Where to Start DKWS 5
Damaged Me Mentally DMM 1
Esoteric E 3
Eliezer Yudkowsky EY 6
Improperly Indexed II 7
Impossible Mission IM 4
Insufficient Social Support ISS 1
Jargon  
Literal Cult LC 1
Lack of Rigor LR 14
Misfocused M 13
Mixed Bag MB 3
Nothing N 13
Not Enough Jargon NEJ 1
Not Enough Roko's Basilisk NERB 1
Not Enough Theory NET 1
No Intuition NI 6
Not Progressive Enough NPE 7
Narrow Scholarship NS 20
Other O 3
Personality Cult PC 10
None of the Above  
Quantum Mechanics Sequence QMS 2
Reinvention R 10
Rejects Expertise RE 5
Spoiled S 7
Small Competent Authorship SCA 6
Suggestion For Improvement SFI 1
Socially Incompetent SI 9
Stupid Philosophy SP 4
Too Contrarian TC 2
Typical Mind TM 1
Too Much Roko's Basilisk TMRB 1
Too Much Theory TMT 14
Too Progressive TP 2
Too Serious TS 2
Unwelcoming U 8

Well, those are certainly some results. Top answers are:

Narrow Scholarship: 20
Arrogance: 16
Too Much Theory: 14
Lack of Rigor: 14
Misfocused: 13
Nothing: 13
Reinvention (reinvents the wheel too much): 10
Personality Cult: 10

So condensing a bit: Pay more attention to mainstream scholarship and ideas, try to do better about intellectual rigor, be more practical and focus on results, be more humble. (Labeled Dataset)

Peak Community Issue Tallies

Community Issues (Sample Size: 227)
Label Code Tally
Arrogance A 7
Assumes Reader Is Male ARIM 1
Bad Aesthetics BA 1
Bad At PR BAP 5
Bad Norms BN 5
Bad Politics BP 2
Cultish C 9
Cliqueish Tendencies CT 1
Diaspora D 1
Defensive Attitude DA 1
Doesn't Accept Criticism DAC 3
Dunning Kruger DK 1
Elitism E 3
Eliezer Yudkowsky EY 2
Groupthink G 11
Insufficiently Indexed II 9
Impossible Mission IM 1
Imposter Syndrome IS 1
Jargon J 2
Lack of Rigor LR 1
Mixed Bag MB 1
Nothing N 5
??? NA 1
Not Big Enough NBE 3
Not Enough of A Cult NEAC 1
Not Enough Content NEC 7
Not Enough Community Infrastructure NECI 10
Not Enough Meetups NEM 5
No Goals NG 2
Not Nerdy Enough NNE 3
None Of the Above NOA 1
Not Progressive Enough NPE 3
Not Rational NR 3
NRx (Neoreaction) NRx 1
Narrow Scholarship NS 4
Not Stringent Enough NSE 3
Parochialism P 1
Pickup Artistry PA 2
Personality Cult PC 7
Reinvention R 1
Recurring Arguments RA 3
Rejects Expertise RE 2
Sequences S 2
Small Competent Authorship SCA 5
Suggestion For Improvement SFI 1
Spoiled Issue SI 9
Socially INCOMpetent SINCOM 2
Too Boring TB 1
Too Contrarian TC 10
Too COMbative TCOM 4
Too Cis/Straight/Male TCSM 5
Too Intolerant of Cranks TIC 1
Too Intolerant of Politics TIP 2
Too Long Winded TLW 2
Too Many Idiots TMI 3
Too Much Math TMM 1
Too Much Theory TMT 12
Too Nerdy TN 6
Too Rigorous TR 1
Too Serious TS 1
Too Tolerant of Cranks TTC 1
Too Tolerant of Politics TTP 3
Too Tolerant of POSers TTPOS 2
Too Tolerant of PROGressivism TTPROG 2
Too Weird TW 2
Unwelcoming U 12
UTILitarianism UTIL 1

Top Answers:

Unwelcoming: 12
Too Much Theory: 12
Groupthink: 11
Not Enough Community Infrastructure: 10
Too Contrarian: 10
Insufficiently Indexed: 9
Cultish: 9

Again condensing a bit: Work on being less intimidating/aggressive/etc to newcomers, spend less time on navel gazing and more time on actually doing things and collecting data, work on getting the structures in place that will onboard people into the community, stop being so nitpicky and argumentative, spend more time on getting content indexed in a form where people can actually find it, be more accepting of outside viewpoints and remember that you're probably more likely to be wrong than you think. (Labeled Dataset)

One last note before we finish up, these tallies are a very rough executive summary. The tagging process basically involves trying to fit points into clusters and is prone to inaccuracy through laziness, adding another category being undesirable, square-peg into round-hole fitting, and my personal political biases. So take these with a grain of salt, if you really want to know what people wrote in my advice would be to read through the write in sets I have above in HTML format. If you want to evaluate for yourself how well I tagged things you can see the labeled datasets above.

I won't bother tallying the "issues now" sections, all you really need to know is that it's basically the same as the first sections except with lots more "It's dead." comments and from eyeballing it a higher proportion of people arguing that LessWrong has been taken over by the left/social justice and complaints about effective altruism. (I infer that the complaints about being taken over by the left are mostly referring to effective altruism.)

Traits Respondents Would Like To See In A Successor Community

Philosophically

Attention Paid To Outside Sources
More: 1042 70.933%
Same: 414 28.182%
Less: 13 0.885%

Self Improvement Focus
More: 754 50.706%
Same: 598 40.215%
Less: 135 9.079%

AI Focus
More: 184 12.611%
Same: 821 56.271%
Less: 454 31.117%

Political
More: 330 22.837%
Same: 770 53.287%
Less: 345 23.875%

Academic/Formal
More: 455 31.885%
Same: 803 56.272%
Less: 169 11.843%

In summary, people want a site that will engage with outside ideas, acknowledge where it borrows from, focus on practical self improvement, less on AI and AI risk, and tighten its academic rigor. They could go either way on politics but the epistemic direction is clear.

Community

Intense Environment
More: 254 19.644%
Same: 830 64.192%
Less: 209 16.164%

Focused On 'Real World' Action
More: 739 53.824%
Same: 563 41.005%
Less: 71 5.171%

Experts
More: 749 55.605%
Same: 575 42.687%
Less: 23 1.707%

Data Driven/Testing Of Ideas
More: 1107 78.344%
Same: 291 20.594%
Less: 15 1.062%

Social
More: 583 43.507%
Same: 682 50.896%
Less: 75 5.597%

This largely backs up what I said about the previous results. People want a more practical, more active, more social and more empirical LessWrong with outside expertise and ideas brought into the fold. They could go either way on it being more intense but the epistemic trend is still clear.

Write Ins

Diaspora Communities

So where did the party go? We got twice as many respondents this year as last when we opened up the survey to the diaspora, which means that the LW community is alive and kicking it's just not on LessWrong.

LessWrong
Yes: 353 11.498%
No: 1597 52.02%

LessWrong Meetups
Yes: 215 7.003%
No: 1735 56.515%

LessWrong Facebook Group
Yes: 171 5.57%
No: 1779 57.948%

LessWrong Slack
Yes: 55 1.792%
No: 1895 61.726%

SlateStarCodex
Yes: 832 27.101%
No: 1118 36.417%

[SlateStarCodex by far has the highest proportion of active LessWrong users, over twice that of LessWrong itself, and more than LessWrong and Tumblr combined.]

Rationalist Tumblr
Yes: 350 11.401%
No: 1600 52.117%

[I'm actually surprised that Tumblr doesn't just beat LessWrong itself outright, They're only a tenth of a percentage point behind though, and if current trends continue I suspect that by 2017 Tumblr will have a large lead over the main LW site.]

Rationalist Facebook
Yes: 150 4.886%
No: 1800 58.632%

[Eliezer Yudkowsky currently resides here.]

Rationalist Twitter
Yes: 59 1.922%
No: 1891 61.596%

Effective Altruism Hub
Yes: 98 3.192%
No: 1852 60.326%

FortForecast
Yes: 4 0.13%
No: 1946 63.388%

[I included this as a 'troll' option to catch people who just check every box. Relatively few people seem to have done that, but having the option here lets me know one way or the other.]

Good Judgement(TM) Open
Yes: 29 0.945%
No: 1921 62.573%

PredictionBook
Yes: 59 1.922%
No: 1891 61.596%

Omnilibrium
Yes: 8 0.261%
No: 1942 63.257%

Hacker News
Yes: 252 8.208%
No: 1698 55.309%

#lesswrong on freenode
Yes: 76 2.476%
No: 1874 61.042%

#slatestarcodex on freenode
Yes: 36 1.173%
No: 1914 62.345%

#hplusroadmap on freenode
Yes: 4 0.13%
No: 1946 63.388%

#chapelperilous on freenode
Yes: 10 0.326%
No: 1940 63.192%

[Since people keep asking me, this is a postrational channel.]

/r/rational
Yes: 274 8.925%
No: 1676 54.593%

/r/HPMOR
Yes: 230 7.492%
No: 1720 56.026%

[Given that the story is long over, this is pretty impressive. I'd have expected it to be dead by now.]

/r/SlateStarCodex
Yes: 244 7.948%
No: 1706 55.57%

One or more private 'rationalist' groups
Yes: 192 6.254%
No: 1758 57.264%

[I almost wish I hadn't included this option, it'd have been fascinating to learn more about these through write ins.]

Of all the parties who seem like plausible candidates at the moment, Scott Alexander seems most capable to undiaspora the community. In practice he's very busy, so he would need a dedicated team of relatively autonomous people to help him. Scott could court guest posts and start to scale up under the SSC brand, and I think he would fairly easily end up with the lions share of the free floating LWers that way.

Before I call a hearse for LessWrong, there is a glimmer of hope left:

Would you consider rejoining LessWrong?

I never left: 668 40.6%
Yes: 557 33.8%
Yes, but only under certain conditions: 205 12.5%
No: 216 13.1%

A significant fraction of people say they'd be interested in an improved version of the site. And of course there were write ins for conditions to rejoin, what did people say they'd need to rejoin the site?

Rejoin Condition Write Ins [Part One]
Rejoin Condition Write Ins [Part Two]
Rejoin Condition Write Ins [Part Three]
Rejoin Condition Write Ins [Part Four]
Rejoin Condition Write Ins [Part Five]

Feel free to read these yourselves (they're not long), but I'll go ahead and summarize: It's all about the content. Content, content, content. No amount of usability improvements, A/B testing or clever trickery will let you get around content. People are overwhelmingly clear about this; they need a reason to come to the site and right now they don't feel like they have one. That means priority number one for somebody trying to revitalize LessWrong is how you deal with this.

Let's recap.

Future Improvement Wishlist Based On Survey Results

Philosophical

  • Pay more attention to mainstream scholarship and ideas.
  • Improved intellectual rigor.
  • Acknowledge sources borrowed from.
  • Be more practical and focus on results.
  • Be more humble.

Community

  • Less intimidating/aggressive/etc to newcomers,
  • Structures that will onboard people into the community.
  • Stop being so nitpicky and argumentative.
  • Spend more time on getting content indexed in a form where people can actually find it.
  • More accepting of outside viewpoints.

While that list seems reasonable, it's quite hard to put into practice. Rigor, as the name implies requires high-effort from participants. Frankly, it's not fun. And getting people to do un-fun things without paying them is difficult. If LessWrong is serious about it's goal of 'advancing the art of human rationality' then it needs to figure out a way to do real investigation into the subject. Not just have people 'discuss', as though the potential for Rationality is within all of us just waiting to be brought out by the right conversation.

I personally haven't been a LW regular in a long time. Assuming the points about pedanticism, snipping, "well actually"-ism and the like are true then they need to stop for the site to move forward. Personally, I'm a huge fan of Scott Alexander's comment policy: All comments must be at least two of true, kind, or necessary.

  • True and kind - Probably won't drown out the discussion signal, will help significantly decrease the hostility of the atmosphere.

  • True and necessary - Sometimes what you have to say isn't nice, but it needs to be said. This is the common core of free speech arguments for saying mean things and they're not wrong. However, something being true isn't necessarily enough to make it something you should say. In fact, in some situations saying mean things to people entirely unrelated to their arguments is known as the ad hominem fallacy.

  • Kind and necessary - The infamous 'hugbox' is essentially a place where people go to hear things which are kind but not necessarily true. I don't think anybody wants a hugbox, but occasionally it can be important to say things that might not be true but are needed for the sake of tact, reconciliation, or to prevent greater harm.

If people took that seriously and really gave it some thought before they used their keyboard, I think the on-site LessWrong community would be a significant part of the way to not driving people off as soon as they arrive.

More importantly, in places like the LessWrong Slack I see this sort of happy go lucky attitude about site improvement. "Oh that sounds nice, we should do that." without the accompanying mountain of work to actually make 'that' happen. I'm not sure people really understand the dynamics of what it means to 'revive' a website in severe decay. When you decide to 'revive' a dying site, what you're really doing once you're past a certain point is refounding the site. So the question you should be asking yourself isn't "Can I fix the site up a bit so it isn't quite so stale?". It's "Could I have founded this site?" and if the answer is no you should seriously question whether to make the time investment.

Whether or not LessWrong lives to see another day basically depends on the level of ground game its last users and administrators can muster up. And if it's not enough, it won't.

Virtus junxit mors non separabit!

Notes on the Safety in Artificial Intelligence conference

25 UmamiSalami 01 July 2016 12:36AM

These are my notes and observations after attending the Safety in Artificial Intelligence (SafArtInt) conference, which was co-hosted by the White House Office of Science and Technology Policy and Carnegie Mellon University on June 27 and 28. This isn't an organized summary of the content of the conference; rather, it's a selection of points which are relevant to the control problem. As a result, it suffers from selection bias: it looks like superintelligence and control-problem-relevant issues were discussed frequently, when in reality those issues were discussed less and I didn't write much about the more mundane parts.

SafArtInt has been the third out of a planned series of four conferences. The purpose of the conference series was twofold: the OSTP wanted to get other parts of the government moving on AI issues, and they also wanted to inform public opinion.

The other three conferences are about near term legal, social, and economic issues of AI. SafArtInt was about near term safety and reliability in AI systems. It was effectively the brainchild of Dr. Ed Felten, the deputy U.S. chief technology officer for the White House, who came up with the idea for it last year. CMU is a top computer science university and many of their own researchers attended, as well as some students. There were also researchers from other universities, some people from private sector AI including both Silicon Valley and government contracting, government researchers and policymakers from groups such as DARPA and NASA, a few people from the military/DoD, and a few control problem researchers. As far as I could tell, everyone except a few university researchers were from the U.S., although I did not meet many people. There were about 70-100 people watching the presentations at any given time, and I had conversations with about twelve of the people who were not affiliated with existential risk organizations, as well as of course all of those who were affiliated. The conference was split with a few presentations on the 27th and the majority of presentations on the 28th. Not everyone was there for both days.

Felten believes that neither "robot apocalypses" nor "mass unemployment" are likely. It soon became apparent that the majority of others present at the conference felt the same way with regard to superintelligence. The general intention among researchers and policymakers at the conference could be summarized as follows: we need to make sure that the AI systems we develop in the near future will not be responsible for any accidents, because if accidents do happen then they will spark public fears about AI, which would lead to a dearth of funding for AI research and an inability to realize the corresponding social and economic benefits. Of course, that doesn't change the fact that they strongly care about safety in its own right and have significant pragmatic needs for robust and reliable AI systems.

Most of the talks were about verification and reliability in modern day AI systems. So they were concerned with AI systems that would give poor results or be unreliable in the narrow domains where they are being applied in the near future. They mostly focused on "safety-critical" systems, where failure of an AI program would result in serious negative consequences: automated vehicles were a common topic of interest, as well as the use of AI in healthcare systems. A recurring theme was that we have to be more rigorous in demonstrating safety and do actual hazard analyses on AI systems, and another was that we need the AI safety field to succeed in ways that the cybersecurity field has failed. Another general belief was that long term AI safety, such as concerns about the ability of humans to control AIs, was not a serious issue.

On average, the presentations were moderately technical. They were mostly focused on machine learning systems, although there was significant discussion of cybersecurity techniques.

The first talk was given by Eric Horvitz of Microsoft. He discussed some approaches for pushing into new directions in AI safety. Instead of merely trying to reduce the errors spotted according to one model, we should look out for "unknown unknowns" by stacking models and looking at problems which appear on any of them, a theme which would be presented by other researchers as well in later presentations. He discussed optimization under uncertain parameters, sensitivity analysis to uncertain parameters, and 'wireheading' or short-circuiting of reinforcement learning systems (which he believes can be guarded against by using 'reflective analysis'). Finally, he brought up the concerns about superintelligence, which sparked amused reactions in the audience. He said that scientists should address concerns about superintelligence, which he aptly described as the 'elephant in the room', noting that it was the reason that some people were at the conference. He said that scientists will have to engage with public concerns, while also noting that there were experts who were worried about superintelligence and that there would have to be engagement with the experts' concerns. He did not comment on whether he believed that these concerns were reasonable or not.

An issue which came up in the Q&A afterwards was that we need to deal with mis-structured utility functions in AI, because it is often the case that the specific tradeoffs and utilities which humans claim to value often lead to results which the humans don't like. So we need to have structural uncertainty about our utility models. The difficulty of finding good objective functions for AIs would eventually be discussed in many other presentations as well.

The next talk was given by Andrew Moore of Carnegie Mellon University, who claimed that his talk represented the consensus of computer scientists at the school. He claimed that the stakes of AI safety were very high - namely, that AI has the capability to save many people's lives in the near future, but if there are any accidents involving AI then public fears could lead to freezes in AI research and development. He highlighted the public's irrational tendencies wherein a single accident could cause people to overlook and ignore hundreds of invisible lives saved. He specifically mentioned a 12-24 month timeframe for these issues.

Moore said that verification of AI system safety will be difficult due to the combinatorial explosion of AI behaviors. He talked about meta-machine-learning as a solution to this, something which is being investigated under the direction of Lawrence Schuette at the Office of Naval Research. Moore also said that military AI systems require high verification standards and that development timelines for these systems are long. He talked about two different approaches to AI safety, stochastic testing and theorem proving - the process of doing the latter often leads to the discovery of unsafe edge cases.

He also discussed AI ethics, giving an example 'trolley problem' where AI cars would have to choose whether to hit a deer in order to provide a slightly higher probability of survival for the human driver. He said that we would need hash-defined constants to tell vehicle AIs how many deer a human is worth. He also said that we would need to find compromises in death-pleasantry tradeoffs, for instance where the safety of self-driving cars depends on the speed and routes on which they are driven. He compared the issue to civil engineering where engineers have to operate with an assumption about how much money they would spend to save a human life.

He concluded by saying that we need policymakers, company executives, scientists, and startups to all be involved in AI safety. He said that the research community stands to gain or lose together, and that there is a shared responsibility among researchers and developers to avoid triggering another AI winter through unsafe AI designs.

The next presentation was by Richard Mallah of the Future of Life Institute, who was there to represent "Medium Term AI Safety". He pointed out the explicit/implicit distinction between different modeling techniques in AI systems, as well as the explicit/implicit distinction between different AI actuation techniques. He talked about the difficulty of value specification and the concept of instrumental subgoals as an important issue in the case of complex AIs which are beyond human understanding. He said that even a slight misalignment of AI values with regard to human values along one parameter could lead to a strongly negative outcome, because machine learning parameters don't strictly correspond to the things that humans care about.

Mallah stated that open-world discovery leads to self-discovery, which can lead to reward hacking or a loss of control. He underscored the importance of causal accounting, which is distinguishing causation from correlation in AI systems. He said that we should extend machine learning verification to self-modification. Finally, he talked about introducing non-self-centered ontology to AI systems and bounding their behavior.

The audience was generally quiet and respectful during Richard's talk. I sensed that at least a few of them labelled him as part of the 'superintelligence out-group' and dismissed him accordingly, but I did not learn what most people's thoughts or reactions were. In the next panel featuring three speakers, he wasn't the recipient of any questions regarding his presentation or ideas.

Tom Mitchell from CMU gave the next talk. He talked about both making AI systems safer, and using AI to make other systems safer. He said that risks to humanity from other kinds of issues besides AI were the "big deals of 2016" and that we should make sure that the potential of AIs to solve these problems is realized. He wanted to focus on the detection and remediation of all failures in AI systems. He said that it is a novel issue that learning systems defy standard pre-testing ("as Richard mentioned") and also brought up the purposeful use of AI for dangerous things.

Some interesting points were raised in the panel. Andrew did not have a direct response to the implications of AI ethics being determined by the predominantly white people of the US/UK where most AIs are being developed. He said that ethics in AIs will have to be decided by society, regulators, manufacturers, and human rights organizations in conjunction. He also said that our cost functions for AIs will have to get more and more complicated as AIs get better, and he said that he wants to separate unintended failures from superintelligence type scenarios. On trolley problems in self driving cars and similar issues, he said "it's got to be complicated and messy."

Dario Amodei of Google Deepbrain, who co-authored the paper on concrete problems in AI safety, gave the next talk. He said that the public focus is too much on AGI/ASI and wants more focus on concrete/empirical approaches. He discussed the same problems that pose issues in advanced general AI, including flawed objective functions and reward hacking. He said that he sees long term concerns about AGI/ASI as "extreme versions of accident risk" and that he thinks it's too early to work directly on them, but he believes that if you want to deal with them then the best way to do it is to start with safety in current systems. Mostly he summarized the Google paper in his talk.

In her presentation, Claire Le Goues of CMU said "before we talk about Skynet we should focus on problems that we already have." She mostly talked about analogies between software bugs and AI safety, the similarities and differences between the two and what we can learn from software debugging to help with AI safety.

Robert Rahmer of IARPA discussed CAUSE, a cyberintelligence forecasting program which promises to help predict cyber attacks. It is a program which is still being put together.

In the panel of the above three, autonomous weapons were discussed, but no clear policy stances were presented.

John Launchbury gave a talk on DARPA research and the big picture of AI development. He pointed out that DARPA work leads to commercial applications and that progress in AI comes from sustained government investment. He classified AI capabilities into "describing," "predicting," and "explaining" in order of increasing difficulty, and he pointed out that old fashioned "describing" still plays a large role in AI verification. He said that "explaining" AIs would need transparent decisionmaking and probabilistic programming (the latter would also be discussed by others at the conference).

The next talk came from Jason Gaverick Matheny, the director of IARPA. Matheny talked about four requirements in current and future AI systems: verification, validation, security, and control. He wanted "auditability" in AI systems as a weaker form of explainability. He talked about the importance of "corner cases" for national intelligence purposes, the low probability, high stakes situations where we have limited data - these are situations where we have significant need for analysis but where the traditional machine learning approach doesn't work because of its overwhelming focus on data. Another aspect of national defense is that it has a slower decision tempo, longer timelines, and longer-viewing optics about future events.

He said that assessing local progress in machine learning development would be important for global security and that we therefore need benchmarks to measure progress in AIs. He ended with a concrete invitation for research proposals from anyone (educated or not), for both large scale research and for smaller studies ("seedlings") that could take us "from disbelief to doubt".

The difference in timescales between different groups was something I noticed later on, after hearing someone from the DoD describe their agency as having a longer timeframe than the Homeland Security Agency, and someone from the White House describe their work as being crisis reactionary.

The next presentation was from Andrew Grotto, senior director of cybersecurity policy at the National Security Council. He drew a close parallel from the issue of genetically modified crops in Europe in the 1990's to modern day artificial intelligence. He pointed out that Europe utterly failed to achieve widespread cultivation of GMO crops as a result of public backlash. He said that the widespread economic and health benefits of GMO crops were ignored by the public, who instead focused on a few health incidents which undermined trust in the government and crop producers. He had three key points: that risk frameworks matter, that you should never assume that the benefits of new technology will be widely perceived by the public, and that we're all in this together with regard to funding, research progress and public perception.

In the Q&A between Launchbury, Matheny, and Grotto after Grotto's presentation, it was mentioned that the economic interests of farmers worried about displacement also played a role in populist rejection of GMOs, and that a similar dynamic could play out with regard to automation causing structural unemployment. Grotto was also asked what to do about bad publicity which seeks to sink progress in order to avoid risks. He said that meetings like SafArtInt and open public dialogue were good.

One person asked what Launchbury wanted to do about AI arms races with multiple countries trying to "get there" and whether he thinks we should go "slow and secure" or "fast and risky" in AI development, a question which provoked laughter in the audience. He said we should go "fast and secure" and wasn't concerned. He said that secure designs for the Internet once existed, but the one which took off was the one which was open and flexible.

Another person asked how we could avoid discounting outliers in our models, referencing Matheny's point that we need to include corner cases. Matheny affirmed that data quality is a limiting factor to many of our machine learning capabilities. At IARPA, we generally try to include outliers until they are sure that they are erroneous, said Matheny.

Another presentation came from Tom Dietterich, president of the Association for the Advancement of Artificial Intelligence. He said that we have not focused enough on safety, reliability and robustness in AI and that this must change. Much like Eric Horvitz, he drew a distinction between robustness against errors within the scope of a model and robustness against unmodeled phenomena. On the latter issue, he talked about solutions such as expanding the scope of models, employing multiple parallel models, and doing creative searches for flaws - the latter doesn't enable verification that a system is safe, but it nevertheless helps discover many potential problems. He talked about knowledge-level redundancy as a method of avoiding misspecification - for instance, systems could identify objects by an "ownership facet" as well as by a "goal facet" to produce a combined concept with less likelihood of overlooking key features. He said that this would require wider experiences and more data.

There were many other speakers who brought up a similar set of issues: the user of cybersecurity techniques to verify machine learning systems, the failures of cybersecurity as a field, opportunities for probabilistic programming, and the need for better success in AI verification. Inverse reinforcement learning was extensively discussed as a way of assigning values. Jeanette Wing of Microsoft talked about the need for AIs to reason about the continuous and the discrete in parallel, as well as the need for them to reason about uncertainty (with potential meta levels all the way up). One point which was made by Sarah Loos of Google was that proving the safety of an AI system can be computationally very expensive, especially given the combinatorial explosion of AI behaviors.

In one of the panels, the idea of government actions to ensure AI safety was discussed. No one was willing to say that the government should regulate AI designs. Instead they stated that the government should be involved in softer ways, such as guiding and working with AI developers, and setting standards for certification.

Pictures: https://imgur.com/a/49eb7

In between these presentations I had time to speak to individuals and listen in on various conversations. A high ranking person from the Department of Defense stated that the real benefit of autonomous systems would be in terms of logistical systems rather than weaponized applications. A government AI contractor drew the connection between Mallah's presentation and the recent press revolving around superintelligence, and said he was glad that the government wasn't worried about it.

I talked to some insiders about the status of organizations such as MIRI, and found that the current crop of AI safety groups could use additional donations to become more established and expand their programs. There may be some issues with the organizations being sidelined; after all, the Google Deepbrain paper was essentially similar to a lot of work by MIRI, just expressed in somewhat different language, and was more widely received in mainstream AI circles.

In terms of careers, I found that there is significant opportunity for a wide range of people to contribute to improving government policy on this issue. Working at a group such as the Office of Science and Technology Policy does not necessarily require advanced technical education, as you can just as easily enter straight out of a liberal arts undergraduate program and build a successful career as long as you are technically literate. (At the same time, the level of skepticism about long term AI safety at the conference hinted to me that the signalling value of a PhD in computer science would be significant.) In addition, there are large government budgets in the seven or eight figure range available for qualifying research projects. I've come to believe that it would not be difficult to find or create AI research programs that are relevant to long term AI safety while also being practical and likely to be funded by skeptical policymakers and officials.

I also realized that there is a significant need for people who are interested in long term AI safety to have basic social and business skills. Since there is so much need for persuasion and compromise in government policy, there is a lot of value to be had in being communicative, engaging, approachable, appealing, socially savvy, and well-dressed. This is not to say that everyone involved in long term AI safety is missing those skills, of course.

I was surprised by the refusal of almost everyone at the conference to take long term AI safety seriously, as I had previously held the belief that it was more of a mixed debate given the existence of expert computer scientists who were involved in the issue. I sensed that the recent wave of popular press and public interest in dangerous AI has made researchers and policymakers substantially less likely to take the issue seriously. None of them seemed to be familiar with actual arguments or research on the control problem, so their opinions didn't significantly change my outlook on the technical issues. I strongly suspect that the majority of them had their first or possibly only exposure to the idea of the control problem after seeing badly written op-eds and news editorials featuring comments from the likes of Elon Musk and Stephen Hawking, which would naturally make them strongly predisposed to not take the issue seriously. In the run-up to the conference, websites and press releases didn't say anything about whether this conference would be about long or short term AI safety, and they didn't make any reference to the idea of superintelligence.

I sympathize with the concerns and strategy given by people such as Andrew Moore and Andrew Grotto, which make perfect sense if (and only if) you assume that worries about long term AI safety are completely unfounded. For the community that is interested in long term AI safety, I would recommend that we avoid competitive dynamics by (a) demonstrating that we are equally strong opponents of bad press, inaccurate news, and irrational public opinion which promotes generic uninformed fears over AI, (b) explaining that we are not interested in removing funding for AI research (even if you think that slowing down AI development is a good thing, restricting funding yields only limited benefits in terms of changing overall timelines, whereas those who are not concerned about long term AI safety would see a restriction of funding as a direct threat to their interests and projects, so it makes sense to cooperate here in exchange for other concessions), and (c) showing that we are scientifically literate and focused on the technical concerns. I do not believe that there is necessarily a need for the two "sides" on this to be competing against each other, so it was disappointing to see an implication of opposition at the conference.

Anyway, Ed Felten announced a request for information from the general public, seeking popular and scientific input on the government's policies and attitudes towards AI: https://www.whitehouse.gov/webform/rfi-preparing-future-artificial-intelligence

Overall, I learned quite a bit and benefited from the experience, and I hope the insight I've gained can be used to improve the attitudes and approaches of the long term AI safety community.

Diaspora roundup thread, 15th June 2016

24 philh 15 June 2016 09:36AM

This is a new experimental weekly thread.

Guidelines: Top-level comments here should be links to things written by members of the rationalist community, preferably that would be interesting specifically to this community. Self-promotion is totally fine. Including a very brief summary or excerpt is great, but not required. Generally stick to one link per top-level comment. Recent links are preferred.

Rule: Do not link to anyone who does not want to be linked to. In particular, Scott Alexander has asked people not to link to specific posts on his tumblr. As far as I know he's never rescinded that. Do not link to posts on his tumblr.

Google Deepmind and FHI collaborate to present research at UAI 2016

23 Stuart_Armstrong 09 June 2016 06:08PM

Safely Interruptible Agents

Oxford academics are teaming up with Google DeepMind to make artificial intelligence safer. Laurent Orseau, of Google DeepMind, and Stuart Armstrong, the Alexander Tamas Fellow in Artificial Intelligence and Machine Learning at the Future of Humanity Institute at the University of Oxford, will be presenting their research on reinforcement learning agent interruptibility at UAI 2016. The conference, one of the most prestigious in the field of machine learning, will be held in New York City from June 25-29. The paper which resulted from this collaborative research will be published in the Proceedings of the 32nd Conference on Uncertainty in Artificial Intelligence (UAI).

Orseau and Armstrong’s research explores a method to ensure that reinforcement learning agents can be repeatedly safely interrupted by human or automatic overseers. This ensures that the agents do not “learn” about these interruptions, and do not take steps to avoid or manipulate the interruptions. When there are control procedures during the training of the agent, we do not want the agent to learn about these procedures, as they will not exist once the agent is on its own. This is useful for agents that have a substantially different training and testing environment (for instance, when training a Martian rover on Earth, shutting it down, replacing it at its initial location and turning it on again when it goes out of bounds—something that may be impossible once alone unsupervised on Mars), for agents not known to be fully trustworthy (such as an automated delivery vehicle, that we do not want to learn to behave differently when watched), or simply for agents that need continual adjustments to their learnt behaviour. In all cases where it makes sense to include an emergency “off” mechanism, it also makes sense to ensure the agent doesn’t learn to plan around that mechanism.

Interruptibility has several advantages as an approach over previous methods of control. As Dr. Armstrong explains, “Interruptibility has applications for many current agents, especially when we need the agent to not learn from specific experiences during training. Many of the naive ideas for accomplishing this—such as deleting certain histories from the training set—change the behaviour of the agent in unfortunate ways.”

In the paper, the researchers provide a formal definition of safe interruptibility, show that some types of agents already have this property, and show that others can be easily modified to gain it. They also demonstrate that even an ideal agent that tends to the optimal behaviour in any computable environment can be made safely interruptible.

These results will have implications in future research directions in AI safety. As the paper says, “Safe interruptibility can be useful to take control of a robot that is misbehaving… take it out of a delicate situation, or even to temporarily use it to achieve a task it did not learn to perform….” As Armstrong explains, “Machine learning is one of the most powerful tools for building AI that has ever existed. But applying it to questions of AI motivations is problematic: just as we humans would not willingly change to an alien system of values, any agent has a natural tendency to avoid changing its current values, even if we want to change or tune them. Interruptibility and the related general idea of corrigibility, allow such changes to happen without the agent trying to resist them or force them. The newness of the field of AI safety means that there is relatively little awareness of these problems in the wider machine learning community.  As with other areas of AI research, DeepMind remains at the cutting edge of this important subfield.”

On the prospect of continuing collaboration in this field with DeepMind, Stuart said, “I personally had a really illuminating time writing this paper—Laurent is a brilliant researcher… I sincerely look forward to productive collaboration with him and other researchers at DeepMind into the future.” The same sentiment is echoed by Laurent, who said, “It was a real pleasure to work with Stuart on this. His creativity and critical thinking as well as his technical skills were essential components to the success of this work. This collaboration is one of the first steps toward AI Safety research, and there’s no doubt FHI and Google DeepMind will work again together to make AI safer.”

For more information, or to schedule an interview, please contact Kyle Scott at fhipa@philosophy.ox.ac.uk

Hedge drift and advanced motte-and-bailey

21 Stefan_Schubert 01 May 2016 02:45PM

Motte and bailey is a technique by which one protects an interesting but hard-to-defend view by making it similar to a less interesting but more defensible position. Whenever the more interesting position - the bailey - is attacked - one retreats to the more defensible one - the motte -, but when the attackers are gone, one expands again to the bailey. 

In that case, one and the same person switches between two interpretations of the original claim. Here, I rather want to focus on situations where different people make different interpretations of the original claim. The originator of the claim adds a number of caveats and hedges to their claim, which makes it more defensible, but less striking and sometimes also less interesting.* When others refer to the same claim, the caveats and hedges gradually disappear, however, making it more and more motte-like.

A salient example of this is that scientific claims (particularly in messy fields like psychology and economics) often come with a number of caveats and hedges, which tend to get lost when re-told. This is especially so when media writes about these claims, but even other scientists often fail to properly transmit all the hedges and caveats that come with them.

Since this happens over and over again, people probably do expect their hedges to drift to some extent. Indeed, it would not surprise me if some people actually want hedge drift to occur. Such a strategy effectively amounts to a more effective, because less observable, version of the motte-and-bailey-strategy. Rather than switching back and forth between the motte and the bailey - something which is at least moderately observable, and also usually relies on some amount of vagueness, which is undesirable - you let others spread the bailey version of your claim, whilst you sit safe in the motte. This way, you get what you want - the spread of the bailey version - in a much safer way.

Even when people don't use this strategy intentionally, you could argue that they should expect hedge drift, and that omitting to take action against it is, if not ouright intellectually dishonest, then at least approaching that. This argument would rest on the consequentialist notion that if you have strong reasons to believe that some negative event will occur, and you could prevent it from happening by fairly simple means, then you have an obligation to do so. I certainly do think that scientists should do more to prevent their views from being garbled via hedge drift. 

Another way of expressing all this is by saying that when including hedging or caveats, scientists often seem to seek plausible deniability ("I included these hedges; it's not my fault if they were misinterpreted"). They don't actually try to prevent their claims from being misunderstood. 

What concrete steps could one then take to prevent hedge-drift? Here are some suggestions. I am sure there are many more.

  1. Many authors use eye-catching, hedge-free titles and/or abstracts, and then only include hedges in the paper itself. This is a recipe for hedge-drift and should be avoided.
  2. Make abundantly clear, preferably in the abstract, just how dependent the conclusions are on keys and assumptions. Say this not in a way that enables you to claim plausible deniability in case someone misinterprets you, but in a way that actually reduces the risk of hedge-drift as much as possible. 
  3. Explicitly caution against hedge drift, using that term or a similar one, in the abstract of the paper.

* Edited 2/5 2016. By hedges and caveats I mean terms like "somewhat" ("x reduces y somewhat"), "slightly", etc, as well as modelling assumptions without which the conclusions don't follow and qualifications regarding domains in which the thesis don't hold.

Revitalizing Less Wrong seems like a lost purpose, but here are some other ideas

19 John_Maxwell_IV 12 June 2016 07:38AM

This is a response to ingres' recent post sharing Less Wrong survey results. If you haven't read & upvoted it, I strongly encourage you to--they've done a fabulous job of collecting and presenting data about the state of the community.

So, there's a bit of a contradiction in the survey results.  On the one hand, people say the community needs to do more scholarship, be more rigorous, be more practical, be more humble.  On the other hand, not much is getting posted, and it seems like raising the bar will only exacerbate that problem.

I did a query against the survey database to find the complaints of top Less Wrong contributors and figure out how best to serve their needs.  (Note: it's a bit hard to read the comments because some of them should start with "the community needs more" or "the community needs less", but adding that info would have meant constructing a much more complicated query.)  One user wrote:

[it's not so much that there are] overly high standards,  just not a very civil or welcoming climate . why write content for free and get trashed when I can go write a grant application or a manuscript instead?

ingres emphasizes that in order to revitalize the community, we would need more content.  Content is important, but incentives for producing content might be even more important.  Social status may be the incentive humans respond most strongly to.  Right now, from a social status perspective, the expected value of creating a new Less Wrong post doesn't feel very high.  Partially because many LW posts are getting downvotes and critical comments, so my System 1 says my posts might as well.  And partially because the Less Wrong brand is weak enough that I don't expect associating myself with it will boost my social status.

When Less Wrong was founded, the primary failure mode guarded against was Eternal September.  If Eternal September represents a sort of digital populism, Less Wrong was attempting a sort of digital elitism.  My perception is that elitism isn't working because the benefits of joining the elite are too small and the costs are too large.  Teddy Roosevelt talked about the man in the arena--I think Less Wrong experienced the reverse of the evaporative cooling EY feared, where people gradually left the arena as the proportional number of critics in the stands grew ever larger.

Given where Less Wrong is at, however, I suspect the goal of revitalizing Less Wrong represents a lost purpose.

ingres' survey received a total of 3083 responses.  Not only is that about twice the number we got in the last survey in 2014, it's about twice the number we got in 20132012, and 2011 (though much bigger than the first survey in 2009).  It's hard to know for sure, since previous surveys were only advertised on the LessWrong.com domain, but it doesn't seem like the diaspora thing has slowed the growth of the community a ton and it may have dramatically accelerated it.

Why has the community continued growing?  Here's one possibility.  Maybe Less Wrong has been replaced by superior alternatives.

  • CFAR - ingres writes: "If LessWrong is serious about it's goal of 'advancing the art of human rationality' then it needs to figure out a way to do real investigation into the subject."  That's exactly what CFAR does.  CFAR is a superior alternative for people who want something like Less Wrong, but more practical.  (They have an alumni mailing list that's higher quality and more active than Less Wrong.)  Yes, CFAR costs money, because doing research costs money!
  • Effective Altruism - A superior alternative for people who want something that's more focused on results.
  • Facebook, Tumblr, Twitter - People are going to be wasting time on these sites anyway.  They might as well talk about rationality while they do it.  Like all those phpBB boards in the 00s, Less Wrong has been outcompeted by the hot new thing, and I think it's probably better to roll with it than fight it.  I also wouldn't be surprised if interacting with others through social media has been a cause of community growth.
  • SlateStarCodex - SSC already checks most of the boxes under ingres' "Future Improvement Wishlist Based On Survey Results".  In my opinion, the average SSC post has better scholarship, rigor, and humility than the average LW post, and the community seems less intimidating, less argumentative, more accessible, and more accepting of outside viewpoints.
  • The meatspace community - Meeting in person has lots of advantages.  Real-time discussion using Slack/IRC also has advantages.

Less Wrong had a great run, and the superior alternatives wouldn't exist in their current form without it.  (LW was easily the most common way people heard about EA in 2014, for instance, although sampling effects may have distorted that estimate.)  But that doesn't mean it's the best option going forward.

Therefore, here are some things I don't think we should do:

  • Try to be a second-rate version of any of the superior alternatives I mentioned above.  If someone's going to put something together, it should fulfill a real community need or be the best alternative available for whatever purpose it serves.
  • Try to get old contributors to return to Less Wrong for the sake of getting them to return.  If they've judged that other activities are a better use of time, we should probably trust their judgement.  It might be sensible to make an exception for old posters that never transferred to the in-person community, but they'd be harder to track down.
  • Try to solve the same sort of problems Arbital or Metaculus is optimizing for.  No reason to step on the toes of other projects in the community.

But that doesn't mean there's nothing to be done.  Here are some possible weaknesses I see with our current setup:

  • If you've got a great idea for a blog post, and you don't already have an online presence, it's a bit hard to reach lots of people, if that's what you want to do.
  • If we had a good system for incentivizing people to write great stuff (as opposed to merely tolerating great stuff the way LW culture historically has), we'd get more great stuff written.
  • It can be hard to find good content in the diaspora.  Possible solution: Weekly "diaspora roundup" posts to Less Wrong.  I'm too busy to do this, but anyone else is more than welcome to (assuming both people reading LW and people in the diaspora want it).

ingres mentions the possibility of Scott Alexander somehow opening up SlateStarCodex to other contributors.  This seems like a clearly superior alternative to revitalizing Less Wrong, if Scott is down for it:

  • As I mentioned, SSC already seems to have solved most of the culture & philosophy problems that people complained about with Less Wrong.
  • SSC has no shortage of content--Scott has increased the rate at which he creates open threads to deal with an excess of comments.
  • SSC has a stronger brand than Less Wrong.  It's been linked to by Ezra Klein, Ross Douthat, Bryan Caplan, etc.

But the most important reasons may be behavioral reasons.  SSC has more traffic--people are in the habit of visiting there, not here.  And the posting habits people have acquired there seem more conducive to community.  Changing habits is hard.

As ingres writes, revitalizing Less Wrong is probably about as difficult as creating a new site from scratch, and I think creating a new site from scratch for Scott is a superior alternative for the reasons I gave.

So if there's anyone who's interested in improving Less Wrong, here's my humble recommendation: Go tell Scott Alexander you'll build an online forum to his specification, with SSC community feedback, to provide a better solution for his overflowing open threads.  Once you've solved that problem, keep making improvements and subfora so your forum becomes the best available alternative for more and more use cases.

And here's my humble suggestion for what an SSC forum could look like:

As I mentioned above, Eternal September is analogous to a sort of digital populism.  The major social media sites often have a "mob rule" culture to them, and people are increasingly seeing the disadvantages of this model.  Less Wrong tried to achieve digital elitism and it didn't work well in the long run, but that doesn't mean it's impossible.  Edge.org has found a model for digital elitism that works.  There may be other workable models out there.  A workable model could even turn in to a successful company.  Fight the hot new thing by becoming the hot new thing.

My proposal is based on the idea of eigendemocracy.  (Recommended that you read the link before continuing--eigendemocracy is cool.)  In eigendemocracy, your trust score is a composite rating of what trusted people think of you.  (It sounds like infinite recursion, but it can be resolved using linear algebra.)

Eigendemocracy is a complicated idea, but a simple way to get most of the way there would be to have a forum where having lots of karma gives you the ability to upvote multiple times.  How would this work?  Let's say Scott starts with 5 karma and everyone else starts with 0 karma.  Each point of karma gives you the ability to upvote once a day.  Let's say it takes 5 upvotes for a post to get featured on the sidebar of Scott's blog.  If Scott wants to feature a post on the sidebar of his blog, he upvotes it 5 times, netting the person who wrote it 1 karma.  As Scott features more and more posts, he gains a moderation team full of people who wrote posts that were good enough to feature.  As they feature posts in turn, they generate more co-moderators.

Why do I like this solution?

  • It acts as a cultural preservation mechanism.  On reddit and Twitter, sheer numbers rule when determining what gets visibility.  The reddit-like voting mechanisms of Less Wrong meant that the site deliberately kept a somewhat low profile in order to avoid getting overrun.  Even if SSC experienced a large influx of new users, those users would only gain power to affect the visibility of content if they proved themselves by making quality contributions first.
  • It takes the moderation burden off of Scott and distributes it across trusted community members.  As the community grows, the mod team grows with it.
  • The incentives seem well-aligned.  Writing stuff Scott likes or meta-likes gets you recognition, mod powers, and the ability to control the discussion--forms of social status.  Contrast with social media sites where hyperbole is a shortcut to attention, followers, upvotes.  Also, unlike Less Wrong, there'd be no punishment for writing a low quality post--it simply doesn't get featured and is one more click away from the SSC homepage.

TL;DR - Despite appearances, the Less Wrong community is actually doing great.  Any successor to Less Wrong should try to offer compelling advantages over options that are already available.

2016 LessWrong Diaspora Survey Analysis: Part One (Meta and Demographics)

18 ingres 14 May 2016 06:09AM

2016 LessWrong Diaspora Survey Analysis

Overview

  • Results and Dataset
  • Meta
  • Demographics (You are here)
  • LessWrong Usage and Experience
  • LessWrong Criticism and Successorship
  • Diaspora Community Analysis
  • What it all means for LW 2.0
  • Mental Health Section
  • Basilisk Section/Analysis
  • Blogs and Media analysis
  • Politics
  • Calibration Question And Probability Question Analysis
  • Charity And Effective Altruism Analysis

Survey Meta

Introduction

Hello everybody, this is part one in a series of posts analyzing the 2016 LessWrong Diaspora Survey. The survey ran from March 24th to May 1st and had 3083 respondents.

Almost two thousand eight hundred and fifty hours were spent surveying this year and you've all waited nearly two months from the first survey response to the results writeup. While the results have been available for over a week, they haven't seen widespread dissemination in large part because they lacked a succinct summary of their contents.

When we started the survey in march I posted this graph showing the dropoff in question responses over time:

So it seems only reasonable to post the same graph with this years survey data:

(I should note that this analysis counts certain things as questions that the other chart does not, so it says there are many more questions than the previous survey when in reality where are about as many as last year.)

2016 Diaspora Survey Stats

Survey hours spent in total: 2849.818888888889

Average number of minutes spent on survey: 102.14404619673437

Median number of minutes spent on survey: 39.775

Mode minutes spent on survey: 20.266666666666666

The takeaway here seems to be that some people take a long time with the survey, raising the average. However, most people's survey time is somewhere below the forty five minute mark. LessWrong does a very long survey, and I wanted to make sure that investment was rewarded with a deep detailed analysis. Weighing in at over four thousand lines of python code, I hope the analysis I've put together is worth the wait.

Credits

I'd like to thank people who contributed to the analysis effort:

Bartosz Wroblewski

Kuudes on #lesswrong

Obormot on #lesswrong

Two anonymous contributors

And anybody else who I may have forgotten. Thanks again to Scott Alexander, who wrote the majority of the survey and ran it in 2014, and who has also been generous enough to license his part of the survey under a creative commons license along with mine.


Demographics

Age

The 2014 survey gave these numbers for age:

Age: 27.67 + 8.679 (22, 26, 31) [1490]

In 2016 the numbers were:

Mean: 28.108772669759592
Median: 26.0
Mode: 23.0

Most LWers are in their early to mid twenties, with some older LWers bringing up the average. The average is close enough to the former figure that we can probably say the LW demographic is in their 20's or 30's as a general rule.

Sex and Gender

In 2014 our gender ratio looked like this:

Female: 179, 11.9%
Male: 1311, 87.2%

In 2016 the proportion of women in the community went up by over four percent:

Male: 2021 83.5%
Female: 393 16.2%

One hypothesis on why this happened is that the 2016 survey focused on the diaspora rather than just LW. Diaspora communities plausibly have marginally higher rates of female membership. If I had more time I would write an analysis investigating the demographics of each diaspora community, but to answer this particular question I think a couple of SQL queries are illustrative:

(Note: ActiveMemberships one and two are 'LessWrong' and 'LessWrong Meetups' respectively.)
sqlite> select count(birthsex) from data where (ActiveMemberships_1 = "Yes" OR ActiveMemberships_2 = "Yes") AND birthsex="Male";
425
sqlite> select count(birthsex) from data where (ActiveMemberships_1 = "Yes" OR ActiveMemberships_2 = "Yes") AND birthsex="Female";
66
>>> 66 / (425 + 66)
0.13441955193482688

Well, maybe. Of course, before we wring our hands too much on this question it pays to remember that assigned sex at birth isn't the whole story. The gender question in 2014 had these results:

F (cisgender): 150, 10.0%
F (transgender MtF): 24, 1.6%
M (cisgender): 1245, 82.8%
M (transgender FtM): 5, 0.3%
Other: 64, 4.3%

In 2016:

F (cisgender): 321 13.3%
F (transgender MtF): 65 2.7%
M (cisgender): 1829 76%
M (transgender FtM): 23 1%
Other: 156 6.48%

Some things to note here. 16.2% of respondents were assigned female at birth but only 13.3% still identify as women. 1% are transmen, but where did the other 1.9% go? Presumably into the 'Other' field. Let's find out.

sqlite> select count(birthsex) from data where birthsex = "Female" AND gender = "Other";
57
sqlite> select count(*) from data;
3083
>>> 57 / 3083
0.018488485241647746

Seems to be the case. In general the proportion of men is down 6.1% from 2014. We also gained 1.1% transwomen and .7% transmen in 2016. Moving away from binary genders, this surveys nonbinary gender count gained in proportion by nearly 2.2%. This means that over one in twenty LWers identified as a nonbinary gender, making it a larger demographic than binary transgender LWers! As exciting as that may sound to some ears the numbers tell one story and the write ins tell quite another.

It pays to keep in mind that nonbinary genders are a common troll option for people who want to write in criticism of the question. A quick look at the write ins accompanying the other option indicates that this is what many people used it for, but by no means all. At 156 responses, that's small enough to be worth doing a quick manual tally.

Key = Agender, Esoteric, Female, Male, Male-to-Female, Nonbinary, Objection on Basis Gender Doesn't Exist, Objection on Basis Gender Is Binary, in Process of Transitioning, Refusal, Undecided
Sample Size: 156
A 35
E 6
F 6
M 21
MTF 1
NB 55
OBGDE 6
OBGIB 7
PT 2
R 7
U 10

So depending on your comfort zone as to what constitutes a countable gender, there are 90 to 96 valid 'other' answers in the survey dataset. (Labeled dataset)

>>> 90 / 3083
0.029192345118391177

With some cleanup the number trails behind the binary transgender one by the greater part of a percentage point, but only by. I bet that if you went through and did the same sort of tally on the 2014 survey results you'd find that the proportion of valid nonbinary gender write ins has gone up between then and now.

Some interesting 'esoteric' answers: Attack Helocopter, Blackstar, Elizer, spiderman, Agenderfluid

For the rest of this section I'm going to just focus on differences between the 2016 and 2014 surveys.

2014 Demographics Versus 2016 Demographics

Country

United States: -1.000% 1298 53.700%
United Kingdom: -0.100% 183 7.600%
Canada: +0.100% 144 6.000%
Australia: +0.300% 141 5.800%
Germany: -0.600% 85 3.500%
Russia: +0.700% 57 2.400%
Finland: -0.300% 25 1.000%
New Zealand: -0.200% 26 1.100%
India: -0.100% 24 1.000%
Brazil: -0.300% 16 0.700%
France: +0.400% 34 1.400%
Israel: +0.200% 29 1.200%
Other: 354 14.646%

[Summing these all up to one shows that nearly 1% of change is unaccounted for. My hypothesis is that this 1% went into the other countries not in the list, this can't be easily confirmed because the 2014 analysis does not list the other country percentage.]

Race

Asian (East Asian): -0.600% 80 3.300%
Asian (Indian subcontinent): +0.300% 60 2.500%
Middle Eastern: 0.000% 14 0.600%
Black: -0.300% 12 0.500%
White (non-Hispanic): -0.300% 2059 85.800%
Hispanic: +0.300% 57 2.400%
Other: +1.200% 108 4.500%

Sexual Orientation

Heterosexual: -5.000% 1640 70.400%
Homosexual: +1.300% 103 4.400%
Bisexual: +4.000% 428 18.400%
Other: +3.880% 144 6.180%

[LessWrong got 5.3% more gay, 9.1% if you're more loose with the definition. Before we start any wild speculation, the 2014 question included asexuality as an option and it got 3.9% of the responses, we spun this off into a separate question on the 2016 survey which should explain a significant portion of the change.]

Are you asexual?

Yes: 171 0.074
No: 2129 0.926

[Scott said in 2014 that he'd probably 'vastly undercounted' our asexual readers, a near doubling in our count would seem to support this.]

Relationship Style

Prefer monogomous: -0.900% 1190 50.900%
Prefer polyamorous: +3.100% 426 18.200%
Uncertain/no preference: -2.100% 673 28.800%
Other: +0.426% 45 1.926%

[Polyamorous gained three points, presumably the drop in uncertain people went into that bin.]

Number of Partners

0: -2.300% 1094 46.800%
1: -0.400% 1039 44.400%
2: +1.200% 107 4.600%
3: +0.900% 46 2.000%
4: +0.100% 15 0.600%
5: +0.200% 8 0.300%
Lots and lots: +1.000% 29 1.200%

Relationship Goals

...and seeking more relationship partners: +0.200% 577 24.800%
...and possibly open to more relationship partners: -0.300% 716 30.800%
...and currently not looking for more relationship partners: +1.300% 1034 44.400%

Are you married?

Yes: 443 0.19
No: 1885 0.81

[This question appeared in a different form on the previous survey. Marriage went up by .8% from last year.]

Who do you currently live with most of the time?

Alone: -2.200% 487 20.800%
With parents and/or guardians: +0.100% 476 20.300%
With partner and/or children: +2.100% 687 29.400%
With roommates: -2.000% 619 26.500%

[This would seem to line up with the result that single LWers went down by 2.3%]

How many children do you have?

Sum: 598 or greater
0: +5.400% 2042 87.000%
1: +0.500% 115 4.900%
2: +0.100% 124 5.300%
3: +0.900% 48 2.000%
4: -0.100% 7 0.300%
5: +0.100% 6 0.300%
6: 0.000% 2 0.100%
Lots and lots: 0.000% 3 0.100%

[Interestingly enough, childless LWers went up by 5.4%. This would seem incongruous with the previous results. Not sure how to investigate though.]

Are you planning on having more children?

Yes: -5.400% 720 30.700%
Uncertain: +3.900% 755 32.200%
No: +2.800% 869 37.100%

[This is an interesting result, either nearly 4% of LWers are suddenly less enthusiastic about having kids, or new entrants to the survey are less likely and less sure if they want to. Possibly both.]

Work Status

Student: -5.402% 968 31.398%
Academics: +0.949% 205 6.649%
Self-employed: +4.223% 309 10.023%
Independently wealthy: +0.762% 42 1.362%
Non-profit work: +1.030% 152 4.930%
For-profit work: -1.756% 954 30.944%
Government work: +0.479% 135 4.379%
Homemaker: +1.024% 47 1.524%
Unemployed: +0.495% 228 7.395%

[Most interesting result here is that 5.4% of LWers are no longer students or new survey entrants aren't.]

Profession

Art: +0.800% 51 2.300%
Biology: +0.300% 49 2.200%
Business: -0.800% 72 3.200%
Computers (AI): +0.700% 79 3.500%
Computers (other academic, computer science): -0.100% 156 7.000%
Computers (practical): -1.200% 681 30.500%
Engineering: +0.600% 150 6.700%
Finance / Economics: +0.500% 116 5.200%
Law: -0.300% 50 2.200%
Mathematics: -1.500% 147 6.600%
Medicine: +0.100% 49 2.200%
Neuroscience: +0.100% 28 1.300%
Philosophy: 0.000% 54 2.400%
Physics: -0.200% 91 4.100%
Psychology: 0.000% 48 2.100%
Other: +2.199% 277 12.399%
Other "hard science": -0.500% 26 1.200%
Other "social science": -0.200% 48 2.100%

[The largest profession growth for LWers in 2016 was art, that or this is a consequence of new survey entrants.]

What is your highest education credential earned?

None: -0.700% 96 4.200%
High School: +3.600% 617 26.700%
2 year degree: +0.200% 105 4.500%
Bachelor's: -1.600% 815 35.300%
Master's: -0.500% 415 18.000%
JD/MD/other professional degree: 0.000% 66 2.900%
PhD: -0.700% 145 6.300%
Other: +0.288% 39 1.688%

[Hm, the academic credentials of LWers seems to have gone down some since the last survey. As usual this may also be the result of new survey entrants.]


Footnotes

  1. The 2850 hour estimate of survey hours is very naive. It measures the time between starting and turning in the survey, a person didn't necessarily sit there during all that time. For example this could easily be including people who spent multiple days doing other things before finally finishing their survey.

  2. The apache helicopter image is licensed under the Open Government License, which requires attribution. That particular edit was done by Wubbles on the LW Slack.

  3. The first published draft of this post made a basic stats error calculating the proportion of women in active memberships one and two, dividing the number of women by the number of men rather than the number of women by the number of men and women.

[LINK] Concrete problems in AI safety

15 Stuart_Armstrong 05 July 2016 09:33PM

From the Google Research blog:

We believe that AI technologies are likely to be overwhelmingly useful and beneficial for humanity. But part of being a responsible steward of any new technology is thinking through potential challenges and how best to address any associated risks. So today we’re publishing a technical paper, Concrete Problems in AI Safety, a collaboration among scientists at Google, OpenAI, Stanford and Berkeley.

While possible AI safety risks have received a lot of public attention, most previous discussion has been very hypothetical and speculative. We believe it’s essential to ground concerns in real machine learning research, and to start developing practical approaches for engineering AI systems that operate safely and reliably.

We’ve outlined five problems we think will be very important as we apply AI in more general circumstances. These are all forward thinking, long-term research questions -- minor issues today, but important to address for future systems:

  • Avoiding Negative Side Effects: How can we ensure that an AI system will not disturb its environment in negative ways while pursuing its goals, e.g. a cleaning robot knocking over a vase because it can clean faster by doing so?
  • Avoiding Reward Hacking: How can we avoid gaming of the reward function? For example, we don’t want this cleaning robot simply covering over messes with materials it can’t see through.
  • Scalable Oversight: How can we efficiently ensure that a given AI system respects aspects of the objective that are too expensive to be frequently evaluated during training? For example, if an AI system gets human feedback as it performs a task, it needs to use that feedback efficiently because asking too often would be annoying.
  • Safe Exploration: How do we ensure that an AI system doesn’t make exploratory moves with very negative repercussions? For example, maybe a cleaning robot should experiment with mopping strategies, but clearly it shouldn’t try putting a wet mop in an electrical outlet.
  • Robustness to Distributional Shift: How do we ensure that an AI system recognizes, and behaves robustly, when it’s in an environment very different from its training environment? For example, heuristics learned for a factory workfloor may not be safe enough for an office.

We go into more technical detail in the paper. The machine learning research community has already thought quite a bit about most of these problems and many related issues, but we think there’s a lot more work to be done.

We believe in rigorous, open, cross-institution work on how to build machine learning systems that work as intended. We’re eager to continue our collaborations with other research groups to make positive progress on AI.

2016 LessWrong Diaspora Survey Analysis: Part Three (Mental Health, Basilisk, Blogs and Media)

15 ingres 25 June 2016 03:40AM

2016 LessWrong Diaspora Survey Analysis

Overview


Mental Health

We decided to move the Mental Health section up closer in the survey this year so that the data could inform accessibility decisions.

LessWrong Mental Health As Compared To Base Rates In The General Population
Condition Base Rate LessWrong Rate LessWrong Self dx Rate Combined LW Rate Base/LW Rate Spread Relative Risk
Depression 17% 25.37% 27.04% 52.41% +8.37 1.492
Obsessive Compulsive Disorder 2.3% 2.7% 5.6% 8.3% +0.4 1.173
Autism Spectrum Disorder 1.47% 8.2% 12.9% 21.1% +6.73 5.578
Attention Deficit Disorder 5% 13.6% 10.4% 24% +8.6 2.719
Bipolar Disorder 3% 2.2% 2.8% 5% -0.8 0.733
Anxiety Disorder(s) 29% 13.7% 17.4% 31.1% -15.3 0.472
Borderline Personality Disorder 5.9% 0.6% 1.2% 1.8% -5.3 0.101
Schizophrenia 1.1% 0.8% 0.4% 1.2% -0.3 0.727
Substance Use Disorder 10.6% 1.3% 3.6% 4.9% -9.3 0.122

Base rates are taken from Wikipedia, US rates were favored over global rates where immediately available.

Accessibility Suggestions

So of the conditions we asked about, LessWrongers are at significant extra risk for three of them: Autism, ADHD, Depression.

LessWrong probably doesn't need to concern itself with being more accessible to those with autism as it likely already is. Depression is a complicated disorder with no clear interventions that can be easily implemented as site or community policy. It might be helpful to encourage looking more at positive trends in addition to negative ones, but the community already seems to do a fairly good job of this. (We could definitely use some more of it though.)

Attention Deficit Disorder - Public Service Announcement

That leaves ADHD, which we might be able to do something about, starting with this:

A lot of LessWrong stuff ends up falling into the same genre as productivity advice or 'self help'. If you have trouble with getting yourself to work, find yourself reading these things and completely unable to implement them, it's entirely possible that you have a mental health condition which impacts your executive function.

The best overview I've been able to find on ADD is this talk from Russell Barkely.

30 Essential Ideas For Parents

Ironically enough, this is a long talk, over four hours in total. Barkely is an entertaining speaker and the talk is absolutely fascinating. If you're even mildly interested in the subject I wholeheartedly recommend it. Many people who have ADHD just assume that they're lazy, or not trying hard enough, or just haven't found the 'magic bullet' yet. It never even occurs to them that they might have it because they assume that adult ADHD looks like childhood ADHD, or that ADHD is a thing that psychiatrists made up so they can give children powerful stimulants.

ADD is real, if you're in the demographic that takes this survey there's a decent enough chance you have it.

Attention Deficit Disorder - Accessibility

So with that in mind, is there anything else we can do?

Yes, write better.

Scott Alexander has written a blog post with writing advice for non-fiction, and the interesting thing about it is just how much of the advice is what I would tell you to do if your audience has ADD.

  • Reward the reader quickly and often. If your prose isn't rewarding to read it won't be read.

  • Make sure the overall article has good sectioning and indexing, people might be only looking for a particular thing and they won't want to wade through everything else to get it. Sectioning also gives the impression of progress and reduces eye strain.

  • Use good data visualization to compress information, take away mental effort where possible. Take for example the condition table above. It saves space and provides additional context. Instead of a long vertical wall of text with sections for each condition, it removes:

    • The extraneous information of how many people said they did not have a condition.

    • The space that would be used by creating a section for each condition. In fact the specific improvement of the table is that it takes extra advantage of space in the horizontal plane as well as the vertical plane.

    And instead of just presenting the raw data, it also adds:

    • The normal rate of incidence for each condition, so that the reader understands the extent to which rates are abnormal or unexpected.

    • Easy comparison between the clinically diagnosed, self diagnosed, and combined rates of the condition in the LW demographic. This preserves the value of the original raw data presentation while also easing the mental arithmetic of how many people claim to have a condition.

    • Percentage spread between the clinically diagnosed and the base rate, which saves the effort of figuring out the difference between the two values.

    • Relative risk between the clinically diagnosed and the base rate, which saves the effort of figuring out how much more or less likely a LessWronger is to have a given condition.

    Add all that together and you've created a compelling presentation that significantly improves on the 'naive' raw data presentation.

  • Use visuals in general, they help draw and maintain interest.

None of these are solely for the benefit of people with ADD. ADD is an exaggerated profile of normal human behavior. Following this kind of advice makes your article more accessible to everybody, which should be more than enough incentive if you intend to have an audience.1

Roko's Basilisk

This year we finally added a Basilisk question! In fact, it kind of turned into a whole Basilisk section. A fairly common question about this years survey is why the Basilisk section is so large. The basic reason is that asking only one or two questions about it would leave the results open to rampant speculation in one direction or another. By making the section comprehensive and covering every base, we've pretty much gotten about as complete of data as we'd want on the Basilisk phenomena.

Basilisk Knowledge
Do you know what Roko's Basilisk thought experiment is?

Yes: 1521 73.2%
No but I've heard of it: 158 7.6%
No: 398 19.2%

Basilisk Etiology
Where did you read Roko's argument for the Basilisk?

Roko's post on LessWrong: 323 20.2%
Reddit: 171 10.7%
XKCD: 61 3.8%
LessWrong Wiki: 234 14.6%
A news article: 71 4.4%
Word of mouth: 222 13.9%
RationalWiki: 314 19.6%
Other: 194 12.1%

Basilisk Correctness
Do you think Roko's argument for the Basilisk is correct?

Yes: 75 5.1%
Yes but I don't think it's logical conclusions apply for other reasons: 339 23.1%
No: 1055 71.8%

Basilisks And Lizardmen

One of the biggest mistakes I made with this years survey was not including "Do you believe Barack Obama is a hippopotamus?" as a control question in this section.2 Five percent is just outside of the infamous lizardman constant. This was the biggest survey surprise for me. I thought there was no way that 'yes' could go above a couple of percentage points. As far as I can tell this result is not caused by brigading but I've by no means investigated the matter so thoroughly that I would rule it out.

Higher?

Of course, we also shouldn't forget to investigate the hypothesis that the number might be higher than 5%. After all, somebody who thinks the Basilisk is correct could skip the questions entirely so they don't face potential stigma. So how many people skipped the questions but filled out the rest of the survey?

Eight people refused to answer whether they'd heard of Roko's Basilisk but went on to answer the depression question immediately after the Basilisk section. This gives us a decent proxy for how many people skipped the section and took the rest of the survey. So if we're pessimistic the number is a little higher, but it pays to keep in mind that there are other reasons to want to skip this section. (It is also possible that people took the survey up until they got to the Basilisk section and then quit so they didn't have to answer it, but this seems unlikely.)

Of course this assumes people are being strictly truthful with their survey answers. It's also plausible that people who think the Basilisk is correct said they'd never heard of it and then went on with the rest of the survey. So the number could in theory be quite large. My hunch is that it's not. I personally know quite a few LessWrongers and I'm fairly sure none of them would tell me that the Basilisk is 'correct'. (In fact I'm fairly sure they'd all be offended at me even asking the question.) Since 5% is one in twenty I'd think I'd know at least one or two people who thought the Basilisk was correct by now.

Lower?

One partial explanation for the surprisingly high rate here is that ten percent of the people who said yes by their own admission didn't know what they were saying yes to. Eight people said they've heard of the Basilisk but don't know what it is, and that it's correct. The lizardman constant also plausibly explains a significant portion of the yes responses, but that explanation relies on you already having a prior belief that the rate should be low.


Basilisk-Like Danger
Do you think Basilisk-like thought experiments are dangerous?

Yes, I think they're dangerous for decision theory reasons: 63 4.2%
Yes I think they're dangerous for social reasons (eg. A cult might use them): 194 12.8%
Yes I think they're dangerous for decision theory and social reasons: 136 9%
Yes I think they're socially dangerous because they make everybody involved look foolish: 253 16.7%
Yes I think they're dangerous for other reasons: 54 3.6%
No: 809 53.4%

Most people don't think Basilisk-Like thought experiments are dangerous at all. Of those that think they are, most of them think they're socially dangerous as opposed to a raw decision theory threat. The 4.2% number for pure decision theory threat is interesting because it lines up with the 5% number in the previous question for Basilisk Correctness.

P(Decision Theory Danger | Basilisk Belief) = 26.6%
P(Decision Theory And Social Danger | Basilisk Belief) = 21.3%

So of the people who say the Basilisk is correct, only half of them believe it is a decision theory based danger at all. (In theory this could be because they believe the Basilisk is a good thing and therefore not dangerous, but I refuse to lose that much faith in humanity.3)

Basilisk Anxiety
Have you ever felt any sort of anxiety about the Basilisk?

Yes: 142 8.8%
Yes but only because I worry about everything: 189 11.8%
No: 1275 79.4%

20.6% of respondents have felt some kind of Basilisk Anxiety. It should be noted that the exact wording of the question permits any anxiety, even for a second. And as we'll see in the next question that nuance is very important.

Degree Of Basilisk Worry
What is the longest span of time you've spent worrying about the Basilisk?

I haven't: 714 47%
A few seconds: 237 15.6%
A minute: 298 19.6%
An hour: 176 11.6%
A day: 40 2.6%
Two days: 16 1.05%
Three days: 12 0.79%
A week: 12 0.79%
A month: 5 0.32%
One to three months: 2 0.13%
Three to six months: 0 0.0%
Six to nine months: 0 0.0%
Nine months to a year: 1 0.06%
Over a year: 1 0.06%
Years: 4 0.26%

These numbers provide some pretty sobering context for the previous ones. Of all the people who worried about the Basilisk, 93.8% didn't worry about it for more than an hour. The next 3.65% didn't worry about it for more than a day or two. The next 1.9% didn't worry about it for more than a month and the last .7% or so have worried about it for longer.

Current Basilisk Worry
Are you currently worrying about the Basilisk?

Yes: 29 1.8%
Yes but only because I worry about everything: 60 3.7%
No: 1522 94.5%

Also encouraging. We should expect a small number of people to be worried at this question just because the section is basically the word "Basilisk" and "worry" repeated over and over so it's probably a bit scary to some people. But these numbers are much lower than the "Have you ever worried" ones and back up the previous inference that Basilisk anxiety is mostly a transitory phenomena.

One article on the Basilisk asked the question of whether or not it was just a "referendum on autism". It's a good question and now I have an answer for you, as per the table below:

Mental Health Conditions Versus Basilisk Worry
Condition Worried Worried But They Worry About Everything Combined Worry
Baseline (in the respondent population) 8.8% 11.8% 20.6%
ASD 7.3% 17.3% 24.7%
OCD 10.0% 32.5% 42.5%
AnxietyDisorder 6.9% 20.3% 27.3%
Schizophrenia 0.0% 16.7% 16.7%

 

The short answer: Autism raises your chances of Basilisk anxiety, but anxiety disorders and OCD especially raise them much more. Interestingly enough, schizophrenia seems to bring the chances down. This might just be an effect of small sample size, but my expectation was the opposite. (People who are really obsessed with Roko's Basilisk seem to present with schizophrenic symptoms at any rate.)

Before we move on, there's one last elephant in the room to contend with. The philosophical theory underlying the Basilisk is the CEV conception of friendly AI primarily espoused by Eliezer Yudkowsky. Which has led many critics to speculate on all kinds of relationships between Eliezer Yudkowsky and the Basilisk. Which of course obviously would extend to Eliezer Yudkowsky's Machine Intelligence Research Institute, a project to develop 'Friendly Artificial Intelligence' which does not implement a naive goal function that eats everything else humans actually care about once it's given sufficient optimization power.

The general thrust of these accusations is that MIRI, intentionally or not, profits from belief in the Basilisk. I think MIRI gets picked on enough, so I'm not thrilled about adding another log to the hefty pile of criticism they deal with. However this is a serious accusation which is plausible enough to be in the public interest for me to look at.

 

Percentage Of People Who Donate To MIRI Versus Basilisk Belief
Belief Percentage
Believe It's Incorrect 5.2%
Believe It's Structurally Correct 5.6%
Believe It's Correct 12.0%

Basilisk belief does appear to make you twice as likely to donate to MIRI. It's important to note from the perspective of earlier investigation that thinking it is "structurally correct" appears to make you about as likely as if you don't think it's correct, implying that both of these options mean about the same thing.

 

Sum Money Donated To MIRI Versus Basilisk Belief
Belief Mean Median Mode Stdev Total Donated
Believe It's Incorrect 1365.590 100.0 100.0 4825.293 75107.5
Believe It's Structurally Correct 2644.736 110.0 20.0 9147.299 50250.0
Believe It's Correct 740.555 300.0 300.0 1152.541 6665.0

Take these numbers with a grain of salt, it only takes one troll to plausibly lie about their income to ruin it for everybody else.

Interestingly enough, if you sum all three total donated counts and divide by a hundred, you find that five percent of the sum is about what was donated by the Basilisk group. ($6601 to be exact) So even though the modal and median donations of Basilisk believers are higher, they donate about as much as would be naively expected by assuming donations among groups are equal.4

 

Percentage Of People Who Donate To MIRI Versus Basilisk Worry
Anxiety Percentage
Never Worried 4.3%
Worried But They Worry About Everything 11.1%
Worried 11.3%

In contrast to the correctness question, merely having worried about the Basilisk at any point in time doubles your chances of donating to MIRI. My suspicion is that these people are not, as a general rule, donating because of the Basilisk per se. If you're the sort of person who is even capable of worrying about the Basilisk in principle, you're probably the kind of person who is likely to worry about AI risk in general and donate to MIRI on that basis. This hypothesis is probably unfalsifiable with the survey information I have, because Basilisk-risk is a subset of AI risk. This means that anytime somebody indicates on the survey that they're worried about AI risk this could be because they're worried about the Basilisk or because they're worried about more general AI risk.

 

Sum Money Donated To MIRI Versus Basilisk Worry
Anxiety Mean Median Mode Stdev Total Donated
Never Worried 1033.936 100.0 100.0 3493.373 56866.5
Worried But They Worry About Everything 227.047 75.0 300.0 438.861 4768.0
Worried 4539.25 90.0 10.0 11442.675 72628.0
Combined Worry         77396.0

Take these numbers with a grain of salt, it only takes one troll to plausibly lie about their income to ruin it for everybody else.

This particular analysis is probably the strongest evidence in the set for the hypothesis that MIRI profits (though not necessarily through any involvement on their part) from the Basilisk. People who worried from an unendorsed perspective donate less on average than everybody else. The modal donation among people who've worried about the Basilisk is ten dollars, which seems like a surefire way to torture if we're going with the hypothesis that these are people who believe the Basilisk is a real thing and they're concerned about it. So this implies that they don't, which supports my earlier hypothesis that people who are capable of feeling anxiety about the Basilisk are the core demographic to donate to MIRI anyway.

Of course, donors don't need to believe in the Basilisk for MIRI to profit from it. If exposing people to the concept of the Basilisk makes them twice as likely to donate but they don't end up actually believing the argument that would arguably be the ideal outcome for MIRI from an Evil Plot perspective. (Since after all, pursuing a strategy which involves Basilisk belief would actually incentivize torture from the perspective of the acausal game theories MIRI bases its FAI on, which would be bad.)

But frankly this is veering into very speculative territory. I don't think there's an evil plot, nor am I convinced that MIRI is profiting from Basilisk belief in a way that outweighs the resulting lost donations and damage to their cause.5 If anybody would like to assert otherwise I invite them to 'put up or shut up' with hard evidence. The world has enough criticism based on idle speculation and you're peeing in the pool.

Blogs and Media

Since this was the LessWrong diaspora survey, I felt it would be in order to reach out a bit to ask not just where the community is at but what it's reading. I went around to various people I knew and asked them about blogs for this section. However the picks were largely based on my mental 'map' of the blogs that are commonly read/linked in the community with a handful of suggestions thrown in. The same method was used for stories.

Blogs Read

LessWrong
Regular Reader: 239 13.4%
Sometimes: 642 36.1%
Rarely: 537 30.2%
Almost Never: 272 15.3%
Never: 70 3.9%
Never Heard Of It: 14 0.7%

SlateStarCodex (Scott Alexander)
Regular Reader: 1137 63.7%
Sometimes: 264 14.7%
Rarely: 90 5%
Almost Never: 61 3.4%
Never: 51 2.8%
Never Heard Of It: 181 10.1%

[These two results together pretty much confirm the results I talked about in part two of the survey analysis. A supermajority of respondents are 'regular readers' of SlateStarCodex. By contrast LessWrong itself doesn't even have a quarter of SlateStarCodexes readership.]

Overcoming Bias (Robin Hanson)
Regular Reader: 206 11.751%
Sometimes: 365 20.821%
Rarely: 391 22.305%
Almost Never: 385 21.962%
Never: 239 13.634%
Never Heard Of It: 167 9.527%

Minding Our Way (Nate Soares)
Regular Reader: 151 8.718%
Sometimes: 134 7.737%
Rarely: 139 8.025%
Almost Never: 175 10.104%
Never: 214 12.356%
Never Heard Of It: 919 53.06%

Agenty Duck (Brienne Yudkowsky)
Regular Reader: 55 3.181%
Sometimes: 132 7.634%
Rarely: 144 8.329%
Almost Never: 213 12.319%
Never: 254 14.691%
Never Heard Of It: 931 53.846%

Eliezer Yudkowsky's Facebook Page
Regular Reader: 325 18.561%
Sometimes: 316 18.047%
Rarely: 231 13.192%
Almost Never: 267 15.248%
Never: 361 20.617%
Never Heard Of It: 251 14.335%

Luke Muehlhauser (Eponymous)
Regular Reader: 59 3.426%
Sometimes: 106 6.156%
Rarely: 179 10.395%
Almost Never: 231 13.415%
Never: 312 18.118%
Never Heard Of It: 835 48.49%

Gwern.net (Gwern Branwen)
Regular Reader: 118 6.782%
Sometimes: 281 16.149%
Rarely: 292 16.782%
Almost Never: 224 12.874%
Never: 230 13.218%
Never Heard Of It: 595 34.195%

Siderea (Sibylla Bostoniensis)
Regular Reader: 29 1.682%
Sometimes: 49 2.842%
Rarely: 59 3.422%
Almost Never: 104 6.032%
Never: 183 10.615%
Never Heard Of It: 1300 75.406%

Ribbon Farm (Venkatesh Rao)
Regular Reader: 64 3.734%
Sometimes: 123 7.176%
Rarely: 111 6.476%
Almost Never: 150 8.751%
Never: 150 8.751%
Never Heard Of It: 1116 65.111%

Bayesed And Confused (Michael Rupert)
Regular Reader: 2 0.117%
Sometimes: 10 0.587%
Rarely: 24 1.408%
Almost Never: 68 3.988%
Never: 167 9.795%
Never Heard Of It: 1434 84.106%

[This was the 'troll' answer to catch out people who claim to read everything.]

The Unit Of Caring (Anonymous)
Regular Reader: 281 16.452%
Sometimes: 132 7.728%
Rarely: 126 7.377%
Almost Never: 178 10.422%
Never: 216 12.646%
Never Heard Of It: 775 45.375%

GiveWell Blog (Multiple Authors)
Regular Reader: 75 4.438%
Sometimes: 197 11.657%
Rarely: 243 14.379%
Almost Never: 280 16.568%
Never: 412 24.379%
Never Heard Of It: 482 28.521%

Thing Of Things (Ozy Frantz)
Regular Reader: 363 21.166%
Sometimes: 201 11.72%
Rarely: 143 8.338%
Almost Never: 171 9.971%
Never: 176 10.262%
Never Heard Of It: 661 38.542%

The Last Psychiatrist (Anonymous)
Regular Reader: 103 6.023%
Sometimes: 94 5.497%
Rarely: 164 9.591%
Almost Never: 221 12.924%
Never: 302 17.661%
Never Heard Of It: 826 48.304%

Hotel Concierge (Anonymous)
Regular Reader: 29 1.711%
Sometimes: 35 2.065%
Rarely: 49 2.891%
Almost Never: 88 5.192%
Never: 179 10.56%
Never Heard Of It: 1315 77.581%

The View From Hell (Sister Y)
Regular Reader: 34 1.998%
Sometimes: 39 2.291%
Rarely: 75 4.407%
Almost Never: 137 8.049%
Never: 250 14.689%
Never Heard Of It: 1167 68.566%

Xenosystems (Nick Land)
Regular Reader: 51 3.012%
Sometimes: 32 1.89%
Rarely: 64 3.78%
Almost Never: 175 10.337%
Never: 364 21.5%
Never Heard Of It: 1007 59.48%

I tried my best to have representation from multiple sections of the diaspora, if you look at the different blogs you can probably guess which blogs represent which section.

Stories Read

Harry Potter And The Methods Of Rationality (Eliezer Yudkowsky)
Whole Thing: 1103 61.931%
Partially And Intend To Finish: 145 8.141%
Partially And Abandoned: 231 12.97%
Never: 221 12.409%
Never Heard Of It: 81 4.548%

Significant Digits (Alexander D)
Whole Thing: 123 7.114%
Partially And Intend To Finish: 105 6.073%
Partially And Abandoned: 91 5.263%
Never: 333 19.26%
Never Heard Of It: 1077 62.29%

Three Worlds Collide (Eliezer Yudkowsky)
Whole Thing: 889 51.239%
Partially And Intend To Finish: 35 2.017%
Partially And Abandoned: 36 2.075%
Never: 286 16.484%
Never Heard Of It: 489 28.184%

The Fable of the Dragon-Tyrant (Nick Bostrom)
Whole Thing: 728 41.935%
Partially And Intend To Finish: 31 1.786%
Partially And Abandoned: 15 0.864%
Never: 205 11.809%
Never Heard Of It: 757 43.606%

The World of Null-A (A. E. van Vogt)
Whole Thing: 92 5.34%
Partially And Intend To Finish: 18 1.045%
Partially And Abandoned: 25 1.451%
Never: 429 24.898%
Never Heard Of It: 1159 67.266%

[Wow, I never would have expected this many people to have read this. I mostly included it on a lark because of its historical significance.]

Synthesis (Sharon Mitchell)
Whole Thing: 6 0.353%
Partially And Intend To Finish: 2 0.118%
Partially And Abandoned: 8 0.47%
Never: 217 12.75%
Never Heard Of It: 1469 86.31%

[This was the 'troll' option to catch people who just say they've read everything.]

Worm (Wildbow)
Whole Thing: 501 28.843%
Partially And Intend To Finish: 168 9.672%
Partially And Abandoned: 184 10.593%
Never: 430 24.755%
Never Heard Of It: 454 26.137%

Pact (Wildbow)
Whole Thing: 138 7.991%
Partially And Intend To Finish: 59 3.416%
Partially And Abandoned: 148 8.57%
Never: 501 29.01%
Never Heard Of It: 881 51.013%

Twig (Wildbow)
Whole Thing: 55 3.192%
Partially And Intend To Finish: 132 7.661%
Partially And Abandoned: 65 3.772%
Never: 560 32.501%
Never Heard Of It: 911 52.873%

Ra (Sam Hughes)
Whole Thing: 269 15.558%
Partially And Intend To Finish: 80 4.627%
Partially And Abandoned: 95 5.495%
Never: 314 18.161%
Never Heard Of It: 971 56.16%

My Little Pony: Friendship Is Optimal (Iceman)
Whole Thing: 424 24.495%
Partially And Intend To Finish: 16 0.924%
Partially And Abandoned: 65 3.755%
Never: 559 32.293%
Never Heard Of It: 667 38.533%

Friendship Is Optimal: Caelum Est Conterrens (Chatoyance)
Whole Thing: 217 12.705%
Partially And Intend To Finish: 16 0.937%
Partially And Abandoned: 24 1.405%
Never: 411 24.063%
Never Heard Of It: 1040 60.89%

Ender's Game (Orson Scott Card)
Whole Thing: 1177 67.219%
Partially And Intend To Finish: 22 1.256%
Partially And Abandoned: 43 2.456%
Never: 395 22.559%
Never Heard Of It: 114 6.511%

[This is the most read story according to survey respondents, beating HPMOR by 5%.]

The Diamond Age (Neal Stephenson)
Whole Thing: 440 25.346%
Partially And Intend To Finish: 37 2.131%
Partially And Abandoned: 55 3.168%
Never: 577 33.237%
Never Heard Of It: 627 36.118%

Consider Phlebas (Iain Banks)
Whole Thing: 302 17.507%
Partially And Intend To Finish: 52 3.014%
Partially And Abandoned: 47 2.725%
Never: 439 25.449%
Never Heard Of It: 885 51.304%

The Metamorphosis Of Prime Intellect (Roger Williams)
Whole Thing: 226 13.232%
Partially And Intend To Finish: 10 0.585%
Partially And Abandoned: 24 1.405%
Never: 322 18.852%
Never Heard Of It: 1126 65.925%

Accelerando (Charles Stross)
Whole Thing: 293 17.045%
Partially And Intend To Finish: 46 2.676%
Partially And Abandoned: 66 3.839%
Never: 425 24.724%
Never Heard Of It: 889 51.716%

A Fire Upon The Deep (Vernor Vinge)
Whole Thing: 343 19.769%
Partially And Intend To Finish: 31 1.787%
Partially And Abandoned: 41 2.363%
Never: 508 29.28%
Never Heard Of It: 812 46.801%

I also did a k-means cluster analysis of the data to try and determine demographics and the ultimate conclusion I drew from it is that I need to do more analysis. Which I would do, except that the initial analysis was a whole bunch of work and jumping further down the rabbit hole in the hopes I reach an oasis probably isn't in the best interests of myself or my readers.

Footnotes


  1. This is a general trend I notice with accessibility. Not always, but very often measures taken to help a specific group end up having positive effects for others as well. Many of the accessibility suggestions of the W3C are things you wish every website did.

  2. I hadn't read this particular SSC post at the time I compiled the survey, but I was already familiar with the concept of a lizardman constant and should have accounted for it.

  3. I've been informed by a member of the freenode #lesswrong IRC channel that this is in fact Roko's opinion, because you can 'timelessly trade with the future superintelligence for rewards, not just punishment' according to a conversation they had with him last summer. Remember kids: Don't do drugs, including Max Tegmark.

  4. You might think that this conflicts with the hypothesis that the true rate of Basilisk belief is lower than 5%. It does a bit, but you also need to remember that these people are in the LessWrong demographic, which means regardless of what the Basilisk belief question means we should naively expect them to donate five percent of the MIRI donation pot.

  5. That is to say, it does seem plausible that MIRI 'profits' from Basilisk belief based on this data, but I'm fairly sure any profit is outweighed by the significant opportunity cost associated with it. I should also take this moment to remind the reader that the original Basilisk argument was supposed to prove that CEV is a flawed concept from the perspective of not having deleterious outcomes for people, so MIRI using it as a way to justify donating to them would be weird.

Room For More Funding In AI Safety Is Highly Uncertain

12 Evan_Gaensbauer 12 May 2016 01:57PM

(Crossposted to the Effective Altruism Forum)


Introduction

In effective altruism, people talk about the room for more funding (RFMF) of various organizations. RFMF is simply the maximum amount of money which can be donated to an organization, and be put to good use, right now. In most cases, “right now” typically refers to the next (fiscal) year.  Most of the time when I see the phrase invoked, it’s to talk about individual charities, for example, one of Givewell’s top-recommended charities. If a charity has run out of room for more funding, it may be typical for effective donors to seek the next best option to donate to.
Last year, the Future of Life Institute (FLI) made the first of its grants from the pool of money it’s received as donations from Elon Musk and the Open Philanthropy Project (Open Phil). Since then, I've heard a few people speculating about how much RFMF the whole AI safety community has in general. I don't think that's a sensible question to ask before we have a sense of what the 'AI safety' field is. Before, people were commenting on only the RFMF of individual charities, and now they’re commenting of entire fields as though they’re well-defined. AI safety hasn’t necessarily reached peak RFMF just because MIRI has a runway for one more year to operate at their current capacity, or because FLI made a limited number of grants this year.

Overview of Current Funding For Some Projects


The starting point I used to think about this issue came from Topher Hallquist, from his post explaining his 2015 donations:

I’m feeling pretty cautious right now about donating to organizations focused on existential risk, especially after Elon Musk’s $10 million donation to the Future of Life Institute. Musk’s donation don’t necessarily mean there’s no room for more funding, but it certainly does mean that room for more funding is harder to find than it used to be. Furthermore, it’s difficult to evaluate the effectiveness of efforts in this space, so I think there’s a strong case for waiting to see what comes of this infusion of cash before committing more money.


My friend Andrew and I were discussing this last week. In past years, the Machine Intelligence Research Institute (MIRI) has raised about $1 million (USD) in funds, and received more than that  for their annual operations last year. Going into 2016, Nate Soares, Executive Director of MIRI, wrote the following:

Our successful summer fundraiser has helped determine how ambitious we’re making our plans; although we may still slow down or accelerate our growth based on our fundraising performance, our current plans assume a budget of roughly $1,825,000 per year [emphasis not added].


This seems sensible to me as it's not too much more than what they raised last year, and it seems more and not less money will be flowing into AI safety in the near future. However, Nate also had plans for how MIRI could've productively spent up to $6 million last year, to grow the organization. So, far from MIRI believing it had all the funding it could use, it was seeking more. Of course, others might argue MIRI or other AI safety organizations already receive enough funding relative to other priorities, but that is an argument for a different time.

Andrew and I also talked about how, had FLI had enough funding to grant money to all the promising applicants for its 2015 grants in AI safety research, that would have been millions more flowing into AI safety. It’s true what Topher wrote: that, being outside of FLI, and not otherwise being a major donor, it may be exceedingly difficult for individuals to evaluate funding gaps in AI safety. While FLI has only received $11 million to grant in 2015-16 ($6 million already granted in 2015, with $5 million more to be granted in the coming year), they could easily have granted more than twice that much, had they received the money.

To speak to other organizations, Niel Bowerman, Assistant Director at the Future of Humanity Institute (FH)I, recently spoke about how FHI receives most of its funding exclusively for research, and bottlenecks like the operations he runs more depend on private donations FHI could use more of.  Sean O HEigeartaigh, Executive Director at the Centre for the Study of Existential Risk (CSER), at Cambridge University, recently stated in discussion that CSER and the Leverhulme Centre for the Future of Intelligence (CFI), which CSER is currently helping launch, face the same problem with their operations. Nick Bostrom, author of Superintelligence, and Director of FHI, is in the course of launching the Strategic Artificial Intelligence Research Centre (SAIRC), which received $1.5 million (USD) in funding from FLI. SAIRC seems good for funding for at least the rest of 2016.

 


The Big Picture
Above are the funding summaries for several organizations listed in Andrew Critch’s 2015 map of the existential risk reduction ecosystem.There are organizations working on existential risks other than those from AI, but they aren’t explicitly organized in a network the same way AI safety organizations are. So, in practice, the ‘x-risk ecosystem’ is mapable almost exclusively in terms of AI safety.

It seems to me the 'AI safety field', if defined just as the organizations and projects listed in Dr. Critch’s ecosystem map, and perhaps others closely related (e.g., AI Impacts), could have productively absorbed between $10 million and $25 million in 2016 alone. Of course, there are caveats rendering this a conservative estimate. First of all, the above is a contrived version of the AI safety "field", as there is plenty of research outside of this network popping up all the time. Second, I think the organizations and projects I listed above could've themselves thought of more uses for funding. Seeing as they're working on what is (presumably) the most important problem in the world, there is much millions more could do for foundational research on the AGI containment/control problem, safety research into narrow systems aside.


Too Much Variance in Estimates for RFMF in AI Safety

I've also heard people setting the benchmark for truly appropriate funding for AI safety to be in the ballpark of a trillion dollars. While in theory that may be true, on its face it currently seems absurd. I'm not saying there won't be a time in even the next several years when $1 trillion/year couldn't be used effectively. I'm saying that if there isn't a roadmap for how to increase the productive use of ~$10 million/year to AI safety, to $100 million to $1 billion dollars, talking about $1 trillion/year isn't practical. I don't even think there will be more than $1 billion on the table per year for the near future.

This argument can be used to justify continued earning to give on the part of effective altruists. That is, there is so much money, e.g., MIRI could use, it makes sense for everyone who isn't an AI researcher to earn to give. This might make sense if governments and universities give major funding to what they think is AI safety, give 99% of it to only robotic unemployment or something, miss the boat on the control problem, and MIRI gets a pittance of the money that will flow into the field. The idea that there is effectively something like a multi-trillion dollar ceiling for effective funding for AI safety is still unsound.

When the range for RFMF for AI safety ranges between $5-10 million (the amount of funding AI safety received in 2015) and $1 trillion, I feel like anyone not already well-within the AI safety community cannot reasonably make an estimate of how much money the field can productively use in one year.
On the other hand, there are also people who think that AI safety doesn’t need to be a big priority, or is currently as big a priority as it needs to be, so money spent funding AI safety research and strategy would be better spent elsewhere.

All this stated, I myself don’t have a precise estimate of how much capacity for funding the whole AI safety field will have in, say, 2017.

Reasonable Assumptions Going Forward

What I'm confident saying right now is:

  1. The amount of money AI safety could've productively used in 2016 alone is within an order of magnitude of $10 million, and probably less than $25 million, based on what I currently know.
  2. The amount of total funding available will likely increase year over year for the next several years. There could be quite dramatic rises.. The Open Philanthropy Project, worth $10+ billion (USD), recently announced AI safety will be their top priority next year, although this may not necessarily translate into more major grants in the next 12 months. The White House recently announced they’ll be hosting workshops on the Future of Artificial Intelligence, including concerns over risk. Also, to quote Stuart Russell (HT Luke Muehlhauser): "Industry [has probably invested] more in the last 5 years than governments have invested since the beginning of the field [in the 1950s]." This includes companies like Facebook, Baidu, and Google each investing tons of money into AI research, including Google’s purchase of DeepMind for $500 million in 2014. With an increasing number of universities and corporations investing money and talent into AI research, including AI safety, and now with major philanthropic foundations and governments paying attention to AI safety as well, it seems plausible the amount of funding for AI safety worldwide might balloon up to $100+ million in 2017 or 2018. However, this could just as easily not happen, and there's much uncertainty in projecting this.
  3. The field of AI safety will also grow year over year for the next several years. I doubt projects needing funding will grow as fast as the amount of funding available. This is because the rate at which institutions are willing to invest in growth will not only depend on how much money they're receiving now, but how much they can expect to receive in the future. Since how much those expectations reasonably vary is so uncertain, organizations are smartly conservative to hold their cards close to their chest. While OpenAI has pledged $1 billion for funding AI research in general, and not just safety, over the next couple decades, nobody knows if such funding will be available to organizations out of Oxford or Berkeley like AI Impacts MIRI, FHI or CFI. However,

 

  • i) increased awareness and concern over AI safety will draw in more researchers.
  • ii) the promise or expectation of more money to come may draw in more researchers seeking funding.
  • iii) the expanding field and the increased funding available will create a feedback loop in which institutions in AI safety, such as MIRI, make contingency plans to expand faster, if able to or need be.

Why This Matters

I don't mean to use the amount of funding AI safety has received in 2015 or 2016 as an anchor which will bias how much RFMF I think the field has. However, it seems more extreme lower or upper estimates I’ve encountered are baseless, and either vastly underestimate or overestimate how much the field of AI safety can productively grow each year. This is actually important to figure out.

80,000 Hours rates AI safety as perhaps the most important and neglected cause currently prioritized by the effective altruism movement. Consequently, 80,000 Hours recommends how similarly concerned people can work on the issue. Some talented computer scientists who could do best working in AI safety might opt to earn to give in software engineering or data science, if they conclude the bottleneck on AI safety isn’t talent but funding. Alternatively, small but critical organization which requires funding from value-aligned and consistent donors might fall through the cracks if too many people conclude all AI safety work in general is receiving sufficient funding, and chooses to forgo donating to AI safety. Many of us could make individual decisions going either way, but it also seems many of us could end up making the wrong choice. Assessments of these issues will practically inform decisions many of make over the next few years, determining how much of our time and potential we use fruitfully, or waste.

Everything above just lays out how estimating room for more funding in AI safety overall may be harder than anticipated, and to show how high the variance might be. I invite you to contribute to this discussion, as it only just starting. Please use the above info as a starting point to look into this more, or ask questions that will usefully clarify what we’re thinking about. The best fora to start further discussion seem to be the Effective Altruism Forum, LessWrong, or the AI Safety Discussion group on Facebook, where I initiated the conversation leading to this post.

Link: Re-reading Kahneman's Thinking, Fast and Slow

11 toomanymetas 04 July 2016 06:32AM

"A bit over four years ago I wrote a glowing review of Daniel Kahneman’s Thinking, Fast and Slow. I described it as a “magnificent book” and “one of the best books I have read”. I praised the way Kahneman threaded his story around the System 1 / System 2 dichotomy, and the coherence provided  by prospect theory.

What a difference four years makes. I will still describe Thinking, Fast and Slow as an excellent book – possibly the best behavioural science book available. But during that time a combination of my learning path and additional research in the behavioural sciences has led me to see Thinking, Fast and Slow as a book with many flaws."

Continued here: https://jasoncollins.org/2016/06/29/re-reading-kahnemans-thinking-fast-and-slow/

Are smart contracts AI-complete?

11 Stuart_Armstrong 22 June 2016 02:08PM

Many people are probably aware of the hack at DAO, using a bug in their smart contract system to steal millions of dollars worth of the crypto currency Ethereum.

There's various arguments as to whether this theft was technically allowed or not, and what should be done about it, and so on. Many people are arguing that the code is the contract, and that therefore no-one should be allowed to interfere with it - DAO just made a coding mistake, and are now being (deservedly?) punished for it.

That got me wondering whether its ever possible to make a smart contract without a full AI of some sort. For instance, if the contract is triggered by the delivery of physical goods - how can you define what the goods are, what constitutes delivery, what constitutes possession of them, and so on. You could have a human confirm delivery - but that's precisely the kind of judgement call you want to avoid. You could have an automated delivery confirmation system - but what happens if someone hacks or triggers that? You could connect it automatically with scanning headlines of media reports, but again, this is relying on aggregated human judgement, which could be hacked or influenced.

Digital goods seem more secure, as you can automate confirmation of delivery/services rendered, and so on. But, again, this leaves the confirmation process open to hacking. Which would be illegal, if you're going to profit from the hack. Hum...

This seems the most promising avenue for smart contracts that doesn't involve full AI: clear out the bugs in the code, then ground the confirmation procedure in such a way that it can only be hacked in a way that's already illegal. Sort of use the standard legal system as a backstop, fixing the basic assumptions, and then setting up the smart contracts on top of them (which is not the same as using the standard legal system within the contract).

Review and Thoughts on Current Version of CFAR Workshop

11 Gleb_Tsipursky 06 June 2016 01:44PM

Outline: I will discuss my background and how I prepared for the workshop, and then how I would have prepared differently if I could go back and have the chance to do it again; I will then discuss my experience at the CFAR workshop, and what I would have done differently if I had the chance to do it again; I will then discuss what my take-aways were from the workshop, and what I am doing to integrate CFAR strategies into my life; finally, I will give my assessment of its benefits and what other folks might expect to get who attend the workshop.


 

Acknowledgments: Thanks to fellow CFAR alumni and CFAR staff for feedback on earlier versions of this post


 

Introduction

 

Many aspiring rationalists have heard about the Center for Applied Rationality, an organization devoted to teaching applied rationality skills to help people improve their thinking, feeling, and behavior patterns. This nonprofit does so primarily through its intense workshops, and is funded by donations and revenue from its workshop. It fulfills its social mission through conducting rationality research and through giving discounted or free workshops to those people its staff judge as likely to help make the world a better place, mainly those associated with various Effective Altruist cause areas, especially existential risk.

 

To be fully transparent: even before attending the workshop, I already had a strong belief that CFAR is a great organization and have been a monthly donor to CFAR for years. So keep that in mind as you read my description of my experience (you can become a donor here).


Preparation

 

First, some background about myself, so you know where I’m coming from in attending the workshop. I’m a professor specializing in the intersection of history, psychology, behavioral economics, sociology, and cognitive neuroscience. I discovered the rationality movement several years ago through a combination of my research and attending a LessWrong meetup in Columbus, OH, and so come from a background of both academic and LW-style rationality. Since discovering the movement, I have become an activist in the movement as the President of Intentional Insights, a nonprofit devoted to popularizing rationality and effective altruism (see here for our EA work). So I came to the workshop with some training and knowledge of rationality, including some CFAR techniques.

 

To help myself prepare for the workshop, I reviewed existing posts about CFAR materials, with an eye toward being careful not to assume that the actual techniques match their actual descriptions in the posts.

 

I also delayed a number of tasks for after the workshop, tying up loose ends. In retrospect, I wish I did not leave myself some ongoing tasks to do during the workshop. As part of my leadership of InIn, I coordinate about 50ish volunteers, and I wish I had placed those responsibilities on someone else during the workshop.

 

Before the workshop, I worked intensely on finishing up some projects. In retrospect, it would have been better to get some rest and come to the workshop as fresh as possible.

 

There were some communication snafus with logistics details before the workshop. It all worked out in the end, but I would have told myself in retrospect to get the logistics hammered out in advance to not experience anxiety before the workshop about how to get there.


Experience

 

The classes were well put together, had interesting examples, and provided useful techniques. FYI, my experience in the workshop was that reading these techniques in advance was not harmful, but that the techniques in the CFAR classes were quite a bit better than the existing posts about them, so don’t assume you can get the same benefits from reading posts as attending the workshop. So while I was aware of the techniques, the ones in the classes definitely had optimized versions of them - maybe because of the “broken telephone” effect or maybe because CFAR optimized them from previous workshops, not sure. I was glad to learn that CFAR considers the workshop they gave us in May as satisfactory enough to scale up their workshops, while still improving their content over time.

 

Just as useful as the classes were the conversations held in between and after the official classes ended. Talking about them with fellow aspiring rationalists and seeing how they were thinking about applying these to their lives was helpful for sparking ideas about how to apply them to my life. The latter half of the CFAR workshop was especially great, as it focused on pairing off people and helping others figure out how to apply CFAR techniques to themselves and how to address various problems in their lives. It was especially helpful to have conversations with CFAR staff and trained volunteers, of whom there were plenty - probably about 20 volunteers/staff for the 50ish workshop attendees.

 

Another super-helpful aspect of the conversations was networking and community building. Now, this may have been more useful to some participants than others, so YMMV. As an activist in the moment, I talked to many folks in the CFAR workshop about promoting EA and rationality to a broad audience. I was happy to introduce some people to EA, with my most positive conversation there being to encourage someone to switch his efforts regarding x-risk from addressing nuclear disarmament to AI safety research as a means of addressing long/medium-term risk, and promoting rationality as a means of addressing short/medium-term risk. Others who were already familiar with EA were interested in ways of promoting it broadly, while some aspiring rationalists expressed enthusiasm over becoming rationality communicators.

 

Looking back at my experience, I wish I was more aware of the benefits of these conversations. I went to sleep early the first couple of nights, and I would have taken supplements to enable myself to stay awake and have conversations instead.


Take-Aways and Integration

 

The aspects of the workshop that I think will help me most were what CFAR staff called “5-second” strategies - brief tactics and techniques that could be executed in 5 seconds or less and address various problems. The stuff that we learned at the workshops that I was already familiar with required some time to learn and practice, such as Trigger Action Plans, Goal Factoring, Murphyjitsu, Pre-Hindsight, often with pen and paper as part of the work. However, with sufficient practice, one can develop brief techniques that mimic various aspects of the more thorough techniques, and apply them quickly to in-the-moment decision-making.

 

Now, this doesn’t mean that the longer techniques are not helpful. They are very important, but they are things I was already generally familiar with, and already practice. The 5-second versions were more of a revelation for me, and I anticipate will be more helpful for me as I did not know about them previously.

 

Now, CFAR does a very nice job of helping people integrate the techniques into daily life, as this is a common failure mode of CFAR attendees, with them going home and not practicing the techniques. So they have 6 Google Hangouts with CFAR staff and all attendees who want to participate, 4 one-on-one sessions with CFAR trained volunteers or staff, and they also pair you with one attendee for post-workshop conversations. I plan to take advantage of all these, although my pairing did not work out.

 

For integration of CFAR techniques into my life, I found the CFAR strategy of “Overlearning” especially helpful. Overlearning refers to trying to apply a single technique intensely for a while to all aspect of one’s activities, so that it gets internalized thoroughly. I will first focus on overlearning Trigger Action Plans, following the advice of CFAR.

 

I also plan to teach CFAR techniques in my local rationality dojo, as teaching is a great way to learn, naturally.

 

Finally, I plan to integrate some CFAR techniques into Intentional Insights content, at least the more simple techniques that are a good fit for the broad audience with which InIn is communicating.


Benefits

 

I have a strong probabilistic belief that having attended the workshop will improve my capacity to be a person who achieves my goals for doing good in the world. I anticipate I will be able to figure out better whether the projects I am taking on are the best uses of my time and energy. I will be more capable of avoiding procrastination and other forms of akrasia. I believe I will be more capable of making better plans, and acting on them well. I will also be more in touch with my emotions and intuitions, and be able to trust them more, as I will have more alignment among different components of my mind.

 

Another benefit is meeting the many other people at CFAR who have similar mindsets. Here in Columbus, we have a flourishing rationality community, but it’s still relatively small. Getting to know 70ish people, attendees and staff/volunteers, passionate about rationality was a blast. It was especially great to see people who were involved in creating new rationality strategies, something that I am engaged in myself in addition to popularizing rationality - it’s really heartening to envision how the rationality movement is growing.

 

These benefits should resonate strongly with those who are aspiring rationalists, but they are really important for EA participants as well. I think one of the best things that EA movement members can do is studying rationality, and it’s something we promote to the EA movement as part of InIn’s work. What we offer is articles and videos, but coming to a CFAR workshop is a much more intense and cohesive way of getting these benefits. Imagine all the good you can do for the world if you are better at planning, organizing, and enacting EA-related tasks. Rationality is what has helped me and other InIn participants make the major impact that we have been able to make, and there are a number of EA movement members who have rationality training and who reported similar benefits. Remember, as an EA participant, you can likely get a scholarship with a partial or full coverage of the regular $3900 price of the workshop, as I did myself when attending it, and you are highly likely to be able to save more lives as a result of attending the workshop over time, even if you have to pay some costs upfront.

 

Hope these thoughts prove helpful to you all, and please contact me at gleb@intentionalinsights.org if you want to chat with me about my experience.

 

Improving long-run civilisational robustness

11 RyanCarey 10 May 2016 11:15AM

People trying to guard civilisation against catastrophe usually focus on one specific kind of catastrophe at a time. This can be useful for building concrete knowledge with some certainty in order for others to build on it. However, there are disadvantages to this catastrophe-specific approach:

1. Catastrophe researchers (including Anders Sandberg and Nick Bostrom) think that there are substantial risks from catastrophes that have not yet been anticipated. Resilience-boosting measures may mitigate risks that have not yet been investigated.

2. Thinking about resilience measures in general may suggest new mitigation ideas that were missed by the catastrophe-specific approach.

One analogy for this is that an intrusion (or hack) to a software system can arise from a combination of many minor security failures, each of which might appear innocuous in isolation. You can decrease the chance of an intrusion by adding extra security measures, even without a specific idea of what kind of hacking would be performed. Things like being being able to power down and reboot a system, storing a backup and being able to run it in a "safe" offline mode are all standard resilience measures for software systems. These measures aren't necessarily the first thing that would come to mind if you were trying to model a specific risk like a password getting stolen, or a hacker subverting administrative privileges, although they would be very useful in those cases. So mitigating risk doesn't necessarily require a precise idea of the risk to be mitigated. Sometimes it can be done instead by thinking about the principles required for proper operation of a system - in the case of its software, preservation of its clean code - and the avenues through which it is vulnerable - such as the internet.

So what would be good robustness measures for human civilisation? I have a bunch of proposals:

 

Disaster forecasting

Disaster research

* Build research labs to survey and study catastrophic risks (like the Future of Humanity Institute, the Open Philanthropy Project and others)

Disaster prediction

* Prediction contests (like IARPA's Aggregative Contingent Estimation "ACE" program)

* Expert aggregation and elicitation

 

Disaster prevention

General prevention measures

* Build a culture of prudence in groups that run risky scientific experiments

* Lobby for these mitigation measures

* Improving the foresight and clear-thinking of policymakers and other relevant decision-makers

* Build research labs to plan more risk-mitigation measures (including the Centre for Study of Existential Risk)

Preventing intentional violence

* Improve focused surveillance of people who might commit large-scale terrorism (this is controversial because excessive surveillance itself poses some risk)

* Improve cooperation between nations and large institutions

Preventing catastrophic errors

* Legislating for individuals to be held more accountable for large-scale catastrophic errors that they may make (including by requiring insurance premiums for any risky activities)

 

Disaster response

* Improve political systems to respond to new risks

* Improved vaccine development, quarantine and other pandemic response measures

* Building systems for disaster notification


Disaster recovery

Shelters

* Build underground bomb shelters

* Provide a sheltered place for people to live with air and water

* Provide (or store) food and farming technologies (cf Dave Denkenberger's *Feeding Everyone No Matter What*

* Store energy and energy-generators

* Store reproductive technologies (which could include IVF, artificial wombs or measures for increasing genetic diversity)

* Store information about building the above

* Store information about building a stable political system, and about mitigating future catastrophes

* Store other useful information about science and technology (e.g. reading and writing)

* Store some of the above in submarines

* (maybe) store biodiversity

 

Space Travel

* Grow (or replicate) the international space station

* Improve humanity's capacity to travel to the Moon and Mars

* Build sustainable settlements on the Moon and Mars

 

Of course, some caveats are in order. 

To begin with, one could argue that surveilling terrorists is a measure specifically designed to reduce the risk from terrorism. But there are a number of different scenarios and methods through which a malicious actor could try to inflict major damage on civilisation, and so I still regard this as a general robustness measure, granted that there is some subjectivity to all of this. If you know absolutely nothing about the risks that you might face, and the structures in society that are to be preserved, then the exercise is futile. So some of the measures on this list will mitigate a smaller subset of risks than others, and that's just how it is, though I think the list is pretty different from the one people think of by using a risk-specific paradigm, which is the reason for the exercise.

Additionally, I'll disclaim that some of these measures are already well invested, and yet others will not be able to be done cheaply or effectively. But many seem to me to be worth thinking more about.

Additional suggestions for this list are welcome in the comments, as are proposals for their implementation.

 

Related readings

https://www.academia.edu/7266845/Existential_Risks_Exploring_a_Robust_Risk_Reduction_Strategy

http://www.nickbostrom.com/existential/risks.pdf

http://users.physics.harvard.edu/~wilson/pmpmta/Mahoney_extinction.pdf

http://gcrinstitute.org/aftermath

http://sethbaum.com/ac/2015_Food.html

http://the-knowledge.org

http://lesswrong.com/lw/ma8/roadmap_plan_of_action_to_prevent_human/

Collaborative Truth-Seeking

11 Gleb_Tsipursky 04 May 2016 11:28PM

Summary: We frequently use debates to resolve different opinions about the truth. However, debates are not always the best course for figuring out the truth. In some situations, the technique of collaborative truth-seeking may be more optimal.

 

Acknowledgments: Thanks to Pete Michaud, Michael Dickens, Denis Drescher, Claire Zabel, Boris Yakubchik, Szun S. Tay, Alfredo Parra, Michael Estes, Aaron Thoma, Alex Weissenfels, Peter Livingstone, Jacob Bryan, Roy Wallace, and other readers who prefer to remain anonymous for providing feedback on this post. The author takes full responsibility for all opinions expressed here and any mistakes or oversights.

 

The Problem with Debates

 

Aspiring rationalists generally aim to figure out the truth, and often disagree about it. The usual method of hashing out such disagreements in order to discover the truth is through debates, in person or online.

 

Yet more often than not, people on opposing sides of a debate end up seeking to persuade rather than prioritizing truth discovery. Indeed, research suggests that debates have a specific evolutionary function – not for discovering the truth but to ensure that our perspective prevails within a tribal social context. No wonder debates are often compared to wars.

 

We may hope that as aspiring rationalists, we would strive to discover the truth during debates. Yet given that we are not always fully rational and strategic in our social engagements, it is easy to slip up within debate mode and orient toward winning instead of uncovering the truth. Heck, I know that I sometimes forget in the midst of a heated debate that I may be the one who is wrong – I’d be surprised if this didn’t happen with you. So while we should certainly continue to engage in debates, we should also use additional strategies – less natural and intuitive ones. These strategies could put us in a better mindset for updating our beliefs and improving our perspective on the truth. One such solution is a mode of engagement called collaborative truth-seeking.


Collaborative Truth-Seeking

 

Collaborative truth-seeking is one way of describing a more intentional approach in which two or more people with different opinions engage in a process that focuses on finding out the truth. Collaborative truth-seeking is a modality that should be used among people with shared goals and a shared sense of trust.

 

Some important features of collaborative truth-seeking, which are often not present in debates, are: focusing on a desire to change one’s own mind toward the truth; a curious attitude; being sensitive to others’ emotions; striving to avoid arousing emotions that will hinder updating beliefs and truth discovery; and a trust that all other participants are doing the same. These can contribute to increased  social sensitivity, which, together with other attributes, correlate with accomplishing higher group performance  on a variety of activities.

 

The process of collaborative truth-seeking starts with establishing trust, which will help increase social sensitivity, lower barriers to updating beliefs, increase willingness to be vulnerable, and calm emotional arousal. The following techniques are helpful for establishing trust in collaborative truth-seeking:

  • Share weaknesses and uncertainties in your own position

  • Share your biases about your position

  • Share your social context and background as relevant to the discussion

    • For instance, I grew up poor once my family immigrated to the US when I was 10, and this naturally influences me to care about poverty more than some other issues, and have some biases around it - this is one reason I prioritize poverty in my Effective Altruism engagement

  • Vocalize curiosity and the desire to learn

  • Ask the other person to call you out if they think you're getting emotional or engaging in emotive debate instead of collaborative truth-seeking, and consider using a safe word



Here are additional techniques that can help you stay in collaborative truth-seeking mode after establishing trust:

  • Self-signal: signal to yourself that you want to engage in collaborative truth-seeking, instead of debating

  • Empathize: try to empathize with the other perspective that you do not hold by considering where their viewpoint came from, why they think what they do, and recognizing that they feel that their viewpoint is correct

  • Keep calm: be prepared with emotional management to calm your emotions and those of the people you engage with when a desire for debate arises

    • watch out for defensiveness and aggressiveness in particular

  • Go slow: take the time to listen fully and think fully

  • Consider pausing: have an escape route for complex thoughts and emotions if you can’t deal with them in the moment by pausing and picking up the discussion later

    • say “I will take some time to think about this,” and/or write things down

  • Echo: paraphrase the other person’s position to indicate and check whether you’ve fully understood their thoughts

  • Be open: orient toward improving the other person’s points to argue against their strongest form

  • Stay the course: be passionate about wanting to update your beliefs, maintain the most truthful perspective, and adopt the best evidence and arguments, no matter if they are yours of those of others

  • Be diplomatic: when you think the other person is wrong, strive to avoid saying "you're wrong because of X" but instead to use questions, such as "what do you think X implies about your argument?"

  • Be specific and concrete: go down levels of abstraction

  • Be clear: make sure the semantics are clear to all by defining terms

  • Be probabilistic: use probabilistic thinking and probabilistic language, to help get at the extent of disagreement and be as specific and concrete as possible

    • For instance, avoid saying that X is absolutely true, but say that you think there's an 80% chance it's the true position

    • Consider adding what evidence and reasoning led you to believe so, for both you and the other participants to examine this chain of thought

  • When people whose perspective you respect fail to update their beliefs in response to your clear chain of reasoning and evidence, update a little somewhat toward their position, since that presents evidence that your position is not very convincing

  • Confirm your sources: look up information when it's possible to do so (Google is your friend)

  • Charity mode: trive to be more charitable to others and their expertise than seems intuitive to you

  • Use the reversal test to check for status quo bias

    • If you are discussing whether to change some specific numeric parameter - say increase by 50% the money donated to charity X - state the reverse of your positions, for example decreasing the amount of money donated to charity X by 50%, and see how that impacts your perspective

  • Use CFAR’s double crux technique

    • In this technique, two parties who hold different positions on an argument each writes the the fundamental reason for their position (the crux of their position). This reason has to be the key one, so if it was proven incorrect, then each would change their perspective. Then, look for experiments that can test the crux. Repeat as needed. If a person identifies more than one reason as crucial, you can go through each as needed. More details are here.  


Of course, not all of these techniques are necessary for high-quality collaborative truth-seeking. Some are easier than others, and different techniques apply better to different kinds of truth-seeking discussions. You can apply some of these techniques during debates as well, such as double crux and the reversal test. Try some out and see how they work for you.


Conclusion

 

Engaging in collaborative truth-seeking goes against our natural impulses to win in a debate, and is thus more cognitively costly. It also tends to take more time and effort than just debating. It is also easy to slip into debate mode even when using collaborative truth-seeking, because of the intuitive nature of debate mode.

 

Moreover, collaborative truth-seeking need not replace debates at all times. This non-intuitive mode of engagement can be chosen when discussing issues that relate to deeply-held beliefs and/or ones that risk emotional triggering for the people involved. Because of my own background, I would prefer to discuss poverty in collaborative truth-seeking mode rather than debate mode, for example. On such issues, collaborative truth-seeking can provide a shortcut to resolution, in comparison to protracted, tiring, and emotionally challenging debates. Likewise, using collaborative truth-seeking to resolve differing opinions on all issues holds the danger of creating a community oriented excessively toward sensitivity to the perspectives of others, which might result in important issues not being discussed candidly. After all, research shows the importance of having disagreement in order to make wise decisions and to figure out the truth. Of course, collaborative truth-seeking is well suited to expressing disagreements in a sensitive way, so if used appropriately, it might permit even people with triggers around certain topics to express their opinions.

 

Taking these caveats into consideration, collaborative truth-seeking is a great tool to use to discover the truth and to update our beliefs, as it can get past the high emotional barriers to altering our perspectives that have been put up by evolution. Rationality venues are natural places to try out collaborative truth-seeking.

 

 

 

[Link] White House announces a series of workshops on AI, expresses interest in safety

11 AspiringRationalist 04 May 2016 02:50AM

Thoughts on hacking aromanticism?

10 hg00 02 June 2016 11:52AM

Several years ago, Alicorn wrote an article about how she hacked herself to be polyamorous.  I'm interested in methods for hacking myself to be aromantic.  I've had some success with this, so I'll share what's worked for me, but I'm really hoping you all will chime in with your ideas in the comments.

Motivation

Why would someone want to be aromantic?  There's the obvious time commitment involved in romance, which can be considerable.  This is an especially large drain if you're in a situation where finding suitable partners is difficult, which means most of this time is spent enduring disappointment (e.g. if you're heterosexual and the balance of singles in your community is unfavorable).

But I think an even better way to motivate aromanticism is by referring you to this Paul Graham essay, The Top Idea in Your Mind.  To be effective at accomplishing your goals, you'd like to have your goals be the most interesting thing you have to think about.  I find it's far too easy for my love life to become the most interesting thing I have to think about, for obvious reasons.

Subproblems

After thinking some, I came up with a list of 4 goals people try to achieve through engaging in romance:

  1. Companionship.
  2. Sexual pleasure.
  3. Infatuation (also known as new relationship energy).
  4. Validation.  This one is trickier than the previous three, but I think it's arguably the most important.  Many unhappy singles have friends they are close to, and know how to masturbate, but they still feel lousy in a way people in post-infatuation relationships do not.  What's going on?  I think it's best described as a sort of romantic insecurity.  To test this out, imagine a time when someone you were interested in was smiling at you, and contrast that with the feeling of someone you were interested in turning you down.  You don't have to experience companionship or sexual pleasure from these interactions for them to have a major impact on your "romantic self-esteem".  And in a culture where singlehood is considered a failure, it's natural for your "romantic self-esteem" to take a hit if you're single.

To remove the need for romance, it makes sense to find quicker and less distracting ways to achieve each of these 4 goals.  So I'll treat each goal as a subproblem and brainstorm ideas for solving it.  Subproblems 1 through 3 all seem pretty easy to solve:

  1. Companionship: Make deep friendships with people you're not interested in romantically.  I recommend paying special attention to your coworkers and housemates, since you spend so much time with them.
  2. Sexual pleasure: Hopefully you already have some ideas on pleasuring yourself.
  3. Infatuation: I see this as more of a bonus than a need to be met.  There are lots of ways to find inspiration, excitement, and meaning in life outside of romance.

Subproblem 4 seems trickiest.

Hacking Romantic Self-Esteem

I'll note that what I'm describing as "validation" or "romantic self-esteem" seems closely related to abundance mindset.  But I think it's useful to keep them conceptually distinct.  Although alieving that there are many people you could date is one way to boost your romantic self-esteem, it's not necessarily the only strategy.

The most important thing to keep in mind about your romantic self-esteem is that it's heavily affected by the availability heuristic.  If I was encouraged by someone in 2015, that won't do much to assuage the sting of discouragement in 2016, except maybe if it happens to come to mind.

Another clue is the idea of a sexual "dry spell".  Dry spells are supposed to get worse the longer they go on... which simply means that if your mind doesn't have a recent (available!) incident of success to latch on, you're more likely to feel down.

So to increase your romantic self-esteem, keep a cherished list of thoughts suggesting your desirability is high, and don't worry too much about thoughts suggesting your desirability is low.  Here's a freebie: If you're reading this post, it's likely that you are (or will be) quite rich by global standards.  I hear rich people are considered attractive.  Put it on your list!

Other ideas for raising your romantic self-esteem:

  • Take steps to maintain your physical appearance, so you will appear marginally more desirable to yourself when you see yourself in the mirror.
  • Remind yourself that you're not a victim if you're making a conscious choice to prioritize other aspects of your life.  Point out to yourself things you could be doing to find partners that you're choosing not to do.

I think this is a situation where prevention works better than cure--it's best to work pre-emptively to keep your romantic self-esteem high.  In my experience, low romantic self-esteem leads to unproductive coping mechanisms like distracting myself from dark thoughts by wasting time on the Internet.

The other side of the coin is avoiding hits to your romantic self-esteem.  Here's an interesting snippet from a Quora answer I found:

In general specialized contemplative monastic organisations that tend to separate from the society tend to be celibate while ritual specialists within the society (priests) even if expected to follow a higher standard of ethical and ritual purity tend not to be.

So, it seems like it's easier for heterosexual male monks to stay celibate if they are isolated on a monastery away from women.  Without any possible partners around, there's no one to reject (or distract) them.  Participating in a monastic culture in which long-term singlehood is considered normal & desirable also removes a romantic self-esteem hit.

Retreating to a monastery probably isn't practical, but there may be simpler things you can do.  I recently switched from lifting weights to running in order to get exercise, and I found that running is better for my concentration because I'm not distracted by attractive people at the gym.

It's not supposed to be easy

I shared a bunch of ideas in this post.  But my overall impression is that instilling aromanticism is a very hard problem.  Based on my research, even monks and priests have a difficult time of things.  That's why I'm curious to hear what the Less Wrong community can come up with.  Side note: when possible, please try to make your suggestions gender-neutral so we can avoid gender-related flame wars.  Thanks!

Two forms of procrastination

9 Viliam 16 July 2016 08:30PM

I noticed something about myself when comparing two forms of procrastination:

a) reading online discussions,
b) watching movies online.

Reading online discussions (LessWrong, SSC, Reddit, Facebook) and sometimes writing a comment there, is a huge sink of time for me. On the other hand, watching movies online is almost harmless, at least compared with the former option. The difference is obvious when I compare my productivity at the end of the day when I did only the former, or only the latter. The interesting thing is that at the moment it feels the other way round.

When I start watching a movie that is 1:30:00 long, or start watching a series where each part is 40:00 long but I know I will probably watch more than one part a day, I am aware from the beginning that I am going to lose more than one hour of time; possibly several hours. On the other hand, when I open the "Discussion" tab on LessWrong, the latest "Open Thread" on SSC, my few favorite subreddits, and/or my Facebook "Home" page, it feels like it will only take a few minutes -- I will click on the few interesting links, quickly skim through the text, and maybe write a comment or two -- it certainly feels like much less than an hour.

But the fact is, when I start reading the discussions, I will probably click on at least hundred links. Most of the pages I will read just as quickly as I imagined, but there will be a few that will take disproportionally more time; either because they are interesting and long, or because they contain further interesting links. And writing a comment sometimes takes more time than it seems; it can easily be a half an hour for a three-paragraphs-long comment. (Ironically, this specific article gets written rather quickly, because I know what I want to write, and I write it directly. But there are comments where I think a lot, and keep correcting my text, to avoid misunderstanding when debating a sensitive topic, etc.) And when I stop doing it, because I want to make something productive for a change, I will feel tired. Reading many different things, trying to read quickly, and formulating my answers, that all makes me mentally exhausted. So after I close the browser, I just wish I could take a nap.

On the other hand, watching a movie does not make me tired in that specific way. The movies runs at its own speed and doesn't require me to do anything actively. Also, there is no sense of urgency; none of the "if I reply to this now, people will notice and respond, but if I do it a week later, no one will care anymore". So I feel perfectly comfortable pausing the movie at any moment, doing something productive for a while, then unpausing the movie and watching more. I know I won't miss anything.

I think it's the mental activity during my procrastination that both makes me tired and creates the illusion that it will take less time than it actually does. When the movie says 1:30:00, I know it will be 1:30:00 (or maybe a little less because of the final credits). With a web page, I can always tell myself "don't worry, I will read this one really fast", so there is the illusion that I have it under control, and can reduce the time waste. The fact that I am reading an individual page really fast makes me underestimate how much time it took to read all those pages.

On the other hand, sometimes I do inverse procrastination -- I start watching a movie, pause it a dozen times and do some useful work during the breaks -- and at the end of the day I spent maybe 90% of the time working productively, while my brain tells me I just spent the whole day watching a movie, so I almost feel like I had a free day.

Okay, so how could I use this knowledge to improve my productivity?

1) Knowing the difference between the two forms of procrastination, whenever I feel a desire to escape to the online world, I should start watching a movie instead of reading some debate, because thus I will waste less time, even if it feels the other way round.

2) Integrate it with pomodoro? 10 minutes movie, 50 minutes work, then again, and at the end of the day my lying brain will tell me "dude, you didn't work at all today, you were just watching movies, of course you should feel awesome!".

Do you have a similar experience? No idea how typical is this. No need to hurry with responding, I am going to watch a movie now. ;-)

Availability Heuristic Considered Ambiguous

9 Gram_Stone 10 June 2016 10:40PM

(Content note: The experimental results on the availability bias, one of the biases described in Tversky and Kahneman's original work, have been overdetermined, which has led to at least two separate interpretations of the heuristic in the cognitive science literature. These interpretations also result in different experimental predictions. The audience probably wants to know about this. This post is also intended to measure audience interest in a tradition of cognitive scientific research that I've been considering describing here for a while. Finally, I steal from Scott Alexander the section numbering technique that he stole from someone else: I expect it to be helpful because there are several inferential steps to take in this particular article, and it makes it look less monolithic.)

Related to: Availability

I.

The availability heuristic is judging the frequency or probability of an event, by the ease with which examples of the event come to mind.

This statement is actually slightly ambiguous. I notice at least two possible interpretations with regards to what the cognitive scientists infer is happening inside of the human mind:

  1. Humans think things like, “I found a lot of examples, thus the frequency or probability of the event is high,” or, “I didn’t find many examples, thus the frequency or probability of the event is low.”
  2. Humans think things like, “Looking for examples felt easy, thus the frequency or probability of the event is high,” or, “Looking for examples felt hard, thus the frequency or probability of the event is low.”

I think the second interpretation is the one more similar to Kahneman and Tversky’s original description, as quoted above.

And it doesn’t seem that I would be building up a strawman by claiming that some adhere to the first interpretation, intentionally or not. From Medin and Ross (1996, p. 522):

The availability heuristic refers to a tendency to form a judgment on the basis of what is readily brought to mind. For example, a person who is asked whether there are more English words that begin with the letter ‘t’ or the letter ‘k’ might try to think of words that begin with each of these letters. Since a person can probably think of more words beginning with ‘t’, he or she would (correctly) conclude that ‘t’ is more frequent than ‘k’ as the first letter of English words.

And even that sounds at least slightly ambiguous to me, although it falls on the other side of the continuum between pure mental-content-ism and pure phenomenal-experience-ism that includes the original description.

II.

You can’t really tease out this ambiguity with the older studies on availability, because these two interpretations generate the same prediction. There is a strong correlation between the number of examples recalled and the ease with which those examples come to mind.

For example, consider a piece of the setup in Experiment 3 from the original paper on the availability heuristic. The subjects in this experiment were asked to estimate the frequency of two types of words in the English language: words with ‘k’ as their first letter, and words with ‘k’ as their third letter. There are twice as many words with ‘k’ as their third letter, but there was bias towards estimating that there are more words with ‘k’ as their first letter.

How, in experiments like these, are you supposed to figure out whether the subjects are relying on mental content or phenomenal experience? Both mechanisms predict the outcome, "Humans will be biased towards estimating that there are more words with 'k' as their first letter." And a lot of the later studies just replicate this result in other domains, and thus suffer from the same ambiguity.

III.

If you wanted to design a better experiment, where would you begin?

Well, if we think of feelings as sources of information in the way that we regard thoughts as sources of information, then we should find that we have some (perhaps low, perhaps high) confidence in the informational value of those feelings, as we have some level of confidence in the informational value of our thoughts.

This is useful because it suggests a method for detecting the use of feelings as sources of information: if we are led to believe that a source of information has low value, then its relevance will be discounted; and if we are led to believe that it has high value, then its relevance will be augmented. Detecting this phenomenon in the first place is probably a good place to start before trying to determine whether the classic availability studies demonstrate a reliance on phenomenal experience, mental content, or both. 

Fortunately, Wänke et al. (1995) conducted a modified replication of the experiment described above with exactly the properties that we’re looking for! Let’s start with the control condition.

In the control condition, subjects were given a blank sheet of paper and asked to write down 10 words that have ‘t’ as the third letter, and then to write down 10 words that begin with the letter ‘t’. After this listing task, they rated the extent to which words beginning with a ‘t’ are more or less frequent than words that have ‘t’ as the third letter. As in the original availability experiments, subjects estimated that words that begin with a ‘t’ are much more frequent than words with a ‘t’ in the third position.

Like before, this isn’t enough to answer the questions that we want to answer, but it can’t hurt to replicate the original result. It doesn’t really get interesting until you do things that affect the perceived value of the subjects’ feelings.

Wänke et al. got creative and, instead of blank paper, they gave subjects in two experimental conditions sheets of paper imprinted with pale, blue rows of ‘t’s, and told them to write 10 words beginning with a ‘t’. One condition was told that the paper would make it easier for them to recall words beginning with a ‘t’, and the other was told that the paper would make it harder for them to recall words beginning with a ‘t’.

Subjects made to think that the magic paper made it easier to think of examples gave lower estimates of the frequency of words beginning with a ‘t’ in the English language. It felt easy to think of examples, but the experimenter made them expect that by means of the magic paper, so they discounted the value of the feeling of ease. Their estimates of the frequency of words beginning with 't' went down relative to the control condition.

Subjects made to think that the magic paper made it harder to think of examples gave higher estimates of the frequency of words beginning with a ‘t’ in the English language. It felt easy to recall examples, but the experimenter made them think it would feel hard, so they augmented the value of the feeling of ease. Their estimates of the frequency of words beginning with 't' went up relative to the control condition.

(Also, here's a second explanation by Nate Soares if you want one.)

So, at least in this sort of experiment, it looks like the subjects weren’t counting the number of examples they came up with; it looks like they really were using their phenomenal experiences of ease and difficulty to estimate the frequency of certain classes of words. This is some evidence for the validity of the second interpretation mentioned at the beginning.

IV.

So we know that there is at least one circumstance in which the second interpretation seems valid. This was a step towards figuring out whether the availability heuristic first described by Kahneman and Tversky is an inference from amount of mental content, or an inference from the phenomenal experience of ease of recall, or something else, or some combination thereof.

As I said before, the two interpretations have identical predictions in the earlier studies. The solution to this is to design an experiment where inferences from mental content and inferences from phenomenal experience cause different judgments.

Schwarz et al. (1991, Experiment 1) asked subjects to list either 6 or 12 situations in which they behaved either assertively or unassertively. Pretests had shown that recalling 6 examples was experienced as easy, whereas recalling 12 examples was experienced as difficult. After listing examples, subjects had to evaluate their own assertiveness.

As one would expect, subjects rated themselves as more assertive when recalling 6 examples of assertive behavior than when recalling 6 examples of unassertive behavior.

But the difference in assertiveness ratings didn’t increase with the number of examples. Subjects who had to recall examples of assertive behavior rated themselves as less assertive after reporting 12 examples rather than 6 examples, and subjects who had to recall examples of unassertive behavior rated themselves as more assertive after reporting 12 examples rather than 6 examples.

If they were relying on the number of examples, then we should expect their ratings for the recalled quality to increase with the number of examples. Instead, they decreased.

It could be that it got harder to come up with good examples near the end of the task, and that later examples were lower quality than earlier examples, and the increased availability of the later examples biased the ratings in the way that we see. Schwarz acknowledged this, checked the written reports manually, and claimed that no such quality difference was evident.

V.

It would still be nice if we could do better than taking Schwarz’s word on that though. One thing you could try is seeing what happens when you combine the methods we used in the last two experiments: vary the number of examples generated and manipulate the perceived relevance of the experiences of ease and difficulty at the same time. (Last experiment, I promise.)

Schwarz et al. (1991, Experiment 3) manipulated the perceived value of the experienced ease or difficulty of recall by having subjects listen to ‘new-age music’ played at half-speed while they worked on the recall task. Some subjects were told that this music would make it easier to recall situations in which they behaved assertively and felt at ease, whereas others were told that it would make it easier to recall situations in which they behaved unassertively and felt insecure. These manipulations make subjects perceive recall experiences as uninformative whenever the experience matches the alleged impact of the music; after all, it may simply be easy or difficult because of the music. On the other hand, experiences that are opposite to the alleged impact of the music are considered very informative.

When the alleged effects of the music were the opposite of the phenomenal experience of generating examples, the previous experimental results were replicated.

When the alleged effects of the music match the phenomenal experience of generating examples, then the experience is called into question, since you can’t tell if it’s caused by the recall task or the music.

When this is done, the pattern that we expect from the first interpretation of the availability heuristic holds. Thinking of 12 examples of assertive behavior makes subjects rate themselves as more assertive than thinking of 6 examples of assertive behavior; mutatis mutandis for unassertive examples. When people can’t rely on their experience, they fall back to using mental content, and instead of relying on how hard or easy things feel, they count.

Under different circumstances, both interpretations are useful, but of course, it’s important to recognize that a distinction exists in the first place.


Medin, D. L., & Ross, B. H. (1996). Cognitive psychology (2nd ed.). Fort Worth: Harcourt Brace.

Schwarz, N., Bless, H., Strack, F., Klumpp, G., Rittenauer-Schatka, H., & Simons, A. (1991). Ease of retrieval as information: Another look at the availability heuristic. Journal of Personality and Social Psychology, 61, 195–202.

Tversky, A., & Kahneman, D. (1973). Availability: A heuristic for judging frequency and probability. Cognitive Psychology, 5, 207–232.

Wänke, M., Schwarz, N. & Bless, H. (1995). The availability heuristic revisited: Experienced ease of retrieval in mundane frequency estimates. Acta Psychologica, 89, 83-90.

Writing Collaboratively

8 richard_reitz 18 June 2016 07:47PM

This is a summary of the customs for collaborative writing the team on the fanfiction In Fire Forged came to, after a fair amount of time and effort figuring things out. The purpose of this piece is to share our results, thereby saving anyone who wants to write collaboratively the cost of experimentation. Obviously, different writing projects will accomplish different things with different people, and will therefore be best served by different practices. Take this as a first approximation, to be revised by experience.

Google Docs

We tried a bunch of platforms for collaboration, and found Google Docs to best fit our needs.

  1. Create a Google Doc. Multi-installment affairs may consider creating a folder and make one doc per installment.
  2. Enable editing. Collaborators are not very helpful if they can't provide feedback.

    Google Docs allows authors to restrict the changes other people can make to "suggestions" and "comments" by switching to "suggesting" mode.



    In general, the author restricts collaborator permissions to comments and suggestions. How to control these permissions should be described in the "enable editing" link above.
  3. Distribute link to collaborators.

Once the collaborators have the link, they read through it, making the comments and suggestions they think of. Google Docs does a good job facilitating discussion of this feedback; utilize this!

Micro and Macro

We found it useful to distinguish between what we were saying and how we were saying it. We termed the former "macro" and the latter "micro". This allows authors to say things like "I'm mostly looking for micro suggestions, although I'd be interested in any glaring macro errors (anything untrue or major omissions)." This succinctly communicates that collaborators should mostly restrict themselves to suggesting changes to how the author is communicating, which usually consists of small edits concerning things like technical issues (typos, omitted words, grammar) and smoother communication (word choice, resolving ambiguities, sectioning).

This contrasts macro suggestions, which would include (in nonfiction) things like making sure factual claims were true, being sure to include all relevant information, and the perspective from a different field. (In fiction, macro suggestions would include things such as plot, characterization, chapter structure and consistency of the universe.)

In general, you want to address macro issues before micro issues, since micro improvements are lost to changes on the macro level.

Team Makeup

On the macro level, you want as many people as can bring novel, relevant viewpoints to the writing. Essentially, you're looking to exploit Linus's Law by having at least one collaborator who will naturally see every improvement that could be made.

I favor erring on the size of larger teams for a few reasons. The coordination cost of adding a member isn't very high. Improving things on the micro level really benefits from having lots of eyeballs scrutinize for improvements: it's entirely plausible that the tenth reader of some passage notices a way to reword it that the first nine missed.

My favorite reason for having more collaborators, however, is that it opens up the possibility of partial editing. One collaborator flags something they notice could be improved, even if they can't think of how. Then, another collaborator, who may not have noticed that something sounded awkward, may figure out how to rewrite it better. (It may sound implausible that someone who can figure out the improvement wouldn't notice something improvable in the first place, but it happened reasonably often.)

Spreading the micro over a lot of people also helps avoid illusions of transparency. If you only have one or two people revising, it's easy for them to spend so much time that they miss statements that don't mean what they think it means or are ambiguous, since they're so familiar with what they mean to mean. Spreading out the editing keeps everyone from becoming overfamiliar with the work. It also allows for holding editors in reverse, who give the work one last pass and read it as naively as the target audience.

Collaborator Benefits

Helping someone else write their piece is the single most effective technique I've used to powerlevel my writing. SICP:

The ability to visualize the consequences of the actions under consideration is crucial to becoming an expert programmer, just as it is in any synthetic, creative activity. In becoming an expert photographer, for example, one must learn how to look at a scene and know how dark each region will appear on a print for each possible choice of exposure and development conditions. Only then can one reason backward, planning framing, lighting, exposure, and development to obtain the desired effects. So it is with programming...

...and so it is with writing. There's an awkward period when you're first starting to write, where you've read enough that you have some idea of what better and worse writing looks like, but you haven't written enough to visualize the consequences of your writing. The author of In Fire Forged got there by writing and scrapping 140k words. I got there with a fraction of the effort by helping out on a team that allowed me to see the consequences of various actions without needing to write entire pieces. I also got to see and analyze and discuss the feedback from the other collaborators, which taught me things about better writing I didn't already know. Plus, gaining this experience had positive externalities, since the suggestions I made wound up in a final product, instead of going into the trash.

Collaborating also helps you learn about the topic of the piece more effectively than just reading it, via levels of processing. Merely reading about something is fairly shallow, leading to nondurable memory, whereas collaborating on something forces deeper processing, and thus more durable understanding. You can force yourself to process something on a deeper level as you read it to get the same effect, but collaborating, again, produces positive externalities.

(You should be processing deeply anyway. One collaborator on this piece, for instance, puts comments in the margins of pieces she reads. That said, collaborating has positive externalities.)

It's also fun and social; writing collaboratively has caused me to meet some of my favorite people and strengthened many personal relationships. As such, I suggest that, should you come across some piece that you take a liking to, but see how you could improve it, you offer to collaborate with them. Worst case, they're flattered and turn you down politely.

[Link] Suffering-focused AI safety: Why “fail-safe” measures might be particularly promising

7 wallowinmaya 21 July 2016 08:22PM

The Foundational Research Institute just published a new paper: "Suffering-focused AI safety: Why “fail-safe” measures might be our top intervention". 

It is important to consider that [AI outcomes] can go wrong to very different degrees. For value systems that place primary importance on the prevention of suffering, this aspect is crucial: the best way to avoid bad-case scenarios specifically may not be to try and get everything right. Instead, it makes sense to focus on the worst outcomes (in terms of the suffering they would contain) and on tractable methods to avert them. As others are trying to shoot for a best-case outcome (and hopefully they will succeed!), it is important that some people also take care of addressing the biggest risks. This perspective to AI safety is especially promising both because it is currently neglected and because it is easier to avoid a subset of outcomes rather than to shoot for one highly specific outcome. Finally, it is something that people with many different value systems could get behind.

Earning money with/for work in AI safety

7 rmoehn 18 July 2016 05:37AM

(I'm re-posting my question from the Welcome thread, because nobody answered there.)

I care about the current and future state of humanity, so I think it's good to work on existential or global catastrophic risk. Since I've studied computer science at a university until last year, I decided to work on AI safety. Currently I'm a research student at Kagoshima University doing exactly that. Before April this year I had only little experience with AI or ML. Therefore, I'm slowly digging through books and articles in order to be able to do research.

I'm living off my savings. My research student time will end in March 2017 and my savings will run out some time after that. Nevertheless, I want to continue AI safety research, or at least work on X or GC risk.

I see three ways of doing this:

  • Continue full-time research and get paid/funded by someone.
  • Continue research part-time and work the other part of the time in order to get money. This work would most likely be programming (since I like it and am good at it). I would prefer work that helps humanity effectively.
  • Work full-time on something that helps humanity effectively.


Oh, and I need to be location-independent or based in Kagoshima.

I know http://futureoflife.org/job-postings/, but all of the job postings fail me in two ways: not location-independent and requiring more/different experience than I have.

Can anyone here help me? If yes, I would be happy to provide more information about myself.

(Note that I think I'm not in a precarious situation, because I would be able to get a remote software development job fairly easily. Just not in AI safety or X or GC risk.)

Wikipedia usage survey results

7 riceissa 15 July 2016 12:49AM

Contents

Background

At the end of May 2016, Vipul Naik and I created a Wikipedia usage survey to gauge the usage habits of Wikipedia readers and editors. SurveyMonkey allows the use of different “collectors” (i.e. survey URLs that keep results separate), so we circulated several different URLs among four locations to see how different audiences would respond. The audiences were as follows:

  • SurveyMonkey’s United States audience with no demographic filters (62 responses, 54 of which are full responses)
  • Vipul Naik’s timeline (post asking people to take the survey; 70 responses, 69 of which are full responses). For background on Vipul’s timeline audience, see his page on how he uses Facebook.
  • The Wikipedia Analytics mailing list (email linking to the survey; 7 responses, 6 of which are full responses). Note that due to the small size of this group, the results below should not be trusted, unless possibly when the votes are decisive.
  • Slate Star Codex (post that links to the survey; 618 responses, 596 of which are full responses). While Slate Star Codex isn’t the same as LessWrong, we think there is significant overlap in the two sites’ audiences (see e.g. the recent LessWrong diaspora survey results).
  • In addition, although not an actual audience with a separate URL, several of the tables we present below will include an “H” group; this is the heavy users group of people who responded by saying they read 26 or more articles per week on Wikipedia. This group has 179 people: 164 from Slate Star Codex, 11 from Vipul’s timeline, and 4 from the Analytics mailing list.

We ran the survey from May 30 to July 9, 2016 (although only the Slate Star Codex survey had a response past June 1).

After we looked at the survey responses on the first day, Vipul and I decided to create a second survey to focus on the parts from the first survey that interested us the most. The second survey was only circulated among SurveyMonkey’s audiences: we used SurveyMonkey’s US audience with no demographic filters (54 responses), as well as a US audience of ages 18–29 with a college or graduate degree (50 responses). We first ran the survey on the unfiltered audience again because the wording of our first question was changed and we wanted to have the new baseline. We then chose to filter for young college-educated people because our prediction was that more educated people would be more likely to read Wikipedia (the SurveyMonkey demographic data does not include education, and we hadn’t seen the Pew Internet Research surveys in the next section, so we were relying on our intuition and some demographic data from past surveys) and because young people in our first survey gave more informative free-form responses in survey 2 (SurveyMonkey’s demographic data does include age).

Previous surveys

Several demographic surveys regarding Wikipedia have been conducted, targeting both editors and users. The surveys we found most helpful were the following:

  • The 2010 Wikipedia survey by the Collaborative Creativity Group and the Wikimedia Foundation. The explanation before the bottom table on page 7 of the overview PDF has “Contributors show slightly but significantly higher education levels than readers”, which provides weak evidence that more educated people are more likely to engage with Wikipedia.
  • The Global South User Survey 2014 by the Wikimedia Foundation
  • Pew Internet Research’s 2011 survey: “Education level continues to be the strongest predictor of Wikipedia use. The collaborative encyclopedia is most popular among internet users with at least a college degree, 69% of whom use the site.” (page 3)
  • Pew Internet Research’s 2007 survey

Note that we found the Pew Internet Research surveys after conducting our own two surveys (and during the write-up of this document).

Motivation

Vipul and I ultimately want to get a better sense of the value of a Wikipedia pageview (one way to measure the impact of content creation), and one way to do this is to understand how people are using Wikipedia. As we focus on getting more people to work on editing Wikipedia – thus causing more people to read the content we pay and help to create – it becomes more important to understand what people are doing on the site.

For some previous discussion, see also Vipul’s answers to the following Quora questions:

Wikipedia allows relatively easy access to pageview data (especially by using tools developed for this purpose, including one that Vipul made), and there are some surveys that provide demographic data (see “Previous surveys” above). However, after looking around, it was apparent that the kind of information our survey was designed to find was not available.

I should also note that we were driven by our curiosity of how people use Wikipedia.

Survey questions for the first survey

For reference, here are the survey questions for the first survey. A dummy/mock-up version of the survey can be found here: https://www.surveymonkey.com/r/PDTTBM8.

The survey introduction said the following:

This survey is intended to gauge Wikipedia use habits. This survey has 3 pages with 5 questions total (3 on the first page, 1 on the second page, 1 on the third page). Please try your best to answer all of the questions, and make a guess if you’re not sure.

And the actual questions:

  1. How many distinct Wikipedia pages do you read per week on average?

    • less than 1
    • 1 to 10
    • 11 to 25
    • 26 or more
  2. On a search engine (e.g. Google) results page, do you explicitly seek Wikipedia pages, or do you passively click on Wikipedia pages only if they show up at the top of the results?

    • I explicitly seek Wikipedia pages
    • I have a slight preference for Wikipedia pages
    • I just click on what is at the top of the results
  3. Do you usually read a particular section of a page or the whole article?

    • Particular section
    • Whole page
  4. How often do you do the following? (Choices: Several times per week, About once per week, About once per month, About once per several months, Never/almost never.)

    • Use the search functionality on Wikipedia
    • Be surprised that there is no Wikipedia page on a topic
  5. For what fraction of pages you read do you do the following? (Choices: For every page, For most pages, For some pages, For very few pages, Never. These were displayed in a random order for each respondent, but displayed in alphabetical order here.)

    • Check (click or hover over) at least one citation to see where the information comes from on a page you are reading
    • Check how many pageviews a page is getting (on an external site or through the Pageview API)
    • Click through/look for at least one cited source to verify the information on a page you are reading
    • Edit a page you are reading because of grammatical/typographical errors on the page
    • Edit a page you are reading to add new information
    • Look at the “See also” section for additional articles to read
    • Look at the editing history of a page you are reading
    • Look at the editing history solely to see if a particular user wrote the page
    • Look at the talk page of a page you are reading
    • Read a page mostly for the “Criticisms” or “Reception” (or similar) section, to understand different views on the subject
    • Share the page with a friend/acquaintance/coworker

For the SurveyMonkey audience, there were also some demographic questions (age, gender, household income, US region, and device type).

Survey questions for the second survey

For reference, here are the survey questions for the second survey. A dummy/mock-up version of the survey can be found here: https://www.surveymonkey.com/r/28BW78V.

The survey introduction said the following:

This survey is intended to gauge Wikipedia use habits. Please try your best to answer all of the questions, and make a guess if you’re not sure.

This survey has 4 questions across 3 pages.

In this survey, “Wikipedia page” refers to a Wikipedia page in any language (not just the English Wikipedia).

And the actual questions:

  1. How many distinct Wikipedia pages do you read (at least one sentence of) per week on average?

    • Fewer than 1
    • 1 to 10
    • 11 to 25
    • 26 or more
  2. Which of these articles have you read (at least one sentence of) on Wikipedia (select all that apply)?

    • Adele
    • Barack Obama
    • Bernie Sanders
    • China
    • Donald Trump
    • Google
    • Hillary Clinton
    • India
    • Japan
    • Justin Bieber
    • Justin Trudeau
    • Katy Perry
    • Taylor Swift
    • The Beatles
    • United States
    • World War II
    • None of the above
  3. What are some of the Wikipedia articles you have most recently read (at least one sentence of)? Feel free to consult your browser’s history.

  4. Recall a time when you were surprised that a topic did not have a Wikipedia page. What were some of these topics?

Results

In this section we present the highlights from each of the survey questions. If you prefer to dig into the data yourself, there are also some exported PDFs below provided by SurveyMonkey. Most of the inferences can be made using these PDFs, but there are some cases where additional filters are needed to deduce certain percentages.

We use the notation “SnQm” to mean “survey n question m”.

S1Q1: number of Wikipedia pages read per week

Here is a table that summarizes the data for Q1:

How many distinct Wikipedia pages do you read per week on average? SM = SurveyMonkey audience, V = Vipul Naik’s timeline, SSC = Slate Star Codex audience, AM = Wikipedia Analytics mailing list.
Response SM V SSC AM
less than 1 42% 1% 1% 0%
1 to 10 45% 40% 37% 29%
11 to 25 13% 43% 36% 14%
26 or more 0% 16% 27% 57%

Here are some highlights from the first question that aren’t apparent from the table:

  • Of the people who read fewer than 1 distinct Wikipedia page per week (26 people), 68% were female even though females were only 48% of the respondents. (Note that gender data is only available for the SurveyMonkey audience.)

  • Filtering for high household income ($150k or more; 11 people) in the SurveyMonkey audience, only 2 read fewer than 1 page per week, although most (7) of the responses still fall in the “1 to 10” category.

The comments indicated that this question was flawed in several ways: we didn’t specify which language Wikipedias count nor what it meant to “read” an article (the whole page, a section, or just a sentence?). One comment questioned the “low” ceiling of 26; in fact, I had initially made the cutoffs 1, 10, 100, 500, and 1000, but Vipul suggested the final cutoffs because he argued they would make it easier for people to answer (without having to look it up in their browser history). It turned out this modification was reasonable because the “26 or more” group was a minority.

S1Q2: affinity for Wikipedia in search results

We asked Q2, “On a search engine (e.g. Google) results page, do you explicitly seek Wikipedia pages, or do you passively click on Wikipedia pages only if they show up at the top of the results?”, to see to what extent people preferred Wikipedia in search results. The main implication to this for people who do content creation on Wikipedia is that if people do explicitly seek Wikipedia pages (for whatever reason), it makes sense to give them more of what they want. On the other hand, if people don’t prefer Wikipedia, it makes sense to update in favor of diversifying one’s content creation efforts while still keeping in mind that raw pageviews indicate that content will be read more if placed on Wikipedia (see for instance Brian Tomasik’s experience, which is similar to my own, or gwern’s page comparing Wikipedia with other wikis).

The following table summarizes our results.

On a search engine (e.g. Google) results page, do you explicitly seek Wikipedia pages, or do you passively click on Wikipedia pages only if they show up at the top of the results? SM = SurveyMonkey audience, V = Vipul Naik’s timeline, SSC = Slate Star Codex audience, AM = Wikipedia Analytics mailing list, H = heavy users (26 or more articles per week) of Wikipedia.
Response SM V SSC AM H
Explicitly seek Wikipedia 19% 60% 63% 57% 79%
Slight preference for Wikipedia 29% 39% 34% 43% 20%
Just click on top results 52% 1% 3% 0% 1%

One error on my part was that I didn’t include an option for people who avoided Wikipedia or did something else. This became apparent from the comments. For this reason, the “Just click on top results” options might be inflated. In addition, some comments indicated a mixed strategy of preferring Wikipedia for general overviews while avoiding it for specific inquiries, so allowing multiple selections might have been better for this question.

S1Q3: section vs whole page

This question is relevant for Vipul and me because the work Vipul funds is mainly whole-page creation. If people are mostly reading the introduction or a particular section like the “Criticisms” or “Reception” section (see S1Q5), then that forces us to consider spending more time on those sections, or to strengthen those sections on weak existing pages.

Responses to this question were fairly consistent across different audiences, as can be see in the following table.

Do you usually read a particular section of a page or the whole article? SM = SurveyMonkey audience, V = Vipul Naik’s timeline, SSC = Slate Star Codex audience, AM = Wikipedia Analytics mailing list.
Response SM V SSC AM
Section 73% 80% 74% 86%
Whole 34% 23% 33% 29%

Note that people were allowed to select more than one option for this question. The comments indicate that several people do a combination, where they read the introductory portion of an article, then narrow down to the section of their interest.

S1Q4: search functionality on Wikipedia and surprise at lack of Wikipedia pages

We asked about whether people use the search functionality on Wikipedia because we wanted to know more about people’s article discovery methods. The data is summarized in the following table.

How often do you use the search functionality on Wikipedia? SM = SurveyMonkey audience, V = Vipul Naik’s timeline, SSC = Slate Star Codex audience, AM = Wikipedia Analytics mailing list, H = heavy users (26 or more articles per week) of Wikipedia.
Response SM V SSC AM H
Several times per week 8% 14% 32% 57% 55%
About once per week 19% 17% 21% 14% 15%
About once per month 15% 13% 14% 0% 3%
About once per several months 13% 12% 9% 14% 5%
Never/almost never 45% 43% 24% 14% 23%

Many people noted here that rather than using Wikipedia’s search functionality, they use Google with “wiki” attached to their query, DuckDuckGo’s “!w” expression, or some browser configuration to allow a quick search on Wikipedia.

To be more thorough about discovering people’s content discovery methods, we should have asked about other methods as well. We did ask about the “See also” section in S1Q5.

Next, we asked how often people are surprised that there is no Wikipedia page on a topic to gauge to what extent people notice a “gap” between how Wikipedia exists today and how it could exist. We were curious about what articles people specifically found missing, so we followed up with S2Q4.

How often are you surprised that there is no Wikipedia page on a topic? SM = SurveyMonkey audience, V = Vipul Naik’s timeline, SSC = Slate Star Codex audience, AM = Wikipedia Analytics mailing list, H = heavy users (26 or more articles per week) of Wikipedia.
Response SM V SSC AM H
Several times per week 2% 0% 2% 29% 6%
About once per week 8% 22% 18% 14% 34%
About once per month 18% 36% 34% 29% 31%
About once per several months 21% 22% 27% 0% 19%
Never/almost never 52% 20% 19% 29% 10%

Two comments on this question (out of 59) – both from the SSC group – specifically bemoaned deletionism, with one comment calling deletionism “a cancer killing Wikipedia”.

S1Q5: behavior on pages

This question was intended to gauge how often people perform an action for a specific page; as such, the frequencies are expressed in page-relative terms.

The following table presents the scores for each response, which are weighted by the number of responses. The scores range from 1 (for every page) to 5 (never); in other words, the lower the number, the more frequently one does the thing.

For what fraction of pages you read do you do the following? Note that the responses have been shortened here; see the “Survey questions” section for the wording used in the survey. Responses are sorted by the values in the SSC column. SM = SurveyMonkey audience, V = Vipul Naik’s timeline, SSC = Slate Star Codex audience, AM = Wikipedia Analytics mailing list, H = heavy users (26 or more articles per week) of Wikipedia.
Response SM V SSC AM H
Check ≥1 citation 3.57 2.80 2.91 2.67 2.69
Look at “See also” 3.65 2.93 2.92 2.67 2.76
Read mostly for “Criticisms” or “Reception” 4.35 3.12 3.34 3.83 3.14
Click through ≥1 source to verify information 3.80 3.07 3.47 3.17 3.36
Share the page 4.11 3.72 3.86 3.67 3.79
Look at the talk page 4.31 4.28 4.03 3.00 3.86
Look at the editing history 4.35 4.32 4.12 3.33 3.92
Edit a page for grammatical/typographical errors 4.50 4.41 4.22 3.67 4.02
Edit a page to add new information 4.61 4.55 4.49 3.83 4.34
Look at editing history to verify author 4.50 4.65 4.48 3.67 4.73
Check how many pageviews a page is getting 4.63 4.88 4.96 3.17 4.92

The table above provides a good ranking of how often people perform these actions on pages, but not the distribution information (which would require three dimensions to present fully). In general, the more common actions (scores of 2.5–4) had responses that clustered among “For some pages”, “For very few pages”, and “Never”, while the less common actions (scores above 4) had responses that clustered mainly in “Never”.

One comment (out of 43) – from the SSC group, but a different individual from the two in S1Q4 – bemoaned deletionism.

S2Q1: number of Wikipedia pages read per week

Note the wording changes on this question for the second survey: “less” was changed to “fewer”, the clarification “at least one sentence of” was added, and we explicitly allowed any language. We have also presented the survey 1 results for the SurveyMonkey audience in the corresponding rows, but note that because of the change in wording, the correspondence isn’t exact.

How many distinct Wikipedia pages do you read (at least one sentence of) per week on average? SM = SurveyMonkey audience with no demographic filters, CEYP = College-educated young people of SurveyMonkey, S1SM = SurveyMonkey audience with no demographic filters from the first survey.
Response SM CEYP S1SM
Fewer than 1 37% 32% 42%
1 to 10 48% 64% 45%
11 to 25 7% 2% 13%
26 or more 7% 2% 0%

Comparing SM with S1SM, we see that probably because of the wording, the percentages have drifted in the direction of more pages read. It might be surprising that the young educated audience seems to have a smaller fraction of heavy users than the general population. However note that each group only had ~50 responses, and that we have no education information for the SM group.

S2Q2: multiple-choice of articles read

Our intention with this question was to see if people’s stated or recalled article frequencies matched the actual, revealed popularity of the articles. Therefore we present the pageview data along with the percentage of people who said they had read an article.

Which of these articles have you read (at least one sentence of) on Wikipedia (select all that apply)? SM = SurveyMonkey audience with no demographic filters, CEYP = College-educated young people of SurveyMonkey. Columns “2016” and “2015” are desktop pageviews in millions. Note that the 2016 pageviews only include pageviews through the end of June. The rows are sorted by the values in the CEYP column followed by those in the SM column.
Response SM CEYP 2016 2015
None 37% 40%
World War II 17% 22% 2.6 6.5
Barack Obama 17% 20% 3.0 7.7
United States 17% 18% 4.3 9.6
Donald Trump 15% 18% 14.0 6.6
Taylor Swift 9% 18% 1.7 5.3
Bernie Sanders 17% 16% 4.3 3.8
Japan 11% 16% 1.6 3.7
Adele 6% 16% 2.0 4.0
Hillary Clinton 19% 14% 2.8 1.5
China 13% 14% 1.9 5.2
The Beatles 11% 14% 1.4 3.0
Katy Perry 9% 12% 0.8 2.4
Google 15% 10% 3.0 9.0
India 13% 10% 2.4 6.4
Justin Bieber 4% 8% 1.6 3.0
Justin Trudeau 9% 6% 1.1 3.0

Below are four plots of the data. Note that r_s denotes Spearman’s rank correlation coefficient. Spearman’s rank correlation coefficient is used instead of Pearson’s r because the former is less affected by outliers. Note also that the percentage of respondents who viewed a page counts each respondent once, whereas the number of pageviews does not have this restriction (i.e. duplicate pageviews count), so we wouldn’t expect the relationship to be entirely linear even if the survey audiences were perfectly representative of the general population.

SM vs 2016 pageviews

SM vs 2016 pageviews

SM vs 2015 pageviews

SM vs 2015 pageviews

CEYP vs 2016 pageviews

CEYP vs 2016 pageviews

CEYP vs 2015 pageviews

CEYP vs 2015 pageviews

S2Q3: free response of articles read

The most common response was along the lines of “None”, “I don’t know”, “I don’t remember”, or similar. Among the more useful responses were:

S2Q4: free response of surprise at lack of Wikipedia pages

As with the previous question, the most common response was along the lines of “None”, “I don’t know”, “I don’t remember”, “Doesn’t happen”, or similar.

The most useful responses were classes of things: “particular words”, “French plays/books”, “Random people”, “obscure people”, “Specific list pages of movie genres”, “Foreign actors”, “various insect pages”, and so forth.

Summaries of responses (exported from SurveyMonkey)

SurveyMonkey allows exporting of response summaries. Here are the exports for each of the audiences.

Survey-making lessons

Not having any experience designing surveys, and wanting some rough results quickly, I decided not to look into survey-making best practices beyond the feedback from Vipul. As the first survey progressed, it became clear that there were several deficiencies in that survey:

  • Question 1 did not specify what counts as reading a page.
  • We did not specify which language Wikipedias we were considering (multiple people noted how they read other language Wikipedias other than the English Wikipedia).
  • Question 2 did not include an option for people who avoid Wikipedia or do something else entirely.
  • We did not include an option to allow people to release their survey results.

Further questions

The two surveys we’ve done so far provide some insight into how people use Wikipedia, but we are still far from understanding the value of Wikipedia pageviews. Some remaining questions:

  • Could it be possible that even on non-obscure topics, most of the views are by “elites” (i.e. those with outsized impact on the world)? This could mean pageviews are more valuable than previously thought.
  • On S2Q1, why did our data show that CEYP was less engaged with Wikipedia than SM? Is this a limitation of the small number of responses or of SurveyMonkey’s audiences?

Further reading

Acknowledgements

Thanks to Vipul Naik for collaboration on this project and feedback while writing this document, and thanks to Ethan Bashkansky for reviewing the document. All imperfections are my own.

The writing of this document was sponsored by Vipul Naik.

Document source and versions

The source files used to compile this document are available in a GitHub Gist. The Git repository of the Gist contains all versions of this document since its first publication.

This document is available in the following formats:

License

This document is released to the public domain.

Attempts to Debias Hindsight Backfire!

7 Gram_Stone 13 June 2016 04:13PM

(Content note: A common suggestion for debiasing hindsight: try to think of many alternative historical outcomes. But thinking of too many examples can actually make hindsight bias worse.)

Followup to: Availability Heuristic Considered Ambiguous

Related to: Hindsight Bias

I.

Hindsight bias is when people who know the answer vastly overestimate its predictability or obviousness, compared to the estimates of subjects who must guess without advance knowledge.  Hindsight bias is sometimes called the I-knew-it-all-along effect.

The way that this bias is usually explained is via the availability of outcome-related knowledge. The outcome is very salient, but the possible alternatives are not, so the probability that people claim they would have assigned to an event that has already happened gets jacked up. It's also known that knowing about hindsight bias and trying to adjust for it consciously doesn't eliminate it.

This means that most attempts at debiasing focus on making alternative outcomes more salient. One is encouraged to recall other ways that things could have happened. Even this merely attenuates the hindsight bias, and does not eliminate it (Koriat, Lichtenstein, & Fischhoff, 1980; Slovic & Fischhoff, 1977).

II.

Remember what happened with the availability heuristic when we varied the number of examples that subjects had to recall? Crazy things happened because of the phenomenal experience of difficulty that recalling more examples caused within the subjects.

You might imagine that, if you recalled too many examples, you could actually make the hindsight bias worse, because if subjects experience alternative outcomes as difficult to generate, then they'll consider the alternatives less likely, and not more.

Relatedly, Sanna, Schwarz, and Stocker (2002, Experiment 2) presented participants with a description of the British–Gurkha War (taken from Fischhoff, 1975; you should remember this one). Depending on conditions, subjects were told either that the British or the Gurkha had won the war, or were given no outcome information. Afterwards, they were asked, “If we hadn’t already told you who had won, what would you have thought the probability of the British (Gurkhas, respectively) winning would be?”, and asked to give a probability in the form of a percentage.

Like in the original hindsight bias studies, subjects with outcome knowledge assigned a higher probability to the known outcome than subjects in the group with no outcome knowledge. (Median probability of 58.2% in the group with outcome knowledge, and 48.3% in the group without outcome knowledge.)

Some subjects, however, were asked to generate either 2 or 10 thoughts about how the outcome could have been different. Thinking of 2 alternative outcomes slightly attenuated hindsight bias (median down to 54.3%), but asking subjects to think of 10 alternative outcomes went horribly, horribly awry, increasing the subjects' median probability for the 'known' outcome all the way up to 68.0%!

It looks like we should be extremely careful when we try to retrieve counterexamples to claims that we believe. If we're too hard on ourselves and fail to take this effect into account, then we can make ourselves even more biased than we would have been if we had done nothing at all.

III.

But it doesn't end there.

Like in the availability experiments before this, we can discount the informational value of the experience of difficulty when generating examples of alternative historical outcomes. Then the subjects would make their judgment based on the number of thoughts instead of the experience of difficulty.

Just before the 2000 U.S. presidential elections, Sanna et al. (2002, Experiment 4) asked subjects to predict the percentage of the popular vote the major candidates would receive. (They had to wait a little longer than they expected for the results.)

Later, they were asked to recall what their predictions were.

Control group subjects who listed no alternative thoughts replicated previous results on the hindsight bias.

Experimental group subjects who listed 12 alternative thoughts experienced difficulty and their hindsight bias wasn't made any better, but it didn't get worse either.

(It seems the reason it didn't get worse is because everyone thought Gore was going to win before the election, and for the hindsight bias to get worse, the subjects would have to incorrectly recall that they predicted a Bush victory.)

Other experimental group subjects listed 12 alternative thoughts and were also made to attribute their phenomenal experience of difficulty to lack of domain knowledge, via the question: "We realize that this was an extremely difficult task that only people with a good knowledge of politics may be able to complete. As background information, may we therefore ask you how knowledgeable you are about politics?" They were then made to provide a rating of their political expertise and to recall their predictions.

Because they discounted the relevance of the difficulty of recalling 12 alternative thoughts, attributing it to their lack of political domain knowledge, thinking of 12 ways that Gore could have won introduced a bias in the opposite direction! They recalled their original predictions for a Gore victory as even more confident than they actually, originally were.

We really are doomed.


Fischhoff, B. (1975). Hindsight is not equal to foresight: the effect of outcome knowledge on judgment under uncertainty. Journal of Experimental Psychology: Human Perception and Performance, 1, 288–299.

Koriat, A., Lichtenstein, S., & Fischhoff, B. (1980). Reasons for confidence. Journal of Experimental Psychology: Human Learning and Memory, 6, 107–118.

Sanna, L. J., Schwarz, N., & Stocker, S. L. (2002). When debiasing backfires: Accessible content and accessibility experiences in debiasing hindsight through mental simulations. Journal of Experimental Psychology: Learning, Memory, and Cognition, 28, 497–502.

Slovic, P., & Fischhoff, B. (1977). On the psychology of experimental surprises. Journal of Experimental Psychology: Human Perception and Performance, 3, 544–551.

rationalfiction.io - publish, discover, and discuss rational fiction

7 rayalez 31 May 2016 12:02PM

Hey, everyone! I want to share with you a project I've been working on for a while - http://rationalfiction.io.

I want it to become the perfect place to publish, discover, and discuss rational fiction.

We already have a lot of awesome stories, and I invite you to join and post more! =)

Paid research assistant position focusing on artificial intelligence and existential risk

7 crmflynn 02 May 2016 06:27PM

Yale Assistant Professor of Political Science Allan Dafoe is seeking Research Assistants for a project on the political dimensions of the existential risks posed by advanced artificial intelligence. The project will involve exploring issues related to grand strategy and international politics, reviewing possibilities for social scientific research in this area, and institution building. Familiarity with international relations, existential risk, Effective Altruism, and/or artificial intelligence are a plus but not necessary. The project is done in collaboration with the Future of Humanity Institute, located in the Faculty of Philosophy at the University of Oxford. There are additional career opportunities in this area, including in the coming academic year and in the future at Yale, Oxford, and elsewhere. If interested in the position, please email allan.dafoe@yale.edu with a copy of your CV, a writing sample, an unofficial copy of your transcript, and a short (200-500 word) statement of interest. Work can be done remotely, though being located in New Haven, CT or Oxford, UK is a plus.

[LINK] Updating Drake's Equation with values from modern astronomy

7 DanArmak 30 April 2016 10:08PM

A paper published in AstrobiologyA New Empirical Constraint on the Prevalence of Technological Species in the Universe (PDF), A. Frank and W.T. Sullivan.

From the abstract:

Recent advances in exoplanet studies provide strong constraints on all astrophysical terms in the Drake equation. [...] We find that as long as the probability that a habitable zone planet develops a technological species is larger than ~ 10-24, humanity is not the only time technological intelligence has evolved.

They say we now know with reasonable certainty the total number of stars ever to exist (in the observable universe), and the average number of planets in the habitable zone. But we still don't know the probabilities of life, intelligence, and technology arising. They call this cumulative unknown factor fbt.

Their result: for technological civilization to arise no more than once, with probability 0.01, in the lifetime of the observable universe, fbt should be no greater than ~ 2.5 x 10-24.


Discussion

It's convenient that they calculate the chance technological civilization ever arose, rather than the chance one exists now. This is just the number we need to estimate the likelihood of a Great Filter.

They state their result as "[if we set fbt ≤ 2.5 x 10-24, then] at in a statistical sense were we to rerun the history of the Universe 100 times, only once would a lone technological species occur". But I don't know what rerunning the Universe means. I also can't formulate this as saying "if we hadn't already observed the Universe to be apparently empty of life, we would expect it to contain or to have once contained life with a probability of 1024", because that would ignore the chance that another civilization (if it counterfactually existed) would have affected or prevented the rise of life on Earth. Can someone help reformulate this? 

I don't know if their modern values for star and planet formation have been used in previous discussions of the Fermi paradox or the Great Filter. (The papers they cite for their values date from 2012, 2013 and 2015.) I also don't know if these values should be trusted, or what concrete values had been used previously. People on top of the Great Filter discussion probably already updated when the astronomical data came in.

Rationality Reading Group: Part Z: The Craft and the Community

6 Gram_Stone 04 May 2016 11:03PM

This is part of a semi-monthly reading group on Eliezer Yudkowsky's ebook, Rationality: From AI to Zombies. For more information about the group, see the announcement post.


Welcome to the Rationality reading group. This fortnight we discuss Part Z: The Craft and the Community (pp. 1651-1750). This post summarizes each article of the sequence, linking to the original LessWrong post where available.

Z. The Craft and the Community

312. Raising the Sanity Waterline - Behind every particular failure of social rationality is a larger and more general failure of social rationality; even if all religious content were deleted tomorrow from all human minds, the larger failures that permit religion would still be present. Religion may serve the function of an asphyxiated canary in a coal mine - getting rid of the canary doesn't get rid of the gas. Even a complete social victory for atheism would only be the beginning of the real work of rationalists. What could you teach people without ever explicitly mentioning religion, that would raise their general epistemic waterline to the point that religion went underwater?

313. A Sense That More Is Possible - The art of human rationality may have not been much developed because its practitioners lack a sense that vastly more is possible. The level of expertise that most rationalists strive to develop is not on a par with the skills of a professional mathematician - more like that of a strong casual amateur. Self-proclaimed "rationalists" don't seem to get huge amounts of personal mileage out of their craft, and no one sees a problem with this. Yet rationalists get less systematic training in a less systematic context than a first-dan black belt gets in hitting people.

314. Epistemic Viciousness - An essay by Gillian Russell on "Epistemic Viciousness in the Martial Arts" generalizes amazingly to possible and actual problems with building a community around rationality. Most notably the extreme dangers associated with "data poverty" - the difficulty of testing the skills in the real world. But also such factors as the sacredness of the dojo, the investment in teachings long-practiced, the difficulty of book learning that leads into the need to trust a teacher, deference to historical masters, and above all, living in data poverty while continuing to act as if the luxury of trust is possible.

315. Schools Proliferating Without Evidence - The branching schools of "psychotherapy", another domain in which experimental verification was weak (nonexistent, actually), show that an aspiring craft lives or dies by the degree to which it can be tested in the real world. In the absence of that testing, one becomes prestigious by inventing yet another school and having students, rather than excelling at any visible performance criterion. The field of hedonic psychology (happiness studies) began, to some extent, with the realization that you could measure happiness - that there was a family of measures that by golly did validate well against each other. The act of creating a new measurement creates new science; if it's a good measurement, you get good science.

316. Three Levels of Rationality Verification - How far the craft of rationality can be taken, depends largely on what methods can be invented for verifying it. Tests seem usefully stratifiable into reputational, experimental, andorganizational. A "reputational" test is some real-world problem that tests the ability of a teacher or a school (like running a hedge fund, say) - "keeping it real", but without being able to break down exactly what was responsible for success. An "experimental" test is one that can be run on each of a hundred students (such as a well-validated survey). An "organizational" test is one that can be used to preserve the integrity of organizations by validating individuals or small groups, even in the face of strong incentives to game the test. The strength of solution invented at each level will determine how far the craft of rationality can go in the real world.

317. Why Our Kind Can't Cooperate - The atheist/libertarian/technophile/sf-fan/early-adopter/programmer/etc crowd, aka "the nonconformist cluster", seems to be stunningly bad at coordinating group projects. There are a number of reasons for this, but one of them is that people are as reluctant to speak agreement out loud, as they are eager to voice disagreements - the exact opposite of the situation that obtains in more cohesive and powerful communities. This is not rational either! It is dangerous to be half a rationalist (in general), and this also applies to teaching only disagreement but not agreement, or only lonely defiance but not coordination. The pseudo-rationalist taboo against expressing strong feelings probably doesn't help either.

318. Tolerate Tolerance - One of the likely characteristics of someone who sets out to be a "rationalist" is a lower-than-usual tolerance for flawed thinking. This makes it very important to tolerate other people's tolerance - to avoid rejecting them because they tolerate people you wouldn't - since otherwise we must all have exactly the same standards of tolerance in order to work together, which is unlikely. Even if someone has a nice word to say about complete lunatics and crackpots - so long as they don't literally believe the same ideas themselves - try to be nice to them? Intolerance of tolerance corresponds to punishment of non-punishers, a very dangerous game-theoretic idiom that can lock completely arbitrary systems in place even when they benefit no one at all.

319. Your Price for Joining - The game-theoretical puzzle of the Ultimatum game has its reflection in a real-world dilemma: How much do you demand that an existing group adjust toward you, before you will adjust toward it? Our hunter-gatherer instincts will be tuned to groups of 40 with very minimal administrative demands and equal participation, meaning that we underestimate the inertia of larger and more specialized groups and demand too much before joining them. In other groups this resistance can be overcome by affective death spirals and conformity, but rationalists think themselves too good for this - with the result that people in the nonconformist cluster often set their joining prices way way way too high, like an 50-way split with each player demanding 20% of the money. Nonconformists need to move in the direction of joining groups more easily, even in the face of annoyances and apparent unresponsiveness. If an issue isn't worth personally fixing by however much effort it takes, it's not worth a refusal to contribute.

320. Can Humanism Match Religion's Output? - Anyone with a simple and obvious charitable project - responding with food and shelter to a tidal wave in Thailand, say - would be better off by far pleading with the Pope to mobilize the Catholics, rather than with Richard Dawkins to mobilize the atheists. For so long as this is true, any increase in atheism at the expense of Catholicism will be something of a hollow victory, regardless of all other benefits. Can no rationalist match the motivation that comes from the irrational fear of Hell? Or does the real story have more to do with the motivating power of physically meeting others who share your cause, and group norms of participating?

321. Church vs. Taskforce - Churches serve a role of providing community - but they aren't explicitly optimized for this, because their nominal role is different. If we desire community without church, can we go one better in the course of deleting religion? There's a great deal of work to be done in the world; rationalist communities might potentially organize themselves around good causes, while explicitly optimizing for community.

322. Rationality: Common Interest of Many Causes - Many causes benefit particularly from the spread of rationality - because it takes a little more rationality than usual to see their case, as a supporter, or even just a supportive bystander. Not just the obvious causes like atheism, but things like marijuana legalization. In the case of my own work this effect was strong enough that after years of bogging down I threw up my hands and explicitly recursed on creating rationalists. If such causes can come to terms with not individually capturing all the rationalists they create, then they can mutually benefit from mutual effort on creating rationalists. This cooperation may require learning to shut up about disagreements between such causes, and not fight over priorities, except in specialized venues clearly marked.

323. Helpless Individuals - When you consider that our grouping instincts are optimized for 50-person hunter-gatherer bands where everyone knows everyone else, it begins to seem miraculous that modern-day large institutions survive at all. And in fact, the vast majority of large modern-day institutions simply fail to exist in the first place. This is why funding of Science is largely through money thrown at Science rather than donations from individuals - research isn't a good emotional fit for the rare problems that individuals can manage to coordinate on. In fact very few things are, which is why e.g. 200 million adult Americans have such tremendous trouble supervising the 535 members of Congress. Modern humanity manages to put forth very little in the way of coordinated individual effort to serve our collective individual interests.

324. Money: The Unit of Caring - Omohundro's resource balance principle implies that the inside of any approximately rational system has a common currency of expected utilons. In our world, this common currency is called "money" and it is the unit of how much society cares about something - a brutal yet obvious point. Many people, seeing a good cause, would prefer to help it by donating a few volunteer hours. But this avoids the tremendous gains of comparative advantage, professional specialization, and economies of scale - the reason we're not still in caves, the only way anything ever gets done in this world, the tools grownups use when anyone really cares. Donating hours worked within a professional specialty and paying-customer priority, whether directly, or by donating the money earned to hire other professional specialists, is far more effective than volunteering unskilled hours.

325. Purchase Fuzzies and Utilons Separately - Wealthy philanthropists typically make the mistake of trying to purchase warm fuzzy feelings, status among friends, and actual utilitarian gains, simultaneously; this results in vague pushes along all three dimensions and a mediocre final result. It should be far more effective to spend some money/effort on buying altruistic fuzzies at maximum optimized efficiency (e.g. by helping people in person and seeing the results in person), buying status at maximum efficiency (e.g. by donating to something sexy that you can brag about, regardless of effectiveness), and spending most of your money on expected utilons (chosen through sheer cold-blooded shut-up-and-multiply calculation, without worrying about status or fuzzies).

326. Bystander ApathyThe bystander effect is when groups of people are less likely to take action than an individual. There are a few explanations for why this might be the case.

327. Collective Apathy and the Internet - The causes of bystander apathy are even worse on the Internet. There may be an opportunity here for a startup to deliberately try to avert bystander apathy in online group coordination.

328. Incremental Progress and the Valley - The optimality theorems for probability theory and decision theory, are for perfect probability theory and decision theory. There is no theorem that incremental changes toward the ideal, starting from a flawed initial form, must yield incremental progress at each step along the way. Since perfection is unattainable, why dare to try for improvement? But my limited experience with specialized applications suggests that given enough progress, one can achieve huge improvements over baseline - it just takes a lot of progress to get there.

329. Bayesians vs. BarbariansSuppose that a country of rationalists is attacked by a country of Evil Barbarians who know nothing of probability theory or decision theory. There's a certain concept of "rationality" which says that the rationalists inevitably lose, because the Barbarians believe in a heavenly afterlife if they die in battle, while the rationalists would all individually prefer to stay out of harm's way. So the rationalist civilization is doomed; it is too elegant and civilized to fight the savage Barbarians... And then there's the idea that rationalists should be able to (a) solve group coordination problems, (b) care a lot about other people and (c) win...

330. Beware of Other-Optimizing - Aspiring rationalists often vastly overestimate their own ability to optimize other people's lives. They read nineteen webpages offering productivity advice that doesn't work for them... and then encounter the twentieth page, or invent a new method themselves, and wow, it really works - they've discovered the true method. Actually, they've just discovered the one method in twenty that works for them, and their confident advice is no better than randomly selecting one of the twenty blog posts. Other-Optimizing is exceptionally dangerous when you have power over the other person - for then you'll just believe that they aren't trying hard enough.

331. Practical Advice Backed by Deep Theories - Practical advice is genuinely much, much more useful when it's backed up by concrete experimental results, causal models that are actually true, or valid math that is validly interpreted. (Listed in increasing order of difficulty.) Stripping out the theories and giving the mere advice alone wouldn't have nearly the same impact or even the same message; and oddly enough, translating experiments and math into practical advice seems to be a rare niche activity relative to academia. If there's a distinctive LW style, this is it.

332. The Sin of Underconfidence - When subjects know about a bias or are warned about a bias, overcorrection is not unheard of as an experimental result. That's what makes a lot of cognitive subtasks so troublesome - you know you're biased but you're not sure how much, and if you keep tweaking you may overcorrect. The danger of underconfidence (overcorrecting for overconfidence) is that you pass up opportunities on which you could have been successful; not challenging difficult enough problems; losing forward momentum and adopting defensive postures; refusing to put the hypothesis of your inability to the test; losing enough hope of triumph to try hard enough to win. You should ask yourself "Does this way of thinking make me stronger, or weaker?"

333. Go Forth and Create the Art! - I've developed primarily the art of epistemic rationality, in particular, the arts required for advanced cognitive reductionism... arts like distinguishing fake explanations from real ones and avoiding affective death spirals. There is much else that needs developing to create a craft of rationality - fighting akrasia; coordinating groups; teaching, training, verification, and becoming a proper experimental science; developing better introductory literature... And yet it seems to me that there is a beginning barrier to surpass before you can start creating high-quality craft of rationality, having to do with virtually everyone who tries to think lofty thoughts going instantly astray, or indeed even realizing that a craft of rationality exists and that you ought to be studying cognitive science literature to create it. It's my hope that my writings, as partial as they are, will serve to surpass this initial barrier. The rest I leave to you.

 


This has been a collection of notes on the assigned sequence for this fortnight. The most important part of the reading group though is discussion, which is in the comments section. Please remember that this group contains a variety of levels of expertise: if a line of discussion seems too basic or too incomprehensible, look around for one that suits you better!

This is the end, beautiful friend!

Quick puzzle about utility functions under affine transformations

5 Liron 16 July 2016 05:11PM

Here's a puzzle based on something I used to be confused about:

It is known that utility functions are equivalent (i.e. produce the same preferences over actions) up to a positive affine transformation: u'(x) = au(x) + b where a is positive.

Suppose I have u(vanilla) = 3, u(chocolate) = 8. I prefer an action that yields a 50% chance of chocolate over an action that yields a 100% chance of vanilla, because 0.5(8) > 1.0(3).

Under the positive affine transformation a = 1, b = 4; we get that u'(vanilla) = 7 and u'(chocolate) = 12. Therefore I now prefer the action that yields a 100% chance of vanilla, because 1.0(7) > 0.5(12).

How to resolve the contradiction?

 

Notes on Imagination and Suffering

5 SquirrelInHell 05 July 2016 02:28PM

Time: 22:56:47

I

This is going to be an exercise in speed writing a LW post.

Not writing posts at all seems to be worse than writing poorly edited posts.

It is currently hard for me to do anything that even resembles actual speed writing: even as I type this sentence, I have a very hard to resist urge to check it for grammar mistakes and make small corrections/improvements before I've even finished typing.

But to reduce the burden of writing, I predict it is going to be highly useful to develop the ability of actually writing a post as fast as I can type, without going back.

If this proves to have acceptable results, you can expect more regular posts from me in the future.

And possibly, if I develop the habit of writing regularly, I'll finally get to describing some of the topics on which I have (what I believe are) original and sizable clusters of knowledge, which is not easily available somewhere else.

But for now, just some thoughts on a very particular aspect of modelling how human brains think about a very particular thing.

This thing is immense suffering.

Time: 23:03:18

(Still slow!)

II

You might have heard this or similar from someone, possibly more than once in your life:

"you have no idea how I feel!"

or

"you can't even imagine how I feel!"

For me, this kind of phrase has always had the ring of a challenge. I have a potent imagination, and non-negligible experience in the affairs of humans. Therefore, I am certainly able to imagine how you feel, am I not?

Not so fast.

(Note added later: as Gram_Stone mentions, these kinds of statements tend to be used in epistemically unsound arguments, and as such can be presumed to be suspicious; however here, I am more concerned with the fact of the matter of how imagination works.)

Let's back up a little bit and recount some simple observations about imagining numbers.

You might be able to imagine and hold the image of five, six, nine, or even sixteen apples in your mind.

If I tell you to imagine something more complex, like pointed arrows arranged in a circle, you might be able to imagine four, or six, or maybe even eight of them.

If your brain is constructed differently from mine, you might easily go higher with the numbers.

But at some fairly small number, your mental machinery simply no longer has the capacity to imagine more shapes.

III

However, if I tell you that "you can't even imagine 35 apples!" it is obviously not an insult or a challenge, and what is more:

"imagining 35 apples" is NOT EQUAL to "comprehending in every detail what 35 apples are"

I.e. depending on how good your knowledge of natural numbers is, that is to say, if you passed the first class of primary school, you can analyse the situation of "35 apples" in every possible way, and imagine it partially - but not all of it at the same time.

Directly imagining apples is very similar to actually experiencing apples in your life, but it has a severe limitation.

You can experience 35 apples in your life, but you can't imagine all of them at once even if you saw them 3 seconds ago.

Meta: I think I'm getting better at not stopping when I write.

Time: 23:13:00

IV

But, you ask, what is the point of writing all this obvious stuff about apples?

Well, if you move to more emotionally charged topics, like someone's emotions, it is much harder to think about the situation in a clear way.

And if you have a clear model of how your brain processes this information, you might be able to respond in a more effective way.

In particular, you might be saved from feeling guilty or inadequate about not being able to imagine someone's feelings or suffering.

It is a simple fact about your brain that it has a limited capability to imagine emotion.

And especially with suffering, the amount of suffering you are able to experience IS OF A COMPLETELY DIFFERENT ORDER OF MAGNITUDE than the amount you are able to imagine, even with the best intentions and knowledge.

However, can you comprehend it?

V

From this model, it is also immediately obvious that the same thing happens when you think about your own suffering in the past.

We know generally that humans can't remember their emotions very well, and their memories don't correlate very well with reported experience-in-the-moment.

Based on my personal experience, I'll tentatively make some bolder claims.

If you have suffered a tremendous amount, and then enough time has passed to "get over it", your brain is not only unable to imagine how much you have suffered in the past:

it is also unable to comprehend the amount of suffering.

Yes, even if it's your own suffering.

And what is more, I propose that the exact mechanism of "getting over something" is more or less EQUIVALENT to losing the ability to comprehend that suffering.

The same would (I expect) hold in case of getting better after severe PTSD etc.

VI

So in this sense, a person telling you "you cannot even imagine how I feel" is right also with a less literal interpretation of their statement.

If you are a mentally healthy individual, not suffering any major traumas etc., I suggest your brain literally has a defense mechanism (that protects your precious mental health) that makes it impossible for you to not only imagine, but also fully comprehend the amounts of suffering you are being told about.

Time: 23:28:04

Publish!

The map of future models

5 turchin 03 July 2016 01:17PM

TL;DR: Many models of the future exist. Several are relevant. Hyperbolic model is strongest, but too strange.

Our need: correct model of the future
Different people: different models = no communication.

Assumptions:
Model of the future = main driving force of historical process + graphic of changes
Model of the future determines global risks

The map: lists all main future models.
Structure: from fast growth – to slow growth models.
Pfd: http://immortality-roadmap.com/futuremodelseng.pdf

 

 

 

 

Link: The Economist on Paperclip Maximizers

5 Anders_H 30 June 2016 12:40PM

I certainly was not expecting the Economist to publish a special report on paperclip maximizers (!).

See http://www.economist.com/news/special-report/21700762-techies-do-not-believe-artificial-intelligence-will-run-out-control-there-are?fsrc=scn/fb/te/pe/ed/frankensteinspaperclips

 

As the title suggests, they are downplaying the risks of unfriendly AI, but just the fact that the Economist published this is significant

Diaspora roundup thread, 23rd June 2016

5 philh 23 June 2016 02:03PM

Guidelines: Top-level comments here should be links to things written by members of the rationalist community, preferably that have some particular interest to this community. Self-promotion is totally fine. Including a brief summary or excerpt is great, but not required. Generally stick to one link per top-level comment, so they can be voted on individually. Recent links are preferred.

Rule: Do not link to anyone who does not want to be linked to. In particular, Scott Alexander has asked people to get his permission, before linking to specific posts on his tumblr or in other out-of-the-way places.

Open thread, June 20 - June 26, 2016

5 Elo 21 June 2016 02:45AM

If it's worth saying, but not worth its own post (even in Discussion), then it goes here.


Notes for future OT posters:

1. Please add the 'open_thread' tag.

2. Check if there is an active Open Thread before posting a new one. (Immediately before; refresh the list-of-threads page before posting.)

3. Open Threads should be posted in Discussion, and not Main.

4. Open Threads should start on Monday, and end on Sunday.

Crazy Ideas Thread

5 James_Miller 18 June 2016 12:30AM

This thread is intended to provide a space for 'crazy' ideas. Ideas that spontaneously come to mind (and feel great), ideas you long wanted to tell but never found the place and time for and also for ideas you think should be obvious and simple - but nobody ever mentions them. 

Rules for this thread:

  1. Each crazy idea goes into its own top level comment and may be commented there.
  2. Voting should be based primarily on how original the idea is.
  3. Meta discussion of the thread should go to the top level comment intended for that purpose.

How my something to protect just coalesced into being

5 Romashka 28 May 2016 06:21PM

Tl;dr Different people will probably have different answers to the question of how to find the goal & nurture the 'something to protect' feeling, but mine is: your specific working experience is already doing it for you.

What values do other people expect of you?

I think that for many people, their jobs are the most meaningful ways of changing the world (including being a housewife). When you just enter a profession and start sharing your space and time with people who have been in it for a while, you let them shape you, for better or for worse. If the overwhelming majority of bankers are not EA (from the beneficiaries' point of view), it will be hard to be an EA banker. If the overwhelming majority of teachers view the lessons as basically slam dunks (from the students' point of view), it will be hard to be a teacher who revisits past insights with any purpose other than cramming.

So basically, if I want Something to protect, I find a compatible job, observe the people, like something good and hate something bad, and then try to give others like me the chance to do more of the first and less of the second.

I am generalizing from one example... or two...

I've been in a PhD program. I liked being expected to think, being given free advice about some of the possible failures, knowing other people who don't consider solo expeditions too dangerous. I hated being expected to fail, being denied changing my research topic, spending half a day home with a cranky kid and then running to meet someone who wasn't going to show up.

Then I became a lab technician & botany teacher in an out-of-school educational facility. I liked being able to show up later on some days, being treated kindly by a dozen unfamiliar people (even if they speak at classroom volume level), being someone who steps in for a chemistry instructor, finds umbrellas, and gives out books from her own library. I hated the condescending treatment of my subject by other teachers, sudden appointments, keys going missing, questions being recycled in highschool contests, and the feeling of intrusion upon others' well-structured lessons when I just had to add something (everyone took it in stride).

(...I am going to leave the job, because it doesn't pay well enough & I do want to see my kid on weekdays. It let me to identify my StP, though - a vision of what I want from botany education.)

Background and resolution.

When kids here in Ukraine start studying biology (6th-7th Form), they wouldn't have had any physics or chemistry classes, and are at the very start of algebra and geometry curriculum. (Which makes this a good place to introduce the notion of a phenomenon for the first time.) The main thing one can get out of a botany course is, I think, the notion of ordered, sequential, mathematically describable change. The kids have already observed seasonal changes in weather and vegetation, they have words to describe their personal experiences - but this goes unused. Instead, they begin with history of botany (!), proceed to cell structure (!!) and then to bacteriae etc. Life cycle of mosses? Try asking them how long does any particular stage take! It all happens on one page, doesn't it?

There are almost no numbers.

There is, frankly, no need for numbers. Understanding the difference between the flowering and the non-flowering plants doesn't require any. There is almost no use for direct observation, either - even of the simplest things, like what will grow in the infusions of different vegetables after a week on the windowsill. There is no science.

And I don't like this.

I want there to be a book of simple, imperfectly posed problems containing as little words and as many pictures as possible. As in, 'compare the areas of the leaves on Day 1 - Day 15. How does it change? What processes underlie it?' etc. And there should be 10 or more leaves per day, so that the child would see that they don't grow equally fast, and that maybe sometimes, you can't really tell Day 7 from Day 10.

And there would be questions like 'given such gradient of densities of stomata on the poplar's leaves from Height 1 to Height 2, will there be any change in the densities of stomata of the mistletoe plants attached at Height 1 and Height 2? Explain your reasoning.' (Actually, I am unsure about this one. Leaf conductance depends on more than stomatal density...)

Conclusion

...Sorry for so many words. One day, my brain just told me [in the voice of Sponge Bob] that this was what I wanted. Subjectively, it didn't use virtue ethics or conscious decisions or anything, just saw a hole in the world order and squashed plugs into it until one kinda fit.

Has it been like this for you?

LINK: Quora brainstorms strategies for containing AI risk

5 Mass_Driver 26 May 2016 04:32PM

In case you haven't seen it yet, Quora hosted an interesting discussion of different strategies for containing / mitigating AI risk, boosted by a $500 prize for the best answer. It attracted sci-fi author David Brin, U. Michigan professor Igor Markov, and several people with PhDs in machine learning, neuroscience, or artificial intelligence. Most people from LessWrong will disagree with most of the answers, but I think the article is useful as a quick overview of the variety of opinions that ordinary smart people have about AI risk.

https://www.quora.com/What-constraints-to-AI-and-machine-learning-algorithms-are-needed-to-prevent-AI-from-becoming-a-dystopian-threat-to-humanity

Open Thread May 16 - May 22, 2016

5 Elo 15 May 2016 11:35PM

If it's worth saying, but not worth its own post (even in Discussion), then it goes here.


Notes for future OT posters:

1. Please add the 'open_thread' tag.

2. Check if there is an active Open Thread before posting a new one. (Immediately before; refresh the list-of-threads page before posting.)

3. Open Threads should be posted in Discussion, and not Main.

4. Open Threads should start on Monday, and end on Sunday.

View more: Next