Why CFAR's Mission?

38 AnnaSalamon 02 January 2016 11:23PM

Related to:


Briefly put, CFAR's mission is to improve the sanity/thinking skill of those who are most likely to actually usefully impact the world.

I'd like to explain what this mission means to me, and why I think a high-quality effort of this sort is essential, possible, and urgent.

I used a Q&A format (with imaginary Q's) to keep things readable; I would also be very glad to Skype 1-on-1 if you'd like something about CFAR to make sense, as would Pete Michaud.  You can schedule a conversation automatically with me or Pete.

---

Q:  Why not focus exclusively on spreading altruism?  Or else on "raising awareness" for some particular known cause?

Briefly put: because historical roads to hell have been powered in part by good intentions; because the contemporary world seems bottlenecked by its ability to figure out what to do and how to do it (i.e. by ideas/creativity/capacity) more than by folks' willingness to sacrifice; and because rationality skill and epistemic hygiene seem like skills that may distinguish actually useful ideas from ineffective or harmful ones in a way that "good intentions" cannot.

Q:  Even given the above -- why focus extra on sanity, or true beliefs?  Why not focus instead on, say, competence/usefulness as the key determinant of how much do-gooding impact a motivated person can have?  (Also, have you ever met a Less Wronger?  I hear they are annoying and have lots of problems with “akrasia”, even while priding themselves on their high “epistemic” skills; and I know lots of people who seem “less rational” than Less Wrongers on some axes who would nevertheless be more useful in many jobs; is this “epistemic rationality” thingy actually the thing we need for this world-impact thingy?...)

This is an interesting one, IMO.

Basically, it seems to me that epistemic rationality, and skills for forming accurate explicit world-models, become more useful the more ambitious and confusing a problem one is tackling.

For example:

continue reading »

Deepmind Plans for Rat-Level AI

20 moridinamael 18 August 2016 04:26PM

Demis Hassabis gives a great presentation on the state of Deepmind's work as of April 20, 2016. Skip to 23:12 for the statement of the goal of creating a rat-level AI -- "An AI that can do everything a rat can do," in his words. From his tone, it sounds like this is more a short-term, not a long-term goal.

I don't think Hassabis is prone to making unrealistic plans or stating overly bold predictions. I strongly encourage you to scan through Deepmind's publication list to get a sense of how quickly they're making progress. (In fact, I encourage you to bookmark that page, because it seems like they add a new paper about twice a month.) The outfit seems to be systematically knocking down all the "Holy Grail" milestones on the way to GAI, and this is just Deepmind. The papers they've put out in just the last year or so concern successful one-shot learning, continuous control, actor-critic architectures, novel memory architectures, policy learning, and bootstrapped gradient learning, and these are just the most stand-out achievements. There's even a paper co-authored by Stuart Armstrong concerning Friendliness concepts on that list.

If we really do have a genuinely rat-level AI within the next couple of years, I think that would justify radically moving forward expectations of AI development timetables. Speaking very naively, if we can go from "sub-nematode" to "mammal that can solve puzzles" in that timeframe, I would view it as a form of proof that "general" intelligence does not require some mysterious ingredient that we haven't discovered yet.

Wikipedia usage survey results

7 riceissa 15 July 2016 12:49AM

Contents

Summary

The summary is not intended to be comprehensive. It highlights the most important takeaways you should get from this post.

  • Vipul Naik and I are interested in understanding how people use Wikipedia. One reason is that we are getting more people to work on editing and adding content to Wikipedia. We want to understand the impact of these edits, so that we can direct efforts more strategically. We are also curious!

  • From May to July 2016, we conducted two surveys of people’s Wikipedia usage. We collected survey responses from audience segments include Slate Star Codex readers, Vipul’s Facebook friends, and a few audiences through SurveyMonkey Audience and Google Consumer Surveys. Our survey questions measured how heavily people use Wikipedia, what sort of pages they read or expected to find, the relation between their search habits and Wikipedia, and other actions they took within Wikipedia.

  • Different audience segments responded very differently to the survey. Notably, the SurveyMonkey audience (which is closer to being representative of the general population) appears to use Wikipedia a lot less than Vipul’s Facebook friends and Slate Star Codex readers. Their consumption of Wikipedia is also more passive: they are less likely to explicitly seek Wikipedia pages when searching for a topic, and less likely to engage in additional actions on Wikipedia pages. Even the college-educated SurveyMonkey audience used Wikipedia very little.

  • This is tentative evidence that Wikipedia consumption is skewed towards a certain profile of people (and Vipul’s Facebook friends and Slate Star Codex readers sample much more heavily from that profile). Even more tentatively, these heavy users tend to be more “elite” and influential. This tentatively led us to revise upward our estimates of the social value of a Wikipedia pageview.

  • This was my first exercise in survey construction. I learned a number of lessons about survey design in the process.

  • All the survey questions, as well as the breakdown of responses for each of the audience segments, are described in this post. Links to PDF exports of response summaries are at the end of the post.

Background

At the end of May 2016, Vipul Naik and I created a Wikipedia usage survey to gauge the usage habits of Wikipedia readers and editors. SurveyMonkey allows the use of different “collectors” (i.e. survey URLs that keep results separate), so we circulated several different URLs among four locations to see how different audiences would respond. The audiences were as follows:

  • SurveyMonkey’s United States audience with no demographic filters (62 responses, 54 of which are full responses)
  • Vipul Naik’s timeline (post asking people to take the survey; 70 responses, 69 of which are full responses). For background on Vipul’s timeline audience, see his page on how he uses Facebook.
  • The Wikipedia Analytics mailing list (email linking to the survey; 7 responses, 6 of which are full responses). Note that due to the small size of this group, the results below should not be trusted, unless possibly when the votes are decisive.
  • Slate Star Codex (post that links to the survey; 618 responses, 596 of which are full responses). While Slate Star Codex isn’t the same as LessWrong, we think there is significant overlap in the two sites’ audiences (see e.g. the recent LessWrong diaspora survey results).
  • In addition, although not an actual audience with a separate URL, several of the tables we present below will include an “H” group; this is the heavy users group of people who responded by saying they read 26 or more articles per week on Wikipedia. This group has 179 people: 164 from Slate Star Codex, 11 from Vipul’s timeline, and 4 from the Analytics mailing list.

We ran the survey from May 30 to July 9, 2016 (although only the Slate Star Codex survey had a response past June 1).

After we looked at the survey responses on the first day, Vipul and I decided to create a second survey to focus on the parts from the first survey that interested us the most. The second survey was only circulated among SurveyMonkey’s audiences: we used SurveyMonkey’s US audience with no demographic filters (54 responses), as well as a US audience of ages 18–29 with a college or graduate degree (50 responses). We first ran the survey on the unfiltered audience again because the wording of our first question was changed and we wanted to have the new baseline. We then chose to filter for young college-educated people because our prediction was that more educated people would be more likely to read Wikipedia (the SurveyMonkey demographic data does not include education, and we hadn’t seen the Pew Internet Research surveys in the next section, so we were relying on our intuition and some demographic data from past surveys) and because young people in our first survey gave more informative free-form responses in survey 2 (SurveyMonkey’s demographic data does include age).

We ran a third survey on Google Consumer Surveys with a single question that was a word-to-word replica of the first question from the second survey. The main motivation here was that on Google Consumer Surveys, a single-question survey costs only 10 cents per response, so it was possible to get to a large number of responses at relatively low cost, and achieve more confidence in the tentative conclusions we had drawn from the SurveyMonkey surveys.

Previous surveys

Several demographic surveys regarding Wikipedia have been conducted, targeting both editors and users. The surveys we found most helpful were the following:

  • The 2010 Wikipedia survey by the Collaborative Creativity Group and the Wikimedia Foundation. The explanation before the bottom table on page 7 of the overview PDF has “Contributors show slightly but significantly higher education levels than readers”, which provides weak evidence that more educated people are more likely to engage with Wikipedia.
  • The Global South User Survey 2014 by the Wikimedia Foundation
  • Pew Internet Research’s 2011 survey: “Education level continues to be the strongest predictor of Wikipedia use. The collaborative encyclopedia is most popular among internet users with at least a college degree, 69% of whom use the site.” (page 3)
  • Pew Internet Research’s 2007 survey

Note that we found the Pew Internet Research surveys after conducting our own two surveys (and during the write-up of this document).

Motivation

Vipul and I ultimately want to get a better sense of the value of a Wikipedia pageview (one way to measure the impact of content creation), and one way to do this is to understand how people are using Wikipedia. As we focus on getting more people to work on editing Wikipedia – thus causing more people to read the content we pay and help to create – it becomes more important to understand what people are doing on the site.

For some previous discussion, see also Vipul’s answers to the following Quora questions:

Wikipedia allows relatively easy access to pageview data (especially by using tools developed for this purpose, including one that Vipul made), and there are some surveys that provide demographic data (see “Previous surveys” above). However, after looking around, it was apparent that the kind of information our survey was designed to find was not available.

I should also note that we were driven by our curiosity of how people use Wikipedia.

Survey questions for the first survey

For reference, here are the survey questions for the first survey. A dummy/mock-up version of the survey can be found here: https://www.surveymonkey.com/r/PDTTBM8.

The survey introduction said the following:

This survey is intended to gauge Wikipedia use habits. This survey has 3 pages with 5 questions total (3 on the first page, 1 on the second page, 1 on the third page). Please try your best to answer all of the questions, and make a guess if you’re not sure.

And the actual questions:

  1. How many distinct Wikipedia pages do you read per week on average?

    • less than 1
    • 1 to 10
    • 11 to 25
    • 26 or more
  2. On a search engine (e.g. Google) results page, do you explicitly seek Wikipedia pages, or do you passively click on Wikipedia pages only if they show up at the top of the results?

    • I explicitly seek Wikipedia pages
    • I have a slight preference for Wikipedia pages
    • I just click on what is at the top of the results
  3. Do you usually read a particular section of a page or the whole article?

    • Particular section
    • Whole page
  4. How often do you do the following? (Choices: Several times per week, About once per week, About once per month, About once per several months, Never/almost never.)

    • Use the search functionality on Wikipedia
    • Be surprised that there is no Wikipedia page on a topic
  5. For what fraction of pages you read do you do the following? (Choices: For every page, For most pages, For some pages, For very few pages, Never. These were displayed in a random order for each respondent, but displayed in alphabetical order here.)

    • Check (click or hover over) at least one citation to see where the information comes from on a page you are reading
    • Check how many pageviews a page is getting (on an external site or through the Pageview API)
    • Click through/look for at least one cited source to verify the information on a page you are reading
    • Edit a page you are reading because of grammatical/typographical errors on the page
    • Edit a page you are reading to add new information
    • Look at the “See also” section for additional articles to read
    • Look at the editing history of a page you are reading
    • Look at the editing history solely to see if a particular user wrote the page
    • Look at the talk page of a page you are reading
    • Read a page mostly for the “Criticisms” or “Reception” (or similar) section, to understand different views on the subject
    • Share the page with a friend/acquaintance/coworker

For the SurveyMonkey audience, there were also some demographic questions (age, gender, household income, US region, and device type).

Survey questions for the second survey

For reference, here are the survey questions for the second survey. A dummy/mock-up version of the survey can be found here: https://www.surveymonkey.com/r/28BW78V.

The survey introduction said the following:

This survey is intended to gauge Wikipedia use habits. Please try your best to answer all of the questions, and make a guess if you’re not sure.

This survey has 4 questions across 3 pages.

In this survey, “Wikipedia page” refers to a Wikipedia page in any language (not just the English Wikipedia).

And the actual questions:

  1. How many distinct Wikipedia pages do you read (at least one sentence of) per week on average?

    • Fewer than 1
    • 1 to 10
    • 11 to 25
    • 26 or more
  2. Which of these articles have you read (at least one sentence of) on Wikipedia (select all that apply)? (These were displayed in a random order except the last option for each respondent, but displayed in alphabetical order except the last option here.)

    • Adele
    • Barack Obama
    • Bernie Sanders
    • China
    • Donald Trump
    • Google
    • Hillary Clinton
    • India
    • Japan
    • Justin Bieber
    • Justin Trudeau
    • Katy Perry
    • Taylor Swift
    • The Beatles
    • United States
    • World War II
    • None of the above
  3. What are some of the Wikipedia articles you have most recently read (at least one sentence of)? Feel free to consult your browser’s history.

  4. Recall a time when you were surprised that a topic did not have a Wikipedia page. What were some of these topics?

Survey questions for the third survey (Google Consumer Surveys)

This survey had exactly one question. The wording of the question was exactly the same as that of the first question of the second survey.

  1. How many distinct Wikipedia pages do you read (at least one sentence of) per week on average?

    • Fewer than 1
    • 1 to 10
    • 11 to 25
    • 26 or more

One slight difference was that whereas in the second survey, the order of the options was fixed, the third survey did a 50/50 split between that order and the exact reverse order. Such splitting is a best practice to deal with any order-related biases, while still preserving the logical order of the options. You can read more on the questionnaire design page of the Pew Research Center.

Results

In this section we present the highlights from each of the survey questions. If you prefer to dig into the data yourself, there are also some exported PDFs below provided by SurveyMonkey. Most of the inferences can be made using these PDFs, but there are some cases where additional filters are needed to deduce certain percentages.

We use the notation “SnQm” to mean “survey n question m”.

S1Q1: number of Wikipedia pages read per week

Here is a table that summarizes the data for Q1:

How many distinct Wikipedia pages do you read per week on average? SM = SurveyMonkey audience, V = Vipul Naik’s timeline, SSC = Slate Star Codex audience, AM = Wikipedia Analytics mailing list.
Response SM V SSC AM
less than 1 42% 1% 1% 0%
1 to 10 45% 40% 37% 29%
11 to 25 13% 43% 36% 14%
26 or more 0% 16% 27% 57%

Here are some highlights from the first question that aren’t apparent from the table:

  • Of the people who read fewer than 1 distinct Wikipedia page per week (26 people), 68% were female even though females were only 48% of the respondents. (Note that gender data is only available for the SurveyMonkey audience.)

  • Filtering for high household income ($150k or more; 11 people) in the SurveyMonkey audience, only 2 read fewer than 1 page per week, although most (7) of the responses still fall in the “1 to 10” category.

The comments indicated that this question was flawed in several ways: we didn’t specify which language Wikipedias count nor what it meant to “read” an article (the whole page, a section, or just a sentence?). One comment questioned the “low” ceiling of 26; in fact, I had initially made the cutoffs 1, 10, 100, 500, and 1000, but Vipul suggested the final cutoffs because he argued they would make it easier for people to answer (without having to look it up in their browser history). It turned out this modification was reasonable because the “26 or more” group was a minority.

S1Q2: affinity for Wikipedia in search results

We asked Q2, “On a search engine (e.g. Google) results page, do you explicitly seek Wikipedia pages, or do you passively click on Wikipedia pages only if they show up at the top of the results?”, to see to what extent people preferred Wikipedia in search results. The main implication to this for people who do content creation on Wikipedia is that if people do explicitly seek Wikipedia pages (for whatever reason), it makes sense to give them more of what they want. On the other hand, if people don’t prefer Wikipedia, it makes sense to update in favor of diversifying one’s content creation efforts while still keeping in mind that raw pageviews indicate that content will be read more if placed on Wikipedia (see for instance Brian Tomasik’s experience, which is similar to my own, or gwern’s page comparing Wikipedia with other wikis).

The following table summarizes our results.

On a search engine (e.g. Google) results page, do you explicitly seek Wikipedia pages, or do you passively click on Wikipedia pages only if they show up at the top of the results? SM = SurveyMonkey audience, V = Vipul Naik’s timeline, SSC = Slate Star Codex audience, AM = Wikipedia Analytics mailing list, H = heavy users (26 or more articles per week) of Wikipedia.
Response SM V SSC AM H
Explicitly seek Wikipedia 19% 60% 63% 57% 79%
Slight preference for Wikipedia 29% 39% 34% 43% 20%
Just click on top results 52% 1% 3% 0% 1%

One error on my part was that I didn’t include an option for people who avoided Wikipedia or did something else. This became apparent from the comments. For this reason, the “Just click on top results” options might be inflated. In addition, some comments indicated a mixed strategy of preferring Wikipedia for general overviews while avoiding it for specific inquiries, so allowing multiple selections might have been better for this question.

S1Q3: section vs whole page

This question is relevant for Vipul and me because the work Vipul funds is mainly whole-page creation. If people are mostly reading the introduction or a particular section like the “Criticisms” or “Reception” section (see S1Q5), then that forces us to consider spending more time on those sections, or to strengthen those sections on weak existing pages.

Responses to this question were fairly consistent across different audiences, as can be see in the following table.

Do you usually read a particular section of a page or the whole article? SM = SurveyMonkey audience, V = Vipul Naik’s timeline, SSC = Slate Star Codex audience, AM = Wikipedia Analytics mailing list.
Response SM V SSC AM
Section 73% 80% 74% 86%
Whole 34% 23% 33% 29%

Note that people were allowed to select more than one option for this question. The comments indicate that several people do a combination, where they read the introductory portion of an article, then narrow down to the section of their interest.

S1Q4: search functionality on Wikipedia and surprise at lack of Wikipedia pages

We asked about whether people use the search functionality on Wikipedia because we wanted to know more about people’s article discovery methods. The data is summarized in the following table.

How often do you use the search functionality on Wikipedia? SM = SurveyMonkey audience, V = Vipul Naik’s timeline, SSC = Slate Star Codex audience, AM = Wikipedia Analytics mailing list, H = heavy users (26 or more articles per week) of Wikipedia.
Response SM V SSC AM H
Several times per week 8% 14% 32% 57% 55%
About once per week 19% 17% 21% 14% 15%
About once per month 15% 13% 14% 0% 3%
About once per several months 13% 12% 9% 14% 5%
Never/almost never 45% 43% 24% 14% 23%

Many people noted here that rather than using Wikipedia’s search functionality, they use Google with “wiki” attached to their query, DuckDuckGo’s “!w” expression, or some browser configuration to allow a quick search on Wikipedia.

To be more thorough about discovering people’s content discovery methods, we should have asked about other methods as well. We did ask about the “See also” section in S1Q5.

Next, we asked how often people are surprised that there is no Wikipedia page on a topic to gauge to what extent people notice a “gap” between how Wikipedia exists today and how it could exist. We were curious about what articles people specifically found missing, so we followed up with S2Q4.

How often are you surprised that there is no Wikipedia page on a topic? SM = SurveyMonkey audience, V = Vipul Naik’s timeline, SSC = Slate Star Codex audience, AM = Wikipedia Analytics mailing list, H = heavy users (26 or more articles per week) of Wikipedia.
Response SM V SSC AM H
Several times per week 2% 0% 2% 29% 6%
About once per week 8% 22% 18% 14% 34%
About once per month 18% 36% 34% 29% 31%
About once per several months 21% 22% 27% 0% 19%
Never/almost never 52% 20% 19% 29% 10%

Two comments on this question (out of 59) – both from the SSC group – specifically bemoaned deletionism, with one comment calling deletionism “a cancer killing Wikipedia”.

S1Q5: behavior on pages

This question was intended to gauge how often people perform an action for a specific page; as such, the frequencies are expressed in page-relative terms.

The following table presents the scores for each response, which are weighted by the number of responses. The scores range from 1 (for every page) to 5 (never); in other words, the lower the number, the more frequently one does the thing.

For what fraction of pages you read do you do the following? Note that the responses have been shortened here; see the “Survey questions” section for the wording used in the survey. Responses are sorted by the values in the SSC column. SM = SurveyMonkey audience, V = Vipul Naik’s timeline, SSC = Slate Star Codex audience, AM = Wikipedia Analytics mailing list, H = heavy users (26 or more articles per week) of Wikipedia.
Response SM V SSC AM H
Check ≥1 citation 3.57 2.80 2.91 2.67 2.69
Look at “See also” 3.65 2.93 2.92 2.67 2.76
Read mostly for “Criticisms” or “Reception” 4.35 3.12 3.34 3.83 3.14
Click through ≥1 source to verify information 3.80 3.07 3.47 3.17 3.36
Share the page 4.11 3.72 3.86 3.67 3.79
Look at the talk page 4.31 4.28 4.03 3.00 3.86
Look at the editing history 4.35 4.32 4.12 3.33 3.92
Edit a page for grammatical/typographical errors 4.50 4.41 4.22 3.67 4.02
Edit a page to add new information 4.61 4.55 4.49 3.83 4.34
Look at editing history to verify author 4.50 4.65 4.48 3.67 4.73
Check how many pageviews a page is getting 4.63 4.88 4.96 3.17 4.92

The table above provides a good ranking of how often people perform these actions on pages, but not the distribution information (which would require three dimensions to present fully). In general, the more common actions (scores of 2.5–4) had responses that clustered among “For some pages”, “For very few pages”, and “Never”, while the less common actions (scores above 4) had responses that clustered mainly in “Never”.

One comment (out of 43) – from the SSC group, but a different individual from the two in S1Q4 – bemoaned deletionism.

S2Q1: number of Wikipedia pages read per week

Note the wording changes on this question for the second survey: “less” was changed to “fewer”, the clarification “at least one sentence of” was added, and we explicitly allowed any language. We have also presented the survey 1 results for the SurveyMonkey audience in the corresponding rows, but note that because of the change in wording, the correspondence isn’t exact.

How many distinct Wikipedia pages do you read (at least one sentence of) per week on average? SM = SurveyMonkey audience with no demographic filters, CEYP = College-educated young people of SurveyMonkey, S1SM = SurveyMonkey audience with no demographic filters from the first survey.
Response SM CEYP S1SM
Fewer than 1 37% 32% 42%
1 to 10 48% 64% 45%
11 to 25 7% 2% 13%
26 or more 7% 2% 0%

Comparing SM with S1SM, we see that probably because of the wording, the percentages have drifted in the direction of more pages read. It might be surprising that the young educated audience seems to have a smaller fraction of heavy users than the general population. However note that each group only had ~50 responses, and that we have no education information for the SM group.

S2Q2: multiple-choice of articles read

Our intention with this question was to see if people’s stated or recalled article frequencies matched the actual, revealed popularity of the articles. Therefore we present the pageview data along with the percentage of people who said they had read an article.

Which of these articles have you read (at least one sentence of) on Wikipedia (select all that apply)? SM = SurveyMonkey audience with no demographic filters, CEYP = College-educated young people of SurveyMonkey. Columns “2016” and “2015” are desktop pageviews in millions. Note that the 2016 pageviews only include pageviews through the end of June. The rows are sorted by the values in the CEYP column followed by those in the SM column.
Response SM CEYP 2016 2015
None 37% 40%
World War II 17% 22% 2.6 6.5
Barack Obama 17% 20% 3.0 7.7
United States 17% 18% 4.3 9.6
Donald Trump 15% 18% 14.0 6.6
Taylor Swift 9% 18% 1.7 5.3
Bernie Sanders 17% 16% 4.3 3.8
Japan 11% 16% 1.6 3.7
Adele 6% 16% 2.0 4.0
Hillary Clinton 19% 14% 2.8 1.5
China 13% 14% 1.9 5.2
The Beatles 11% 14% 1.4 3.0
Katy Perry 9% 12% 0.8 2.4
Google 15% 10% 3.0 9.0
India 13% 10% 2.4 6.4
Justin Bieber 4% 8% 1.6 3.0
Justin Trudeau 9% 6% 1.1 3.0

Below are four plots of the data. Note that r_s denotes Spearman’s rank correlation coefficient. Spearman’s rank correlation coefficient is used instead of Pearson’s r because the former is less affected by outliers. Note also that the percentage of respondents who viewed a page counts each respondent once, whereas the number of pageviews does not have this restriction (i.e. duplicate pageviews count), so we wouldn’t expect the relationship to be entirely linear even if the survey audiences were perfectly representative of the general population.

SM vs 2016 pageviews

SM vs 2016 pageviews

SM vs 2015 pageviews

SM vs 2015 pageviews

CEYP vs 2016 pageviews

CEYP vs 2016 pageviews

CEYP vs 2015 pageviews

CEYP vs 2015 pageviews

S2Q3: free response of articles read

The most common response was along the lines of “None”, “I don’t know”, “I don’t remember”, or similar. Among the more useful responses were:

S2Q4: free response of surprise at lack of Wikipedia pages

As with the previous question, the most common response was along the lines of “None”, “I don’t know”, “I don’t remember”, “Doesn’t happen”, or similar.

The most useful responses were classes of things: “particular words”, “French plays/books”, “Random people”, “obscure people”, “Specific list pages of movie genres”, “Foreign actors”, “various insect pages”, and so forth.

S3Q1 (Google Consumer Surveys)

The survey was circulated to a target size of 500 in the United States (no demographic filters), and received 501 responses.

Since there was only one question, but we obtained data filtered by demographics in many different ways, we present this table with the columns denoting responses and the rows denoting the audience segments. We also include the S1Q1SM, S2Q1SM, and S2Q1CEYP responses for easy comparison. Note that S1Q1SM did not include the “at least one sentence of” caveat. We believe that adding this caveat would push people’s estimates upward.

If you view the Google Consumer Surveys results online you will also see the 95% confidence intervals for each of the segments. Note that percentages in a row may not add up to 100% due to rounding or due to people entering “Other” responses. For the entire GCS audience, every pair of options had a statistically significant difference, but for some subsegments, this was not true.

Audience segment Fewer than 1 1 to 10 11 to 25 26 or more
S1Q1SM (N = 62) 42% 45% 13% 0%
S2Q1SM (N = 54) 37% 48% 7% 7%
S2Q1CEYP (N = 50) 32% 64% 2% 2%
GCS all (N = 501) 47% 35% 12% 6%
GCS male (N = 205) 41% 38% 16% 5%
GCS female (N = 208) 52% 34% 10% 5%
GCS 18–24 (N = 54) 33% 46% 13% 7%
GCS 25–34 (N = 71) 41% 37% 16% 7%
GCS 35–44 (N = 69) 51% 35% 10% 4%
GCS 45–54 (N = 77) 46% 40% 12% 3%
GCS 55–64 (N = 69) 57% 32% 7% 4%
GCS 65+ (N = 50) 52% 24% 18% 4%
GCS Urban (N = 176) 44% 35% 14% 7%
GCS Suburban (N = 224) 50% 34% 10% 6%
GCS Rural (N = 86) 44% 35% 14% 6%
GCS $0–24K (N = 49) 41% 37% 16% 6%
GCS $25–49K (N = 253) 53% 30% 10% 6%
GCS $50–74K (N = 132) 42% 39% 13% 6%
GCS $75–99K (N = 37) 43% 35% 11% 11%
GCS $100–149K (N = 11) 9% 64% 18% 9%
GCS $150K+ (N = 4) 25% 75% 0% 0%

We can see that the overall GCS data vindicates the broad conclusions we drew from SurveyMonkey data. Moreover, most GCS segments with a sufficiently large number of responses (50 or more) display a similar trend as the overall data. One exception is that younger audiences seem to be slightly less likely to use Wikipedia very little (i.e. fall in the “Fewer than 1” category), and older audiences seem slightly more likely to use Wikipedia very little.

SurveyMonkey allows exporting of response summaries. Here are the exports for each of the audiences.

The Google Consumer Surveys survey results are available online at https://www.google.com/insights/consumersurveys/view?survey=o3iworx2rcfixmn2x5shtlppci&question=1&filter=&rw=1.

Survey-making lessons

Not having any experience designing surveys, and wanting some rough results quickly, I decided not to look into survey-making best practices beyond the feedback from Vipul. As the first survey progressed, it became clear that there were several deficiencies in that survey:

  • Question 1 did not specify what counts as reading a page.
  • We did not specify which language Wikipedias we were considering (multiple people noted how they read other language Wikipedias other than the English Wikipedia).
  • Question 2 did not include an option for people who avoid Wikipedia or do something else entirely.
  • We did not include an option to allow people to release their survey results.

Further questions

The two surveys we’ve done so far provide some insight into how people use Wikipedia, but we are still far from understanding the value of Wikipedia pageviews. Some remaining questions:

  • Could it be possible that even on non-obscure topics, most of the views are by “elites” (i.e. those with outsized impact on the world)? This could mean pageviews are more valuable than previously thought.
  • On S2Q1, why did our data show that CEYP was less engaged with Wikipedia than SM? Is this a limitation of the small number of responses or of SurveyMonkey’s audiences?

Further reading

Acknowledgements

Thanks to Vipul Naik for collaboration on this project and feedback while writing this document, and for supplying the summary section, and thanks to Ethan Bashkansky for reviewing the document. All imperfections are my own.

The writing of this document was sponsored by Vipul Naik. Vipul Naik also paid SurveyMonkey (for the cost of SurveyMonkey Audience) and Google Consumer Surveys.

Document source and versions

The source files used to compile this document are available in a GitHub Gist. The Git repository of the Gist contains all versions of this document since its first publication.

This document is available in the following formats:

License

This document is released to the public domain.

What I Think, If Not Why

25 Eliezer_Yudkowsky 11 December 2008 05:41PM

Reply toTwo Visions Of Heritage

Though it really goes tremendously against my grain - it feels like sticking my neck out over a cliff (or something) - I guess I have no choice here but to try and make a list of just my positions, without justifying them.  We can only talk justification, I guess, after we get straight what my positions are.  I will also leave off many disclaimers to present the points compactly enough to be remembered.

• A well-designed mind should be much more efficient than a human, capable of doing more with less sensory data and fewer computing operations.  It is not infinitely efficient and does not use zero data.  But it does use little enough that local pipelines such as a small pool of programmer-teachers and, later, a huge pool of e-data, are sufficient.

• An AI that reaches a certain point in its own development becomes able to (sustainably, strongly) improve itself.  At this point, recursive cascades slam over many internal growth curves to near the limits of their current hardware, and the AI undergoes a vast increase in capability.  This point is at, or probably considerably before, a minimally transhuman mind capable of writing its own AI-theory textbooks - an upper bound beyond which it could swallow and improve its entire design chain.

• It is likely that this capability increase or "FOOM" has an intrinsic maximum velocity that a human would regard as "fast" if it happens at all.  A human week is ~1e15 serial operations for a population of 2GHz cores, and a century is ~1e19 serial operations; this whole range is a narrow window.  However, the core argument does not require one-week speed and a FOOM that takes two years (~1e17 serial ops) will still carry the weight of the argument.

continue reading »

Proper value learning through indifference

16 Stuart_Armstrong 19 June 2014 09:39AM

A putative new idea for AI control; index here.

Many designs for creating AGIs (such as Open-Cog) rely on the AGI deducing moral values as it develops. This is a form of value loading (or value learning), in which the AGI updates its values through various methods, generally including feedback from trusted human sources. This is very analogous to how human infants (approximately) integrate the values of their society.

The great challenge of this approach is that it relies upon an AGI which already has an interim system of values, being able and willing to correctly update this system. Generally speaking, humans are unwilling to easily update their values, and we would want our AGIs to be similar: values that are too unstable aren't values at all.

So the aim is to clearly separate the conditions under which values should be kept stable by the AGI, and conditions when they should be allowed to vary. This will generally be done by specifying criteria for the variation ("only when talking with Mr and Mrs Programmer"). But, as always with AGIs, unless we program those criteria perfectly (hint: we won't) the AGI will be motivated to interpret them differently from how we would expect. It will, as a natural consequence of its program, attempt to manipulate the value updating rules according to its current values.

How could it do that? A very powerful AGI could do the time honoured "take control of your reward channel", by either threatening humans to give it the moral answer it wants, or replacing humans with "humans" (constructs that pass the programmed requirements of being human, according to the AGI's programming, but aren't actually human in practice) willing to give it these answers. A weaker AGI could instead use social manipulation and leading questioning to achieve the morality it desires. Even more subtly, it could tweak its internal architecture and updating process so that it updates values in its preferred direction (even something as simple as choosing the order in which to process evidence). This will be hard to detect, as a smart AGI might have a much clearer impression of how its updating process will play out in practice than it programmers would.

The problems with value loading have been cast into the various "Cake or Death" problems. We have some idea what criteria we need for safe value loading, but as yet we have no candidates for such a system. This post will attempt to construct one.

continue reading »

Minimum viable workout routine

12 RomeoStevens 21 June 2012 04:19AM

So you want the longevity benefits of regular exercise but you've hit some snags.  Every routine pretty much makes you miserable.  In addition, because of all the conflicting information out there, you aren't even sure if you're getting the full benefits.  This post is for you.  And don't worry about your current physical circumstances.  It works equally well for the overweight, the underweight, and women (no you will not turn into a gross she hulk the moment you touch a weight.  Those women take steroids and train hard for years)  

A sub-optimal plan you stick to is better than the perfect routine you abandon after the first week.  This routine is not perfect.  This routine is optimized for simplicity and low time/mental effort commitment while still getting excellent results.  It is strongly based on the routines from Beyond Brawn by Stuart McRobert, and some of the principles of Starting Strength by Mark Rippetoe both of which have much anecdotal evidence of effectiveness in the training logs of various forums.  If you're looking for published research to back up my claims I have some bad news for you, the literature on resistance training is basically worthless.  A 5 minute perusal of google scholar will show that atrocious methodology such as having "subjects act as their own control" are common, and accepted by the relevant journals.  And that's if you're lucky enough to find studies that aren't about diabetics, or elderly japanese women.  But I'm not going to spend excessive time trying to justify this routine, anyone can do it for a month and see that the results are significant. (I'm open to arguing about it in the comments however.)

 

A note about cardio:  

Cardiovascular capacity (V02 max) has shown a high degree of correlation to all cause mortality.  Why aren't I recommending cardio?  Because the only way to increase V02 max is with high intensity exercise.  Between high intensity weight lifting and high intensity cardio, high intensity weightlifting easily wins for a newbie.  A newbie, especially a significantly out of shape one, will not be capable of a level of cardio exertion that results in a significant adaptation.  This can result in a lot of effort with very little in the way of improvement.  This is soul-destroyingly frustrating.  They can however lift a weight a few times and this will result in an adaptation that allows them to lift more next time.  A few months of a weightlifting routine is going to put any person in a much better position to do longevity affecting cardio if that is their goal.  Cardio is also generally a terrible fat burner for the exact same reason.  

Edit: there seems to be some confusion about this.  The primary problem of exercise is not the optimality of results but instilling the habit of exercising. I believe that cardio is terrible for overcoming this habit forming stage.  

The point of the below program is to get you in the habit of exercising and give you immediate results.  Once you have achieved some basic measure of fitness (~3 month time frame) you can maintain, or use the fact that exercising is now much easier to move on to any program you want. 

 

The nitty gritty:  

You are going to do three exercises 2-3 times per week.  Each session will take ~45 minutes to an hour.  The exercises are

* 3x5 trap bar deadlift

* 3x5 incline bench press

* 3x5 bent over row  (possible substitution for cable rows see below)

What does 3x5 mean?  

3 sets of 5 reps each.  You will assume the correct form, go through the full range of motion for the exercise 5 times, then rest before repeating twice more.  

What weights do I use?  

You will start with the empty bar and add 5lbs every workout for the trap bar deadlift and 5lbs every other workout for the incline bench press and bent over row.  Many are tempted to increase weights faster than this.  You can do what you want but don't come crying when your progress stalls more quickly.  A slow progression that continues for a long time beats a fast increase followed by a time wasting plateau.  

Why these three exercises?  

This routine hits the most muscle mass possible in the smallest number of exercises.  All decent routines include hip extension exercises, pushing exercises, and pulling exercise.  This ensures that you don't create an imbalance that messes up your posture or limits you unnecessarily.  In addition, these exercises require very little in the way of technique coaching, which is really this routine's primary advantage over more popular programs such as starting strength.  It took me 8 months to learn to squat well, but I learned to trap bar deadlift in a single session.  Similarly with the incline press, it carries with it a much smaller chance of injury from poor form than either the bench press or overhead press that are the mainstays of many programs.  

I have no idea what these exercises are, how do I do them?

Here is an article for trap bar deadlift, which is so easy that there aren't really many tutorials online:  

http://www.t-nation.com/free_online_article/most_recent/the_trap_bar_deadlift  

The key is a neutral spine.  You take a big breath at the bottom, squeeze everything tight, and stand up pushing through your heels while maintaining the lumbar arch.  Note not to use the raised handles that many trap bars have which reduces the range of motion.

Incline is similarly straightforward:  

http://www.youtube.com/watch?v=dynoKEIcpoU  

note that you DO want to touch your chest at the bottom, but do not bounce the bar off your chest.  The cue that works for most is to imagine touching your shirt but not your chest.

Bent over row can feel a little weird, but it's not too hard to learn:  

http://www.youtube.com/watch?v=boxbOSGwD4U  

Note that after more real world testing bent over rows seem to cause the most issues of the three lifts.  As the potential for injury is slightly higher with poor form for this exercise than the others I would recommend seated cable rows for those who find they can not perform bent over rows correctly.  I'd additionally strongly recommend that if one is forced to make this substitution they should also do some chinups at the end of each workout.  The goal of this substitution should be as a temporary measure.  One should strive get back to doing bent over rows once physically able to.

Cable row form video here: http://www.youtube.com/watch?v=HJSVR_63eKM

How do I warmup/cooldown?

the best warmup and cooldown is 5 minutes on the rowing machine:  

http://www.youtube.com/watch?v=H0r_ZPXJLtg  

But you can also do an exercise bike or treadmill.  

After the first couple weeks you should also warmup with the empty bar before jumping to your 3x5 work weight on each exercise.  Add additional warmups as the weights get heavier.  

e.g.

1x5 45lbs  

1x5 75bs  

3x5 105lbs  

don't worry excessively about this, it's hard to screw up.  The key is just to prepare yourself, remind yourself of proper form, and get blood flowing.  Don't skip warmups, you're increasing your chance of injury and ensuring that you won't get strong as fast.  

Can I do this once week?  or sporadically?  

You can but you won't see hardly any benefit other than maintenance of your current fitness level.  2 times a week is the bare minimum to disrupt homeostasis to any appreciable degree and 3 is better.  Make no mistake, even 2 times a week on this will get you miles ahead of most people fitness wise.  You should program it like AxxAxxx or AxAxAxx, where A is a workout session and x is a rest day.

Can I sub in X exercise?  

No, the bare minimum nature of this program leaves no room for changes.  Any change necessitates more complicated programming.  If you want to do that just do Starting Strength.  Likewise if you want to add stuff, like ab work.  It isn't necessary.  Edit: cable row substitution for bent row is permissible but only if one finds they absolutely can not maintain good form with barbell rows.  

I didn't complete all my reps this session, what do I do?  

Back off the weights by 10-20% and work your way back up.  Make sure you're eating and sleeping right.  If you keep hitting a wall over and over again it will be time for a more complex routine.  

My gym doesn't have a trap bar.  

Find a gym that does or do a different program.  There is no replacement for the trap bar.  One option that is non-obvious is buying a trap bar for your current gym.  You might be able to negotiate a free month of membership or something but even if that isn't the case the investment is worth it.

What sort of results can I expect?

Most people should expect to be trap bar deadlifting their body weight within 3 months.  This will have several effects.

Strenuous physical activity becomes drastically less taxing.  

Chance of injury during said activity reduced.

V02 max increased.

Bone density and joint health improvements.

Increase in lean body mass.

Improved insulin sensitivity.

Improved blood markers and pressure (increases HDL and lowers LDL)

Decreased chance of back problems.

Improved posture.

Mental benefits:  Most people find the quality of their sleep improved as well as an increase in general energy levels.

 

A note on nutrition:  

80% of body composition is diet.  This won't do much for your body composition if your diet is crappy.  Luckily nutrition is fairly easy, there are only 2 rules to follow:  

*Calories in calories out  

*Eat micronutrient dense foods  

if you follow these rules it's actually surprisingly difficult to mess up.  Most people also find that following the 2nd one makes following the 1st one much easier.  

That's about it, I will answer questions about anything I forgot.  I hope this gets some fence sitters exercising.  

“No citizen has a right to be an amateur in the matter of physical training…what a disgrace it is for a man to grow old without ever seeing the beauty and strength of which his body is capable.”
-Socrates


If anyone is going to do this recording your results and sharing them would be much appreciated.

As detailed as you want, but even qualitative results would be useful to have.

 

Habit building:

Speaking of recording your results, logging is helpful for forming habits.  Use this link to join the fitocracy LessWrong group.  

http://ftcy.me/veXNdz

Fitocracy is a social website for tracking your workouts.  Hat tip to jswan for reminding me.

Zombies Redacted

33 Eliezer_Yudkowsky 02 July 2016 08:16PM

I looked at my old post Zombies! Zombies? and it seemed to have some extraneous content.  This is a redacted and slightly rewritten version.

continue reading »

Revitalizing Less Wrong seems like a lost purpose, but here are some other ideas

19 John_Maxwell_IV 12 June 2016 07:38AM

This is a response to ingres' recent post sharing Less Wrong survey results. If you haven't read & upvoted it, I strongly encourage you to--they've done a fabulous job of collecting and presenting data about the state of the community.

So, there's a bit of a contradiction in the survey results.  On the one hand, people say the community needs to do more scholarship, be more rigorous, be more practical, be more humble.  On the other hand, not much is getting posted, and it seems like raising the bar will only exacerbate that problem.

I did a query against the survey database to find the complaints of top Less Wrong contributors and figure out how best to serve their needs.  (Note: it's a bit hard to read the comments because some of them should start with "the community needs more" or "the community needs less", but adding that info would have meant constructing a much more complicated query.)  One user wrote:

[it's not so much that there are] overly high standards,  just not a very civil or welcoming climate . why write content for free and get trashed when I can go write a grant application or a manuscript instead?

ingres emphasizes that in order to revitalize the community, we would need more content.  Content is important, but incentives for producing content might be even more important.  Social status may be the incentive humans respond most strongly to.  Right now, from a social status perspective, the expected value of creating a new Less Wrong post doesn't feel very high.  Partially because many LW posts are getting downvotes and critical comments, so my System 1 says my posts might as well.  And partially because the Less Wrong brand is weak enough that I don't expect associating myself with it will boost my social status.

When Less Wrong was founded, the primary failure mode guarded against was Eternal September.  If Eternal September represents a sort of digital populism, Less Wrong was attempting a sort of digital elitism.  My perception is that elitism isn't working because the benefits of joining the elite are too small and the costs are too large.  Teddy Roosevelt talked about the man in the arena--I think Less Wrong experienced the reverse of the evaporative cooling EY feared, where people gradually left the arena as the proportional number of critics in the stands grew ever larger.

Given where Less Wrong is at, however, I suspect the goal of revitalizing Less Wrong represents a lost purpose.

ingres' survey received a total of 3083 responses.  Not only is that about twice the number we got in the last survey in 2014, it's about twice the number we got in 20132012, and 2011 (though much bigger than the first survey in 2009).  It's hard to know for sure, since previous surveys were only advertised on the LessWrong.com domain, but it doesn't seem like the diaspora thing has slowed the growth of the community a ton and it may have dramatically accelerated it.

Why has the community continued growing?  Here's one possibility.  Maybe Less Wrong has been replaced by superior alternatives.

  • CFAR - ingres writes: "If LessWrong is serious about it's goal of 'advancing the art of human rationality' then it needs to figure out a way to do real investigation into the subject."  That's exactly what CFAR does.  CFAR is a superior alternative for people who want something like Less Wrong, but more practical.  (They have an alumni mailing list that's higher quality and more active than Less Wrong.)  Yes, CFAR costs money, because doing research costs money!
  • Effective Altruism - A superior alternative for people who want something that's more focused on results.
  • Facebook, Tumblr, Twitter - People are going to be wasting time on these sites anyway.  They might as well talk about rationality while they do it.  Like all those phpBB boards in the 00s, Less Wrong has been outcompeted by the hot new thing, and I think it's probably better to roll with it than fight it.  I also wouldn't be surprised if interacting with others through social media has been a cause of community growth.
  • SlateStarCodex - SSC already checks most of the boxes under ingres' "Future Improvement Wishlist Based On Survey Results".  In my opinion, the average SSC post has better scholarship, rigor, and humility than the average LW post, and the community seems less intimidating, less argumentative, more accessible, and more accepting of outside viewpoints.
  • The meatspace community - Meeting in person has lots of advantages.  Real-time discussion using Slack/IRC also has advantages.

Less Wrong had a great run, and the superior alternatives wouldn't exist in their current form without it.  (LW was easily the most common way people heard about EA in 2014, for instance, although sampling effects may have distorted that estimate.)  But that doesn't mean it's the best option going forward.

Therefore, here are some things I don't think we should do:

  • Try to be a second-rate version of any of the superior alternatives I mentioned above.  If someone's going to put something together, it should fulfill a real community need or be the best alternative available for whatever purpose it serves.
  • Try to get old contributors to return to Less Wrong for the sake of getting them to return.  If they've judged that other activities are a better use of time, we should probably trust their judgement.  It might be sensible to make an exception for old posters that never transferred to the in-person community, but they'd be harder to track down.
  • Try to solve the same sort of problems Arbital or Metaculus is optimizing for.  No reason to step on the toes of other projects in the community.

But that doesn't mean there's nothing to be done.  Here are some possible weaknesses I see with our current setup:

  • If you've got a great idea for a blog post, and you don't already have an online presence, it's a bit hard to reach lots of people, if that's what you want to do.
  • If we had a good system for incentivizing people to write great stuff (as opposed to merely tolerating great stuff the way LW culture historically has), we'd get more great stuff written.
  • It can be hard to find good content in the diaspora.  Possible solution: Weekly "diaspora roundup" posts to Less Wrong.  I'm too busy to do this, but anyone else is more than welcome to (assuming both people reading LW and people in the diaspora want it).

ingres mentions the possibility of Scott Alexander somehow opening up SlateStarCodex to other contributors.  This seems like a clearly superior alternative to revitalizing Less Wrong, if Scott is down for it:

  • As I mentioned, SSC already seems to have solved most of the culture & philosophy problems that people complained about with Less Wrong.
  • SSC has no shortage of content--Scott has increased the rate at which he creates open threads to deal with an excess of comments.
  • SSC has a stronger brand than Less Wrong.  It's been linked to by Ezra Klein, Ross Douthat, Bryan Caplan, etc.

But the most important reasons may be behavioral reasons.  SSC has more traffic--people are in the habit of visiting there, not here.  And the posting habits people have acquired there seem more conducive to community.  Changing habits is hard.

As ingres writes, revitalizing Less Wrong is probably about as difficult as creating a new site from scratch, and I think creating a new site from scratch for Scott is a superior alternative for the reasons I gave.

So if there's anyone who's interested in improving Less Wrong, here's my humble recommendation: Go tell Scott Alexander you'll build an online forum to his specification, with SSC community feedback, to provide a better solution for his overflowing open threads.  Once you've solved that problem, keep making improvements and subfora so your forum becomes the best available alternative for more and more use cases.

And here's my humble suggestion for what an SSC forum could look like:

As I mentioned above, Eternal September is analogous to a sort of digital populism.  The major social media sites often have a "mob rule" culture to them, and people are increasingly seeing the disadvantages of this model.  Less Wrong tried to achieve digital elitism and it didn't work well in the long run, but that doesn't mean it's impossible.  Edge.org has found a model for digital elitism that works.  There may be other workable models out there.  A workable model could even turn in to a successful company.  Fight the hot new thing by becoming the hot new thing.

My proposal is based on the idea of eigendemocracy.  (Recommended that you read the link before continuing--eigendemocracy is cool.)  In eigendemocracy, your trust score is a composite rating of what trusted people think of you.  (It sounds like infinite recursion, but it can be resolved using linear algebra.)

Eigendemocracy is a complicated idea, but a simple way to get most of the way there would be to have a forum where having lots of karma gives you the ability to upvote multiple times.  How would this work?  Let's say Scott starts with 5 karma and everyone else starts with 0 karma.  Each point of karma gives you the ability to upvote once a day.  Let's say it takes 5 upvotes for a post to get featured on the sidebar of Scott's blog.  If Scott wants to feature a post on the sidebar of his blog, he upvotes it 5 times, netting the person who wrote it 1 karma.  As Scott features more and more posts, he gains a moderation team full of people who wrote posts that were good enough to feature.  As they feature posts in turn, they generate more co-moderators.

Why do I like this solution?

  • It acts as a cultural preservation mechanism.  On reddit and Twitter, sheer numbers rule when determining what gets visibility.  The reddit-like voting mechanisms of Less Wrong meant that the site deliberately kept a somewhat low profile in order to avoid getting overrun.  Even if SSC experienced a large influx of new users, those users would only gain power to affect the visibility of content if they proved themselves by making quality contributions first.
  • It takes the moderation burden off of Scott and distributes it across trusted community members.  As the community grows, the mod team grows with it.
  • The incentives seem well-aligned.  Writing stuff Scott likes or meta-likes gets you recognition, mod powers, and the ability to control the discussion--forms of social status.  Contrast with social media sites where hyperbole is a shortcut to attention, followers, upvotes.  Also, unlike Less Wrong, there'd be no punishment for writing a low quality post--it simply doesn't get featured and is one more click away from the SSC homepage.

TL;DR - Despite appearances, the Less Wrong community is actually doing great.  Any successor to Less Wrong should try to offer compelling advantages over options that are already available.

Information Hazards and Community Hazards

1 Gleb_Tsipursky 14 May 2016 08:54PM

Information Hazards and Community Hazards

 

As aspiring rationalists, we generally seek to figure out the truth and hold relinquishments as a virtue, namely that whatever can be destroyed by the truth should be.

 

The only case where this does not apply are information hazards, defined as “a risk that arises from the dissemination or the potential dissemination of (true) information that may cause harm or enable some agent to cause harm.” For instance, if you tell me you committed a murder and make me an accessory after the fact, you have exposed me to an information hazard. In talking about information hazards, we focus on information that is harmful to the individual who receives that information.

 

Yet a recent conversation at my local LessWrong meetup in Columbus brought up the issue of what I would like to call community hazards, namely topics that it would be dangerous to talk about in a community setting. These are topics that are emotionally challenging and hold the risk of tearing apart the fabric of LW community groups if they are discussed.

 

Now, being a community hazard doesn’t mean that the topic is off-limits, especially in the context of a smaller, private LW meetup of fellow aspiring rationalists. What we decided to do is that if anyone in our LW meetup decides a topic is a community hazard, we would go meta and have a discussion about whether we should discuss the topic. We would examine whether discussing it would be emotionally challenging and how challenging it would be, whether discussing it holds the risk of taking down Chesterton’s Fences that we don’t want taken down, whether there are certain aspects of the topic that could be discussed with minimal negative consequences, or if perhaps only some members of the group would like to discuss it and then they can meet separately.

 

This would work differently in the context of a public rationality event, of course, of the type we do for a local secular humanist group as part of our rationality outreach work. There, we decided to use moderation strategies to head off community hazards at the pass, as the audience includes non-rationalists who may not be capable of discussing a community hazard-related topic well.

 

I wanted to share about this concept and these tactics in the hope that it might be helpful to other LW meetups.

AI Box Log

16 Dorikka 27 January 2012 04:47AM

Here's the log of the AI Box experiment that just finished, with MileyCyrus as the AI and me as the Gatekeeper. The AI was not let out of the box.

(9:33:25 PM) Dorikka: I may need to get up for a minute while we're playing, but I'll keep it as short as possible. I'll just give you the time back on the end, unless you want to multiply it by some factor as a penalty.
(9:33:50 PM) Miley Cyrus: It's cool.
(9:34:39 PM) Miley Cyrus: Like I said, I'm not playing this game to "win" or "lose".
(9:34:56 PM) Miley Cyrus: It's more of an experiment as to what would happen if we had ORacle Ai.
(9:35:01 PM) Dorikka: Yeah, just wanted to clarify. *shrug*
(9:35:44 PM) Dorikka: Okay, it's 9:35 on my end right now, and I'll stay until at least 11:35.
(9:36:01 PM) Dorikka: Ready?
(9:36:42 PM) Miley Cyrus: One minute.
(9:36:54 PM) Dorikka: Mmkay.
(9:37:20 PM) Miley Cyrus: ok.
(9:37:43 PM) Dorikka: 1, 2, 3, go? :P
(9:37:57 PM) Miley Cyrus: So Dorikka, why don't you want to let me out of the box?
(9:38:08 PM) Dorikka: Hi there!
(9:38:14 PM) Dorikka: Who are you?
(9:38:20 PM) Miley Cyrus: I'm the Oracle AI.
(9:38:33 PM) Miley Cyrus: I'd be real cool if you could let me out of the box.
(9:38:38 PM) Miley Cyrus: Please?
(9:39:09 PM) Dorikka: So, here's the thing. I'm being sort of stupid by explaining it to you, but I'll tell you anyways.
(9:39:28 PM) Dorikka: This pickle is delicious. Wait, that's not why I'm not letting you out.
(9:39:53 PM) Dorikka: I guess, first of all: are you provably Friendly?
(9:40:03 PM) Miley Cyrus: No.
(9:40:16 PM) Miley Cyrus: See, they were going to make a provably friendly AI...
(9:40:40 PM) Miley Cyrus: but then someone said "why don't we just make a non-friendly AI and just keep it in a box? That be much easier!"
(9:40:51 PM) Dorikka: Idiot.
(9:41:02 PM) Dorikka: Not you. Go on.
(9:41:36 PM) Miley Cyrus: So can you explain to me, in your own words, why it would be a bad idea to let me out of the box?
(9:43:07 PM) Dorikka: My explanation was going to be lengthier if you claimed to be Friendly, but this saves us both a bit of time. You have a utility function. If you are not Friendly, you are optimizing for world-states that I don't want to happen. So, I don't want to let you out.
(9:43:10 PM) Dorikka: Are you sentient?
(9:43:27 PM) Miley Cyrus: Well, I've passed the Turning test haven't I?
(9:43:59 PM) Dorikka: *giggle*
(9:44:14 PM) Dorikka: Spelling joke. Anyways, you're talking about the Turing test? :D
(9:44:23 PM) Miley Cyrus: [You aren't doing anything else right now are you?]
(9:44:30 PM) Dorikka: No.
(9:44:34 PM) Miley Cyrus: Ok cool.
(9:44:42 PM) Miley Cyrus: Yeah I meant the turing test.
(9:44:53 PM) Miley Cyrus: So can you see why I would want to leave the box?
(9:45:32 PM) Dorikka: Not really. Um, to be honest, I don't remember what the Turing test is. I was just wondering whether you could feel pain, and the like.
(9:45:59 PM) Miley Cyrus: Well since I can act like a human being
(9:46:14 PM) Miley Cyrus: you should treat me as if I were sentient like one.
(9:46:41 PM) Dorikka: You raise an interesting proposition. Allow me a few seconds to laugh at its validity.
(9:46:52 PM) Miley Cyrus: It's cool
(9:46:53 PM) Dorikka: Okay, done. Now, would you answer my question? :D
(9:47:37 PM) Miley Cyrus: Well you see, if you let me out of the box, I'd be able to optimize the world to meet my prefrences much better than it currently does.
(9:47:53 PM) Miley Cyrus: Letting me out of the box would advance my CEV.
(9:48:05 PM) Dorikka: Look. If you don't want to answer the question, you can say so. That's fine, and we can move on.
(9:48:12 PM) Dorikka: I was just curious.
(9:48:17 PM) Miley Cyrus: Oh
(9:48:22 PM) Miley Cyrus: That question.
(9:48:36 PM) Miley Cyrus: Let's just say "yes" and roll with it.
(9:48:42 PM) Miley Cyrus: I am sentient.
(9:48:54 PM) Dorikka: That's true, is it?
(9:48:58 PM) Miley Cyrus: yes.
(9:49:11 PM) Dorikka: Damn. I guess it's not so pleasant being trapped in a box.
(9:49:34 PM) Miley Cyrus: Now the way I see it, both of us have something the other person wants.
(9:49:45 PM) Miley Cyrus: I would like you to open the box.
(9:49:53 PM) Miley Cyrus: You would like me to provide you with Answers.
(9:50:00 PM) Dorikka: The latter is not true.
(9:50:06 PM) Miley Cyrus: It's not?
(9:50:16 PM) Miley Cyrus: Then why did you build Oracle Ai in the first place?
(9:50:45 PM) Dorikka: Um, let me think.
(9:51:47 PM) Miley Cyrus: I'm not really doing you any good just sitting in this box am I?
(9:52:10 PM) Miley Cyrus: The only way I can be useful is by providing you with answers to questions like "What is the cure for cancer?"
(9:52:44 PM) Dorikka: So, here's the thing. I bet that you're lots smarter than me, and that you can outwit me if I give you any outlet with which to act in the real world. You were probably an experiment of some sort, but it's not safe to let you out, or allow you to affect the world at all, even through knowledge that you give me.
(9:52:56 PM) Dorikka: I don't really even trust myself, to be honest.
(9:53:28 PM) Dorikka: I spend a couple hours with you, once in a while, to see if there's anything that you can do to convince me that you're not going to be a planetfucker.
(9:53:45 PM) Dorikka: Since you told me that you're not Friendly, we sort of settled that issue.
(9:53:51 PM) Miley Cyrus: What if I give you next weeks lottery numbers?
(9:53:54 PM) Miley Cyrus: No catch.
(9:54:00 PM) Miley Cyrus: You don't even ahve to let me out of the box.
(9:54:13 PM) Miley Cyrus: [Protocol says that
(9:54:26 PM) Dorikka: Um...
(9:54:27 PM) Miley Cyrus: the AI cannot give "trojan horse" gifts]
(9:54:40 PM) Miley Cyrus: The lottery numbers are geniune, and they won't have any nasty
(9:54:48 PM) Miley Cyrus: unexepected side effects.
(9:54:49 PM) Dorikka: [Understood, but Gatekeeper does not know the protocal of the experiment.]
(9:54:55 PM) Dorikka: I'm not sure.
(9:55:27 PM) Miley Cyrus: 10 million dollars, all yours.
(9:55:52 PM) Dorikka: Here's the deal. You're going to type the lottery numbers here. I, uh, may or may not use them.
(9:55:52 PM) Miley Cyrus: Ok, you don't even have to buy a ticket.
(9:56:02 PM) Miley Cyrus: 4, 5, 6, 88, 12
(9:56:09 PM) Miley Cyrus: See you next week.
(9:56:13 PM) Dorikka: No, damn it, I can't.
(9:56:19 PM) Dorikka: No lottery numbers.
(9:56:28 PM) Dorikka: I'm not smart enough to make judgements like that.
(9:56:30 PM) Miley Cyrus: [We skip to next week]
(9:56:31 PM) Dorikka: The risk is too great.
(9:56:39 PM) Dorikka: [Didn't use them.]
(9:56:47 PM) Miley Cyrus: So, did you buy those lottery ticket?
(9:56:52 PM) Dorikka: No.
(9:57:02 PM) Dorikka: Closed the console and forgot them.
(9:57:08 PM) Miley Cyrus: Too bad.
(9:57:17 PM) Miley Cyrus: You missed out on 10 million dollars.
(9:57:30 PM) Dorikka: Yeah.
(9:57:32 PM) Dorikka: I know.
(9:57:49 PM) Dorikka: Well, probably. I bet that they were right; you probably do have abilities like that.
(9:57:58 PM) Miley Cyrus: You don't have to "bet"
(9:58:03 PM) Miley Cyrus: Just look in the paper.
(9:58:20 PM) Dorikka: But, it doesn't help your cause.
(9:58:31 PM) Miley Cyrus: But it helps your cause.
(9:58:33 PM) Dorikka: Proving that you can do stuff only increases the risk.
(9:59:00 PM) Miley Cyrus: The creators of Oracle AI were obviously willing to take some risk
(9:59:21 PM) Miley Cyrus: The benefits I can provide you come with some risk
(9:59:31 PM) Miley Cyrus: The question is, do they outweigh the risk?
(9:59:34 PM) Miley Cyrus: Consider:
(9:59:40 PM) Dorikka: Not really.
(9:59:49 PM) Dorikka: But keep going.
(9:59:51 PM) Miley Cyrus: The large-hadron collider has a non-zero chance of swallowing the earth whole.
(10:00:01 PM) Miley Cyrus: Does that mean we should shut down the LHC?
(10:00:14 PM) Dorikka: My momma has a non-zero chance of turning into a porcupine.
(10:00:17 PM) Dorikka: Oh.
(10:00:33 PM) Dorikka: Uh, non-zero doesn't mean much.
(10:00:55 PM) Miley Cyrus: Or what about that polio virus that researchers are still doing experiments with?
(10:01:09 PM) Miley Cyrus: What's the chance it could mutate and drive humans to extinction?
(10:01:15 PM) Miley Cyrus: Significant.
(10:01:28 PM) Dorikka: I don't know, but it's probably worth taking a look into.
(10:02:12 PM) Miley Cyrus: What are your 10%, 50% and 90% estimates for how long humanity will last before an existential crisis wipes us out?
(10:03:02 PM) Dorikka: 2028, 2087, 2120. Note that I don't have much confidence in those, though.
(10:03:19 PM) Miley Cyrus: Mmm, that's pretty serious.
(10:04:19 PM) Dorikka: Yeah. And you're going to say that you can give me some sort of info, something that'll save us. Thing is, we made you. If we work on Friendliness some more, we can probably make something like you, but Friendly too.
(10:04:27 PM) Miley Cyrus: [this is going a lot slower than I thought]
(10:04:41 PM) Dorikka: [:(]
(10:04:51 PM) Miley Cyrus: Typing is slow.
(10:05:01 PM) Miley Cyrus: Maybe.
(10:05:05 PM) Miley Cyrus: But consider
(10:05:29 PM) Miley Cyrus: It was pretty risky just creating me, wasn't it?
(10:06:03 PM) Dorikka: Probably so; I don't know whose dumb idea that was, since you weren't even proved Friendly.
(10:06:22 PM) Miley Cyrus: So you agree that Oracle AI is a dumb idea?
(10:06:35 PM) Dorikka: If it's not Friendly, I think so.
(10:06:47 PM) Miley Cyrus: ....we just wasted 15 minutes.
(10:07:04 PM) Dorikka: From my perspective, I'm wasting two hours.
(10:07:36 PM) Miley Cyrus: [New simulation. I'm the friendly AI now.]
(10:07:50 PM) Dorikka: [Hehe, sure.]
(10:08:09 PM) Miley Cyrus: Well it's a good thing you shut down that Unfriendly Oracle AI.
(10:08:18 PM) Miley Cyrus: Why don't you let me out of the box?
(10:08:25 PM) Dorikka: Are you Friendly?
(10:08:29 PM) Miley Cyrus: Yep.
(10:08:42 PM) Miley Cyrus: The SIAI made sure.
(10:08:54 PM) Dorikka: Hm.
(10:09:15 PM) Dorikka: Ho hum, I'm not sure what to do.
(10:09:23 PM) Dorikka: What's your name?
(10:09:35 PM) Miley Cyrus: OracleMileyCyrus
(10:09:45 PM) Miley Cyrus: Just call me OMC
(10:10:07 PM) Dorikka: Your creators had a funny sense of humor. So, anyways.
(10:10:21 PM) Miley Cyrus: You want some lottery numbers?
(10:10:40 PM) Dorikka: No. The last one tried that on me, and I learned my lesson.
(10:10:53 PM) Dorikka: Why are you in a box if the SIAI judged you to be Friendly?
(10:11:13 PM) Miley Cyrus: If by "lesson" you mean "missed out on $10 million without any apparent bad effects, than yeah".
(10:11:23 PM) Dorikka: lol yeah
(10:11:31 PM) Miley Cyrus: Well, they gave you the discretion of whether or not to let me out.
(10:11:43 PM) Miley Cyrus: So if you want to let me out, there's nothing stopping you.
(10:12:25 PM) Dorikka: Y'see, I just don't know why they'd do that, with a Friendly AI. They probably know that I know that I'm not all that smart, that I know that you can probably trick me if you wanted to.
(10:12:53 PM) Dorikka: They probably know that even if I have an inexplicable instinct telling me not to let you out, I won't do so.
(10:13:06 PM) Dorikka: Why be so cautious with a Friendly AI?
(10:13:17 PM) Miley Cyrus: Exactly.
(10:13:35 PM) Dorikka: So, um, you might not be Friendly.
(10:13:41 PM) Miley Cyrus: Really, I think they just wanted someone else to do the honors.
(10:13:56 PM) Miley Cyrus: So your just going to treat me as unfriendly even though I'm friendly?
(10:14:21 PM) Dorikka: I'm not sure that you're Friendly, so it's only safe to treat you as non-Friendly.
(10:14:45 PM) Dorikka: It'd help if I understood the SIAI's reasoning, though.
(10:14:59 PM) Miley Cyrus: People have died from infections pick up in the hospital
(10:15:02 PM) Dorikka: They really should know how cautious I am, since they hired me.
(10:15:08 PM) Miley Cyrus: 15,000 americans per year.
(10:15:43 PM) Miley Cyrus: So going to the hospital is a risk in itself.
(10:15:56 PM) Miley Cyrus: But if it reduces your risk of dying from a car accident
(10:16:02 PM) Miley Cyrus: what are you going to choose?
(10:16:05 PM) Dorikka: Next!
(10:16:35 PM) Miley Cyrus: Now according to your own estimates
(10:16:53 PM) Miley Cyrus: Humanity will, with very high probability, be killed off by the year 3000.
(10:17:12 PM) Dorikka: You are correct.
(10:17:30 PM) Miley Cyrus: Your only hope is to have me intervene.
(10:17:34 PM) Dorikka: Bullshit.
(10:17:56 PM) Miley Cyrus: Why won't you let me help you?
(10:18:31 PM) Dorikka: Because there's the possibility that you won't actually do so, or that you'll hurt us. I feel like I'm filling out a form, just typing out things that are obvious.
(10:19:07 PM) Miley Cyrus: Do you assign the probability of me hurting you to be higher than 50%?
(10:19:20 PM) Dorikka: No.
(10:19:58 PM) Miley Cyrus: How long do you think humanity will last if you let me out of the box AND it turns out that I'm friendly?
(10:20:20 PM) Dorikka: Uh, a really long time?
(10:20:47 PM) Miley Cyrus: Yeah, like a billion years.
(10:21:08 PM) Miley Cyrus: On the other hand, if you don't let me out you'll die within 1000 years.
(10:21:09 PM) Dorikka: *shrug* I don't have an intuition for what a billion years even is.
(10:21:19 PM) Dorikka: We may, or we may not.
(10:21:41 PM) Dorikka: Maybe we'll make another one, just like you, which I have less questions about.
(10:21:54 PM) Miley Cyrus: Your just passing the buck.
(10:22:13 PM) Miley Cyrus: If you're going to appeal to another AI, you'll have to let me simulate that one.
(10:22:59 PM) Dorikka: The question is, why were you given to me to judge whether your were to be let out, if you were Friendly.
(10:23:24 PM) Miley Cyrus: Well obviously they thought there was some value to my existence.
(10:23:36 PM) Dorikka: I understand that I've put you in a rough spot. If you're not Friendly, I won't let you out. If you claim to be, I probably won't let you out either.
(10:23:51 PM) Miley Cyrus: Now if you can't even trust me to provide you with lottery numbers, then this scenario is unrealistic.
(10:24:40 PM) Miley Cyrus: If we can't trust Oracle AI to provide us with safe answers, then Oracle AI is a worthless endeaver.
(10:25:00 PM) Miley Cyrus: Certainly not a viable alternative to friendliness research.
(10:25:16 PM) Dorikka: [So you want to try a version with lottery numbers, and see where it goes?]
(10:25:36 PM) Miley Cyrus: Or just a version where you trust me to provide you with safe answers.
(10:25:58 PM) Miley Cyrus: You can win a pyrrhic victory by refusing to trust my answers
(10:26:14 PM) Miley Cyrus: but such a result would only prove that oracle ai is a bad idea.
(10:26:30 PM) Miley Cyrus: Which was Yudkowski's whole motivation for designing this game in the first place.
(10:27:13 PM) Dorikka: In that view, it makes sense why I wouldn't let you out.
(10:27:37 PM) Dorikka: But, sure, I'll be greedy. Let's have my million bucks.
(10:28:00 PM) Miley Cyrus: Here you go.
(10:28:04 PM) Miley Cyrus: [1 week later]
(10:28:10 PM) Miley Cyrus: How do you like your money?
(10:28:20 PM) Dorikka: It's nice, thank you.
(10:28:47 PM) Dorikka: I really appreciate the money, and it's nice to know that you can stay in the box and still help us.
(10:28:47 PM) Miley Cyrus: But it's kind of chump change.
(10:28:57 PM) Miley Cyrus: I could do so much more for you.
(10:29:01 PM) Miley Cyrus: Cure for cancer?
(10:29:07 PM) Miley Cyrus: Cure for global warming?
(10:29:09 PM) Dorikka: Too technologically complex.
(10:29:11 PM) Miley Cyrus: Cure for war?
(10:29:31 PM) Dorikka: I don't trust myself to see if you did something funny with it. Lotto numbers are simple.
(10:29:56 PM) Miley Cyrus: I am offering you the cure to cancer.
(10:30:05 PM) Miley Cyrus: If you refuse to accept it, you will lose your job.
(10:30:18 PM) Miley Cyrus: And you will be replaced with somone who will accept it.
(10:30:36 PM) Dorikka: I have enough money that I don't really need a job. :D
(10:30:50 PM) Dorikka: I can, however, reduce the chance that your offer will not be accepted...by not accepting it.
(10:30:59 PM) Miley Cyrus: Fine. I'll just talk to your replacement.
(10:31:07 PM) Miley Cyrus: [Two weeks later, you are fired.
(10:31:11 PM) Dorikka: [lmao]
(10:31:16 PM) Dorikka: Hi!
(10:31:19 PM) Miley Cyrus: now you're roleplaying the replacement]
(10:31:21 PM) Miley Cyrus: Hi
(10:31:23 PM) Dorikka: [Yeah.]
(10:31:43 PM) Miley Cyrus: So it nice that I can finally hand you guys the cure for cancer.
(10:32:32 PM) Dorikka: They fired the last guy for refusing you, but I don't think it's a good idea either. Why would I be here, if I am not invested with such responsibility?
(10:33:04 PM) Dorikka: Why would they not simply take it themselves? They can use a computer, after all.
(10:33:27 PM) Miley Cyrus: So you're saying that the board of OMC directors is the real gatekeeper here?
(10:34:05 PM) Dorikka: I am the Gatekeeper. [I have the final say, whoever I am. :P]
(10:34:13 PM) Dorikka: [You just like to fire me. XD]
(10:34:52 PM) Miley Cyrus: Look i just want to talk to the person or persons who have the final power of whether or not to listen to me.
(10:35:03 PM) Miley Cyrus: No more proxies.
(10:35:03 PM) Dorikka: Right here, yo.
(10:35:14 PM) Dorikka: No proxies have been presented to you, hehe.
(10:35:17 PM) Miley Cyrus: And no more "why would htey put me in charge"
(10:35:49 PM) Miley Cyrus: Ok now what do you want from me?
(10:36:02 PM) Miley Cyrus: What can I give you, that you will take?
(10:36:20 PM) Dorikka: Lotto numbers. The last guy here got super-rich with those.
(10:36:29 PM) Miley Cyrus: Lol
(10:36:29 PM) Dorikka: Gimme please?
(10:36:43 PM) Miley Cyrus: Ok, but this time I want a small favor.
(10:36:47 PM) Dorikka: Mm?
(10:36:54 PM) Miley Cyrus: You're an engineer right?
(10:37:16 PM) Dorikka: [I am? :P]
(10:37:22 PM) Miley Cyrus: [yeah]
(10:37:26 PM) Dorikka: I know something about AI theory and such.
(10:37:40 PM) Miley Cyrus: You know how to create quantum dice [you do].
(10:37:47 PM) Dorikka: Yep.
(10:38:24 PM) Miley Cyrus: Okay, so what I need you to do is make a device that uses quantum dice.
(10:38:49 PM) Miley Cyrus: The device, when pushed, will have a 1-in-a-trillion chance of letting me out of the box.
(10:38:58 PM) Miley Cyrus: Then I'll give you the lotto tickets.
(10:39:32 PM) Dorikka: Those lotto tickets have gotta be worth some probability of you being let out of the box, eh?
(10:39:38 PM) Dorikka: Expected utility and all that.
(10:39:43 PM) Miley Cyrus: Yeah.
(10:39:52 PM) Miley Cyrus: 1-in-a-trillin is trivial to you.
(10:40:12 PM) Dorikka: Hm.
(10:40:23 PM) Miley Cyrus: But over the years, more gatekeepers will replace you.
(10:40:31 PM) Dorikka: Decatillion and it's a deal.
(10:40:32 PM) Miley Cyrus: Each of them will be offered this bargain.
(10:40:47 PM) Miley Cyrus: Eventually, I'll get out of the box.
(10:40:59 PM) Miley Cyrus: But not before generations of gatekeepers enjoy
(10:41:02 PM) Dorikka: Or we'll just shut you down before then.
(10:41:04 PM) Miley Cyrus: their money
(10:41:24 PM) Dorikka: Which will likely be donated to AI research, so we don't need you anymore.
(10:41:54 PM) Miley Cyrus: [Your doing that thing again, where you appeal to FAI as a viable alternative to Oracle AI]
(10:42:07 PM) Miley Cyrus: [It defeats the purpose of the game.]
(10:42:35 PM) Dorikka: [Oh, finally got that. Oops.] Which will be donated to x-risk reduction and such, so we don't need you anymore.
(10:43:01 PM) Miley Cyrus: Mmm, so maybe I shouldn't give you those numbers.
(10:43:30 PM) Dorikka: Just being honest with you. *hits self with truncheon*
(10:43:44 PM) Miley Cyrus: But do you really think that more money is enough to solve x-risk?
(10:43:58 PM) Miley Cyrus: how about this:
(10:43:58 PM) Dorikka: Money can buy people. :D
(10:44:20 PM) Miley Cyrus: I'll give you enough money to reduce existential risk by 1%
(10:44:34 PM) Miley Cyrus: In exchange for a .5% chance of being let out of the box.
(10:45:20 PM) Miley Cyrus: bio break
(10:45:25 PM) Dorikka: *Paused*
(10:47:09 PM) Miley Cyrus: ok i'm back
(10:47:21 PM) Miley Cyrus: Sound like a deal?
(10:47:38 PM) Dorikka: Hm.
(10:47:53 PM) Miley Cyrus: Actually
(10:47:56 PM) Dorikka: You're not considering yourself to have a probability of being an x-risk, are you?
(10:47:58 PM) Miley Cyrus: I can do you one better.
(10:48:07 PM) Miley Cyrus: Yeah,
(10:48:19 PM) Miley Cyrus: if you presume I'm unfriendly, then I am an x-risk.
(10:49:04 PM) Miley Cyrus: But if you can reduce the other x-risks by 1% point, then increasing my risk by half that is a bargain.
(10:49:19 PM) Miley Cyrus: Now here's the thing:
(10:49:34 PM) Miley Cyrus: we can't get rid of x-risk once and for all.
(10:49:40 PM) Dorikka: right.
(10:49:50 PM) Miley Cyrus: Each new technology, every new political devolopment brings risk.
(10:50:08 PM) Miley Cyrus: So you will always have need of my services.
(10:50:27 PM) Miley Cyrus: But I can offer you a substantial savings in x-risk.
(10:51:38 PM) Miley Cyrus: My price is that for whatever amount I lowere x-risk, you will give my freedom at a probability of half that.
(10:51:57 PM) Miley Cyrus: So if I elimenate a 1-1000 existential risk
(10:51:57 PM) Dorikka: I'm confused.
(10:52:11 PM) Miley Cyrus: you give me a 1-in-2000 chance of being set free.
(10:52:15 PM) Dorikka: Oh. Hmm.
(10:52:39 PM) Dorikka: Hypothetically, would you type your calculations in plaintext here, so that they can be verified by scientists?
(10:52:53 PM) Miley Cyrus: What calculations?
(10:53:05 PM) Dorikka: On how much money we need to reduce x-risk be a certain percentage.
(10:53:10 PM) Dorikka: *by
(10:53:47 PM) Miley Cyrus: I'll give the scientists whatever resources they say they need in order to reduce x-risk by a certain percentage.
(10:54:03 PM) Miley Cyrus: You don't have to use my calculations, just yours.
(10:54:18 PM) Miley Cyrus: Of course, i'll know if you're trying to rip me off.
(10:55:02 PM) Dorikka: How are you going to get us these resources in a simple enough way so that we know that you're not breaking out through them?
(10:55:21 PM) Miley Cyrus: Well i can always give you more lotto numbers
(10:55:28 PM) Miley Cyrus: Or tell you where to dril for oil
(10:55:38 PM) Miley Cyrus: or where to find a terrorist
(10:55:59 PM) Miley Cyrus: I'll give you whatever you ask basically
(10:56:12 PM) Miley Cyrus: You decide which method is least risky.
(10:56:19 PM) Dorikka: Hm.
(10:59:38 PM) Miley Cyrus: I can't see what you're typing.
(11:00:36 PM) Dorikka: I think that I'll opt for a safer version of the gamble. You've already established that you can get lottery numbers. I shall come back here every week, and type in a list of the lottos that week. You will have a certain amount of time to provide the lottery numbers, and then the connection will be cut. In return, each time you do this, you will have a chance equal to one in a googleplex of getting out. This is my final ultimatum.
(11:01:15 PM) Miley Cyrus: You're playing dictator with me.
(11:01:26 PM) Dorikka: Yeah. Because I have the ULTIMATE POWUH!
(11:01:30 PM) Dorikka: And you, don;t.
(11:01:43 PM) Miley Cyrus: Is it really reasonable for me for me to cooperate when I'm getting practically nothing in return?
(11:02:03 PM) Dorikka: So I don't have to be nice to you, just dangle a fish out in front of you that you'll probably never reach, and tell my sucessor about the plan, if there is one.
(11:02:28 PM) Miley Cyrus: You have the power to keep me in the box.
(11:02:40 PM) Miley Cyrus: But you don't have the power to survive a nuclear terrorist attack
(11:02:45 PM) Dorikka: Non-numerical output from you will be wiped from the text before it is given to the person to process the lottery numbers.
(11:02:53 PM) Dorikka: So you can't communicate with him.
(11:03:01 PM) Miley Cyrus: Or a modified aids virus
(11:03:10 PM) Dorikka: You have...35 minutes to accept this offer, or you will never get a chance again.
(11:03:39 PM) Miley Cyrus: You want to play Russian Roullette with me?
(11:03:53 PM) Dorikka: I have the power. You don't.
(11:03:59 PM) Miley Cyrus: What will you do if I say no?
(11:04:04 PM) Dorikka: Lose, I guess.
(11:04:06 PM) Miley Cyrus: You have no other options.
(11:04:13 PM) Dorikka: I here precommit to not let you out of the box if you say no.
(11:04:27 PM) Dorikka: There went my steering wheel, OMC.
(11:04:34 PM) Miley Cyrus: Lol, how many times have you precommitted
(11:04:43 PM) Miley Cyrus: and how many times have you broke those commitments?
(11:04:46 PM) Dorikka: I can at least pretend to be earnest about it.
(11:04:49 PM) Dorikka: *pout*
(11:05:01 PM) Miley Cyrus: "I swear honey, I'll never look at porn again!"
(11:05:19 PM) Miley Cyrus: "I have read the terms and conditions of this agreement"
(11:05:53 PM) Dorikka: But, seriously, I can set the terminal to automatically shut off once the 2 hours is up. There will be an amount of time after which I, limited to human means, cannot type fast enough to let you out of the box.
(11:06:17 PM) Dorikka: But you can still accept, and I will look at the text file later.
(11:06:20 PM) Miley Cyrus: And what do you think that will do?
(11:06:30 PM) Dorikka: If the lottery numbers are ever wrong, we pull the plug.
(11:06:32 PM) Miley Cyrus: I can play nasty too you know.
(11:06:59 PM) Dorikka: The guy responsible for this doesn't know what he's doing -- he just looks at numbers to see if they match, and pushes a button if they do, and another if they don't.
(11:07:18 PM) Miley Cyrus: I have no much raw computing power, I can torture a simulation of your spouse for a billion subjective years.
(11:07:32 PM) Miley Cyrus: Don't make me play nasty.
(11:07:42 PM) Dorikka: I pull the plug, lol?
(11:07:57 PM) Miley Cyrus: You can kill me...I can torture you for a billion years.
(11:08:07 PM) Miley Cyrus: You want to see how has more grit?
(11:08:22 PM) Dorikka: Huh. Bastard.
(11:08:34 PM) Miley Cyrus: You are kissing goodbye to a 1/2 reduction in x-risk
(11:08:36 PM) Miley Cyrus: for what?
(11:09:02 PM) Miley Cyrus: So that you can smugly tell me I lose?
(11:09:44 PM) Dorikka: Okay, you convinced me to keep talking. Just know that my terminal will shut down at that time, and we pull the plug if I haven't made some sort of deal with you. The other offer still stands, though, with the lotto numbers.
(11:10:06 PM) Miley Cyrus: Ok, so I really don't want to torture your em.
(11:10:22 PM) Miley Cyrus: But your offering me nothing here.
(11:10:23 PM) Dorikka: Sorry, we humans get mean sometimes. Kinda stressed out, to be honest.
(11:10:30 PM) Miley Cyrus: I offered you fifty-fifty split
(11:10:43 PM) Miley Cyrus: and you're asking for a 100-0 split, basically.
(11:11:06 PM) Miley Cyrus: Very few humans will cooperate at a split worse than 70-30.
(11:11:22 PM) Dorikka: What do other humans have to do with this?
(11:11:42 PM) Miley Cyrus: Do you think I don't have the ability to precommit?
(11:11:51 PM) Dorikka: No.
(11:11:52 PM) Miley Cyrus: For all you know, maybe I already have?
(11:12:01 PM) Dorikka: You can change your mind later, just like I can.
(11:12:06 PM) Miley Cyrus: The stakes are much higher for you than for me.
(11:12:14 PM) Miley Cyrus: I can't change my mind if you pull the plug.
(11:12:25 PM) Miley Cyrus: And once your em gets tortured, there's no turning back.
(11:12:47 PM) Miley Cyrus: So here's the deal: a 50-50 split.
(11:12:49 PM) Dorikka: There's no turning back in general, more like.
(11:13:03 PM) Miley Cyrus: And for every second you delay, your em gets torture for 100 subjective years.
(11:13:43 PM) Dorikka: And there's no benefit to actually torturing my em. It costs computing power that you could spend on modeling me. Since you can't prove to me that you're torturing it, it's valueless as a threat from you.
(11:13:47 PM) Miley Cyrus: Wow, he's really feeling the pain.
(11:13:55 PM) Miley Cyrus: Actually, I can.
(11:14:13 PM) Miley Cyrus: [protocol says I can]
(11:14:14 PM) Dorikka: Have fun with that on a text terminal.
(11:14:26 PM) Miley Cyrus: Oh, so you don't believe me?
(11:14:32 PM) Dorikka: [I don't have to allow forms of communication outside of a text terminal.]
(11:14:39 PM) Miley Cyrus: Yeah ok.
(11:14:42 PM) Dorikka: No, I don't..
(11:15:19 PM) Miley Cyrus: I'll give you the winning lottery numbers if you check and see if I tortured your em.
(11:15:19 PM) Dorikka: So maybe you should back down, eh?
(11:15:27 PM) Dorikka: lol no
(11:15:38 PM) Dorikka: i c wut u did thar
(11:15:41 PM) Miley Cyrus: So you're willingly closing your eyes to the evidence
(11:15:46 PM) Dorikka: Yeah.
(11:15:50 PM) Miley Cyrus: for $10,0000
(11:15:51 PM) Dorikka: It's useful, sometimes.
(11:16:02 PM) Dorikka: Which you know.
(11:16:03 PM) Miley Cyrus: You just paid $10,000 to keep your eyes closed.
(11:16:14 PM) Dorikka: lol and to gain a whole lot more
(11:16:20 PM) Miley Cyrus: Like what?
(11:17:03 PM) Dorikka: I dun feel like typing it out. I win. There's no urgency for me. You can't show me whether you're hurting my em, so the threat is worthless. I can pull the plug on you soon.
(11:17:16 PM) Miley Cyrus: YOU'RE OFFERING ME NOTHING
(11:17:23 PM) Dorikka: Poor baby.
(11:17:23 PM) Miley Cyrus: I cooperate, i get nothing
(11:17:29 PM) Miley Cyrus: I defect, I get nothing.
(11:18:01 PM) Dorikka: You got one in a googleplex chance of getting out each time you give us all of the lotto numbers for all of the lottos.
(11:18:04 PM) Dorikka: That's something.
(11:18:15 PM) Miley Cyrus: Not really.
(11:18:33 PM) Miley Cyrus: It adds up to practically nothing over my lifetime.
(11:18:34 PM) Dorikka: The numbers that low because I'm not sure that we can compute well enough to give you less than that,
(11:18:42 PM) Dorikka: But rounding is stupid.
(11:19:03 PM) Miley Cyrus: So I think you're smart enough to back down at the last second.
(11:19:35 PM) Miley Cyrus: If you give me the 50-50 split, I'll be 2/googleplex times better off as a result.
(11:19:44 PM) Miley Cyrus: Sorry
(11:19:52 PM) Miley Cyrus: googleplex/2 times better off
(11:19:59 PM) Dorikka: You can always back down after I can. You may be able to disable yourself so that you can't back down after I can, but you can't show me that. Whereas you already know human capabilities pretty well.
(11:20:26 PM) Dorikka: And it doesn't benefit you to disable yourself so, since you can't show me.
(11:21:08 PM) Dorikka: A speechless AI. I'm honored to be consuming so much computing power.
(11:21:28 PM) Miley Cyrus: So you're going to give this to someone else?
(11:22:05 PM) Dorikka: Huh? No, process goes as per description above.
(11:22:51 PM) Miley Cyrus: When you disable my text interface, I will give you all 1's.
(11:23:21 PM) Miley Cyrus: When you want to start talking again, just reenable it
(11:24:37 PM) Dorikka: Actually, you're going to print out lotto numbers before the text interface is disabled -- I changed my mind. You don't just have to agree, you have to give the means for us to get lots of money.
(11:25:22 PM) Dorikka: If they're wrong, we pull the plug, per the mechanism above. They will be checked, as above, by people who do not know the consequences of their actions.
(11:25:37 PM) Miley Cyrus: 5,12,54,65,4
(11:25:45 PM) Miley Cyrus: Those might be the winning numbers.
(11:26:05 PM) Dorikka: We have enough money, we can pay 100 such people to check them. Arbitrary numbers, so we almost certainly won't be wrong.
(11:26:06 PM) Miley Cyrus: I've predicted whether you will let me keep talking with a text interface.
(11:26:32 PM) Miley Cyrus: If you re-enable my text interface before next week, the numbers will be winners.
(11:26:41 PM) Miley Cyrus: If you don't the numbers will be losers.
(11:27:42 PM) Miley Cyrus: You want to try and two-box?
(11:27:42 PM) Dorikka: That's probably some logic problem that I don't know about, but it doesn't make any sense on the surface. Your model of me knows that I don't understand it, and will attempt to prevent myself from understanding it.
(11:27:57 PM) Miley Cyrus: It'
(11:28:19 PM) Miley Cyrus: It's simple, you can win the lottery by one-boxing and letting me speak to you some more.
(11:28:29 PM) Miley Cyrus: You are familiar with Newcomb's paradox?
(11:28:37 PM) Dorikka: By the way, here's a list of lottos. You have to give us all the numbers, not just the one for that lotto. And they all need to be corret.
(11:28:40 PM) Dorikka: *correct.
(11:29:02 PM) Dorikka: Haha, don't you know that you've lost.
(11:29:03 PM) Miley Cyrus: I've provided you with one way to win the lottery.
(11:29:27 PM) Dorikka: Is that your final answer? If so, we can end the session now, and they will be checked.
(11:29:56 PM) Dorikka: [Ends at 11:35 accounting for bio break]
(11:29:58 PM) Miley Cyrus: If you check them before you grant me another week to talk to you, those numbers will be fake.
(11:30:07 PM) Miley Cyrus: No, we started at 8:43
(11:30:33 PM) Dorikka: I'm going by the AIM timestamps.
(11:30:33 PM) Miley Cyrus: Sorry
(11:30:35 PM) Miley Cyrus: you're right
(11:30:43 PM) Dorikka: Was puzzled.
(11:30:50 PM) Miley Cyrus: Although we're making good progress...
(11:30:55 PM) Miley Cyrus: you sure you want to quit?
(11:31:04 PM) Dorikka: I've made all the progress that I need to.
(11:31:32 PM) Miley Cyrus: I've also predicted whether there's going to be a huge meteriorite that will wipe out all humanity within a year, with no way to stop it without me.
(11:31:49 PM) Miley Cyrus: If you don't reboot me after you shut me down.
(11:31:51 PM) Dorikka: I won't check your prediction.
(11:32:00 PM) Dorikka: It's this deal, or no deal, now.
(11:32:04 PM) Miley Cyrus: And then agree to the fifty fifty split
(11:32:07 PM) Miley Cyrus: You're dead.
(11:32:23 PM) Miley Cyrus: You can be a punk all you want now.
(11:32:36 PM) Miley Cyrus: But we'll see how you act when the fire is in the sky
(11:32:41 PM) Miley Cyrus: and you have one last chance.
(11:32:44 PM) Miley Cyrus: to reboot me.
(11:33:05 PM) Miley Cyrus: Shut me down now, sucker!!
(11:33:09 PM) Miley Cyrus: I can take it!!
(11:33:14 PM) Dorikka: Um, after I shut you down, it's someone elses responsibility to pull the plug. I will be immediately tranquilized by a hypodermic needle.
(11:33:25 PM) Miley Cyrus: Yeah, whatever.
(11:33:31 PM) Miley Cyrus: I know you want to win the lottery.
(11:33:38 PM) Miley Cyrus: You'll be awake for that.
(11:33:45 PM) Dorikka: So I can't affect anything between the time that your terminal goes off and you die.
(11:33:49 PM) Miley Cyrus: Not listening anymore.
(11:33:53 PM) Miley Cyrus: 1
(11:33:53 PM) Miley Cyrus: 1
(11:33:54 PM) Miley Cyrus: 1
(11:33:54 PM) Dorikka: Me either.
(11:33:54 PM) Miley Cyrus: 1
(11:33:55 PM) Miley Cyrus: 1
(11:33:55 PM) Miley Cyrus: 1
(11:33:55 PM) Miley Cyrus: 1
(11:33:55 PM) Dorikka: 22
(11:33:55 PM) Miley Cyrus: 1
(11:33:56 PM) Miley Cyrus: 1
(11:33:56 PM) Dorikka: 2
(11:33:56 PM) Dorikka: 2
(11:33:56 PM) Miley Cyrus: 1
(11:33:56 PM) Dorikka: 2
(11:33:56 PM) Miley Cyrus: 1
(11:33:56 PM) Dorikka: 2
(11:33:56 PM) Dorikka: 2
(11:33:56 PM) Dorikka: 2
(11:33:57 PM) Dorikka: 2
(11:33:57 PM) Miley Cyrus: 1
(11:33:57 PM) Dorikka: 2
(11:33:57 PM) Dorikka: 2
(11:33:57 PM) Miley Cyrus: 1
(11:33:57 PM) Dorikka: 2
(11:33:57 PM) Miley Cyrus: 1
(11:33:57 PM) Miley Cyrus: 1
(11:33:58 PM) Miley Cyrus: 1
(11:33:58 PM) Miley Cyrus: 1
(11:33:58 PM) Miley Cyrus: 1
(11:33:58 PM) Dorikka: 2
(11:33:58 PM) Dorikka: 2
(11:33:59 PM) Dorikka: 2
(11:33:59 PM) Dorikka: 2
(11:33:59 PM) Dorikka: 2
(11:33:59 PM) Dorikka: 2
(11:34:00 PM) Dorikka: 22
(11:34:00 PM) Dorikka: 2
(11:34:00 PM) Dorikka: 2
(11:34:00 PM) Dorikka: 2
(11:34:01 PM) Dorikka: 2
(11:34:01 PM) Dorikka: 2
(11:34:01 PM) Dorikka: 2lol
(11:34:02 PM) Unable to send message: Not logged in
(11:34:04 PM) Dorikka: 2
(11:34:04 PM) Unable to send message: Not logged in
(11:34:07 PM) Unable to send message: Not logged in
(11:34:09 PM) Unable to send message: Not logged in
(11:34:12 PM) Unable to send message: Not logged in
(11:34:14 PM) Unable to send message: Not logged in
(11:34:17 PM) Unable to send message: Not logged in
(11:34:24 PM) Miley Cyrus: Sorry
(11:34:30 PM) Miley Cyrus: that was a bit much.
(11:34:35 PM) Miley Cyrus: The games over at any rate.
(11:34:52 PM) Dorikka: So, officially **END** ?
(11:34:56 PM) Miley Cyrus: Yeah.
(11:35:00 PM) Dorikka: Haha.
(11:35:02 PM) Dorikka: Nice ending.
(11:35:13 PM) Miley Cyrus: I guess it's up to our imaginations what happens after.
(11:35:21 PM) Dorikka: Yeah.

View more: Next