Less Wrong is a community blog devoted to refining the art of human rationality. Please visit our About page for more information.
These are my notes and observations after attending the Safety in Artificial Intelligence (SafArtInt) conference, which was co-hosted by the White House Office of Science and Technology Policy and Carnegie Mellon University on June 27 and 28. This isn't an organized summary of the content of the conference; rather, it's a selection of points which are relevant to the control problem. As a result, it suffers from selection bias: it looks like superintelligence and control-problem-relevant issues were discussed frequently, when in reality those issues were discussed less and I didn't write much about the more mundane parts.
SafArtInt has been the third out of a planned series of four conferences. The purpose of the conference series was twofold: the OSTP wanted to get other parts of the government moving on AI issues, and they also wanted to inform public opinion.
The other three conferences are about near term legal, social, and economic issues of AI. SafArtInt was about near term safety and reliability in AI systems. It was effectively the brainchild of Dr. Ed Felten, the deputy U.S. chief technology officer for the White House, who came up with the idea for it last year. CMU is a top computer science university and many of their own researchers attended, as well as some students. There were also researchers from other universities, some people from private sector AI including both Silicon Valley and government contracting, government researchers and policymakers from groups such as DARPA and NASA, a few people from the military/DoD, and a few control problem researchers. As far as I could tell, everyone except a few university researchers were from the U.S., although I did not meet many people. There were about 70-100 people watching the presentations at any given time, and I had conversations with about twelve of the people who were not affiliated with existential risk organizations, as well as of course all of those who were affiliated. The conference was split with a few presentations on the 27th and the majority of presentations on the 28th. Not everyone was there for both days.
Felten believes that neither "robot apocalypses" nor "mass unemployment" are likely. It soon became apparent that the majority of others present at the conference felt the same way with regard to superintelligence. The general intention among researchers and policymakers at the conference could be summarized as follows: we need to make sure that the AI systems we develop in the near future will not be responsible for any accidents, because if accidents do happen then they will spark public fears about AI, which would lead to a dearth of funding for AI research and an inability to realize the corresponding social and economic benefits. Of course, that doesn't change the fact that they strongly care about safety in its own right and have significant pragmatic needs for robust and reliable AI systems.
Most of the talks were about verification and reliability in modern day AI systems. So they were concerned with AI systems that would give poor results or be unreliable in the narrow domains where they are being applied in the near future. They mostly focused on "safety-critical" systems, where failure of an AI program would result in serious negative consequences: automated vehicles were a common topic of interest, as well as the use of AI in healthcare systems. A recurring theme was that we have to be more rigorous in demonstrating safety and do actual hazard analyses on AI systems, and another was that we need the AI safety field to succeed in ways that the cybersecurity field has failed. Another general belief was that long term AI safety, such as concerns about the ability of humans to control AIs, was not a serious issue.
On average, the presentations were moderately technical. They were mostly focused on machine learning systems, although there was significant discussion of cybersecurity techniques.
The first talk was given by Eric Horvitz of Microsoft. He discussed some approaches for pushing into new directions in AI safety. Instead of merely trying to reduce the errors spotted according to one model, we should look out for "unknown unknowns" by stacking models and looking at problems which appear on any of them, a theme which would be presented by other researchers as well in later presentations. He discussed optimization under uncertain parameters, sensitivity analysis to uncertain parameters, and 'wireheading' or short-circuiting of reinforcement learning systems (which he believes can be guarded against by using 'reflective analysis'). Finally, he brought up the concerns about superintelligence, which sparked amused reactions in the audience. He said that scientists should address concerns about superintelligence, which he aptly described as the 'elephant in the room', noting that it was the reason that some people were at the conference. He said that scientists will have to engage with public concerns, while also noting that there were experts who were worried about superintelligence and that there would have to be engagement with the experts' concerns. He did not comment on whether he believed that these concerns were reasonable or not.
An issue which came up in the Q&A afterwards was that we need to deal with mis-structured utility functions in AI, because it is often the case that the specific tradeoffs and utilities which humans claim to value often lead to results which the humans don't like. So we need to have structural uncertainty about our utility models. The difficulty of finding good objective functions for AIs would eventually be discussed in many other presentations as well.
The next talk was given by Andrew Moore of Carnegie Mellon University, who claimed that his talk represented the consensus of computer scientists at the school. He claimed that the stakes of AI safety were very high - namely, that AI has the capability to save many people's lives in the near future, but if there are any accidents involving AI then public fears could lead to freezes in AI research and development. He highlighted the public's irrational tendencies wherein a single accident could cause people to overlook and ignore hundreds of invisible lives saved. He specifically mentioned a 12-24 month timeframe for these issues.
Moore said that verification of AI system safety will be difficult due to the combinatorial explosion of AI behaviors. He talked about meta-machine-learning as a solution to this, something which is being investigated under the direction of Lawrence Schuette at the Office of Naval Research. Moore also said that military AI systems require high verification standards and that development timelines for these systems are long. He talked about two different approaches to AI safety, stochastic testing and theorem proving - the process of doing the latter often leads to the discovery of unsafe edge cases.
He also discussed AI ethics, giving an example 'trolley problem' where AI cars would have to choose whether to hit a deer in order to provide a slightly higher probability of survival for the human driver. He said that we would need hash-defined constants to tell vehicle AIs how many deer a human is worth. He also said that we would need to find compromises in death-pleasantry tradeoffs, for instance where the safety of self-driving cars depends on the speed and routes on which they are driven. He compared the issue to civil engineering where engineers have to operate with an assumption about how much money they would spend to save a human life.
He concluded by saying that we need policymakers, company executives, scientists, and startups to all be involved in AI safety. He said that the research community stands to gain or lose together, and that there is a shared responsibility among researchers and developers to avoid triggering another AI winter through unsafe AI designs.
The next presentation was by Richard Mallah of the Future of Life Institute, who was there to represent "Medium Term AI Safety". He pointed out the explicit/implicit distinction between different modeling techniques in AI systems, as well as the explicit/implicit distinction between different AI actuation techniques. He talked about the difficulty of value specification and the concept of instrumental subgoals as an important issue in the case of complex AIs which are beyond human understanding. He said that even a slight misalignment of AI values with regard to human values along one parameter could lead to a strongly negative outcome, because machine learning parameters don't strictly correspond to the things that humans care about.
Mallah stated that open-world discovery leads to self-discovery, which can lead to reward hacking or a loss of control. He underscored the importance of causal accounting, which is distinguishing causation from correlation in AI systems. He said that we should extend machine learning verification to self-modification. Finally, he talked about introducing non-self-centered ontology to AI systems and bounding their behavior.
The audience was generally quiet and respectful during Richard's talk. I sensed that at least a few of them labelled him as part of the 'superintelligence out-group' and dismissed him accordingly, but I did not learn what most people's thoughts or reactions were. In the next panel featuring three speakers, he wasn't the recipient of any questions regarding his presentation or ideas.
Tom Mitchell from CMU gave the next talk. He talked about both making AI systems safer, and using AI to make other systems safer. He said that risks to humanity from other kinds of issues besides AI were the "big deals of 2016" and that we should make sure that the potential of AIs to solve these problems is realized. He wanted to focus on the detection and remediation of all failures in AI systems. He said that it is a novel issue that learning systems defy standard pre-testing ("as Richard mentioned") and also brought up the purposeful use of AI for dangerous things.
Some interesting points were raised in the panel. Andrew did not have a direct response to the implications of AI ethics being determined by the predominantly white people of the US/UK where most AIs are being developed. He said that ethics in AIs will have to be decided by society, regulators, manufacturers, and human rights organizations in conjunction. He also said that our cost functions for AIs will have to get more and more complicated as AIs get better, and he said that he wants to separate unintended failures from superintelligence type scenarios. On trolley problems in self driving cars and similar issues, he said "it's got to be complicated and messy."
Dario Amodei of Google Deepbrain, who co-authored the paper on concrete problems in AI safety, gave the next talk. He said that the public focus is too much on AGI/ASI and wants more focus on concrete/empirical approaches. He discussed the same problems that pose issues in advanced general AI, including flawed objective functions and reward hacking. He said that he sees long term concerns about AGI/ASI as "extreme versions of accident risk" and that he thinks it's too early to work directly on them, but he believes that if you want to deal with them then the best way to do it is to start with safety in current systems. Mostly he summarized the Google paper in his talk.
In her presentation, Claire Le Goues of CMU said "before we talk about Skynet we should focus on problems that we already have." She mostly talked about analogies between software bugs and AI safety, the similarities and differences between the two and what we can learn from software debugging to help with AI safety.
Robert Rahmer of IARPA discussed CAUSE, a cyberintelligence forecasting program which promises to help predict cyber attacks. It is a program which is still being put together.
In the panel of the above three, autonomous weapons were discussed, but no clear policy stances were presented.
John Launchbury gave a talk on DARPA research and the big picture of AI development. He pointed out that DARPA work leads to commercial applications and that progress in AI comes from sustained government investment. He classified AI capabilities into "describing," "predicting," and "explaining" in order of increasing difficulty, and he pointed out that old fashioned "describing" still plays a large role in AI verification. He said that "explaining" AIs would need transparent decisionmaking and probabilistic programming (the latter would also be discussed by others at the conference).
The next talk came from Jason Gaverick Matheny, the director of IARPA. Matheny talked about four requirements in current and future AI systems: verification, validation, security, and control. He wanted "auditability" in AI systems as a weaker form of explainability. He talked about the importance of "corner cases" for national intelligence purposes, the low probability, high stakes situations where we have limited data - these are situations where we have significant need for analysis but where the traditional machine learning approach doesn't work because of its overwhelming focus on data. Another aspect of national defense is that it has a slower decision tempo, longer timelines, and longer-viewing optics about future events.
He said that assessing local progress in machine learning development would be important for global security and that we therefore need benchmarks to measure progress in AIs. He ended with a concrete invitation for research proposals from anyone (educated or not), for both large scale research and for smaller studies ("seedlings") that could take us "from disbelief to doubt".
The difference in timescales between different groups was something I noticed later on, after hearing someone from the DoD describe their agency as having a longer timeframe than the Homeland Security Agency, and someone from the White House describe their work as being crisis reactionary.
The next presentation was from Andrew Grotto, senior director of cybersecurity policy at the National Security Council. He drew a close parallel from the issue of genetically modified crops in Europe in the 1990's to modern day artificial intelligence. He pointed out that Europe utterly failed to achieve widespread cultivation of GMO crops as a result of public backlash. He said that the widespread economic and health benefits of GMO crops were ignored by the public, who instead focused on a few health incidents which undermined trust in the government and crop producers. He had three key points: that risk frameworks matter, that you should never assume that the benefits of new technology will be widely perceived by the public, and that we're all in this together with regard to funding, research progress and public perception.
In the Q&A between Launchbury, Matheny, and Grotto after Grotto's presentation, it was mentioned that the economic interests of farmers worried about displacement also played a role in populist rejection of GMOs, and that a similar dynamic could play out with regard to automation causing structural unemployment. Grotto was also asked what to do about bad publicity which seeks to sink progress in order to avoid risks. He said that meetings like SafArtInt and open public dialogue were good.
One person asked what Launchbury wanted to do about AI arms races with multiple countries trying to "get there" and whether he thinks we should go "slow and secure" or "fast and risky" in AI development, a question which provoked laughter in the audience. He said we should go "fast and secure" and wasn't concerned. He said that secure designs for the Internet once existed, but the one which took off was the one which was open and flexible.
Another person asked how we could avoid discounting outliers in our models, referencing Matheny's point that we need to include corner cases. Matheny affirmed that data quality is a limiting factor to many of our machine learning capabilities. At IARPA, we generally try to include outliers until they are sure that they are erroneous, said Matheny.
Another presentation came from Tom Dietterich, president of the Association for the Advancement of Artificial Intelligence. He said that we have not focused enough on safety, reliability and robustness in AI and that this must change. Much like Eric Horvitz, he drew a distinction between robustness against errors within the scope of a model and robustness against unmodeled phenomena. On the latter issue, he talked about solutions such as expanding the scope of models, employing multiple parallel models, and doing creative searches for flaws - the latter doesn't enable verification that a system is safe, but it nevertheless helps discover many potential problems. He talked about knowledge-level redundancy as a method of avoiding misspecification - for instance, systems could identify objects by an "ownership facet" as well as by a "goal facet" to produce a combined concept with less likelihood of overlooking key features. He said that this would require wider experiences and more data.
There were many other speakers who brought up a similar set of issues: the user of cybersecurity techniques to verify machine learning systems, the failures of cybersecurity as a field, opportunities for probabilistic programming, and the need for better success in AI verification. Inverse reinforcement learning was extensively discussed as a way of assigning values. Jeanette Wing of Microsoft talked about the need for AIs to reason about the continuous and the discrete in parallel, as well as the need for them to reason about uncertainty (with potential meta levels all the way up). One point which was made by Sarah Loos of Google was that proving the safety of an AI system can be computationally very expensive, especially given the combinatorial explosion of AI behaviors.
In one of the panels, the idea of government actions to ensure AI safety was discussed. No one was willing to say that the government should regulate AI designs. Instead they stated that the government should be involved in softer ways, such as guiding and working with AI developers, and setting standards for certification.
In between these presentations I had time to speak to individuals and listen in on various conversations. A high ranking person from the Department of Defense stated that the real benefit of autonomous systems would be in terms of logistical systems rather than weaponized applications. A government AI contractor drew the connection between Mallah's presentation and the recent press revolving around superintelligence, and said he was glad that the government wasn't worried about it.
I talked to some insiders about the status of organizations such as MIRI, and found that the current crop of AI safety groups could use additional donations to become more established and expand their programs. There may be some issues with the organizations being sidelined; after all, the Google Deepbrain paper was essentially similar to a lot of work by MIRI, just expressed in somewhat different language, and was more widely received in mainstream AI circles.
In terms of careers, I found that there is significant opportunity for a wide range of people to contribute to improving government policy on this issue. Working at a group such as the Office of Science and Technology Policy does not necessarily require advanced technical education, as you can just as easily enter straight out of a liberal arts undergraduate program and build a successful career as long as you are technically literate. (At the same time, the level of skepticism about long term AI safety at the conference hinted to me that the signalling value of a PhD in computer science would be significant.) In addition, there are large government budgets in the seven or eight figure range available for qualifying research projects. I've come to believe that it would not be difficult to find or create AI research programs that are relevant to long term AI safety while also being practical and likely to be funded by skeptical policymakers and officials.
I also realized that there is a significant need for people who are interested in long term AI safety to have basic social and business skills. Since there is so much need for persuasion and compromise in government policy, there is a lot of value to be had in being communicative, engaging, approachable, appealing, socially savvy, and well-dressed. This is not to say that everyone involved in long term AI safety is missing those skills, of course.
I was surprised by the refusal of almost everyone at the conference to take long term AI safety seriously, as I had previously held the belief that it was more of a mixed debate given the existence of expert computer scientists who were involved in the issue. I sensed that the recent wave of popular press and public interest in dangerous AI has made researchers and policymakers substantially less likely to take the issue seriously. None of them seemed to be familiar with actual arguments or research on the control problem, so their opinions didn't significantly change my outlook on the technical issues. I strongly suspect that the majority of them had their first or possibly only exposure to the idea of the control problem after seeing badly written op-eds and news editorials featuring comments from the likes of Elon Musk and Stephen Hawking, which would naturally make them strongly predisposed to not take the issue seriously. In the run-up to the conference, websites and press releases didn't say anything about whether this conference would be about long or short term AI safety, and they didn't make any reference to the idea of superintelligence.
I sympathize with the concerns and strategy given by people such as Andrew Moore and Andrew Grotto, which make perfect sense if (and only if) you assume that worries about long term AI safety are completely unfounded. For the community that is interested in long term AI safety, I would recommend that we avoid competitive dynamics by (a) demonstrating that we are equally strong opponents of bad press, inaccurate news, and irrational public opinion which promotes generic uninformed fears over AI, (b) explaining that we are not interested in removing funding for AI research (even if you think that slowing down AI development is a good thing, restricting funding yields only limited benefits in terms of changing overall timelines, whereas those who are not concerned about long term AI safety would see a restriction of funding as a direct threat to their interests and projects, so it makes sense to cooperate here in exchange for other concessions), and (c) showing that we are scientifically literate and focused on the technical concerns. I do not believe that there is necessarily a need for the two "sides" on this to be competing against each other, so it was disappointing to see an implication of opposition at the conference.
Anyway, Ed Felten announced a request for information from the general public, seeking popular and scientific input on the government's policies and attitudes towards AI: https://www.whitehouse.gov/webform/rfi-preparing-future-artificial-intelligence
Overall, I learned quite a bit and benefited from the experience, and I hope the insight I've gained can be used to improve the attitudes and approaches of the long term AI safety community.
I certainly was not expecting the Economist to publish a special report on paperclip maximizers (!).
As the title suggests, they are downplaying the risks of unfriendly AI, but just the fact that the Economist published this is significant
In theory, the free market and democracy both work because suppliers are incentivized to provide products and services that people want. Economists consider it a perverse situation when the market does not provide what people want, and look for explanations such as government regulation.
The funny thing is that sometimes the market doesn't work, and I look and look for the reason why, and all I can come up with is, People are stupid.
I've written before about the market's apparent failure to provide cup holders in cars. I saw another example this week in the latest Wired magazine, a piece on page 42 about a start-up called Thinx to make re-usable women's underwear that absorbs menstrual fluid--all of it, so women don't have to slip out of the middle of meetings to change tampons. The piece's angle was that venture capitalists rejected the idea because they were mostly men and so didn't "get it".
I'd guess they "got it". It isn't a complicated idea. The thing is, there are already 3 giant companies battling for that market. The first thing a VC would say when you tell him you're going to make something better than a tampon is, "Why haven't Playtex, Kotex, or Tampax already done that?"
So, Thinx did a kickstarter and has now sold hundreds of thousands of thousands of absorbent underwear for about $30 each.
The failure in this case is not that VCs are sexist, but that Playtex, etc., never developed this product, although there evidently is a demand for it, and there is no evident reason it couldn't have been produced 20 years ago. The belief that the market doesn't fail then almost led to a further failure, the failure to develop the product at the present time, because the belief that the market doesn't fail implied the product could not be profitable.
I just now came across an even clearer case of market failure: Sugar-free Tums.
Basically: How does one pursue the truth when direct engagement with evidence is infeasible?
I came to this question while discussing GMO labeling. In this case I am obviously not in a position to experiment for myself, but furthermore: I do not have the time to build up the bank of background understanding to engage vigorously with the study results themselves. I can look at them with a decent secondary education's understanding of experimental method, genetics, and biology, but that is the extent of it.
In this situation I usually find myself reduced to weighing the proclamations of authorities:
- I review aggregations of authority from one side and then the other--because finding a truly unbiased source for contentious issues is always a challenge, and usually says more about the biases of whoever is anointing the source "unbiased."
- Once I have reviewed the authorities, I do at least some due diligence on each authority so that I can modulate my confidence if a particular authority is often considered partisan on an issue. This too can present a bias spiral checking for bias in the source pillorying the authority as partisan ad infinitum.
- Once I have some known degree of confidence in the authorities of both sides, I can form some level of confidence in a statement like: "I am ~x% confident that the scientific consensus is on Y's side" or "I am ~Z% confident that there is not scientific consensus on Y"
What do LWers think about this concept? What do you think is the main rationale for this idea, and do you think it is a good policy?
If it's worth saying, but not worth its own post (even in Discussion), then it goes here.
Notes for future OT posters:
1. Please add the 'open_thread' tag.
2. Check if there is an active Open Thread before posting a new one. (Immediately before; refresh the list-of-threads page before posting.)
3. Open Threads should be posted in Discussion, and not Main.
4. Open Threads should start on Monday, and end on Sunday.
2016 LessWrong Diaspora Survey Analysis
- Results and Dataset
- LessWrong Usage and Experience
- LessWrong Criticism and Successorship
- Diaspora Community Analysis
- Mental Health Section
- Basilisk Section/Analysis
- Blogs and Media analysis (You are here)
- Calibration Question And Probability Question Analysis
- Charity And Effective Altruism Analysis
We decided to move the Mental Health section up closer in the survey this year so that the data could inform accessibility decisions.
|Condition||Base Rate||LessWrong Rate||LessWrong Self dx Rate||Combined LW Rate||Base/LW Rate Spread||Relative Risk|
|Obsessive Compulsive Disorder||2.3%||2.7%||5.6%||8.3%||+0.4||1.173|
|Autism Spectrum Disorder||1.47%||8.2%||12.9%||21.1%||+6.73||5.578|
|Attention Deficit Disorder||5%||13.6%||10.4%||24%||+8.6||2.719|
|Borderline Personality Disorder||5.9%||0.6%||1.2%||1.8%||-5.3||0.101|
|Substance Use Disorder||10.6%||1.3%||3.6%||4.9%||-9.3||0.122|
Base rates are taken from Wikipedia, US rates were favored over global rates where immediately available.
So of the conditions we asked about, LessWrongers are at significant extra risk for three of them: Autism, ADHD, Depression.
LessWrong probably doesn't need to concern itself with being more accessible to those with autism as it likely already is. Depression is a complicated disorder with no clear interventions that can be easily implemented as site or community policy. It might be helpful to encourage looking more at positive trends in addition to negative ones, but the community already seems to do a fairly good job of this. (We could definitely use some more of it though.)
Attention Deficit Disorder - Public Service Announcement
That leaves ADHD, which we might be able to do something about, starting with this:
A lot of LessWrong stuff ends up falling into the same genre as productivity advice or 'self help'. If you have trouble with getting yourself to work, find yourself reading these things and completely unable to implement them, it's entirely possible that you have a mental health condition which impacts your executive function.
The best overview I've been able to find on ADD is this talk from Russell Barkely.
Ironically enough, this is a long talk, over four hours in total. Barkely is an entertaining speaker and the talk is absolutely fascinating. If you're even mildly interested in the subject I wholeheartedly recommend it. Many people who have ADHD just assume that they're lazy, or not trying hard enough, or just haven't found the 'magic bullet' yet. It never even occurs to them that they might have it because they assume that adult ADHD looks like childhood ADHD, or that ADHD is a thing that psychiatrists made up so they can give children powerful stimulants.
ADD is real, if you're in the demographic that takes this survey there's a decent enough chance you have it.
Attention Deficit Disorder - Accessibility
So with that in mind, is there anything else we can do?
Yes, write better.
Scott Alexander has written a blog post with writing advice for non-fiction, and the interesting thing about it is just how much of the advice is what I would tell you to do if your audience has ADD.
Reward the reader quickly and often. If your prose isn't rewarding to read it won't be read.
Make sure the overall article has good sectioning and indexing, people might be only looking for a particular thing and they won't want to wade through everything else to get it. Sectioning also gives the impression of progress and reduces eye strain.
Use good data visualization to compress information, take away mental effort where possible. Take for example the condition table above. It saves space and provides additional context. Instead of a long vertical wall of text with sections for each condition, it removes:
The extraneous information of how many people said they did not have a condition.
The space that would be used by creating a section for each condition. In fact the specific improvement of the table is that it takes extra advantage of space in the horizontal plane as well as the vertical plane.
And instead of just presenting the raw data, it also adds:
The normal rate of incidence for each condition, so that the reader understands the extent to which rates are abnormal or unexpected.
Easy comparison between the clinically diagnosed, self diagnosed, and combined rates of the condition in the LW demographic. This preserves the value of the original raw data presentation while also easing the mental arithmetic of how many people claim to have a condition.
Percentage spread between the clinically diagnosed and the base rate, which saves the effort of figuring out the difference between the two values.
Relative risk between the clinically diagnosed and the base rate, which saves the effort of figuring out how much more or less likely a LessWronger is to have a given condition.
Add all that together and you've created a compelling presentation that significantly improves on the 'naive' raw data presentation.
Use visuals in general, they help draw and maintain interest.
None of these are solely for the benefit of people with ADD. ADD is an exaggerated profile of normal human behavior. Following this kind of advice makes your article more accessible to everybody, which should be more than enough incentive if you intend to have an audience.1
This year we finally added a Basilisk question! In fact, it kind of turned into a whole Basilisk section. A fairly common question about this years survey is why the Basilisk section is so large. The basic reason is that asking only one or two questions about it would leave the results open to rampant speculation in one direction or another. By making the section comprehensive and covering every base, we've pretty much gotten about as complete of data as we'd want on the Basilisk phenomena.
Do you know what Roko's Basilisk thought experiment is?
Yes: 1521 73.2%
No but I've heard of it: 158 7.6%
No: 398 19.2%
Where did you read Roko's argument for the Basilisk?
Roko's post on LessWrong: 323 20.2%
Reddit: 171 10.7%
XKCD: 61 3.8%
LessWrong Wiki: 234 14.6%
A news article: 71 4.4%
Word of mouth: 222 13.9%
RationalWiki: 314 19.6%
Other: 194 12.1%
Do you think Roko's argument for the Basilisk is correct?
Yes: 75 5.1%
Yes but I don't think it's logical conclusions apply for other reasons: 339 23.1%
No: 1055 71.8%
Basilisks And Lizardmen
One of the biggest mistakes I made with this years survey was not including "Do you believe Barack Obama is a hippopotamus?" as a control question in this section.2 Five percent is just outside of the infamous lizardman constant. This was the biggest survey surprise for me. I thought there was no way that 'yes' could go above a couple of percentage points. As far as I can tell this result is not caused by brigading but I've by no means investigated the matter so thoroughly that I would rule it out.
Of course, we also shouldn't forget to investigate the hypothesis that the number might be higher than 5%. After all, somebody who thinks the Basilisk is correct could skip the questions entirely so they don't face potential stigma. So how many people skipped the questions but filled out the rest of the survey?
Eight people refused to answer whether they'd heard of Roko's Basilisk but went on to answer the depression question immediately after the Basilisk section. This gives us a decent proxy for how many people skipped the section and took the rest of the survey. So if we're pessimistic the number is a little higher, but it pays to keep in mind that there are other reasons to want to skip this section. (It is also possible that people took the survey up until they got to the Basilisk section and then quit so they didn't have to answer it, but this seems unlikely.)
Of course this assumes people are being strictly truthful with their survey answers. It's also plausible that people who think the Basilisk is correct said they'd never heard of it and then went on with the rest of the survey. So the number could in theory be quite large. My hunch is that it's not. I personally know quite a few LessWrongers and I'm fairly sure none of them would tell me that the Basilisk is 'correct'. (In fact I'm fairly sure they'd all be offended at me even asking the question.) Since 5% is one in twenty I'd think I'd know at least one or two people who thought the Basilisk was correct by now.
One partial explanation for the surprisingly high rate here is that ten percent of the people who said yes by their own admission didn't know what they were saying yes to. Eight people said they've heard of the Basilisk but don't know what it is, and that it's correct. The lizardman constant also plausibly explains a significant portion of the yes responses, but that explanation relies on you already having a prior belief that the rate should be low.
Do you think Basilisk-like thought experiments are dangerous?
Yes, I think they're dangerous for decision theory reasons: 63 4.2%
Yes I think they're dangerous for social reasons (eg. A cult might use them): 194 12.8%
Yes I think they're dangerous for decision theory and social reasons: 136 9%
Yes I think they're socially dangerous because they make everybody involved look foolish: 253 16.7%
Yes I think they're dangerous for other reasons: 54 3.6%
No: 809 53.4%
Most people don't think Basilisk-Like thought experiments are dangerous at all. Of those that think they are, most of them think they're socially dangerous as opposed to a raw decision theory threat. The 4.2% number for pure decision theory threat is interesting because it lines up with the 5% number in the previous question for Basilisk Correctness.
P(Decision Theory Danger | Basilisk Belief) = 26.6%
P(Decision Theory And Social Danger | Basilisk Belief) = 21.3%
So of the people who say the Basilisk is correct, only half of them believe it is a decision theory based danger at all. (In theory this could be because they believe the Basilisk is a good thing and therefore not dangerous, but I refuse to lose that much faith in humanity.3)
Have you ever felt any sort of anxiety about the Basilisk?
Yes: 142 8.8%
Yes but only because I worry about everything: 189 11.8%
No: 1275 79.4%
20.6% of respondents have felt some kind of Basilisk Anxiety. It should be noted that the exact wording of the question permits any anxiety, even for a second. And as we'll see in the next question that nuance is very important.
Degree Of Basilisk Worry
What is the longest span of time you've spent worrying about the Basilisk?
I haven't: 714 47%
A few seconds: 237 15.6%
A minute: 298 19.6%
An hour: 176 11.6%
A day: 40 2.6%
Two days: 16 1.05%
Three days: 12 0.79%
A week: 12 0.79%
A month: 5 0.32%
One to three months: 2 0.13%
Three to six months: 0 0.0%
Six to nine months: 0 0.0%
Nine months to a year: 1 0.06%
Over a year: 1 0.06%
Years: 4 0.26%
These numbers provide some pretty sobering context for the previous ones. Of all the people who worried about the Basilisk, 93.8% didn't worry about it for more than an hour. The next 3.65% didn't worry about it for more than a day or two. The next 1.9% didn't worry about it for more than a month and the last .7% or so have worried about it for longer.
Current Basilisk Worry
Are you currently worrying about the Basilisk?
Yes: 29 1.8%
Yes but only because I worry about everything: 60 3.7%
No: 1522 94.5%
Also encouraging. We should expect a small number of people to be worried at this question just because the section is basically the word "Basilisk" and "worry" repeated over and over so it's probably a bit scary to some people. But these numbers are much lower than the "Have you ever worried" ones and back up the previous inference that Basilisk anxiety is mostly a transitory phenomena.
One article on the Basilisk asked the question of whether or not it was just a "referendum on autism". It's a good question and now I have an answer for you, as per the table below:
|Condition||Worried||Worried But They Worry About Everything||Combined Worry|
|Baseline (in the respondent population)||8.8%||11.8%||20.6%|
The short answer: Autism raises your chances of Basilisk anxiety, but anxiety disorders and OCD especially raise them much more. Interestingly enough, schizophrenia seems to bring the chances down. This might just be an effect of small sample size, but my expectation was the opposite. (People who are really obsessed with Roko's Basilisk seem to present with schizophrenic symptoms at any rate.)
Before we move on, there's one last elephant in the room to contend with. The philosophical theory underlying the Basilisk is the CEV conception of friendly AI primarily espoused by Eliezer Yudkowsky. Which has led many critics to speculate on all kinds of relationships between Eliezer Yudkowsky and the Basilisk. Which of course obviously would extend to Eliezer Yudkowsky's Machine Intelligence Research Institute, a project to develop 'Friendly Artificial Intelligence' which does not implement a naive goal function that eats everything else humans actually care about once it's given sufficient optimization power.
The general thrust of these accusations is that MIRI, intentionally or not, profits from belief in the Basilisk. I think MIRI gets picked on enough, so I'm not thrilled about adding another log to the hefty pile of criticism they deal with. However this is a serious accusation which is plausible enough to be in the public interest for me to look at.
|Believe It's Incorrect||5.2%|
|Believe It's Structurally Correct||5.6%|
|Believe It's Correct||12.0%|
Basilisk belief does appear to make you twice as likely to donate to MIRI. It's important to note from the perspective of earlier investigation that thinking it is "structurally correct" appears to make you about as likely as if you don't think it's correct, implying that both of these options mean about the same thing.
|Believe It's Incorrect||1365.590||100.0||100.0||4825.293||75107.5|
|Believe It's Structurally Correct||2644.736||110.0||20.0||9147.299||50250.0|
|Believe It's Correct||740.555||300.0||300.0||1152.541||6665.0|
Take these numbers with a grain of salt, it only takes one troll to plausibly lie about their income to ruin it for everybody else.
Interestingly enough, if you sum all three total donated counts and divide by a hundred, you find that five percent of the sum is about what was donated by the Basilisk group. ($6601 to be exact) So even though the modal and median donations of Basilisk believers are higher, they donate about as much as would be naively expected by assuming donations among groups are equal.4
|Worried But They Worry About Everything||11.1%|
In contrast to the correctness question, merely having worried about the Basilisk at any point in time doubles your chances of donating to MIRI. My suspicion is that these people are not, as a general rule, donating because of the Basilisk per se. If you're the sort of person who is even capable of worrying about the Basilisk in principle, you're probably the kind of person who is likely to worry about AI risk in general and donate to MIRI on that basis. This hypothesis is probably unfalsifiable with the survey information I have, because Basilisk-risk is a subset of AI risk. This means that anytime somebody indicates on the survey that they're worried about AI risk this could be because they're worried about the Basilisk or because they're worried about more general AI risk.
|Worried But They Worry About Everything||227.047||75.0||300.0||438.861||4768.0|
Take these numbers with a grain of salt, it only takes one troll to plausibly lie about their income to ruin it for everybody else.
This particular analysis is probably the strongest evidence in the set for the hypothesis that MIRI profits (though not necessarily through any involvement on their part) from the Basilisk. People who worried from an unendorsed perspective donate less on average than everybody else. The modal donation among people who've worried about the Basilisk is ten dollars, which seems like a surefire way to torture if we're going with the hypothesis that these are people who believe the Basilisk is a real thing and they're concerned about it. So this implies that they don't, which supports my earlier hypothesis that people who are capable of feeling anxiety about the Basilisk are the core demographic to donate to MIRI anyway.
Of course, donors don't need to believe in the Basilisk for MIRI to profit from it. If exposing people to the concept of the Basilisk makes them twice as likely to donate but they don't end up actually believing the argument that would arguably be the ideal outcome for MIRI from an Evil Plot perspective. (Since after all, pursuing a strategy which involves Basilisk belief would actually incentivize torture from the perspective of the acausal game theories MIRI bases its FAI on, which would be bad.)
But frankly this is veering into very speculative territory. I don't think there's an evil plot, nor am I convinced that MIRI is profiting from Basilisk belief in a way that outweighs the resulting lost donations and damage to their cause.5 If anybody would like to assert otherwise I invite them to 'put up or shut up' with hard evidence. The world has enough criticism based on idle speculation and you're peeing in the pool.
Blogs and Media
Since this was the LessWrong diaspora survey, I felt it would be in order to reach out a bit to ask not just where the community is at but what it's reading. I went around to various people I knew and asked them about blogs for this section. However the picks were largely based on my mental 'map' of the blogs that are commonly read/linked in the community with a handful of suggestions thrown in. The same method was used for stories.
Regular Reader: 239 13.4%
Sometimes: 642 36.1%
Rarely: 537 30.2%
Almost Never: 272 15.3%
Never: 70 3.9%
Never Heard Of It: 14 0.7%
SlateStarCodex (Scott Alexander)
Regular Reader: 1137 63.7%
Sometimes: 264 14.7%
Rarely: 90 5%
Almost Never: 61 3.4%
Never: 51 2.8%
Never Heard Of It: 181 10.1%
[These two results together pretty much confirm the results I talked about in part two of the survey analysis. A supermajority of respondents are 'regular readers' of SlateStarCodex. By contrast LessWrong itself doesn't even have a quarter of SlateStarCodexes readership.]
Overcoming Bias (Robin Hanson)
Regular Reader: 206 11.751%
Sometimes: 365 20.821%
Rarely: 391 22.305%
Almost Never: 385 21.962%
Never: 239 13.634%
Never Heard Of It: 167 9.527%
Minding Our Way (Nate Soares)
Regular Reader: 151 8.718%
Sometimes: 134 7.737%
Rarely: 139 8.025%
Almost Never: 175 10.104%
Never: 214 12.356%
Never Heard Of It: 919 53.06%
Agenty Duck (Brienne Yudkowsky)
Regular Reader: 55 3.181%
Sometimes: 132 7.634%
Rarely: 144 8.329%
Almost Never: 213 12.319%
Never: 254 14.691%
Never Heard Of It: 931 53.846%
Eliezer Yudkowsky's Facebook Page
Regular Reader: 325 18.561%
Sometimes: 316 18.047%
Rarely: 231 13.192%
Almost Never: 267 15.248%
Never: 361 20.617%
Never Heard Of It: 251 14.335%
Luke Muehlhauser (Eponymous)
Regular Reader: 59 3.426%
Sometimes: 106 6.156%
Rarely: 179 10.395%
Almost Never: 231 13.415%
Never: 312 18.118%
Never Heard Of It: 835 48.49%
Gwern.net (Gwern Branwen)
Regular Reader: 118 6.782%
Sometimes: 281 16.149%
Rarely: 292 16.782%
Almost Never: 224 12.874%
Never: 230 13.218%
Never Heard Of It: 595 34.195%
Siderea (Sibylla Bostoniensis)
Regular Reader: 29 1.682%
Sometimes: 49 2.842%
Rarely: 59 3.422%
Almost Never: 104 6.032%
Never: 183 10.615%
Never Heard Of It: 1300 75.406%
Ribbon Farm (Venkatesh Rao)
Regular Reader: 64 3.734%
Sometimes: 123 7.176%
Rarely: 111 6.476%
Almost Never: 150 8.751%
Never: 150 8.751%
Never Heard Of It: 1116 65.111%
Bayesed And Confused (Michael Rupert)
Regular Reader: 2 0.117%
Sometimes: 10 0.587%
Rarely: 24 1.408%
Almost Never: 68 3.988%
Never: 167 9.795%
Never Heard Of It: 1434 84.106%
[This was the 'troll' answer to catch out people who claim to read everything.]
The Unit Of Caring (Anonymous)
Regular Reader: 281 16.452%
Sometimes: 132 7.728%
Rarely: 126 7.377%
Almost Never: 178 10.422%
Never: 216 12.646%
Never Heard Of It: 775 45.375%
GiveWell Blog (Multiple Authors)
Regular Reader: 75 4.438%
Sometimes: 197 11.657%
Rarely: 243 14.379%
Almost Never: 280 16.568%
Never: 412 24.379%
Never Heard Of It: 482 28.521%
Thing Of Things (Ozy Frantz)
Regular Reader: 363 21.166%
Sometimes: 201 11.72%
Rarely: 143 8.338%
Almost Never: 171 9.971%
Never: 176 10.262%
Never Heard Of It: 661 38.542%
The Last Psychiatrist (Anonymous)
Regular Reader: 103 6.023%
Sometimes: 94 5.497%
Rarely: 164 9.591%
Almost Never: 221 12.924%
Never: 302 17.661%
Never Heard Of It: 826 48.304%
Hotel Concierge (Anonymous)
Regular Reader: 29 1.711%
Sometimes: 35 2.065%
Rarely: 49 2.891%
Almost Never: 88 5.192%
Never: 179 10.56%
Never Heard Of It: 1315 77.581%
The View From Hell (Sister Y)
Regular Reader: 34 1.998%
Sometimes: 39 2.291%
Rarely: 75 4.407%
Almost Never: 137 8.049%
Never: 250 14.689%
Never Heard Of It: 1167 68.566%
Xenosystems (Nick Land)
Regular Reader: 51 3.012%
Sometimes: 32 1.89%
Rarely: 64 3.78%
Almost Never: 175 10.337%
Never: 364 21.5%
Never Heard Of It: 1007 59.48%
I tried my best to have representation from multiple sections of the diaspora, if you look at the different blogs you can probably guess which blogs represent which section.
Harry Potter And The Methods Of Rationality (Eliezer Yudkowsky)
Whole Thing: 1103 61.931%
Partially And Intend To Finish: 145 8.141%
Partially And Abandoned: 231 12.97%
Never: 221 12.409%
Never Heard Of It: 81 4.548%
Significant Digits (Alexander D)
Whole Thing: 123 7.114%
Partially And Intend To Finish: 105 6.073%
Partially And Abandoned: 91 5.263%
Never: 333 19.26%
Never Heard Of It: 1077 62.29%
Three Worlds Collide (Eliezer Yudkowsky)
Whole Thing: 889 51.239%
Partially And Intend To Finish: 35 2.017%
Partially And Abandoned: 36 2.075%
Never: 286 16.484%
Never Heard Of It: 489 28.184%
The Fable of the Dragon-Tyrant (Nick Bostrom)
Whole Thing: 728 41.935%
Partially And Intend To Finish: 31 1.786%
Partially And Abandoned: 15 0.864%
Never: 205 11.809%
Never Heard Of It: 757 43.606%
The World of Null-A (A. E. van Vogt)
Whole Thing: 92 5.34%
Partially And Intend To Finish: 18 1.045%
Partially And Abandoned: 25 1.451%
Never: 429 24.898%
Never Heard Of It: 1159 67.266%
[Wow, I never would have expected this many people to have read this. I mostly included it on a lark because of its historical significance.]
Synthesis (Sharon Mitchell)
Whole Thing: 6 0.353%
Partially And Intend To Finish: 2 0.118%
Partially And Abandoned: 8 0.47%
Never: 217 12.75%
Never Heard Of It: 1469 86.31%
[This was the 'troll' option to catch people who just say they've read everything.]
Whole Thing: 501 28.843%
Partially And Intend To Finish: 168 9.672%
Partially And Abandoned: 184 10.593%
Never: 430 24.755%
Never Heard Of It: 454 26.137%
Whole Thing: 138 7.991%
Partially And Intend To Finish: 59 3.416%
Partially And Abandoned: 148 8.57%
Never: 501 29.01%
Never Heard Of It: 881 51.013%
Whole Thing: 55 3.192%
Partially And Intend To Finish: 132 7.661%
Partially And Abandoned: 65 3.772%
Never: 560 32.501%
Never Heard Of It: 911 52.873%
Ra (Sam Hughes)
Whole Thing: 269 15.558%
Partially And Intend To Finish: 80 4.627%
Partially And Abandoned: 95 5.495%
Never: 314 18.161%
Never Heard Of It: 971 56.16%
My Little Pony: Friendship Is Optimal (Iceman)
Whole Thing: 424 24.495%
Partially And Intend To Finish: 16 0.924%
Partially And Abandoned: 65 3.755%
Never: 559 32.293%
Never Heard Of It: 667 38.533%
Friendship Is Optimal: Caelum Est Conterrens (Chatoyance)
Whole Thing: 217 12.705%
Partially And Intend To Finish: 16 0.937%
Partially And Abandoned: 24 1.405%
Never: 411 24.063%
Never Heard Of It: 1040 60.89%
Ender's Game (Orson Scott Card)
Whole Thing: 1177 67.219%
Partially And Intend To Finish: 22 1.256%
Partially And Abandoned: 43 2.456%
Never: 395 22.559%
Never Heard Of It: 114 6.511%
[This is the most read story according to survey respondents, beating HPMOR by 5%.]
The Diamond Age (Neal Stephenson)
Whole Thing: 440 25.346%
Partially And Intend To Finish: 37 2.131%
Partially And Abandoned: 55 3.168%
Never: 577 33.237%
Never Heard Of It: 627 36.118%
Consider Phlebas (Iain Banks)
Whole Thing: 302 17.507%
Partially And Intend To Finish: 52 3.014%
Partially And Abandoned: 47 2.725%
Never: 439 25.449%
Never Heard Of It: 885 51.304%
The Metamorphosis Of Prime Intellect (Roger Williams)
Whole Thing: 226 13.232%
Partially And Intend To Finish: 10 0.585%
Partially And Abandoned: 24 1.405%
Never: 322 18.852%
Never Heard Of It: 1126 65.925%
Accelerando (Charles Stross)
Whole Thing: 293 17.045%
Partially And Intend To Finish: 46 2.676%
Partially And Abandoned: 66 3.839%
Never: 425 24.724%
Never Heard Of It: 889 51.716%
A Fire Upon The Deep (Vernor Vinge)
Whole Thing: 343 19.769%
Partially And Intend To Finish: 31 1.787%
Partially And Abandoned: 41 2.363%
Never: 508 29.28%
Never Heard Of It: 812 46.801%
I also did a k-means cluster analysis of the data to try and determine demographics and the ultimate conclusion I drew from it is that I need to do more analysis. Which I would do, except that the initial analysis was a whole bunch of work and jumping further down the rabbit hole in the hopes I reach an oasis probably isn't in the best interests of myself or my readers.
This is a general trend I notice with accessibility. Not always, but very often measures taken to help a specific group end up having positive effects for others as well. Many of the accessibility suggestions of the W3C are things you wish every website did.↩
I hadn't read this particular SSC post at the time I compiled the survey, but I was already familiar with the concept of a lizardman constant and should have accounted for it.↩
I've been informed by a member of the freenode #lesswrong IRC channel that this is in fact Roko's opinion, because you can 'timelessly trade with the future superintelligence for rewards, not just punishment' according to a conversation they had with him last summer. Remember kids: Don't do drugs, including Max Tegmark.↩
You might think that this conflicts with the hypothesis that the true rate of Basilisk belief is lower than 5%. It does a bit, but you also need to remember that these people are in the LessWrong demographic, which means regardless of what the Basilisk belief question means we should naively expect them to donate five percent of the MIRI donation pot.↩
That is to say, it does seem plausible that MIRI 'profits' from Basilisk belief based on this data, but I'm fairly sure any profit is outweighed by the significant opportunity cost associated with it. I should also take this moment to remind the reader that the original Basilisk argument was supposed to prove that CEV is a flawed concept from the perspective of not having deleterious outcomes for people, so MIRI using it as a way to justify donating to them would be weird.↩
Lately, I’ve been musing on the nature of self-improvement in general. When I notice that something I’ve been doing-- be it mental or physical, the next immediate chain of thought is “Okay, how do I improve my life now, knowing this phenomena exists?” In doing so, I’ve recently realized that this is missing a crucial distinction that can lead to more confusion later down the road.
This important divide is the question of optimizing around, or powering through. So before figuring out what actions I should be taking, it seems important to ask myself, “What am I trying to optimize for?” If the negative biases and habits I manage to identify are rocks, then the question is whether or not the best plan of action is to plan around these rocks, or crush them entirely. This is far from a clear-cut division, however. It appears that breaking bad habits--powering through is going to be more costly in terms of resources spent. Additionally, a successful plan for overcoming these errors will probably have a mix of these, especially if ridding oneself of the tendency entirely is the goal.
For an example of how these two are often blurred, take the planning fallacy:
One strategy may be to overestimate times when planning, pushing through the “it feels wrong” feeling to develop a better sense of how long things take. To augment this, there are also planning techniques, like Murphyjitsu designed to get you considering “hidden factors”. It’s far from clear how much actions that compensate for biases by countering their effects actually reduce the bias entirely, especially if the helpful action also becomes second nature.
But overall, I think this is an important distinction to keep in mind, because I’ll often be stuck asking myself “Should I work around X, or should I actively try to defeat X?”
Does anyone have experience trying to go specifically in one way or the other to counter their biases?
New meetups (or meetups with a hiatus of more than a year) are happening in:
Irregularly scheduled Less Wrong meetups are taking place in:
- Baltimore Weekly Meetup: 26 June 2016 08:00PM
- European Community Weekend: 02 September 2016 03:35PM
- San Antonio Meetup: 26 June 2016 02:00PM
- Sao Paulo - Meetup de junho: 25 June 2016 02:00PM
The remaining meetups take place in cities with regular scheduling, but involve a change in time or location, special meeting content, or simply a helpful reminder about the meetup:
- Moscow: Words of estimative probability, transparency illusion, belief investigation, rational games: 26 June 2016 02:00PM
- [Moscow] Games in Kocherga club: FallacyMania, Tower of Chaos, Training game: 29 June 2016 07:40PM
- Moscow LW meetup in "Nauchka" library: 01 July 2016 07:50AM
- [Moscow] Role playing game based on HPMOR in Moscow: 16 July 2016 03:00PM
- [NYC] Summer Solstice Party, NYC: 25 June 2016 03:00PM
- San Francisco Meetup: Projects: 27 June 2016 06:15PM
- Sydney Rationality Dojo - July: 03 July 2016 04:00PM
- Washington, D.C.: Fermi Estimates: 26 June 2016 03:30PM
Locations with regularly scheduled meetups: Austin, Berlin, Boston, Brussels, Buffalo, Canberra, Columbus, Denver, Kraków, London, Madison WI, Melbourne, Moscow, New Hampshire, New York, Philadelphia, Research Triangle NC, San Francisco Bay Area, Seattle, Sydney, Tel Aviv, Toronto, Vienna, Washington DC, and West Los Angeles. There's also a 24/7 online study hall for coworking LWers and a Slack channel for daily discussion and online meetups on Sunday night US time.
Guidelines: Top-level comments here should be links to things written by members of the rationalist community, preferably that have some particular interest to this community. Self-promotion is totally fine. Including a brief summary or excerpt is great, but not required. Generally stick to one link per top-level comment, so they can be voted on individually. Recent links are preferred.
Rule: Do not link to anyone who does not want to be linked to. In particular, Scott Alexander has asked people to get his permission, before linking to specific posts on his tumblr or in other out-of-the-way places.
View more: Next