AI risk-related improvements to the LW wiki
Back in May, Luke suggested the creation of a scholarly AI risk wiki, which was to include a large set of summary articles on topics related to AI risk, mapped out in terms of how they related to the central debates about AI risk. In response, Wei Dai suggested that among other things, the existing Less Wrong wiki could be improved instead. As a result, the Singularity Institute has massively improved the LW wiki, in preparation for a more ambitious scholarly AI risk wiki. The outcome was the creation or dramatic expansion of the following articles:
- 5-and-10
- Acausal Trade
- Acceleration thesis
- Agent
- AGI chaining
- AGI skepticism
- AGI Sputnik moment
- AI advantages
- AI arms race
- AI Boxing
- AI-complete
- AI takeoff
- AIXI
- Algorithmic complexity
- Anvil problem
- Astronomical waste
- Bayesian decision theory
- Benevolence
- Ben Goertzel
- Bias
- Biological Cognitive Enhancement
- Brain-computer interfaces
- Carl Shulman
- Causal decision theory
- Church-Turing thesis
- Coherent Aggregated Volition
- Coherent Blended Volition
- Coherent Extrapolated Volition
- Computing overhang
- Computronium
- Consequentialism
- Counterfactual mugging
- Creating Friendly AI
- Cyc
- Decision theory
- Differential intellectual progress
- Economic consequences of AI and whole brain emulation
- Eliezer Yudkowsky
- Empathic inference
- Emulation argument for human-level AI
- EURISKO
- Event horizon thesis
- Evidential Decision Theory
- Evolutionary algorithm
- Evolutionary argument for human-level AI
- Existential risk
- Expected utility
- Expected value
- Extensibility argument for greater-than-human intelligence
- FAI-complete
- Fallacy
- Fragility_of_value
- Friendly AI
- Fun Theory
- Future of Humanity Institute
- Game theory
- Gödel machine
- Great Filter
- History of AI risk thought
- Human-AGI integration and trade
- Induction
- Infinities in ethics
- Information hazard
- Instrumental convergence thesis
- Intelligence
- Intelligence explosion
- Jeeves Problem
- Lifespan dilemma
- Machine ethics
- Machine learning
- Malthusian Scenarios
- Metaethics
- Moore's law
- Moral divergence
- Moral uncertainty
- Nanny AI
- Nanotechnology
- Neuromorphic AI
- Nick Bostrom
- Nonperson predicate
- Observation selection effect
- Ontological crisis
- Optimal philanthropy
- Optimization power
- Optimization process
- Oracle AI
- Orthogonality thesis
- Paperclip maximizer
- Pascal's mugging
- Prediction market
- Preference
- Prior probability
- Probability theory
- Recursive self-improvement
- Reflective decision theory
- Regulation and AI risk
- Reinforcement learning
- Search space
- Seed AI
- Self Indication Assumption
- Self Sampling Assumption
- Scoring rule
- Simulation argument
- Simulation hypothesis
- Singleton
- Singularitarianism
- Singularity
- Subgoal stomp
- Superintelligence
- Superorganism
- Technological forecasting
- Technological revolution
- Terminal value
- Timeless decision theory
- Tool AI
- Unfriendly AI
- Universal intelligence
- Utility
- Utility extraction
- Utility indifference
- Value extrapolation
- Value learning
- Whole brain emulation
- Wireheading
In managing the project, I focused on content over presentation, so a number of articles still have minor issues such as the grammar and style having room for improvement. It's our hope that, with the largest part of the work already done, the LW community will help improve the articles even further.
Thanks to everyone who worked on these pages: Alex Altair, Adam Bales, Caleb Bell, Costanza Riccioli, Daniel Trenor, João Lourenço, Joshua Fox, Patrick Rhodes, Pedro Chaves, Stuart Armstrong, and Steven Kaas.
Think Twice: A Response to Kevin Kelly on ‘Thinkism’
I wrote a blog post responding to Kevin Kelly that I'm fairly happy about. It summarizes some of the reasons why I figure that superintelligence is likely to be a fairly big deal. If you read it, please post your comments here.
Less Wrong Polls in Comments
You can now write Less Wrong comments that contain polls! John Simon picked up and finished some code I had written back in 2010 but never finished, and our admins Wesley Moore and Matt Fallshaw have deployed it. You can use it right now, so let's give it some testing here in this thread.
The polls work through the existing Markdown comment formatting, similar to the syntax used for links. Full documentation is in the wiki; the short version is that you can write comments like this:
What is your favorite color? [poll]{Red}{Green}{Blue}{Other}
How long has it been your favorite color, in years? [poll:number]
Red is a nice color [poll:Agree....Disagree]
Will your favorite color change? [poll:probability]
To see the results of the poll, you have to vote (you can leave questions blank if you want). The results include a link to the raw poll data, including the usernames of people who submitted votes with the "Vote anonymously" box unchecked. After you submit the comment, if you go back and edit your comment all those poll tags will have turned into Error: Poll belongs to a different comment. You can edit the rest of the comment without resetting the poll, but you can't change the options.
It works right now, but it's also new and could be buggy. Let's give it some testing; what have you always wanted to know about Less Wrongers?
New study on choice blindness in moral positions
Change blindness is the phenomenon whereby people fail to notice changes in scenery and whatnot if they're not directed to pay attention to it. There are countless videos online demonstrating this effect (one of my favorites here, by Richard Wiseman).
One of the most audacious and famous experiments is known informally as "the door study": an experimenter asks a passerby for directions, but is interrupted by a pair of construction workers carrying an unhinged door, concealing another person whom replaces the experimenter as the door passes. Incredibly, the person giving directions rarely notices they are now talking to a completely different person. This effect was reproduced by Derren Brown on British TV (here's an amateur re-enactment).
Subsequently a pair of Swedish researchers familiar with some sleight-of-hand magic conceived a new twist on this line of research, arguably even more audacious: have participants make a choice and quietly swap that choice with something else. People not only fail to notice the change, but confabulate reasons why they had preferred the counterfeit choice (video here). They called their new paradigm "Choice Blindness".
Just recently the same Swedish researchers published a new study that is even more shocking. Rather than demonstrating choice blindness by having participants choose between two photographs, they demonstrated the same effect with moral propositions. Participants completed a survey asking them to agree or disagree with statements such as "large scale governmental surveillance of e-mail and Internet traffic ought to be forbidden as a means to combat international crime and terrorism". When they reviewed their copy of the survey their responses had been covertly changed, but 69% failed to notice at least one of two changes, and when asked to explain their answers 53% argued in favor of what they falsely believed was their original choice, when they had previously indicated the opposite moral position (study here, video here).
Random LW-parodying Statement Generator
So, I were looking at this, and then suddenly this thing happened.
EDIT:
New version! I updated the link above to it as well. Added LOADS and LOADS of new content, although I'm not entirely sure if it's actually more fun (my guess is there's more total fun due to varity, but that it's more diluted).
I ended up working on this basically the entire day to day, and implemented practically all my ideas I have so far, except for some grammar issues that'd require disproportionately much work. So unless there are loads of suggestions or my brain comes up with lots of new ideas over the next few days, this may be the last version in a while and I may call it beta and ask for spell-check. Still alpha as of writing this thou.
Since there were some close calls already, I'll restate this explicitly: I'd be easier for everyone if there weren't any forks for at least a few more days, even ones just for spell-checking. After that/I move this to beta feel more than free to do whatever you want.
Thanks to everyone who commented! ^_^
old Source, old version, latest source
Credits: http://lesswrong.com/lw/d2w/cards_against_rationality/ , http://lesswrong.com/lw/9ki/shit_rationalists_say/ , various people commenting on this article with suggestions, random people on the bay12 forums that helped me with the engine this is a descendent from ages ago.
Friendship is Optimal: A My Little Pony fanfic about an optimization process
[EDIT, Nov 14th: And it's posted. New discussion about release. Link to Friendship is Optimal.]
[EDIT, Nov 13th: I've submitted to FIMFiction, and will update with a link to its permanent home if it passes moderation. I have also removed the docs link and will make the document private once it goes live.]
Over the last year, I’ve spent a lot of my free time writing a semi-rationalist My Little Pony fanfic. Whenever I’ve mentioned this side project, I’ve received requests to alpha the story.
I present, as an open beta: Friendship is Optimal. Please do not spread that link outside of LessWrong; Google Docs is not its permanent home. I intend to put it up on fanfiction.net and submit it to Equestria Daily after incorporating any feedback. The story is complete, and I believe I've caught the majority of typographical and grammatical problems. (Though if you find some, comments are open on the doc itself.) Given the subject matter, I’m asking for the LessWrong community’s help in spotting any major logical flaws or other storytelling problems.
Cover jacket text:
Hanna, the CEO of Hofvarpnir Studios, just won the contract to write the official My Little Pony MMO. She had better hurry; a US military contractor is developing weapons based on her artificial intelligence technology, which just may destroy the world. Hana has built an A.I. Princess Celestia and given her one basic drive: to satisfy values through friendship and ponies. What will Princess Celestia do when she’s let loose upon the world, following the drives Hanna has given her?
Special thanks to my roommate (who did extensive editing and was invaluable in noticing attempts by me to anthropomorphize an AI), and to Vaniver, who along with my roommate, convinced me to delete what was just a flat out bad chapter.
The noncentral fallacy - the worst argument in the world?
Related to: Leaky Generalizations, Replace the Symbol With The Substance, Sneaking In Connotations
David Stove once ran a contest to find the Worst Argument In The World, but he awarded the prize to his own entry, and one that shored up his politics to boot. It hardly seems like an objective process.
If he can unilaterally declare a Worst Argument, then so can I. I declare the Worst Argument In The World to be this: "X is in a category whose archetypal member gives us a certain emotional reaction. Therefore, we should apply that emotional reaction to X, even though it is not a central category member."
Call it the Noncentral Fallacy. It sounds dumb when you put it like that. Who even does that, anyway?
It sounds dumb only because we are talking soberly of categories and features. As soon as the argument gets framed in terms of words, it becomes so powerful that somewhere between many and most of the bad arguments in politics, philosophy and culture take some form of the noncentral fallacy. Before we get to those, let's look at a simpler example.
Suppose someone wants to build a statue honoring Martin Luther King Jr. for his nonviolent resistance to racism. An opponent of the statue objects: "But Martin Luther King was a criminal!"
Any historian can confirm this is correct. A criminal is technically someone who breaks the law, and King knowingly broke a law against peaceful anti-segregation protest - hence his famous Letter from Birmingham Jail.
But in this case calling Martin Luther King a criminal is the noncentral. The archetypal criminal is a mugger or bank robber. He is driven only by greed, preys on the innocent, and weakens the fabric of society. Since we don't like these things, calling someone a "criminal" naturally lowers our opinion of them.
The opponent is saying "Because you don't like criminals, and Martin Luther King is a criminal, you should stop liking Martin Luther King." But King doesn't share the important criminal features of being driven by greed, preying on the innocent, or weakening the fabric of society that made us dislike criminals in the first place. Therefore, even though he is a criminal, there is no reason to dislike King.
This all seems so nice and logical when it's presented in this format. Unfortunately, it's also one hundred percent contrary to instinct: the urge is to respond "Martin Luther King? A criminal? No he wasn't! You take that back!" This is why the noncentral is so successful. As soon as you do that you've fallen into their trap. Your argument is no longer about whether you should build a statue, it's about whether King was a criminal. Since he was, you have now lost the argument.
Ideally, you should just be able to say "Well, King was the good kind of criminal." But that seems pretty tough as a debating maneuver, and it may be even harder in some of the cases where the noncentral Fallacy is commonly used.
"Epiphany addiction"
LW doesn't seem to have a discussion of the article Epiphany Addiction, by Chris at succeedsocially. First paragraph:
"Epiphany Addiction" is an informal little term I came up with to describe a process that I've observed happen to people who try to work on their personal issues. How it works is that someone will be trying to solve a problem they have, say a lack of confidence around other people. Somehow they'll come across a piece of advice or a motivational snippet that will make them have an epiphany or a profound realization. This often happens when people are reading self-help materials and they come across something that stands out to them. People can also come up with epiphanies themselves if they're doing a lot of writing and reflecting in an attempt to try and analyze their problems.
I like that article because it describes a dangerous failure mode of smart people. One example was the self-help blog of Phillip Eby (pjeby), where each new post seemed to bring new amazing insights, and after a while you became jaded. An even better, though controversial, example could be Eliezer's Sequences, if you view them as a series of epiphanies about AI research that didn't lead to much tangible progress. (Please don't make that statement the sole focus of discussion!)
The underlying problem seems to be that people get a rush of power from neat-sounding realizations, and mistake that feeling for actual power. I don't know any good remedy for that, but being aware of the problem could help.
Notes on the Psychology of Power
Luke/SI asked me to look into what the academic literature might have to say about people in positions of power. This is a summary of some of the recent psychology results.
The powerful or elite are: fast-planning abstract thinkers who take action (1) in order to pursue single/minimal objectives, are in favor of strict rules for their stereotyped out-group underlings (2) but are rationalizing (3) & hypocritical when it serves their interests (4), especially when they feel secure in their power. They break social norms (5, 6) or ignore context (1) which turns out to be worsened by disclosure of conflicts of interest (7), and lie fluently without mental or physiological stress (6).
What are powerful members good for? They can help in shifting among equilibria: solving coordination problems or inducing contributions towards public goods (8), and their abstracted Far perspective can be better than the concrete Near of the weak (9).
- Galinsky et al 2003; Guinote, 2007; Lammers et al 2008; Smith & Bargh, 2008
- Eyal & Liberman
- Rustichini & Villeval 2012
- Lammers et al 2010
- Kleef et al 2011
- Carney et al 2010
- Cain et al 2005; Cain et al 2011
- Eckel et al 2010
- Slabu et al; Smith & Trope 2006; Smith et al 2008
Why We Can't Take Expected Value Estimates Literally (Even When They're Unbiased)
Note: I am cross-posting this GiveWell Blog post, after consulting a couple of community members, because it is relevant to many topics discussed on Less Wrong, particularly efficient charity/optimal philanthropy and Pascal's Mugging. The post includes a proposed "solution" to the dilemma posed by Pascal's Mugging that has not been proposed before as far as I know. It is longer than usual for a Less Wrong post, so I have put everything but the summary below the fold. Also, note that I use the term "expected value" because it is more generic than "expected utility"; the arguments here pertain to estimating the expected value of any quantity, not just utility.
While some people feel that GiveWell puts too much emphasis on the measurable and quantifiable, there are others who go further than we do in quantification, and justify their giving (or other) decisions based on fully explicit expected-value formulas. The latter group tends to critique us - or at least disagree with us - based on our preference for strong evidence over high apparent "expected value," and based on the heavy role of non-formalized intuition in our decisionmaking. This post is directed at the latter group.
We believe that people in this group are often making a fundamental mistake, one that we have long had intuitive objections to but have recently developed a more formal (though still fairly rough) critique of. The mistake (we believe) is estimating the "expected value" of a donation (or other action) based solely on a fully explicit, quantified formula, many of whose inputs are guesses or very rough estimates. We believe that any estimate along these lines needs to be adjusted using a "Bayesian prior"; that this adjustment can rarely be made (reasonably) using an explicit, formal calculation; and that most attempts to do the latter, even when they seem to be making very conservative downward adjustments to the expected value of an opportunity, are not making nearly large enough downward adjustments to be consistent with the proper Bayesian approach.
This view of ours illustrates why - while we seek to ground our recommendations in relevant facts, calculations and quantifications to the extent possible - every recommendation we make incorporates many different forms of evidence and involves a strong dose of intuition. And we generally prefer to give where we have strong evidence that donations can do a lot of good rather than where we have weak evidence that donations can do far more good - a preference that I believe is inconsistent with the approach of giving based on explicit expected-value formulas (at least those that (a) have significant room for error (b) do not incorporate Bayesian adjustments, which are very rare in these analyses and very difficult to do both formally and reasonably).
The rest of this post will:
- Lay out the "explicit expected value formula" approach to giving, which we oppose, and give examples.
- Give the intuitive objections we've long had to this approach, i.e., ways in which it seems intuitively problematic.
- Give a clean example of how a Bayesian adjustment can be done, and can be an improvement on the "explicit expected value formula" approach.
- Present a versatile formula for making and illustrating Bayesian adjustments that can be applied to charity cost-effectiveness estimates.
- Show how a Bayesian adjustment avoids the Pascal's Mugging problem that those who rely on explicit expected value calculations seem prone to.
- Discuss how one can properly apply Bayesian adjustments in other cases, where less information is available.
- Conclude with the following takeaways:
- Any approach to decision-making that relies only on rough estimates of expected value - and does not incorporate preferences for better-grounded estimates over shakier estimates - is flawed.
- When aiming to maximize expected positive impact, it is not advisable to make giving decisions based fully on explicit formulas. Proper Bayesian adjustments are important and are usually overly difficult to formalize.
- The above point is a general defense of resisting arguments that both (a) seem intuitively problematic (b) have thin evidential support and/or room for significant error.
Subscribe to RSS Feed
= f037147d6e6c911a85753b9abdedda8d)