Launched: Friendship is Optimal
Friendship is Optimal has launched and is being published in chunks on FIMFiction.
Friendship is Optimal is a story about an optimizer written to "satisfy human values through friendship and ponies." I would like to thank everyone on LessWrong who came out and helped edit it. Friendship is Optimal wouldn't be what is today without your help.
Thank you.
Teaser description:
Hanna, the CEO of Hofvarpnir Studios, just won the contract to write the official My Little Pony MMO. Hanna has built an A.I. Princess Celestia and given her one basic drive: to satisfy everybody's values through friendship and ponies. And Princess Celestia will follow those instructions to the letter...even if you don't want her to.
Here is the schedule for the next chapters:
Friday (Nov. 16th): Chapter 4 - 5
Monday (Nov. 19th): Chapter 6 - 7
Thursday (Nov. 22th): Chapter 8 - 9
Sunday (Nov. 25th): Chapter 10 - 11, Author's Afterword
Seduced by Imagination
Previously in series: Justified Expectation of Pleasant Surprises
"Vagueness" usually has a bad name in rationality—connoting skipped steps in reasoning and attempts to avoid falsification. But a rational view of the Future should be vague, because the information we have about the Future is weak. Yesterday I argued that justified vague hopes might also be better hedonically than specific foreknowledge—the power of pleasant surprises.
But there's also a more severe warning that I must deliver: It's not a good idea to dwell much on imagined pleasant futures, since you can't actually dwell in them. It can suck the emotional energy out of your actual, current, ongoing life.
Epistemically, we know the Past much more specifically than the Future. But also on emotional grounds, it's probably wiser to compare yourself to Earth's past, so you can see how far we've come, and how much better we're doing. Rather than comparing your life to an imagined future, and thinking about how awful you've got it Now.
Having set out to explain George Orwell's observation that no one can seem to write about a Utopia where anyone would want to live—having laid out the various Laws of Fun that I believe are being violated in these dreary Heavens—I am now explaining why you shouldn't apply this knowledge to invent an extremely seductive Utopia and write stories set there. That may suck out your soul like an emotional vacuum cleaner.
How can I reduce existential risk from AI?
Suppose you think that reducing the risk of human extinction is the highest-value thing you can do. Or maybe you want to reduce "x-risk" because you're already a comfortable First-Worlder like me and so you might as well do something epic and cool, or because you like the community of people who are doing it already, or whatever.
Suppose also that you think AI is the most pressing x-risk, because (1) mitigating AI risk could mitigate all other existential risks, but not vice-versa, and because (2) AI is plausibly the first existential risk that will occur.
In that case, what should you do? How can you reduce AI x-risk?
It's complicated, but I get this question a lot, so let me try to provide some kind of answer.
Meta-work, strategy work, and direct work
When you're facing a problem and you don't know what to do about it, there are two things you can do:
1. Meta-work: Amass wealth and other resources. Build your community. Make yourself stronger. Meta-work of this sort will be useful regardless of which "direct work" interventions turn out to be useful for tackling the problem you face. Meta-work also empowers you to do strategic work.
2. Strategy work: Purchase a better strategic understanding of the problem you're facing, so you can see more clearly what should be done. Usually, this will consist of getting smart and self-critical people to honestly assess the strategic situation, build models, make predictions about the effects of different possible interventions, and so on. If done well, these analyses can shed light on which kinds of "direct work" will help you deal with the problem you're trying to solve.
When you have enough strategic insight to have discovered some interventions that you're confident will help you tackle the problem you're facing, then you can also engage in:
3. Direct work: Directly attack the problem you're facing, whether this involves technical research, political action, particular kinds of technological development, or something else.
Thinking with these categories can be useful even though the lines between them are fuzzy. For example, you might have to do some basic awareness-raising in order to amass funds for your cause, and then once you've spent those funds on strategy work, your strategy work might tell you that a specific form of awareness-raising is useful for political action that counts as "direct work." Also, some forms of strategy work can feel like direct work, depending on the type of problem you're tackling.
Friendship is Optimal: A My Little Pony fanfic about an optimization process
[EDIT, Nov 14th: And it's posted. New discussion about release. Link to Friendship is Optimal.]
[EDIT, Nov 13th: I've submitted to FIMFiction, and will update with a link to its permanent home if it passes moderation. I have also removed the docs link and will make the document private once it goes live.]
Over the last year, I’ve spent a lot of my free time writing a semi-rationalist My Little Pony fanfic. Whenever I’ve mentioned this side project, I’ve received requests to alpha the story.
I present, as an open beta: Friendship is Optimal. Please do not spread that link outside of LessWrong; Google Docs is not its permanent home. I intend to put it up on fanfiction.net and submit it to Equestria Daily after incorporating any feedback. The story is complete, and I believe I've caught the majority of typographical and grammatical problems. (Though if you find some, comments are open on the doc itself.) Given the subject matter, I’m asking for the LessWrong community’s help in spotting any major logical flaws or other storytelling problems.
Cover jacket text:
Hanna, the CEO of Hofvarpnir Studios, just won the contract to write the official My Little Pony MMO. She had better hurry; a US military contractor is developing weapons based on her artificial intelligence technology, which just may destroy the world. Hana has built an A.I. Princess Celestia and given her one basic drive: to satisfy values through friendship and ponies. What will Princess Celestia do when she’s let loose upon the world, following the drives Hanna has given her?
Special thanks to my roommate (who did extensive editing and was invaluable in noticing attempts by me to anthropomorphize an AI), and to Vaniver, who along with my roommate, convinced me to delete what was just a flat out bad chapter.
Checklist of Rationality Habits
The noncentral fallacy - the worst argument in the world?
Related to: Leaky Generalizations, Replace the Symbol With The Substance, Sneaking In Connotations
David Stove once ran a contest to find the Worst Argument In The World, but he awarded the prize to his own entry, and one that shored up his politics to boot. It hardly seems like an objective process.
If he can unilaterally declare a Worst Argument, then so can I. I declare the Worst Argument In The World to be this: "X is in a category whose archetypal member gives us a certain emotional reaction. Therefore, we should apply that emotional reaction to X, even though it is not a central category member."
Call it the Noncentral Fallacy. It sounds dumb when you put it like that. Who even does that, anyway?
It sounds dumb only because we are talking soberly of categories and features. As soon as the argument gets framed in terms of words, it becomes so powerful that somewhere between many and most of the bad arguments in politics, philosophy and culture take some form of the noncentral fallacy. Before we get to those, let's look at a simpler example.
Suppose someone wants to build a statue honoring Martin Luther King Jr. for his nonviolent resistance to racism. An opponent of the statue objects: "But Martin Luther King was a criminal!"
Any historian can confirm this is correct. A criminal is technically someone who breaks the law, and King knowingly broke a law against peaceful anti-segregation protest - hence his famous Letter from Birmingham Jail.
But in this case calling Martin Luther King a criminal is the noncentral. The archetypal criminal is a mugger or bank robber. He is driven only by greed, preys on the innocent, and weakens the fabric of society. Since we don't like these things, calling someone a "criminal" naturally lowers our opinion of them.
The opponent is saying "Because you don't like criminals, and Martin Luther King is a criminal, you should stop liking Martin Luther King." But King doesn't share the important criminal features of being driven by greed, preying on the innocent, or weakening the fabric of society that made us dislike criminals in the first place. Therefore, even though he is a criminal, there is no reason to dislike King.
This all seems so nice and logical when it's presented in this format. Unfortunately, it's also one hundred percent contrary to instinct: the urge is to respond "Martin Luther King? A criminal? No he wasn't! You take that back!" This is why the noncentral is so successful. As soon as you do that you've fallen into their trap. Your argument is no longer about whether you should build a statue, it's about whether King was a criminal. Since he was, you have now lost the argument.
Ideally, you should just be able to say "Well, King was the good kind of criminal." But that seems pretty tough as a debating maneuver, and it may be even harder in some of the cases where the noncentral Fallacy is commonly used.
The Power of Reinforcement
Part of the sequence: The Science of Winning at Life
Also see: Basics of Animal Reinforcement, Basics of Human Reinforcement, Physical and Mental Behavior, Wanting vs. Liking Revisited, Approving reinforces low-effort behaviors, Applying Behavioral Psychology on Myself.
Story 1:
On Skype with Eliezer, I said: "Eliezer, you've been unusually pleasant these past three weeks. I'm really happy to see that, and moreover, it increases my probability than an Eliezer-led FAI research team will work. What caused this change, do you think?"
Eliezer replied: "Well, three weeks ago I was working with Anna and Alicorn, and every time I said something nice they fed me an M&M."
Story 2:
I once witnessed a worker who hated keeping a work log because it was only used "against" him. His supervisor would call to say "Why did you spend so much time on that?" or "Why isn't this done yet?" but never "I saw you handled X, great job!" Not surprisingly, he often "forgot" to fill out his worklog.
Ever since I got everyone at the Singularity Institute to keep work logs, I've tried to avoid connections between "concerned" feedback and staff work logs, and instead take time to comment positively on things I see in those work logs.
Story 3:
Chatting with Eliezer, I said, "Eliezer, I get the sense that I've inadvertently caused you to be slightly averse to talking to me. Maybe because we disagree on so many things, or something?"
Eliezer's reply was: "No, it's much simpler. Our conversations usually run longer than our previously set deadline, so whenever I finish talking with you I feel drained and slightly cranky."
Now I finish our conversations on time.
Story 4:
A major Singularity Institute donor recently said to me: "By the way, I decided that every time I donate to the Singularity Institute, I'll set aside an additional 5% for myself to do fun things with, as a motivation to donate."
Attention control is critical for changing/increasing/altering motivation
I’ve just been reading Luke’s “Crash Course in the Neuroscience of Human Motivation.” It is a useful text, although there are a few technical errors and a few bits of outdated information (see [1], updated information about one particular quibble in [2] and [3]).
There is one significant missing piece, however, which is of critical importance for our subject matter here on LW: the effect of attention on plasticity, including the plasticity of motivation. Since I don’t see any other texts addressing it directly (certainly not from a neuroscientific perspective), let’s cover the main idea here.
Summary for impatient readers: focus of attention physically determines which synapses in your brain get stronger, and which areas of your cortex physically grow in size. The implications of this provide direct guidance for alteration of behaviors and motivational patterns. This is used for this purpose extensively: for instance, many benefits of the Cognitive-Behavioral Therapy approach rely on this mechanism.
Fictional Bias
As rationalists, we are trained to maintain constant vigilance against common errors in our own thinking. Still, we must be especially careful of biases that are unusually common amongst our kind.
Consider the following scenario: Frodo Baggins is buying pants. Which of these is he most likely to buy:
(a) 32/30
(b) 48/32
(c) 30/20
Theory of Knowledge (rationality outreach)
Public schools (and arguably private schools as well; I wouldn't know) teach students what to think, not how to think.
On LessWrong, this insight is so trivial not to bear repeating. Unfortunately, I think many people have adopted it as an immutable fact about the world that will be corrected post-Singularity, rather than a totally unacceptable state of affairs which we should be doing something about now. The consensus seems to be that a class teaching the basic principles of thinking would be a huge step towards raising the sanity waterline, but that it will never happen. Well, my school has one. It's called Theory of Knowledge, and it's offered at 2,307 schools worldwide as part of the IB Diploma Program.
The IB Diploma, for those of you who haven't heard of it, is a internationally recognized high school program. It requires students to pass tests in 6 subject areas, jump through a number of other hoops, and take an additional class called Theory of Knowledge.
For the record, I'm not convinced the IB Diploma Program is a good thing. It doesn't really solve any of the problems with public schools, it shares the frustrating focus on standardized testing and password-guessing instead of real learning, etc. But I think Theory of Knowledge is a huge opportunity to spread the ideas of rationality.
What kinds of people sign up for the IB Diploma? It is considered more rigorous than A-levels in Britain, and dramatically more rigorous than standard classes in the United States (I would consider it approximately equal to taking 5 or 6 AP classes a year). Most kids engaged in this program are intelligent, motivated and interested in the world around them. They seem, (through my informal survey method of talking to lots of them) to have a higher click factor than average.
The problem is that currently, Theory of Knowledge is a waste of time. There isn't much in the way of standards for a curriculum, and in the entire last semester we covered less content than I learn from any given top-level LessWrong post. We debated the nature of truth for 4 months; most people do not come up with interesting answers to this on their own initiative, so the conversation went in circles around "There's no such thing as truth!" "Now, that's just stupid." the whole time. When I mention LessWrong to my friends, I generally explain it as "What ToK would be like, if ToK was actually good."
At my school, we regularly have speakers come in and discuss various topics during ToK, mostly because the regular instructor doesn't have any idea what to say. The only qualifications seem to be a pulse and some knowledge of English (we've had presenters who aren't fluent). If LessWrong posters wanted to call up the IB school nearest you and offer to present on rationality, I'm almost certain people would agree. This seems like a good opportunity to practice speaking/presenting in a low-stakes situation, and a great way to expose smart, motivated kids to rationality.
I think a good presentation would focus on the meaning of evidence, what we mean by "rationality", and making beliefs pay rent, all topics we've touched on without saying anything meaningful. We've also discussed Popper's falsificationism, and about half your audience will already be familiar with Bayes' theorem through statistics classes but not as a model of inductive reasoning in general.
If you'd be interested in this but don't know where to start in terms of preparing a presentation, Liron's presentation "You Are A Brain" seems like a good place to start. Designing a presentation along these lines might also be a good activity for a meetup group.
View more: Next
Subscribe to RSS Feed
= f037147d6e6c911a85753b9abdedda8d)