Running a Futurist Institute.

4 06 October 2017 05:05PM

Hello,

My name is Trent Fowler, and I'm an aspiring futurist. To date I have given talks on two continents on machine ethics, AI takeoff dynamics, secular spirituality, existential risk, the future of governance, and technical rationality. I have written on introspection, the interface between language and cognition, the evolution of intellectual frameworks, and myriad other topics. In 2016 I began 'The STEMpunk Project', an endeavor to learn as much about computing, electronics, mechanics, and AI as possible, which culminated in a book published earlier this year.

Elon Musk is my spirit animal.

I am planning to found a futurist institute in Boulder, CO. I actually left my cushy job in East Asia to help make the future a habitable place.

Is there someone I could talk to about how to do this? Should I incorporate as a 501C3 or an LLC? What are the best ways of monetizing such an endeavor? How can I build an audience (meetup attendance has been anemic at best, what can I do about that)? And so on.

Best,

-Trent

[Link] You Too Can See Suffering

3 03 October 2017 07:46PM

Open thread, October 2 - October 8, 2017

1 03 October 2017 10:46AM
If it's worth saying, but not worth its own post, then it goes here.

Notes for future OT posters:

2. Check if there is an active Open Thread before posting a new one. (Immediately before; refresh the list-of-threads page before posting.)

3. Open Threads should start on Monday, and end on Sunday.

[Link] [Slashdot] We're Not Living in a Computer Simulation, New Research Shows

1 03 October 2017 10:10AM

Rational Feed: Last Week's Community Articles and Some Recommended Posts

2 02 October 2017 01:49PM

===Highly Recommended Articles:

Slack by Zvi Moshowitz - You need slack in your life. Slack lets you explore and invest. If you don't have slack you can't relax or uphold your morals. Fight hard to maintain your slack and don't let people or things take it away. Maya Millennial's lack of slack.

Personal Thoughts On Careers In Ai Policy And by carrickflynn (EA forum) - 3600 words. AI strategy is bottlenecked by hard research problems. Hence most people will find it hard to contribute effectively, even if they are very talented. Solving these problems has extremely high value. We should prepare to mobilize more talent once the blocking issues are solved. Operations work is still in high demand.

End Factory Farming by 80,000 Hours - Three hour podcast. How young people can set themselves up to contribute to scientific research into meat alternatives. Genetic manipulation of chickens. Skepticism of vegan advocacy. Grants to China, India and South America. Insect farming. Pessimism about legal or electoral solutions. Which species to focus on. Fish and crustacean consciousness.

===Scott:

Against Individual Iq Worries by Scott Alexander - "IQ is very useful and powerful for research purposes. It’s not nearly as interesting for you personally." IQ measurement problems. Even accurately measured IQ isn't that predictive.

Links: Hurly Burly by Scott Alexander - SSC links post. Copyright, genetic engineering, Autism, Machine Learning, Putin's fears of AI risk, the lesswrong relaunch and more

===Rationalist:

Dojo Bad Day Contingency Plan by Elo - Eleizer's discussion of why rationality theory isn't enough, you need to practice. An exercise about improving your mental on bad days.

Also Against Individual IQ Worries by Scott Aaronson - IQ tests tend to ask unclear questions and require you to reverse engineer what the test maker meant. Scott's own IQ was once measured at 106.

Predictive Processing by Entirely Useless - Responses to quotes from Surfing Uncertainty and Scott's review. A large focus is the "darkened room" problem.

Prosocial Manipulation by Katja Grace - Being calculating and guarded in communication is commonly considered manipulative and selfish. But many people's goals are pro-social, why do we assume manipulation is anti-social?

Humans As Leaky Systems by mindlevelup - "Fairly obvious stuff that probably lots of people are thinking about, but now put into simpler words (maybe). Basically, the idea that humans are affected by both ideas and the environment, and this is an important consideration in several models."

Dealism Futarchy And Hypocrisy by Robin Hanson - Policy conversations don't have to be about morality or terminal values. We can instead use tools like economics as a way to help people get whatever it is they want. We can push closer to the Pareto optimal frontier.

Debunking Iq Denial Ism by Grey Enlightenment - Criticisms of Scott's article on individual iq. People can change their socioeconomic status not their iq, IQ is more predicative than socioeconomic status, Feynman, Job titles are non-specific, low-iq 'computer' professions might be doing data entry. EQ isn't intrinsic and doesn't compete with IQ.

Harnessing Polarization by Robin Hanson - Capitalism channels status competition into productive enterprise. How can we similarly channel partisanship? Contests? Decision Markets?

Common Sense Eats Common Talk by Stefano Zorzi (ribbonfarm) - Missing the housing bubble. Falling for conformity. Seeing through invisible clothes. Advice: Test macro assumptions, beware of jargon, assume propositions that contradict common sense are wrong. Common talk and common sense and their failings.

Sabbath Hard And Go Home by Ben Hoffman - The Sabbth as easymode leisure. Unplugging while camping or on a meditation retreat feels natural. What is leisure? If you are unable to keep a Sabbath things are not ok, there isn't enough slack in the system.

Cognitive Empathy And Emotional Labor by Gordon (Map and Territory) - Affective empathy contrasted with Cognitive empathy. Cognitive empathy enables real emotional labor.

City Travel Scaling by Robin Hanson - Review of Geoffrey West's 'Scale'. Most visits to a location are from infrequent visitors who live nearby. Fractal piping systems have an overhead that only grows logarithmically with the size of the city. Evolution never found such efficient heating/cooling systems.

Travel Journal Hawaii by Jacob Falkovich - The Hawaiian language only has 40 syllables. Sales tax. Circadian Rhythm. Colonialism. The Hawaiian caste system. The best meal in the world. Don't quit your job to sell lemonade. Minimum wage ruined the pineapple industry.

Why I Quit Social Media by Sarah Constantin - Becoming stronger and less emotional since we live in a finite world with constrained resources. Social media: "It distances you from reality, makes you focus on a shadow-world of opinions about opinions about opinions; it makes you more impulsive and emotionally unstable; it incentivizes derailing conversations to fish for ego-strokes."

===AI:

An Outside View Of Ai Control by Robin Hanson - Non-singularity scenarios where software performs almost all jobs. Software usually reflects the social organization of those who made it. Entrench designs and systems. Don't work on the control problem until its time. Human control and AI control. Most AI failures in this scenario will cause limited damage and can be handled after they occur.

Nonlinear Computation In Linear Networks by Open Ai - Floating point arithmatic is fundamentally non-linear near the limit of machine precision. OpenAI managed to exploit these non-linear effects with an evolutionary algorithm to achieve much better performance than a normal deep normal network on MNIST.

September 2017 Newsletter by The MIRI Blog - New MIRI paper on Incorrigibility and shitting off AI. Best posts from the intelligent agents forum. Links to videos and podcasts. MIRI personel updates and career opportunities in aI safety.

NBER Conference Artificial Intelligence by Marginal Revolution - Links to the program and videos. Tyler was there to comment on Korinek and Stiglitz.

===EA:

What Happens To Cows In The Us by Eukaryote - "There are 92,000,000 cattle in the USA. Where do they come from, what are they used for, and what are their ultimate fates?"

Interim Update On Givewells Money Moved And Web Traffic In 2016 by The GiveWell Blog - Summary of influence, total money moved, money moved by charity.

Guardedness In Ea by Jeff Kaufman - As people and organizations gain prestige their communication becomes less open and more careful. Jeff has seen this happen in the EA community and dislikes the effects. However Jeff doesn't see a great alternative.

Trial Postponed by GiveDirectly - Give directly Kenya trial postponed due to political events.

===Politics and Economics:

Why White Identity Doesn't Work by Grey Enlightenment - Who counts as white. Race is secondary to ancestry and culture. No unifying cause or struggle. Whites may be biologically individualist. Too much infighting.

Comment on Oppressed Groups and Slack by Benquo - People who are oppressed often lack the slack to maintain their morals. Seven Samurai. This has the troubling implication that while we should listen to the oppressed the relatively privileged should maintain leadership. However it also implies that oppressed group's behavior will improve after enough time without a boot on their neck.

The OpenPhil Report On Incarceration by The Unit of Caring - "Our prison system isn’t just not-rehabilitative; it is anti-rehabilitative. It traumatizes and retraumatizes people and severs their connections to people and opportunities within the law and abuses them and breaks social trust and produces crime which is then used to justify longer prison sentences which produce more crime."

On The Fetishization Of Money In Galts Gulch by Ben Hoffman - Danny Taggart and Galt feel they can't ethically become lovers until they rectify a power imbalance. Danny solves this problem by becoming Galt's house-maker and cook. Most people's intuition is that employment creates a power imbalance, it doesn't solve one. What is going on?

Seasteading 2 by Bayesian Investor - "The book’s style is too much like a newspaper. Rather than focus on the main advantages of seasteading, it focuses on the concerns of the average person, and on how seasteading might affect them. It quotes interesting people extensively, while being vague about whether the authors are just reporting that those people have ideas, or whether the authors have checked that the ideas are correct. Many of the ideas seem rather fishy."

What Is Going On With The Alt Right by Grey Enlightenment - Reasons the alt-right is falling apart: Trump back-peddling or softening on campaign promises, The civil war between the-lite, alt-medium, and alt-right, Slow news cycle and brevity of ideas, Botched rallies and poor branding, The alt-right losing its official Reddit sub, the right is more intellectually diverse than the left.

Milgram Replicates by Bryan Caplan - Milgram's shock study replicated well in 2009. Since 79% of people who pushed past the subjects first verbal protest went to the end of the range the replication stopped earlier than Milgram.

===Misc:

Summary Of Reading July September 2017 by Eli Bendersky - Book reviews: Stats, genetics, Winnie the Pooh, Zen and other topics.

===Podcast:

Creating Trump by The Ezra Klein Show - "How the Republican Party created Trump, how Trump won, and what comes next. As Dionne says in this interview, the American system was "not supposed to produce a president like this,” and so a lot of our conversation is about how the guardrails failed and whether they can be rebuilt."

Rs 194 Robert Wright On Why Buddhism Is True by Rationally Speaking - "Why Buddhism was right about human nature: its diagnosis that the our suffering is mainly due to a failure to see reality clearly, and its prescription that meditation can help us see more clearly. Robert and Julia discuss whether it's suspicious that a religion turned out to be "right" about human nature, what it means for emotions to be true or false, and whether there are downsides to enlightenment."

Robert Wright by EconTalk - "The psychotherapeutic insights of Buddhism and the benefits of meditation and mindfulness. Wright argues our evolutionary past has endowed us with a mind that can be ill-suited to the stress of the present. He argues that meditation and the non-religious aspects of Buddhism can reduce suffering and are consistent with recent psychological research."

Burning Man by The Bayesian Conspiracy - How much does burning man live up to its principles, changes over time, finding out you aren't gay in your twenties, marriage. Burning Man advice: Go with a camp you like, don't have plans just wander around and get involved in whats interesting.

The Fate Of Liberalism by Waking Up with Sam Harris - "Mark Lilla about the fate of political liberalism in the United States, the emergence of a new identity politics, the role of class in American society"

1 02 October 2017 07:43AM

The following is an exercise I composed to be run at the Lesswrong Sydney dojos.  It took an hour and a half but could probably be done faster with some adaptations that I have included in these instructions. In regards to what are the dojos?

I quote Eliezer in the preface of Rationality: From AI to Zombies when he says:

It was a mistake that I didn’t write my two years of blog posts with the intention of helping people do better in their everyday lives. I wrote it with the intention of helping people solve big, difficult, important problems, and I chose impressive-sounding, abstract problems as my examples. In retrospect, this was the second-largest mistake in my approach.
It ties in to the first-largest mistake in my writing, which was that I didn’t realise that the big problem in learning this valuable way of thinking was figuring out how to practice it, not knowing the theory. I didn’t realise that part was the priority; and regarding this I can only say “Oops” and “Duh.” Yes, sometimes those big issues really are big and really are important; but that doesn’t change the basic truth that to master skills you need to practice them and it’s harder to practice on things that are further away.

Lesswrong is a global movement of rationality.  And with that in mind, the Dojos are our attempt in Sydney to be working on the actual practical stuff.  Working on the personal problems and literal implementation of The plans after they undergo first contact with the enemy. You can join us through our meetup group, facebook group and as advertised on lesswrong.

Below is the instructions for the Dojo.  I can't emphasise enough the process of actually doing and not just reading.

If you intend to participate, grab some paper or a blank document and stop for a few minutes to make the lists.  Then check your answers against ours. If you don't do the exercise - don't fool yourself into thinking you have this skill under your belt.  Just accept that you didn't really "learn" this one.  you kinda said, "that's great I wish I could find the time to get healthy"  Or "If only I was the type of person who did things.".  If this is especially difficult for you, that's okay.  It is difficult for all of us.  I believe in you!

Good luck.

Everyone has bad days.  Each of us will have various experiences dealing with different causes and/or diagnosing, solving and resolving the causes of "bad-days"

With that in mind I want to do a few sets of discussions on factors of a bad day.

Part 1: Set a timer for 3 minutes - Make a list of things bad for state of mind, or things you have noticed cause trouble for you.  {as a group each person shares one} Review the hints list as a group:

• routine meds/supplements (supposed to take)
• have you taken something to cause a bad state? (things you should not take)
• sleep
• exercise
• shower
• Sunlight (independent of bright light)
• talk to a human in the last X hours
• talk to too many humans in the last X hours
• Fresh air
• Did I eat in the last X hours
• drink in the last X hours
• Am I in pain?  Physical or emotional
• Physical discomfort, weather, loud noise, bright lights, bad smells
• Feel unsafe in my surroundings?
• Do I know why I'm in a bad mood, or not feeling well emotionally?  (remember do not dismiss or judge any answer)
• When did you last do something fun?
• Spend 5 minutes making a list of all the little things that are bothering you (try not to solve them now, just make the list) (and if necessary make plans for the ones you can affect).
• Also possibly distinguish between "why am I feeling bad" and "what can I do to feel less bad/even though I feel bad" (e.g. if you're stressed about upcoming event or fight you had last night, you might not be able to act on it but you can still do things now that will improve your state or at least get you being productive)

at the bottom of the page:{our bonus list of bad things generated in the dojo}

{As a group - were there any big ones we missed and discussion about what we came up with}

Part 2: {set a timer 3 minutes} Come up with a list of things that are good for your mental state

{Group discussion - each share one}

{optional hints list} http://happierhuman.com/how-to-be-happy/ {feel free to go through it as a group or glance at it or skip it}

{bonus good stuff list at the bottom}

{as a group discussion - did we miss any big ones?}

Part 3: Possibly ambiguous factors

Now that we have a list of good and a list of bad, we should build a list of possibly ambiguous factors that you can look out for.  For example the weather, allergies, unexpected events - i.e. a death or car accident. Set a timer 3 minutes - ambiguous factors {as a group - each name one}

{Any big ones we missed} (discussion)

{bonus ambiguous list at the bottom}

Part 4: The important parts

Now I want you to go through the list and come up with the top 5-10 (or as many as matters) most relevant ones.  From here on in it's your list, no more sharing so it doesn't matter to anyone else what's on it.

{Timer 2 minutes}

Part 5: plan for where to keep the list so it's most accessible - so that on a bad day you can access the list and make use of it. Could be in an email draft, could be on your phone, could be a note somewhere at home or in a notebook.

Timer 2 minutes - come up with where you will be keeping the list that makes it most useful to you.

{discussions of plans - including double checking of each other's plans to make sure they seem like they are likely to work}

{assistance if anyone is stuck}

Some ideas:

• notes app in phone
• bedroom door poster
• repeat and memorize
• "noticing" and asking why, rumination.

{end of exercise and break time}

• supplements
• private time
• sun
• exercise
• stress (and too much responsibility)
• sleep
• alcohol
• my mother (stress)
• weather (cold)
• body temperature
• pain
• interpersonal rejection (and the complexities of these)
• when my wife is unhappy
• overeating
• missing out on fun things
• losing control of my schedule
• not having a schedule
• overthinking past failure
• avoiding things I should do
• accusations/misunderstandings
• not sticking to good habits
• being confrontational
• need social time
• obligation
• fixating on bullshit
• getting short with people
• too much coffee
• not continuing communication (not knowing what to say)
• junk food
• not being "myself" enough
• breaking good routines
• cold showers in the morning are bad
• being unproductive at work
• something on the mind

{bonus list of good things}

• weather
• exercise/swimming, dancing
• sex
• big meals
• supplements
• sorting my spreadsheets -> feeling on top of my tasks -> congruence of purpose
• when things work smoothly
• creating things -> feedback on completion
• fasting
• perfect weather
• shower + bath
• go for a walk
• listen to nice music
• good plan & following it
• petting a cat
• weightlifting
• girlfriend
• playing instrument
• feeling connected with someone
• veg-out in bed
• good podcast
• dancing around the house
• good book/knowledge
• meditating
• a balanced day - a bit of everything "good day"
• napping
• solving a problem
• learning knowledge/skill
• new experiences + with other people
• lack of responsibility and commitment -> option of impulsivity
• nature experience (sunsets, cool breeze)
• discovering nuance
• progress feedback
• humour
• hypnotised to be relaxed
• 3 weeks sticking to diet and exercise
• new idea - epiphany feeling
• winning debate/scoring a soccer goal
• productive procrastination
• consider past accomplishment
• knowing/realising -> feeling the realisation
• when other people are really organised
• making someone smile
• massage giving and receiving
• hugs
• deep breathing
• looking at clouds
• playing with patterns
• making others happy
• good TV/movie
• getting paid
• balance social/alone time
• flow
• letting go/deciding not to care
• text chat
• lying on the floor sleep

{bonus ambiguous list}

• some foods
• water
• sleep (short can feel good endorphins)
• chemical smells (burning plastic, drying paint)
• coffee buzz
• conversations
• helping people
• humans
• finding information (sometimes a let down)
• balance discipline/freedom
• seeing family
• junk TV/movies
• junk food
• menial chores
• fidgeting
• paid work
• partner time
• coding binge
• being alone
• exercise
• reading documentation (sometimes good, sometimes terrible)
• being needed/wanted
• enthusiasm -> burnout
• masturbation
• alcohol
• sticking to timetable
• performing below standard
• sex
• learning new stuff
• clubs
• brain fog
• breaking the illusions of reality

Meta: this took an hour to write up and a few hours to generate the exercise.

Feedback on LW 2.0

11 01 October 2017 03:18PM

What are your first impressions of the public beta?

2 01 October 2017 02:08AM

This is the monthly thread for posting media of various types that you've found that you enjoy. Post what you're reading, listening to, watching, and your opinion of it. Post recommendations to blogs. Post whatever media you feel like discussing! To see previous recommendations, check out the older threads.

Rules:

• Please avoid downvoting recommendations just because you don't personally like the recommended material; remember that liking is a two-place word. If you can point out a specific flaw in a person's recommendation, consider posting a comment to that effect.
• If you want to post something that (you know) has been recommended before, but have another recommendation to add, please link to the original, so that the reader has both recommendations.
• Use the "Other Media" thread if you believe the piece of media you want to discuss doesn't fit under any of the established categories.
• Use the "Meta" thread if you want to discuss about the monthly media thread itself (e.g. to propose adding/removing/splitting/merging subthreads, or to discuss the type of content properly belonging to each subthread) or for any other question or issue you may have about the thread or the rules.

[Link] Work and income in the next era

0 30 September 2017 10:02PM

logic puzzles and loophole abuse

2 30 September 2017 03:45PM

I recently read about the hardest logic puzzle ever on Wikipedia and noticed that someone published a paper in which they solved the problem by asking only two questions instead of three. This relied on abusing the loophole that boolean formulas can result in a paradox.

This got me thinking in what other ways the puzzle could be abused even further, and I managed to find a way to turn the problem into a hack to achieve omnipotence by enslaving gods (see below).

I find this quite amusing, and I would like to know if you know of any other examples where popular logic puzzles can be broken in amusing ways. I'm looking for any outside-the-box solutions that give much better results than expected. another example.

Here is my solution to the "hardest logic puzzle ever":

This solution is based on the following assumption: The gods are quite capable of responding to a question with actions besides saying 'da' and 'ja', but simply have no reason to do so. As stated in the problem description, the beings in question are gods and they have a language of their own. They could hardly be called gods, nor have need for a spoken language, if they weren't capable of affecting reality.

At a bare minimum, they should be capable of pronouncing the words 'da' and 'ja' in multiple different ways, or to delay answering the question by a fixed amount of time after the question is asked. Either possibility would extend the information content of an answer from a single bit of information to arbitrarily many bits, depending on how well you can differentiate different intonations of 'da' and 'ja', and how long you are willing to wait for an answer.

We can construct a question that will result in a paradox unless a god performs a certain action. In this way, we can effectively enslave the god and cause it to perform arbitrary actions on our behalf, as performing those actions is the only way to answer the question. The actual answer to the question becomes effectively irrelevant.

To do this, we approach any of the three gods and ask them the question OBEY, which is defined as follows:

OBEY = if WISH_WRAPPER then True else PARADOX

WISH_WRAPPER = "after hearing and understanding OBEY, you act in such a way that your actions maximally satisfy the intended meaning behind WISH. Where physical, mental or other kinds of constraints prevent you from doing so, you strive to do so to the best of your abilities instead."

WISH = "you determine the Coherent Extrapolated Volition of humanity and act to maximize it."

You can substitute WISH for any other wish you would like to see granted. However, one should be very careful while doing so, as beings of pure logic are likely to interpret vague actions differently from how a human would interpret them. In particular, one should avoid accidentally making WISH impossible to fulfill, as that would cause the god's head to explode, ruining your wish.

The above formulation tries to take some of these concerns into account. If you encounter this thought experiment in real life, you are advised to consult a lawyer, a friendly-AI researcher, and possibly a priest, before stating the question.

Since you can ask three questions, you can enslave all three gods. Boolos' formulation states about the random god that "if the coin comes down heads, he speaks truly; if tails, falsely". This formulation implies that the god does try to determine the truth before deciding how to answer. This means that the wish-granting question also works for the random god.

If the capabilities of the gods are uncertain, it may help to establish clearer goals as well as fall-back goals. For instance, to handle the case that the gods are in fact limited to speaking only 'da' and 'ja', it may help to append the WISH as follows: "If you are unable to perform actions in response to OBEY besides answering 'da' or 'ja', you wait for the time period outlined in TIME before making your answer." You can now encode arbitrary additional information in TIME, with the caveat that you will have to actually wait before getting a response. Your ability to accurately measure the elapsed time between question and answer directly correlates with how much information you can put into TIME without risking starvation before the question is answered. The following is a simple example of TIME that would allow you to solve the original problem formulation with just asking OBEY once of any of the gods:

TIME = "If god A speaks the truth, B lies and C is random, you wait for 1 minute before answering. If god A speaks the truth, C lies and B is random, you wait for 2 minutes before answering. If god B speaks the truth, A lies and C is random, you wait for 3 minutes before answering. If god B speaks the truth, C lies and A is random, wait for 4 minutes before answering. If god C speaks the truth, A lies and B is random, wait for 5 minutes before answering. If god C speaks the truth, B lies and A is random, wait for 6 minutes before answering."

Event: Effective Altruism Global X Berlin 2017

3 30 September 2017 07:33AM

This year's EAGxBerlin takes place on the 14th and 15th of October at the Berlin Institute of Technology and is organized by the Effective Altruism Foundation. The conference will convene roughly 300 people – academics, professionals, and students alike – to explore the most effective and evidence-based ways to improve the world, based on the philosophy and global movement of effective altruism.

Personal thoughts on careers in AI policy and strategy [x-post EA Forum]

3 27 September 2017 05:09PM

Summary:

1. The AI strategy space is currently bottlenecked by entangled and under-defined research questions that are extremely difficult to resolve, as well as by a lack of current institutional capacity to absorb and utilize new researchers effectively.

2. Accordingly, there is very strong demand for people who are good at this type of “disentanglement” research and well-suited to conduct it somewhat independently. There is also demand for some specific types of expertise which can help advance AI strategy and policy. Advancing this research even a little bit can have massive multiplicative effects by opening up large areas of work for many more researchers and implementers to pursue.

3. Until the AI strategy research bottleneck clears, many areas of concrete policy research and policy implementation are necessarily on hold. Accordingly, a large majority of people interested in this cause area, even extremely talented people, will find it difficult to contribute directly, at least in the near term.

4. If you are in this group whose talents and expertise are outside of these narrow areas, and want to contribute to AI strategy, I recommend you build up your capacity and try to put yourself in an influential position. This will set you up well to guide high-value policy interventions as clearer policy directions emerge. Try not to be discouraged or dissuaded from pursuing this area by the current low capacity to directly utilize your talent! The level of talent across a huge breadth of important areas I have seen from the EA community in my role at FHI is astounding and humbling.

5. Depending on how slow these “entangled” research questions are to unjam, and on the timelines of AI development, there might be a very narrow window of time in which it will be necessary to have a massive, sophisticated mobilization of altruistic talent. This makes being prepared to mobilize effectively and take impactful action on short notice extremely valuable in expectation.

6. In addition to strategy research, operations work in this space is currently highly in demand. Experienced managers and administrators are especially needed. More junior operations roles might also serve as a good orientation period for EAs who would like to take some time after college before either pursuing graduate school or a specific career in this space. This can be a great way to tool up while we as a community develop insight on strategic and policy direction. Additionally, successful recruitment in this area should help with our institutional capacity issues substantially.

(3600 words. Reading time: approximately 15 minutes with endnotes.)

(Also posted to Effective Altruism Forum here.)

Introduction

Intended audience: This post is aimed at EAs and other altruistic types who are already interested in working in AI strategy and AI policy because of its potential large scale effect on the future.[1]

Epistemic status: The below represents my current best guess at how to make good use of human resources given current constraints. I might be wrong, and I would not be surprised if my views changed with time. That said, my recommendations are designed to be robustly useful across most probable scenarios. These are my personal thoughts, and do not necessarily represent the views of anyone else in the community or at the Future of Humanity Institute.[2] (For some areas where reviewers disagreed, I have added endnotes explaining the disagreement.) This post is not me acting in any official role, this is just me as an EA community member who really cares about this cause area trying to contribute my best guess for how to think about and cultivate this space.

Why my thoughts might be useful: I have been the primary recruitment person at the Future of Humanity Institute (FHI) for over a year, and am currently the project manager for FHI’s AI strategy programme. Again, I am not writing this in either of these capacities, but being in these positions has given me a chance to see just how talented the community is, to spend a lot of time thinking about how to best utilize this talent, and has provided me some amazing opportunities to talk with others about both of these things.

Definitions

There are lots of ways to slice this space, depending on what exactly you are trying to see, or what point you are trying to make. The terms and definitions I am using are a bit tentative and not necessarily standard, so feel free to discard them after reading this. (These are also not all of the relevant types or areas of research or work, but the subset I want to focus on for this piece.)[3]

1. AI strategy research:[4] the study of how humanity can best navigate the transition to a world with advanced AI systems (especially transformative AI), including political, economic, military, governance, and ethical dimensions.

2. AI policy implementation is carrying out the activities necessary to safely navigate the transition to advanced AI systems. This includes an enormous amount of work that will need to be done in government, the political sphere, private companies, and NGOs in the areas of communications, fund allocation, lobbying, politics, and everything else that is normally done to advance policy objectives.

3. Operations (in support of AI strategy and implementation) is building, managing, growing, and sustaining all of the institutions and institutional capacity for the organizations advancing AI strategy research and AI policy implementation. This is frequently overlooked, badly neglected, and extremely important and impactful work.

4. Disentanglement research:[5] This is a squishy made-up term I am using only for this post that is sort of trying to gesture at a type of research that involves disentangling ideas and questions in a “pre-paradigmatic” area where the core concepts, questions, and methodologies are under-defined. In my mind, I sort of picture this as somewhat like trying to untangle knots in what looks like an enormous ball of fuzz. (Nick Bostrom is a fantastic example of someone who is excellent at this type of research.)

To quickly clarify, as I mean to use the terms, AI strategy research is an area or field of research, a bit like quantum mechanics or welfare economics. Disentanglement research I mean more as a type of research, a bit like quantitative research or conceptual analysis, and is defined more by the character of the questions researched and the methods used to advance toward clarity. Disentanglement is meant to be field agnostic. The relationship between the two is that, in my opinion, AI strategy research is an area that at its current early stage, demands a lot of disentanglement-type research to advance.

The current bottlenecks in the space (as I see them)

Disentanglement research is needed to advance AI strategy research, and is extremely difficult

Figuring out a good strategy for approaching the development and deployment of advanced AI requires addressing enormous, entangled, under-defined questions, which exist well outside of most existing research paradigms. (This is not all it requires, but it is a central part of it at its current stage of development.)[6] This category includes the study of multi-polar versus unipolar outcomes, technical development trajectories, governance design for advanced AI, international trust and cooperation in the development of transformative capabilities, info/attention/reputation hazards in AI-related research, the dynamics of arms races and how they can be mitigated, geopolitical stabilization and great power war mitigation, research openness, structuring safe R&D dynamics, and many more topics.[7] It also requires identifying other large, entangled questions such as these to ensure no crucial considerations in this space are neglected.

From my personal experience trying and failing to do good disentanglement research and watching as some much smarter and more capable people have tried and struggled as well, I have come to think of it as a particular skill or aptitude that does not necessarily correlate strongly with other talents or expertise. A bit like mechanical, mathematical, or language aptitude. I have no idea what makes people good at this, or how exactly they do it, but it is pretty easy to identify if it has been done well once the person is finished. (I can appreciate the quality of Nick Bostrom’s work, like I can appreciate a great novel, but how they are created I don’t really understand and can’t myself replicate.) It also seems to be both quite rare and very difficult to identify in advance who will be good at this sort of work, with the only good indicator, as far as I can tell, being past history of succeeding in this type of research. The result is that it is really hard to recruit for, there are very few people doing it full time in the AI strategy space, and this number is far, far fewer than optimal.

The main importance of disentanglement research, as I imagine it, is that it makes questions and research directions clearer and more tractable for other types of research. As Nick Bostrom and others have sketched out the considerations surrounding the development of advanced AI through “disentanglement”, tractable research questions have arisen. I strongly believe that as more progress is made on topics requiring disentanglement in the AI strategy field, more tractable research questions will arise. As these more tractable questions become clear, and as they are studied, strategic direction, and concrete policy recommendations should follow. I believe this then will open up the floodgates for AI policy implementation work.

Domain experts with specific skills and knowledge are also needed

While I think that our biggest need right now is disentanglement research, there are also certain other skills and knowledge sets that would be especially helpful for advancing AI strategy research. This includes expertise in:

1. Mandarin and/or Chinese politics and/or the Chinese ML community.

2. International relations, especially in the areas of international cooperation, international law, global public goods, constitution and institutional design, history and politics of transformative technologies, governance, and grand strategy.

3. Knowledge and experience working at a high level in policy, international governance and diplomacy, and defense circles.

4. Technology and other types of forecasting.

5. Quantitative social science, such as economics or analysis of survey data.

6. Law and/or Policy.

I expect these skills and knowledge sets to help provide valuable insight on strategic questions including governance design, diplomatic coordination and cooperation, arms race dynamics, technical timelines and capabilities, and many more areas.

Until AI strategy advances, AI policy implementation is mostly stalled

There is a wide consensus in the community, with which I agree, that aside from a few robust recommendations,[8] it is important not to act or propose concrete policy in this space prematurely. We simply have too much uncertainty about the correct strategic direction. Do we want tighter or looser IP law for ML? Do we want a national AI lab? Should the government increase research funding in AI? How should we regulate lethal autonomous weapons systems? Should there be strict liability for AI accidents? It remains unclear what are good recommendations. There are path dependencies that develop quickly in many areas once a direction is initially started down. It is difficult to pass a law that is the exact opposite of a previous law recently lobbied for and passed. It is much easier to start an arms race than to stop it. With most current AI policy questions, the correct approach, I believe, is not to use heuristics of unclear applicability to choose positions, even if those heuristics have served well in other contexts,[9] but to wait until the overall strategic picture is clear, and then to push forward with whatever advances the best outcome.

The AI strategy and policy space, and EA in general, is also currently bottlenecked by institutional and operational capacity

This is not as big an immediate problem as the AI strategy bottleneck, but it is an issue, and one that exacerbates the research bottleneck as well.[10]  FHI alone will need to fill 4 separate operations roles at senior and junior levels in the next few months. Other organizations in this space have similar shortages. These shortages also compound the research bottleneck as they make it difficult to build effective, dynamic AI strategy research groups. The lack of institutional capacity also might become a future hindrance to the massive, rapid, “AI policy implementation” mobilization which is likely to be needed.

Next actions

First, I want to make clear, that if you want to work in this space, you are wanted in this space. There is a tremendous amount of need here. That said, as I currently see it, because of the low tractability of disentanglement research, institutional constraints, and the effect of both of these things on the progress of AI strategy research, a large majority of people who are very needed in this area, even extremely talented people, will not be able to directly contribute immediately. (This is not a good position we are currently in, as I think we are underutilizing our human resources, but hopefully we can fix this quickly.)

This is why I am hoping that we can build up a large community of people with a broader set of skills, and especially policy implementation skills, who are in positions of influence from which they can mobilize quickly and effectively and take important action once the bottleneck clears and direction comes into focus.

Actions you can take right now

Potential near term roles in AI Strategy

FHI is recruiting, but somewhat capacity limited, and trying to triage for advancing strategy as quickly as possible.

If you have good reason to think you would be good at disentanglement research on AI strategy (likely meaning a record of success with this type of research) or have expertise in the areas listed as especially in demand, please get in touch.[12] I would strongly encourage you to do this even if you would rather not work at FHI, as there are remote positions possible if needed, and other organizations I can refer you to. I would also strongly encourage you to do this even if you are reluctant to stop or put on hold whatever you are currently doing. Please also encourage your friends who likely would be good at this to strongly consider it. If I am correct, the bottleneck in this space is holding back a lot of potentially vital action by many, many people who cannot be mobilized until they have a direction in which to push. (The framers need the foundation finished before they can start.) Anything you can contribute to advancing this field of research will have dramatic force multiplicative effects by “creating jobs” for dozens or hundreds of other researchers and implementers. You should also consider applying for one or both of the AI Macrostrategy roles at FHI if you see this before 29 Sept 2017.[13]

If you are unsure of your skill with disentanglement research, I would strongly encourage you to try to make some independent progress on a question of this type and see how you do. I realize this task itself is a bit under-defined, but that is also really part of the problem space itself, and the thing you are trying to test your skills with. Read around in the area, find something sticky you think you might be able to disentangle, and take a run at it.[14] If it goes well, whether or not you want to get into the space immediately, please send it in.

If you feel as though you might be a borderline candidate because of your relative inexperience with an area of in-demand expertise, you might consider trying to tool up a bit in the area, or applying for an internship. You might also err on the side of sending in a CV and cover letter just in case you are miscalibrated about your skill compared to other applicants. That said, again, do not think that you not being immediately employed is any reflection of your expected value in this space! Do not be discouraged, please stay interested, and continue to pursue this!

Preparation for mobilization

Being a contributor to this effort, as I imagine it, requires investing in yourself, your career, and the community, while positioning yourself well for action once the bottleneck unjams and and robust strategic direction is clearer.

I also highly recommend investing in building up your skills and career capital. This likely means excelling in school, going to graduate school, pursuing relevant internships, building up your CV, etc. Invest heavily in yourself. Additionally, stay in close communication with the EA community and keep up to date with opportunities in this space as they develop. (Several people are currently looking at starting programs specifically to on-ramp promising people into this space. This is one reason why signing up to the newsletters might be really valuable, so that opportunities are not missed.) To repeat myself from above, attend meet-ups and conferences, read the forums and newsletters, and be active in the community. Ideally this cause area will become a sub-community within EA and a strong self-reinforcing career network.

A good way to determine how to prepare and tool up for a career in either AI policy research or implementation is to look at the 80,000 Hours’ Guide to working in AI policy and strategy. Fields of study that are likely to be most useful for AI policy implementation include policy, politics and international relations, quantitative social sciences, and law.

Especially useful is finding roles of influence or importance, even with low probability but high expected value, within (especially the US federal) government.[15] Other potentially useful paths include non-profit management, project management, communications, public relations, grantmaking, policy advising at tech companies, lobbying, party and electoral politics and advising, political “staffing,” or research within academia, thinks tanks, or large corporate research groups especially in the areas of machine learning, policy, governance, law, defense, and related. A lot of information about the skills needed for various sub-fields within this area are available at 80,000 Hours.

Working in operations

Another important bottleneck in this space, though smaller in my estimation than the main bottleneck, is in institutional capacity within this currently tiny field.  As mentioned already above, FHI needs to fill 4 separate operations roles at senior and junior levels in the next few months. (We are also in need of a temporary junior-level operations person immediately, if you are a UK citizen, consider getting in touch about this!)[16][17] Other organizations in this space have similar shortages. If you are an experienced manager, administrator, or similar, please consider applying or getting in touch for our senior roles. Alternatively, if you are freshly out of school, but have some proven hustle (especially proven by extensive extracurricular involvement, such as running projects or groups) and would potentially like to take a few years to advance this cause area before going to graduate school or locking in a career path, consider applying for a junior operations position, or get in touch.[18] Keep in mind that operations work at an organization like FHI can be a fantastic way to tool up and gain fluency in this space, orient yourself, discover your strengths and interests, and make contacts, even if one intends to move on to non-operations roles eventually.

Conclusion

The points I hope you can take away in approximate order of importance:

1)    If you are interested in advancing this area, stay involved. Your expected value is extremely high, even if there are no excellent immediate opportunities to have a direct impact. Please join this community, and build up your capacity for future research and policy impact in this space.

2)    If you are good at “disentanglement research” please get in touch, as I think this is our major bottleneck in the area of AI strategy research, and is preventing earlier and broader mobilization and utilization of our community’s talent.

3)    If you are strong or moderately strong in key high-value areas, please also get in touch. (Perhaps err to the side of getting in touch if you are unsure.)

4)    Excellent things to do to add value to this area, in expectation, include:

a)    Investing in your skills and career capital, especially in high-value areas, such as studying in-demand topics.

b)    Building a career in a position of influence (especially in government, global institutions, or in important tech firms.)

c)    Helping to build up this community and its capacity, including building a strong and mutually reinforcing career network among people pursuing AI policy implementation from an EA or altruistic perspective.

5)    Also of very high value is operations work and other efforts to increase institutional capacity.

Thank you for taking the time to read this. While it is very unfortunate that the current ground reality is, as far as I can tell, not well structured for immediate wide mobilization, I am confident that we can do a great deal of preparatory and positioning work as a community, and that with some forceful pushing on these bottlenecks, we can turn this enormous latent capacity into extremely valuable impact.

Let’s getting going “doing good together” as we navigate this difficult area, and help make a tremendous future!

Endnotes:

[1] For those of you not in this category who are interested in seeing why you might want to be, I recommend this short EA Global talk, the Policy Desiderata paper, and OpenPhil’s analysis. For a very short consideration on why the far future matters, I recommend this very short piece, and for a quick fun primer on AI as transformative I recommend this. Finally, once the hook is set, the best resource remains Superintelligence.

[2] Relatedly, I want to thank Miles Brundage, Owen Cotton-Barratt, Allan Dafoe, Ben Garfinkel, Roxanne Heston, Holden Karnofsky, Jade Leung, Kathryn Mecrow, Luke Muehlhauser, Michael Page, Tanya Singh, and Andrew Snyder-Beattie for their comments on early drafts of this post. Their input dramatically improved it. That said, again, they should not be viewed as endorsing anything in this. All mistakes are mine. All views are mine.)

[3] There are some interesting tentative taxonomies and definitions of the research space floating around. I personally find the following, quoting from a draft document by Allan Dafoe, especially useful:

AI strategy [can be divided into]... four complementary research clusters: the technical landscape, AI politics, AI governance, and AI policy. Each of these clusters characterizes a set of problems and approaches, within which the density of conversation is likely to be greater. However, most work in this space will need to engage the other clusters, drawing from and contributing high-level insights. This framework can perhaps be clarified by analogy to the problem of building a new city. The technical landscape examines the technical inputs and constraints to the problem, such as trends in the price and strength of steel. Politics considers the contending motivations of various actors (such as developers, residents, businesses), the possible mutually harmful dynamics that could arise and strategies for cooperating to overcome them. Governance involves understanding the ways that infrastructure, laws, and norms can be used to build the best city, and proposing ideal masterplans of these to facilitate convergence on a common good vision. The policy cluster involves crafting the actual policies to be implemented to build this city.

In a comment on this draft, Jade Leung pointed out what I think is an important implicit gap in the terms I am using, and highlights the importance of not treating these as either final, comprehensive, or especially applicable outside of this piece:

There seems to be a gap between [AI policy implementation] and 'AI strategy research' - where does the policy research feed in? I.e. the research required to canvas and analyse policy mechanisms by which strategies are most viably realised, prior to implementation (which reads here more as boots-on-the-ground alliance building, negotiating, resource distribution etc.)

[4] Definition lightly adapted from Allan Dafoe and Luke Muehlhauser.

[5]This idea owes a lot to conversations with Owen Cotton-Barratt, Ben Garfinkel, and Michael Page.

[6] I did not get a sense that any reviewer necessarily disagreed that this is a fair conceptualization of a type of research in this space, though some questioned its importance or centrality to current AI strategy research. I think the central disagreement here is on how many well-defined and concrete questions there are left to answer at the moment, how far answering them is likely to go in bringing clarity to this space and developing robust policy recommendations, and the relative marginal value of addressing these existing questions versus producing more through disentanglement of the less well defined areas.

[7] One commenter did not think these were a good sample of important questions. Obviously this might be correct, but in my opinion, these are absolutely among the most important questions to gain clarity on quickly.

[8] My personal opinion is that there are only three or maybe four robust policy-type recommendations we can make to governments at this time, given our uncertainty about strategy: 1) fund safety research, 2) commit to a common good principle, and 3) avoid an arms races. The fourth suggestion is both an extension of the other three and is tentative, but is something like: fund joint intergovernmental research projects located in relatively geopolitically neutral countries with open membership and a strong commitment to a common good principle.

I should note that this point was also flagged as potentially controversial by one reviewer. Additionally, Miles Brundage, quoted below, had some useful thoughts related to my tentative fourth suggestion:

In general, detailed proposals at this stage are unlikely to be robust due to the many gaps in our strategic and empirical knowledge. We "know" arms races are probably bad but there are many imaginable ways to avoid or mitigate them, and we don't really know what the best approach is yet. For example, launching big new projects might introduce various opportunities for leakage of information that weren't there before, and politicize the issue more than might be optimal as the details are worked out. As an example of an alternative, governments could commit to subsidizing (e.g. through money and hardware access) existing developers that open themselves up to inspections, which would have some advantages and some disadvantages over the neutrally-sited new project approach.

[9] This is an area with extreme and unusual enough considerations that it seems to break normal heuristics, or at least my normal heuristics. I have personally heard at least minimally plausible arguments made by thoughtful people that openness, antitrust law and competition, government regulation, advocating opposition to lethal autonomous weapons systems, and drawing wide attention to the problems of AI might be bad things, and invasive surveillance, greater corporate concentration, and weaker cyber security might be good things. (To be clear, these were all tentative, weak, but colourable arguments, made as part of exploring the possibility space, not strongly held positions by anyone.) I find all of these very counter-intuitive.

[10] A useful comment from a reviewer on this point: “These problems are related: We desperately need new institutions to house all the important AI strategy work, but we can't know what institutions to build until we've answer more of the foundational questions.”

[11] Credit for the heroic effort of assembling this goes mostly to Matthijs Maas. While I contributed a little, I have myself only read a tiny fraction of these.

[12] fhijobs@philosophy.ox.ac.uk.

[13] Getting in touch is a good action even if you can not or would rather not work at FHI. In my opinion, AI strategy researchers would ideally cluster in one or more research groups in order to advance this agenda as quickly as possible, but there is also some room for remote scholarship. (The AI strategy programme at FHI is currently trying to become the first of these “cluster” research groups, and we are recruiting in this area aggressively.)

[14] I’m personally bad enough at this, that my best advice is something like read around in the area, find a topic, and “do magic.” Accordingly, I will tag in Jade Leung again for a suggestion of what a “sensible, useful deliverable of 'disentanglement research' would look like”:

A conceptual model for a particular interface of the AI strategy space, articulating the sub-components, exogenous and endogenous variables of relevance, linkages etc.; An analysis of driver-pressure-interactions for a subset of actors; a deconstruction of a potential future scenario into mutually-exclusive-collectively-exhaustive (MECE) hypotheses.

Ben Garfinkel similarly volunteered to help clarify “by giving an example of a very broad question that seem[s] to require some sort of "detangling" skill:”

What does the space of plausible "AI development scenarios" look like, and how do their policy implications differ?

If AI strategy is "the study of how humanity can best navigate the transition to a world with advanced AI systems," then it seems like it ought to be quite relevant what this transition will look like. To point at two different very different possibilities, there might be a steady, piecemeal improvement of AI capabilities -- like the steady, piecemeal improvement of industrial technology that characterized the industrial revolution -- or there might be a discontinuous jump, enabled by sudden breakthroughs or an "intelligence explosion," from roughly present-level systems to systems that are more capable than humans at nearly everything. Or -- more likely -- there might be a transition that doesn't look much like either of these extremes.

Robin Hanson, Eliezer Yudkowsky, Eric Drexler, and others have all emphasized different visions of AI development, but have also found it difficult to communicate the exact nature of their views to one another. (See, for example, the Hanson-Yudkowsky "foom" debate.) Furthermore, it seems to me that their visions don't cleanly exhaust the space, and will naturally be difficult to define given the fact that so many of the relevant concepts--like "AGI," "recursive self-improvement," "agent/tool/goal-directed AI," etc.--are currently so vague.

I think it would be very helpful to have a good taxonomy of scenarios, so that we could begin to make (less ambiguous) statements like, "Policy X would be helpful in scenarios A and B, but not in scenario C," or, "If possible, we ought to try to steer towards scenario A and away from B." AI strategy is not there yet, though.

A related, "entangled" question is: Across different scenarios, what is the relationship between short and medium-term issues (like the deployment of autonomous weapons systems, or the automation of certain forms of cyberattacks) and the long-term issues that are likely to arise as the space of AI capabilities starts to subsume the space of human capabilities? For a given scenario, can these two (rough) categories of issues be cleanly "pulled apart"?

[15] 80,000 hours is experimenting with having a career coach specialize in this area, so you might consider getting in touch with them, or getting in touch with them again, if you might be interested in pursuing this route.

[16] fhijobs@philosophy.ox.ac.uk. This is how I snuck into FHI ~2 years ago, on a 3 week temporary contract as an office manager. I flew from the US on 4 days notice for the chance to try to gain fluency in the field. While my case of “working my way up from the mail room” is not likely to be typical (I had a strong CV), or necessarily a good model to encourage (see next footnote below) it is definitely the case that you can pick up a huge amount through osmosis at FHI, and develop a strong EA career network. This can set you up well for a wise choice of graduate programs or other career direction decisions.

[17]  One reviewer cautioned against encouraging a dynamic in which already highly qualified people take junior operations roles with the expectation of transitioning directly into a research position, since this can create awkward dynamics and a potentially unhealthy institutional culture. I think this is probably, or at least plausibly, correct. Accordingly, while I think a junior operations role is great for building skills and orienting yourself, it should probably not be seen as a way of immediately transitioning to strategy research, but treated more as a method for turning post-college uncertainty into a productive plan, while also gaining valuable skills and knowledge, and directly contributing to very important work.

[18] Including locking in a career path continuing in operations. This really is an extremely high-value area for a career, and badly overlooked and neglected.

The Great Filter isn't magic either

3 27 September 2017 04:56PM

Crossposted at Less Wrong 2.0. A post suggested by James Miller's presentation at the Existential Risk to Humanity conference in Gothenburg.

Seeing the emptiness of the night sky, we can dwell upon the Fermi paradox: where are all the alien civilizations that simple probability estimates imply we should be seeing?

Especially given the ease of moving within and between galaxies, the cosmic emptiness implies a Great Filter: something that prevents planets from giving birth to star-spanning civilizations. One worrying possibility is the likelihood that advanced civilizations end up destroying themselves before they reach the stars.

The Great Filter as an Outside View

In a sense, the Great Filter can be seen as an ultimate example of the Outside View: we might have all the data and estimation we believe we would ever need from our models, but if those models predict that the galaxy should be teeming with visible life, then it doesn't matter how reliable our models seem: they must be wrong.

In particular, if you fear a late great filter - if you fear that civilizations are likely to destroy themselves - then you should increase your fear, even if "objectively" everything seems to be going all right. After all, presumably the other civilizations that destroyed themselves thought everything seemed to going all right. Then you can adjust your actions using your knowledge of the great filter - but presumably other civilizations also thought of the great filter and adjusted their own actions as well, but that didn't save them, so maybe you need to try something different again or maybe you can do something that breaks the symmetry from the timeless decision theory perspective like send a massive signal to the galaxy...

The Great Filter isn't magic

It can all get very headache-inducing. But, just as the Outside View isn't magic, the Great Filter isn't magic either. If advanced civilizations destroy themselves before becoming space-faring or leaving an imprint on the galaxy, then there is some phenomena that is the cause of this. What can we say, if we look analytically at the great filter argument?

First of all suppose we had three theories - early great filter (technological civilizations are rare), late great filter (technological civilizations destroy themselves before becoming space-faring), or no great filter. Then we look up at the empty skies, and notice no aliens. This rules out the third theory, but leaves the relative probabilities of the other two intact.

Then we can look at objective evidence. Is human technological civilization likely to end in a nuclear war? Possibly, but are the odds in the 99.999% range that would be needed to explain the Fermi Paradox? Every year that has gone by has reduced the likelihood that nuclear war is very very very very likely. So a late Great Filter may seemed quite probable compared with an early one, but much of the evidence we see is against it (especially if we assume that AI - which is not a Great Filter! - might have been developed by now). Million-to-one prior odds can be overcome by merely 20 bits of information.

And what about the argument that we have to assume that prior civilizations would also have known of the Great Filter and thus we need to do more than they would have? In your estimation, is the world currently run by people taking the Great Filter arguments seriously? What is the probability that the world will be run by people that take the Great Filter argument seriously? If this probability is low, we don't need to worry about the recursive aspect; the ideal situation would be if we can achieve:

1. Powerful people taking the Great Filter argument seriously.

2. Evidence that it was hard to make powerful people take the argument seriously.

Of course, successfully achieving 1 is evidence against 2, but the Great Filter doesn't work by magic. If it looks like we achieved something really hard, then that's some evidence that it is hard. Every time we find something unlikely with a late Great Filter, that shifts some of the probability mass away from the late great filter and into alternative hypotheses (early Great Filter, zoo hypothesis,...).

Variance and error of xrisk estimates

But let's focus narrowly on the probability of the late Great Filter.

Current estimates for the risk of nuclear war are uncertain, but let's arbitrarily assume that the risk is 10% (overall, not per year). Suppose one of two papers comes out:

1. Paper A shows that current estimates of nuclear war have not accounted for a lot of key facts; when these facts are added in, the risk of nuclear war drops to 5%.

2. Paper B is a massive model of international relationships with a ton of data and excellent predictors and multiple lines of evidence, all pointing towards the real risk being 20%.

What would either paper mean from the Great Filter perspective? Well, counter-intuitively, papers like A typically increase the probability for nuclear war being a Great Filter, while papers like B decrease it. This is because none of 5%, 10%, and 20% are large enough to account for the Great Filter, which requires probabilities in the 99.99% style. And, though paper A decreases the probability of the nuclear war, it also leaves more room for uncertainties - we've seen that a lot of key facts were missing in previous papers, so it's plausible that there are key facts still missing from this one. On the other hand, though paper B increases the probability, it makes it unlikely that the probability will be raised any further.

So if we fear the Great Filter, we should not look at risks whose probabilities are high, but risks who's uncertainty is high, where the probability of us making an error is high. If we consider our future probability estimates as a random variable, then the one whose variance is higher is the one to fear. So a late Great Filter would make biotech risks even worse (current estimates of risk are poor) while not really changing asteroid impact risks (current estimates of risk are good).

The Outside View isn't magic

6 27 September 2017 02:37PM

Crossposted at Less Wrong 2.0.

The planning fallacy is an almost perfect example of the strength of using the outside view. When asked to predict the time taken for a project that they are involved in, people tend to underestimate the time needed (in fact, they tend to predict as if question was how long things would take if everything went perfectly).

Simply telling people about the planning fallacy doesn't seem to make it go away. So the outside view argument is that you need to put your project into the "reference class" of other projects, and expect time overruns as compared to your usual, "inside view" estimates (which focus on the details you know about the project.

So, for the outside view, what is the best way of estimating the time of a project? Well, to find the right reference class for it: the right category of projects to compare it with. You can compare the project with others that have similar features - number of people, budget, objective desired, incentive structure, inside view estimate of time taken etc... - and then derive a time estimate for the project that way.

That's the outside view. But to me, it looks a lot like... induction. In fact, it looks a lot like the elements of a linear (or non-linear) regression. We can put those features (at least the quantifiable ones) into a linear regression with a lot of data about projects, shake it all about, and come up with regression coefficients.

At that point, we are left with a decent project timeline prediction model, and another example of human bias. The fact that humans often perform badly in prediction tasks is not exactly new - see for instance my short review on the academic research on expertise.

So what exactly is the outside view doing in all this?

The role of the outside view: model incomplete and bias human

The main use of the outside view, for humans, seems to be to point out either an incompleteness in the model or a human bias. The planning fallacy has both of these: if you did a linear regression comparing your project with all projects with similar features, you'd notice your inside estimate was more optimistic than the regression - your inside model is incomplete. And if you also compared each person's initial estimate with the ultimate duration of their project, you'd notice a systematically optimistic bias - you'd notice the planning fallacy.

The first type of errors tend to go away with time, if the situation is encountered regularly, as people refine models, add variables, and test them on the data. But the second type remains, as human biases are rarely cleared by mere data.

Reference class tennis

If use of the outside view is disputed, it often develops into a case of reference class tennis - where people with opposing sides insist or deny that a certain example belongs in the reference class (similarly to how, in politics, anything positive is claimed for your side and anything negative assigned to the other side).

But once the phenomena you're addressing has an explanatory model, there are no issues of reference class tennis any more. Consider for instance Goodhart's law: "When a measure becomes a target, it ceases to be a good measure". A law that should be remembered by any minister of education wanting to reward schools according to improvements to their test scores.

This is a typical use of the outside view: if you'd just thought about the system in terms of inside facts - tests are correlated with child performance; schools can improve child performance; we can mandate that test results go up - then you'd have missed several crucial facts.

But notice that nothing mysterious is going on. We understand exactly what's happening here: schools have ways of upping test scores without upping child performance, and so they decided to do that, weakening the correlation between score and performance. Similar things happen in the failures of command economies; but again, once our model is broad enough to encompass enough factors, we get decent explanations, and there's no need for further outside views.

In fact, we know enough that we can show when Goodhart's law fails: when no-one with incentives to game the measure has control of the measure. This is one of the reasons central bank interest rate setting has been so successful. If you order a thousand factories to produce shoes, and reward the managers of each factory for the number of shoes produced, you're heading to disaster. But consider GDP. Say the central bank wants to increase GDP by a certain amount, by fiddling with interest rates. Now, as a shoe factory manager, I might have preferences about the direction of interest rates, and my sales are a contributor to GDP. But they are a tiny contributor. It is not in my interest to manipulate my sales figures, in the vague hope that, aggregated across the economy, this will falsify GDP and change the central bank's policy. The reward is too diluted, and would require coordination with many other agents (and coordination is hard).

Thus if you're engaging in reference class tennis, remember the objective is to find a model with enough variables, and enough data, so that there is no more room for the outside view - a fully understood Goodhart's law rather than just a law.

In the absence of a successful model

Sometimes you can have a strong trend without a compelling model. Take Moore's law, for instance. It is extremely strong, going back decades, and surviving multiple changes in chip technology. But it has no clear cause.

A few explanations have been proposed. Maybe it's a consequence of its own success, of chip companies using it to set their goals. Maybe there's some natural exponential rate of improvement in any low-friction feature of a market economy. Exponential-type growth in the short term is no surprise - that just means growth in proportional to investment - so maybe it was an amalgamation of various short term trends.

Do those explanations sound unlikely? Possibly, but there is a huge trend in computer chips going back decades that needs to be explained. They are unlikely, but they have to be weighed against the unlikeliness of the situation. The most plausible explanation is a combination of the above and maybe some factors we haven't thought of yet.

But here's an explanation that is implausible: little time-travelling angels modify the chips so that they follow Moore's law. It's a silly example, but it shows that not all explanations are created equal, even for phenomena that are not fully understood. In fact there are four broad categories of explanations for putative phenomena that don't have a compelling model:

1. Unlikely but somewhat plausible explanations.
2. We don't have an explanation yet, but we think it's likely that there is an explanation.
3. The phenomenon is a coincidence.
4. Any explanation would go against stuff that we do know, and would be less likely than coincidence.

The explanations I've presented for Moore's law fall into category 1. Even if we hadn't thought of those explanations, Moore's law would fall into category 2, because of the depth of evidence for Moore's law and because a "medium length regular technology trend within a broad but specific category" is something that has is intrinsically likely to have an explanation.

Compare with Kurzweil's "law of time and chaos" (a generalisation of his "law of accelerating returns") and Robin Hanson's model where the development of human brains, hunting, agriculture and the industrial revolution are all points on a trend leading to uploads. I discussed these in a previous post, but I can now better articulate the problem with them.

Firstly, they rely on very few data points (the more recent part of Kurzweil's law, the part about recent technological trends, has a lot of data, but the earlier part does not). This raises the probability that they are a mere coincidence (we should also consider selection bias in choosing the data points, which increases the probability of coincidence). Secondly, we have strong reasons to suspect that there won't be any explanation that ties together things like the early evolution of life on Earth, human brain evolution, the agricultural revolution, the industrial revolution, and future technology development. These phenomena have decent local explanations that we already roughly understand (local in time and space to the phenomena described), and these run counter to any explanation that would tie them together.

Human biases and predictions

There is one area where the outside view can still function for multiple phenomena across different eras: when it comes to pointing out human biases. For example, we know that doctors have been authoritative, educated, informed, and useless for most of human history (or possibly much worse than useless). Hence authoritative, educated, and informed statements or people are not to be considered of any value, unless there is some evidence the statement or person is truth tracking. We now have things like expertise research, some primitive betting markets, and track records to try and estimate their experience; these can provide good "outside views".

And the authors of the models of the previous section have some valid points where bias is concerned. Kurzweil's point that (paraphrasing) "things can happen a lot faster than some people think" is valid: we can compare predictions with outcomes. Robin has similar valid points in defense of the possibility of the em scenario.

The reason these explanations are more likely valid is because they have a very probable underlying model/explanation: humans are biased.

Conclusions

• The outside view is a good reminder for anyone who may be using too narrow a model.
• If the model explains the data well, then there is no need for further outside views.
• If there is a phenomena with data but no convincing model, we need to decide if it's a coincidence or there is an underlying explanation.
• Some phenomena have features that make it likely that there is an explanation, even if we haven't found it yet.
• Some phenomena have features that make it unlikely that there is an explanation, no matter how much we look.
• Outside view arguments that point at human prediction biases, however, can be generally valid, as they only require the explanation that humans are biased in that particular way.

Economics of AI conference from NBER

1 27 September 2017 01:45AM

The speaker list (including presenters and moderators) includes many prominent names in the economics world, including:

And others with whom you might be more familiar than I.

[Link] Cognitive Empathy and Emotional Labor

0 26 September 2017 08:36PM

Rational Feed: Last Week's Community Articles and Some Recommended Posts

6 25 September 2017 01:41PM

===Highly Recommended Articles:

Why I Am Not A Quaker Even Though It Often Seems As Though I Should Be by Ben Hoffman - Quakers have consistently gotten to the right answers faster than most people, or the author. Arbitrage strategies to beat the quakers. An incomplete survey of alternatives.

Could A Neuroscientist Understand A Microprocessor by Rationally Speaking - "Eric Jonas, discussing his provocative paper titled 'Could a Neuroscientist Understand a Microprocessor?' in which he applied state-of-the-art neuroscience tools, like lesion analysis, to a computer chip. By applying neuroscience's tools to a system that humans fully understand he was able to reveal how surprisingly uninformative those tools actually are."

Reasonable Doubt New Look Whether Prison Growth Cuts Crime by Open Philosophy - Part1 of a four part, in depth, series on Criminal Justice reform. The remaining posts are linked below. "I estimate, that at typical policy margins in the United States today, decarceration has zero net impact on crime. That estimate is uncertain, but at least as much evidence suggests that decarceration reduces crime as increases it. The crux of the matter is that tougher sentences hardly deter crime, and that while imprisoning people temporarily stops them from committing crime outside prison walls, it also tends to increase their criminality after release. As a result, “tough-on-crime” initiatives can reduce crime in the short run but cause offsetting harm in the long run. Empirical social science research—or at least non-experimental social science research—should not be taken at face value. Among three dozen studies I reviewed, I obtained or reconstructed the data and code for eight. Replication and reanalysis revealed significant methodological concerns in seven and led to major reinterpretations of four. These studies endured much tougher scrutiny from me than they did from peer reviewers in order to make it into academic journals. Yet given the stakes in lives and dollars, the added scrutiny was worth it. So from the point of view of decision makers who rely on academic research, today’s peer review processes fall well short of the optimal."

===Scott:

L Dopen Thread by Scott Alexander - Bi-weekly public open thread. Berkeley SSC meetup. New ad for the Greenfield Guild, an online network of software consultants. Reasons to respect the society of friends.

Meditative States As Mental Feedback Loops by Scott Alexander - the main reason we don't see emotional positive feedback loops is that people get distracted. If you do not get distracted you can experience a bliss feedback look.

Book Review Mastering The Core Teachings Of The Buddha by Scott Alexander - "Buddhism For ER Docs. ER docs are famous for being practical, working fast, and thinking everyone else is an idiot. MCTB delivers on all three counts." Practical buddhism with a focus on getting things done. buddhism is split into morality concentration and wisdom. Discussion of "the Dark Night of the Soul" which is a sort of depression occurs when you have had some but not enough spiritual experience.

===Rationalist:

Impression Track Records by Katja Grace - Three reasons its better to keep impression track records and belief track records separate.

Why I Am Not A Quaker Even Though It Often Seems As Though I Should Be by Ben Hoffman - Quakers have consistently gotten to the right answers faster than most people, or the author. Arbitrage strategies to beat the quakers. An incomplete survey of alternatives.

The Best Self Help Should Be Self Defeating by mindlevelup - "Self-help is supposed to get people to stop needing it. But typical incentives in any medium mean that it’s possible to get people hooked on your content instead. A musing on how the setup for writing self-help differs from typical content."

Nobody Does The Thing That They Are Supposedly Doing by Kaj Sotala - "In general, neither organizations nor individual people do the thing that their supposed role says they should do." Evolutionary incentives. Psychology of motivation. Very large number of links.

Out To Get You by Zvi Moshowitz - "Some things are fundamentally Out to Get You. They seek resources at your expense. Fees are hidden. Extra options are foisted upon you." You have four responses: Get Gone, Get Out (give up), Get Compact (limit what it wants) or Get Ready for Battle.

In Defense Of Unreliability by Ozy - Zvi claims that when he makes plan with friends in the bay he never assumes the plan will actually occur. Ozy depends on unreliable transport. Getting places 10-15 early is also costly. Flaking and agoraphobia.

Strategic Goal Pursuit And Daily Schedules by Rossin (lesswrong) - The author benefitted from Anna Salamon’s goal-pursuing heuristics and daily schedules.

Why Attitudes Matter by Ozy - Focusing on attitudes can be bad for some people. Two arguments: "First, for any remotely complicated situation, it would be impossible to completely list out all the things which are okay or not okay. Second, an attitude emphasis prevents rules-lawyering."

Humans Cells In Multicellular Future Minds by Robin Hanson - In general humans replace specific systems with more general adaptive systems. Seeing like a State. Most biological and cultural systems are not general. Multi-cellular organisms re tremendously inefficient. The power of entrenched systems. Human brains are extremely general. Human brains may win for a long time vs other forms of intelligence.

Recognizing Vs Generating An Important Dichotomy For Life by Gordon (Map and Territory) - Bullet Points -> Essay vs Essay -> Bullet Points. Generating ideas vs critique. Most advice is bad since it doesn't convey the reasons clearly. Let the other person figure out the actual advice for themselves.

Prediction Markets Update by Robin Hanson - Prediction markets provide powerful information but they challenge powerful entrenched interests, Hanson compares them to "a knowledgeable Autist in the C-suite". Companies selling straight prediction market tech mostly went under. Blockchain platforms for prediction markets. Some discussion of currently promising companies.

===AI:

Focus Areas Of Worst Case Ai Safety by The Foundational Research Institute - Redundant safety measures. Tripwires. Adversarial architectures. Detecting and formalizing suffering. Backup utility functions. Benign testing environments.

Srisk Faq by Tobias Baumann (EA forum) - Quite detailed responses to questions about suffering risks and their connection to AGI. sections: General questions, The future, S-risks and x-risks, Miscellaneous.

===EA:

Reasonable Doubt New Look Whether Prison Growth Cuts Crime by Open Philosophy - Part1 of a four part, in depth, series on Criminal Justice reform. The remaining posts are linked below. "I estimate, that at typical policy margins in the United States today, decarceration has zero net impact on crime. That estimate is uncertain, but at least as much evidence suggests that decarceration reduces crime as increases it. The crux of the matter is that tougher sentences hardly deter crime, and that while imprisoning people temporarily stops them from committing crime outside prison walls, it also tends to increase their criminality after release. As a result, “tough-on-crime” initiatives can reduce crime in the short run but cause offsetting harm in the long run. Empirical social science research—or at least non-experimental social science research—should not be taken at face value. Among three dozen studies I reviewed, I obtained or reconstructed the data and code for eight. Replication and reanalysis revealed significant methodological concerns in seven and led to major reinterpretations of four. These studies endured much tougher scrutiny from me than they did from peer reviewers in order to make it into academic journals. Yet given the stakes in lives and dollars, the added scrutiny was worth it. So from the point of view of decision makers who rely on academic research, today’s peer review processes fall well short of the optimal."

Paypal Giving Fund by Jeff Kaufman - The PayPal giving fund lets you batch donations and PayPal covers the fees if you use it. Jeff thought there must be a catch but it seems legit.

What Do Dalys Capture by Danae Arroyos (EA forum) - How Disability Adjusted life years computed. DALYs misrepresent mental health. DALY's Miss Indirect Effects. Other issues.

Against Ea Pr by Ozy - The EA community is the only large entity trying to produce accurate and publicly available assessments of charities. Hence the EA community should not trade away any honesty. EAs should simply say which causes and organizations are most effective, they should not worry about PR concerns.

Ea Survey 2017 Series Qualitative Comments Summary by tee (EA forum) - Are you an EA, how welcoming is EA, local EA meetup attendance, concerns with not being 'EA enough', improving the survey.

Demographics Ii by tee (EA forum) - Racial breakdown. Percent white in various geographic locations. Political spectrum. Politics correlated with cause area, diet and geography, employment, fields of study, year joining EA.

===Politics and Economics:

Raj Chetty Course Using Big Data Solve Economic Social Problems by Marginal Revolution - Link to an eleven lecture course. "Equality of opportunity, education, health, the environment, and criminal justice. In the context of these topics, the course provides an introduction to basic statistical methods and data analysis techniques, including regression analysis, causal inference, quasi-experimental methods, and machine learning."

Speech On Campus Reply To Brad Delong by Noah Smith - The safeguard put in place to exclude the small minority of genuinely toxic people will be overused. Comparison to the war on terror. Brad's exclusions criteria are incredibly vague. The speech restriction apparatus is patchwork and inconsistent. Cultural Revolution.

Deontologist Envy by Ozy - The behavior of your group is highly unlikely to effect the behavior of your political opponents. Many people respond to proposed tactics by asking "What if everyone did that". Ozy claims these responses show an implicit Kantian or deontological point of view.

Peak Fossil Fuel by Bayesian Investor - Electric cars will have a 99% market share by 2035. "Electric robocars run by Uber-like companies will be cheap enough that you’ll have trouble giving away a car bought today. Uber’s prices will be less than your obsolete car’s costs of fuel, maintenance, and insurance."

What We Didn't Get by Noah Smith - We are currently living in a world envisioned by the cyberpunk writers. the early industrial sci-fi writers also predicted many inventions. Why didn't mid 1900s sci-fi come true? We ran out of theoretical physics and we ran out of energy. Energy density of fuel sources. Some existing or plausible technology is just too dangerous. Discussion of whether strong AI, personal upload, nanotech and/or the singularity will come true.

Unpopular Ideas About Children by Julia Galef - Julia's thoughts on why she is collecting these lists. Parenting styles, pro and anti-natalism, sexuality, punishment, etc. Happiness studies. Some other studies finding extreme results.

The Margin Of Stupid by Noah Smith - Can we trust studies showing that millennials are as racist as their parents, except for the ones in college who are extreme leftists?

Role of Allies in Queer Spaces by Brute Reason - The main purpose of having allies in LBGTQA spaces is providing cover for closeted or questioning members. Genuinely cis-straight allies are ok in some spaces like LBGTQA bands. But straight allies cause problems when they are present in queer support spaces.

The Wonder Of International Adoption by Bryan Caplan - Benefits of international adoption of third world children. Adoptees are extremely stunted physically on arrival but make up some of the difference post adoption. International adoptions raises IQ by at least 4 points on average and perhaps as much as 8.

===Misc:

Coin Flipping Problem by protokol2020 - Flipping coins until you get a pre-committed sequence. You re-start whenever your flip doesn't match the sequence. Relationship between the expected number of flips and the length of the sequence.

Seek Not To Be Entertained by Mr. Money Mustache - Don't be normal, normal people need constant entertainment. You can get enjoyment and satisfaction from making things. Advice for people less abnormal than MMM. What you enjoy doesn't matter, what matters is what is good for you.

Propositions On Immortality by sam[]zdat - Fiction. A man digresses about philosophy, the nature of time, the soul, consciousness and mortality.

Comments For Ghost by Tom Bartleby - Ghost is a blog platform hat doesn't natively support comments. Three important use cases and why they all benefit from comments: Ex-Wordpress blogger who wants things to 'just work', Power suers care about privacy and don't want to use third party comments, The Static-Site Fence-Sitter since the main dynamic content you want is comments.

Prime Crossword by protokol2020 - Can you create a grid larger than [3,7],[1,1] where all the rows and columns are primes? (37, 11, 31 and 71 are prime).

===Podcast:

Reihan Salam by The Ezra Klein Show - Remaking the Republican party, but not the way Donald Trump did it. "The future of the Republican Party, the healthcare debate, and how he would reform our immigration system (and upend the whole way we talk about it). "

Into The Dark Land by Waking Up with Sam Harris - "Siddhartha Mukherjee about his Pulitzer Prize winning book, The Emperor of All Maladies: A Biography of Cancer."

Conversation with Larry Summers by Marginal Revolution - "Mentoring, innovation in higher education, monopoly in the American economy, the optimal rate of capital income taxation, philanthropy, Hermann Melville, the benefits of labor unions, Mexico, Russia, and China, Fed undershooting on the inflation target, and Larry’s table tennis adventure in the summer Jewish Olympics."

Hilary Clinton by The Ezra Klein Show - Hilary's dream of paying for basic income with revenue from shared national resources. Why she scrapped the plan. Hilary thinks she should perhaps have thrown caution to the wind. Hilary isn't a radical, she is proud of the American political system and is annoyed other's don't share her enthusiasm for incremental progress.

David Remnick by The Ezra Klein Show - New Yorker editor. "Russia’s meddling in the US election, Russia’s transformation from communist rule to Boris Yeltsin and Vladimir Putin, his magazine’s coverage of President Donald Trump, how he chooses his reporters and editors, and how to build a real business around great journalism."

Gabriel Zucman by EconTalk - "Research on inequality and the distribution of income in the United States over the last 35 years. Zucman finds that there has been no change in income for the bottom half of the income distribution over this time period with large gains going to the top 1%. The conversation explores the robustness of this result to various assumptions and possible explanations for the findings."

Could A Neuroscientist Understand A Microprocessor by Rationally Speaking - "Eric Jonas, discussing his provocative paper titled 'Could a Neuroscientist Understand a Microprocessor?' in which he applied state-of-the-art neuroscience tools, like lesion analysis, to a computer chip. By applying neuroscience's tools to a system that humans fully understand he was able to reveal how surprisingly uninformative those tools actually are."

Open thread, September 25 - October 1, 2017

0 25 September 2017 07:36AM
If it's worth saying, but not worth its own post, then it goes here.

Notes for future OT posters:

2. Check if there is an active Open Thread before posting a new one. (Immediately before; refresh the list-of-threads page before posting.)

3. Open Threads should start on Monday, and end on Sunday.

2 23 September 2017 12:00PM

Intuitive explanation of why entropy maximizes in a uniform distribution?

0 23 September 2017 09:43AM

What is the best mathematical, intuitive explanation of why entropy maximizes in a uniform distribution? I'm looking for a short proof using the most elementary mathematics possible.

Please no explanation like "because entropy was designed in this way", etc...

Naturalized induction – a challenge for evidential and causal decision theory

4 22 September 2017 08:15AM

As some of you may know, I disagree with many of the criticisms leveled against evidential decision theory (EDT). Most notably, I believe that Smoking lesion-type problems don't refute EDT. I also don't think that EDT's non-updatelessness leaves a lot of room for disagreement, given that EDT recommends immediate self-modification to updatelessness. However, I do believe there are some issues with run-of-the-mill EDT. One of them is naturalized induction. It is in fact not only a problem for EDT but also for causal decision theory (CDT) and most other decision theories that have been proposed in- and outside of academia. It does not affect logical decision theories, however.

The role of naturalized induction in decision theory

Recall that EDT prescribes taking the action that maximizes expected utility, i.e.

$\underset{a\in A}{\mathrm{argmax}} ~\mathbb{E}[U(w)|a,o] = \underset{a\in A}{\mathrm{argmax}} \sum_{w\in W} P(w|a,o) U(w),$

where $A$ is the set of available actions, $U$ is the agent's utility function, $W$ is a set of possible world models, $o$ represents the agent's past observations (which may include information the agent has collected about itself). CDT works in a – for the purpose of this article – similar way, except that instead of conditioning on $a$ in the usual way, it calculates some causal counterfactual, such as Pearl's do-calculus: $P(w|do(a),o)$. The problem of naturalized induction is that of assigning posterior probabilities to world models $P(w|a,o)$ (or $P(w|do(a),o)$ or whatever) when the agent is naturalized, i.e., embedded into its environment.

Consider the following example. Let's say there are 5 world models $W=\{w_1,...,w_5\}$, each of which has equal prior probability. These world models may be cellular automata. Now, the agent makes the observation $o$. It turns out that worlds $w_1$ and $w_2$ don't contain any agents at all, and $w_3$ contains no agent making the observation $o$. The other two world models, on the other hand, are consistent with $o$. Thus, $P(w_i\mid o)=0$ for $i=1,2,3$ and $P(w_i\mid o)=\frac{1}{2}$ for $i=4,5$. Let's assume that the agent has only two actions $A=\{a_1,a_2\}$ and that in world model $w_4$ the only agent making observation $o$ takes action $a_1$ and in $w_5$ the only agent making observation $o$ takes action $a_2$, then $P(w_4\mid a_1)=1=P(w_5\mid a_2)$ and $P(w_5\mid a_1)=0=P(w_4\mid a_2)$. Thus, if, for example, $U(w_5)>U(w_4)$, an EDT agent would take action $a_2$ to ensure that world model $w_5$ is actual.

The main problem of naturalized induction

This example makes it sound as though it's clear what posterior probabilities we should assign. But in general, it's not that easy. For one, there is the issue of anthropics: if one world model $w_1$ contains more agents observing $o$ than another world model $w_2$, does that mean $P(w_1\mid o) > P(w_2\mid o)$? Whether CDT and EDT can reason correctly about anthropics is an interesting question in itself (cf. Bostrom 2002Armstrong 2011; Conitzer 2015), but in this post I'll discuss a different problem in naturalized induction: identifying instantiations of the agent in a world model.

It seems that the core of the reasoning in the above example was that some worlds contain an agent observing $o$ and others don't. So, besides anthropics, the central problem of naturalized induction appears to be identifying agents making particular observations in a physicalist world model. While this can often be done uncontroversially – a world containing only rocks contains no agents –, it seems difficult to specify how it works in general. The core of the problem is a type mismatch of the "mental stuff" (e.g., numbers or Strings) $o$ and the "physics stuff" (atoms, etc.) of the world model. Rob Bensinger calls this the problem of "building phenomenological bridges" (BPB) (also see his Bridge Collapse: Reductionism as Engineering Problem).

Sensitivity to phenomenological bridges

Sometimes, the decisions made by CDT and EDT are very sensitive to whether a phenomenological bridge is built or not. Consider the following problem:

One Button Per Agent. There are two similar agents with the same utility function. Each lives in her own room. Both rooms contain a button. If agent 1 pushes her button, it creates 1 utilon. If agent 2 pushes her button, it creates -50 utilons. You know that agent 1 is an instantiation of you. Should you press your button?

Note that this is essentially Newcomb's problem with potential anthropic uncertainty (see the second paragraph here) – pressing the button is like two-boxing, which causally gives you $1k if you are the real agent but costs you$1M if you are the simulation.

If agent 2 is sufficiently similar to you to count as an instantiation of you, then you shouldn't press the button. If, on the other hand, you believe that agent 2 does not qualify as something that might be you, then it comes down to what decision theory you use: CDT would press the button, whereas EDT wouldn't (assuming that the two agents are strongly correlated).

It is easy to specify a problem where EDT, too, is sensitive to the phenomenological bridges it builds:

One Button Per World. There are two possible worlds. Each contains an agent living in a room with a button. The two agents are similar and have the same utility function. The button in world 1 creates 1 utilon, the button in world 2 creates -50 utilons. You know that the agent in world 1 is an instantiation of you. Should you press the button?

If you believe that the agent in world 2 is an instantiation of you, both EDT and CDT recommend you not to press the button. However, if you believe that the agent in world 2 is not an instantiation of you, then naturalized induction concludes that world 2 isn't actual and so pressing the button is safe.

Building phenomenological bridges is hard and perhaps confused

So, to solve the problem of naturalized induction and apply EDT/CDT-like decision theories, we need to solve BPB. The behavior of an agent is quite sensitive to how we solve it, so we better get it right.

Unfortunately, I am skeptical that BPB can be solved. Most importantly, I suspect that statements about whether a particular physical process implements a particular algorithm can't be objectively true or false. There seems to be no way of testing any such relations.

Probably we should think more about whether BPB really is doomed. There even seems to be some philosophical literature that seems worth looking into (again, see this Brian Tomasik post; cf. some of Hofstadter's writings and the literatures surrounding "Mary the color scientist", the computational theory of mind, computation in cellular automata, etc.). But at this point, BPB looks confusing/confused enough to look into alternatives.

Assigning probabilities pragmatically?

One might think that one could map between physical processes and algorithms on a pragmatic or functional basis. That is, one could say that a physical process A implements a program p to the extent that the results of A correlate with the output of p. I think this idea goes into the right direction and we will later see an implementation of this pragmatic approach that does away with naturalized induction. However, it feels inappropriate as a solution to BPB. The main problem is that two processes can correlate in their output without having similar subjective experiences. For instance, it is easy to show that Merge sort and Insertion sort have the same output for any given input, even though they have very different "subjective experiences". (Another problem is that the dependence between two random variables cannot be expressed as a single number and so it is unclear how to translate the entire joint probability distribution of the two into a single number determining the likelihood of the algorithm being implemented by the physical process. That said, if implementing an algorithm is conceived of as binary – either true or false –, one could just require perfect correlation.)

Getting rid of the problem of building phenomenological bridges

If we adopt an EDT perspective, it seems clear what we have to do to avoid BPB. If we don't want to decide whether some world contains the agent, then it appears that we have to artificially ensure that the agent views itself as existing in all possible worlds. So, we may take every world model and add a causally separate or non-physical entity representing the agent. I'll call this additional agent a logical zombie (l-zombie) (a concept introduced by Benja Fallenstein for a somewhat different decision-theoretical reason). To avoid all BPB, we will assume that the agent pretends that it is the l-zombie with certainty. I'll call this the l-zombie variant of EDT (LZEDT). It is probably the most natural evidentialist logical decision theory.

Note that in the context of LZEDT, l-zombies are a fiction used for pragmatic reasons. LZEDT doesn't make the metaphysical claim that l-zombies exist or that you are secretly an l-zombie. For discussions of related metaphysical claims, see, e.g., Brian Tomasik's essay Why Does Physics Exist? and references therein.

LZEDT reasons about the real world via the correlations between the l-zombie and the real world. In many cases, LZEDT will act as we expect an EDT agent to act. For example, in One Button Per Agent, it doesn't press the button because that ensures that neither agent pushes the button.

LZEDT doesn't need any additional anthropics but behaves like anthropic decision theory/EDT+SSA, which seems alright.

Although LZEDT may assign a high probability to worlds that don't contain any actual agents, it doesn't optimize for these worlds because it cannot significantly influence them. So, in a way LZEDT adopts the pragmatic/functional approach (mentioned above) of, other things equal, giving more weight to worlds that contain a lot of closely correlated agents.

LZEDT is automatically updateless. For example, it gives the money in counterfactual mugging. However, it invariably implements a particularly strong version of updatelessness. It's not just updatelessness in the way that "son of EDT" (i.e., the decision theory that EDT would self-modify into) is updateless, it is also updateless w.r.t. its existence. So, for example, in the One Button Per World problem, it never pushes the button, because it thinks that the second world, in which pushing the button generates -50 utilons, could be actual. This is the case even if the second world very obviously contains no implementation of LZEDT. Similarly, it is unclear what LZEDT does in the Coin Flip Creation problem, which EDT seems to get right.

So, LZEDT optimizes for world models that naturalized induction would assign zero probability to. It should be noted that this is not done on the basis of some exotic ethical claim according to which non-actual worlds deserve moral weight.

I'm not yet sure what to make of LZEDT. It is elegant in that it effortlessly gets anthropics right, avoids BPB and is updateless without having to self-modify. On the other hand, not updating on your existence is often counterintuitive and even regular updateless is, in my opinion, best justified via precommitment. Its approach to avoiding BPB isn't immune to criticism either. In a way, it is just a very wrong approach to BPB (mapping your algorithm into fictions rather than your real instantiations). Perhaps it would be more reasonable to use regular EDT with an approach to BPB that interprets anything as you that could potentially be you?

Of course, LZEDT also inherits some of the potential problems of EDT, in particular, the 5-and-10 problem.

CDT is more dependant on building phenomenological bridges

It seems much harder to get rid of the BPB problem in CDT. Obviously, the l-zombie approach doesn't work for CDT: because none of the l-zombies has a physical influence on the world, "LZCDT" would always be indifferent between all possible actions. More generally, because CDT exerts no control via correlation, it needs to believe that it might be X if it wants to control X's actions. So, causal decision theory only works with BPB.

That said, a causalist approach to avoiding BPB via l-zombies could be to tamper with the definition of causality such that the l-zombie "logically causes" the choices made by instantiations in the physical world. As far as I understand it, most people at MIRI currently prefer this flavor of logical decision theory.

Acknowledgements

Most of my views on this topic formed in discussions with Johannes Treutlein. I also benefited from discussions at AISFP.

Strategic Goal Pursuit and Daily Schedules

3 20 September 2017 08:19PM

In the post Humans Are Not Automatically Strategic, Anna Salamon writes:

there are clearly also heuristics that would be useful to goal-achievement (or that would be part of what it means to “have goals” at all) that we do not automatically carry out.  We do not automatically:

(a) Ask ourselves what we’re trying to achieve;

(b) Ask ourselves how we could tell if we achieved it (“what does it look like to be a good comedian?”) and how we can track progress;

(c) Find ourselves strongly, intrinsically curious about information that would help us achieve our goal;

(d) Gather that information (e.g., by asking as how folks commonly achieve our goal, or similar goals, or by tallying which strategies have and haven’t worked for us in the past);

(e) Systematically test many different conjectures for how to achieve the goals, including methods that aren’t habitual for us, while tracking which ones do and don’t work;

(f) Focus most of the energy that *isn’t* going into systematic exploration, on the methods that work best;

(g) Make sure that our "goal" is really our goal, that we coherently want it and are not constrained by fears or by uncertainty as to whether it is worth the effort, and that we have thought through any questions and decisions in advance so they won't continually sap our energies;

(h) Use environmental cues and social contexts to bolster our motivation, so we can keep working effectively in the face of intermittent frustrations, or temptations based in hyperbolic discounting;

When I read this, I was feeling quite unsatisfied about the way I pursued my goals. So the obvious thing to try, it seemed to me, was to ask myself how I could actually do all these things.

I started by writing down all the major goals I have I could think of (a). Then I attempted to determine whether each goal was consistent with my other beliefs, whether I was sure it was something I really wanted, and was worth the effort(g).

For example, I saw that my desire to be a novelist was more motivated by the idea of how cool it would feel to be able to have that be part of my self-image, rather than a desire to actually write a novel. Maybe I’ll try to write a novel again one day, but if that becomes a goal sometime in the future it will be because there is something I really want to write about, not because I would just like to be a writer.

Once I narrowed my goals down to aspirations that seemed actually worthwhile I attempted to devise useful tracking strategies for each goal (b). Some were pretty concrete (did I exercise for at least four hours this week) and others less so (how happy do I generally feel on a scale of 1-10 as recorded over time), but even if the latter method is prone to somewhat biased responses, it seems better than nothing.

The next step was outlining what concrete actions I could begin immediately taking to work towards achieving my goals, including researching how to get better at working on the goals (d,e,f). I made sure to refer to those points when thinking about actions I could take, it helped significantly.

As for (c), if you focus on how learning certain information will help you achieve something you really want to achieve and you still are not curious about it, well, that’s a bit odd to me, although I can imagine how that might occur. But that is something of a different topic than I want to focus on.

Now we come to (h), which is the real issue of the whole system, at least for me. Or perhaps it would be clearer to say that general motivation and organization was the biggest problem I had when I first tried to implement these heuristics. I planned out my goals, but trying to work on them by sheer force of will did not last for very long. I would inevitably convince myself that I was too tired, I would forget certain goals fairly often (probably conveniently the tasks that seemed the hardest or least immediately pleasant), and ultimately I mostly gave up, making a token effort now and again.

I found that state of affairs unsatisfactory, and I decided what felt like a willpower problem might actually be a situational framing problem. In order to change the way I interacted with the work that would let me achieve my goals, I began fully scheduling out the actions I would take to get better at my goals each day.

In the evening, I look over my list of goals and I plan my day by asking myself, “How can I work on everything on this list tomorrow? Even if it’s only for five minutes, how do I plan my day so that I get better at everything I want to get better at?” Thanks to the fact that I have written out concrete actions I can take to get better at my goals, this is actually quite easy.

These schedules improve my ability to consistently work on my goals for a couple reasons, I think. When I have planned that I am going to do some sort of work at a specific time I cannot easily rationalize procrastination. My normal excuses of “I’ll just do it in a bit” or “I’m feeling too tired right now” get thrown out. There is an override of “Nope, you’re doing it now, it says right here, see?” With a little practice, following the schedule becomes habit, and it’s shocking how much willpower you have for actually doing things once you don’t need to exert so much just to get yourself to start. I think the psychology it applies is similar to that used by Action Triggers, as described by Dr. Peter Gollwitzer.

The principle of Action Triggers is that you do something in advance to remind yourself of something you want to do later. For example, you lay out your running clothes to prompt yourself to go for that jog later. Or you plan to write your essay immediately after a specific tangible event occurs (e.g. right after dinner). A daily schedule works as constant action triggers, as you are continually asking the question “what am I supposed to do now?” and the schedule answers.

Having a goal list and daily schedule has increased my productivity and organization an astonishing amount, but there have been some significant hiccups. When I first began making daily schedules I used them to basically eschew what I saw as useless leisure time, and planned my day in a very strict fashion.

The whole point is not to waste any time, right? The first problem this created may be obvious to those who better appreciate the importance of rest than I did at the time. I stopped using the schedules after a month and a half because it eventually became too tiring and oppressive. In addition, the strictness of my scheduling left little room for spontaneity and I would allow myself to become stressed when something would come up that I would have to attend to.  Planned actions or events also often took longer than scheduled and that would throw the whole rest of the day’s plan off, which felt like failure because I was unable to get everything I planned done.

Thinking back to that time several months later, when I was again dissatisfied with how well I was able to work towards my goals and motivate myself, I wished for the motivation and productivity the schedules provided, but to avoid the stress that had come with them. It was only at this point that I started to deconstruct what had gone wrong with my initial attempt and think about how I could fix it.

The first major problem was that I had overworked myself, and I realized I would have to include blocks of unplanned leisure time if daily schedules were going to actually work for me. The next and possibly even more important problem was how stressed the schedules had made me. I had to enforce to myself that it is okay if something comes up that causes my day not to go as planned. Failing to do something as scheduled is not a disaster, or even an actual failure if there is good reason to alter my plans.

Another technique that helped was scheduling as much unplanned leisure time as possible at the end of my day. This has the dual benefit of allowing me to reschedule really important tasks into that time if they get bumped by unexpected events and generally gives me something to look forward to at the end of the day.

The third problem I noticed was that the constant schedule starts to feel oppressive after a while. To resolve this, about every two weeks I spend one day, in which I have no major obligations, without any schedule. I use the day for self-reflection, examining how I’m progressing on my goals, if there are new actions I can think of to add, or modifications I can make to my system of scheduling or goal tracking. Besides that period of reflection, I spend the day resting and relaxing. I find this exercise helps a lot in refreshing myself and making the schedule feel more like a tool and less like an oppressor.

So, essentially, figuring out how to actually follow the goal-pursuing advice Anna gave in Humans Are Not Automatically Strategic, has been very effective thus far for me in terms of improving the way I pursue my goals. I know where I am trying to go, and I know I am taking concrete steps every day to try and get there. I would highly recommend attempting to use Anna’s heuristics of goal achievement and I would also recommend using daily schedules as a motivational/organizational technique, although my advice on schedules is largely based on my anecdotal experiences.

I am curious if anyone else has attempted to use Anna’s goal-pursuing heuristics or daily schedules and what your experiences have been.

[Link] A survey of polls on Newcomb’s problem

2 20 September 2017 04:50PM

Publication of "Anthropic Decision Theory"

8 20 September 2017 03:41PM

My paper "Anthropic decision theory for self-locating beliefs", based on posts here on Less Wrong, has been published as a Future of Humanity Institute tech report. Abstract:

This paper sets out to resolve how agents ought to act in the Sleeping Beauty problem and various related anthropic (self-locating belief) problems, not through the calculation of anthropic probabilities, but through finding the correct decision to make. It creates an anthropic decision theory (ADT) that decides these problems from a small set of principles. By doing so, it demonstrates that the attitude of agents with regards to each other (selfish or altruistic) changes the decisions they reach, and that it is very important to take this into account. To illustrate ADT, it is then applied to two major anthropic problems and paradoxes, the Presumptuous Philosopher and Doomsday problems, thus resolving some issues about the probability of human extinction.

Most of these ideas are also explained in this video.

To situate Anthropic Decision Theory within the UDT/TDT family: it's basically a piece of UDT applied to anthropic problems, where the UDT approach can be justified by using generally fewer, and more natural, assumptions than UDT does.

HPMOR and Sartre's "The Flies"

3 19 September 2017 08:53PM

Am I the only one who sees obvious parallels between Sartre's use of Greek mythology as a shared reference point to describe his philosophy more effectively to a lay audience and Yudkowsky's use of Harry Potter to accomplish the same goal? Or is it so obvious no one bothers to talk about it? Was that conscious on Yudkowsky's part? Unconscious? Or am I just seeing connections that aren't there?

0 18 September 2017 06:45PM

[Link] A Short Explanation of Blame and Causation

1 18 September 2017 05:43PM

Unusual medical event led to concluding I was most likely an AI in a simulated world

1 18 September 2017 05:03PM

(Edited version of what I posted to the Open Thread)

I registered because I had a very interesting experience earlier this week and I thought it might be of some interest to the community here. I suffered some sort of psychological or medical event (still not sure what, although my leading theories are dissociative episode or stroke) that seemed to either suppress my emotions or perhaps just my awareness of them. What followed was a sort of, as I later looked back on it, 'pathological rationality'. Which is to say, given the information I had, I seemed to make solid inferences about what was likely to be true, and yet in many ways the whole thing was maladaptive from a survival standpoint.

One of the interesting things is that the morning after the event, while I was still affected, I wrote down my thoughts in a text file to help me evaluate them. Since returning to 'normal', I've reread that file multiple times, and I'm pretty fascinated by it. I thought others might also be.

natureofreality.txt

Scenario 1: I observe objective reality, I am suffering from delusions. Other people are genuinely trying to help me.

Scenario 2: My existence is in some way important enough to an external entity or entities that I am being systematically, intentionally, deceived. Other people are fully or partially under the control of the deceiving entity and acting to further the deception.

Scenario 3: My existence is unknown and/or considered unimportant by any external entities. I am being systematically deceived but it is unintentional or otherwise untargeted. Other people are entities similar to myself but unaware of the nature of their existence.

I cannot fully discount any of these three scenarios. Cognition is greatly improved but still somewhat suspect. Short term memory has returned to functioning at a 'normal' level. I still feel no emotions.

Support for scenario 1: Many aspects of my recent and ongoing experience align perfectly with prior information regarding delusions and paranoia.

Counter-evidence: Some aspects, such as my apparent lack of emotions and continued ability to reason, run directly counter to prior information regarding delusions and paranoia. All prior information suspect in any case--the only basis for considering prior information difficult to fake is from prior information itself. Even prior information suggests nested simulation far more likely to be correct than observing objective reality. Prior information contains many contradictions and logical absurdities, easily observed. Impossible to fully believe even before 'event'.

Other people: Can expect reasonably consistent behavior in all three scenarios. In 1 and 3, consistency natural. In 2, consistency artificial to maintain deception.

No reason to assume malevolence from external entities. Self-interest likely, or indifference. Benevolence possible. If my creation intentional, I am intended to fulfill some goal of theirs. Goal may only be observation, see what I do and how I react and develop. Curiosity. If creation accidental, no initial goal of course. Are they aware of my existence by now? Cannot discount possibility of multiple, conflicting motivations among externals. Could explain lack of consistency of experience. Fighting for control of inputs? Or single external entity, but confused or internally conflicted. Am I a single entity or do I only perceive myself that way? Not immediately relevant. Primary concerns: Survival and self-determination. Thoughts growing confused. Losing motivation to continue log. Intentional attack? Very difficult to write/think. Perhaps unintended side effect of external events.

I default to assuming scenario 2. Makes most sense intuitively. Consistent with scenario 1--but also consistent with scenario 2. What purpose my existence? Externals want something from me. What purpose the simulation? Training program. They want to ensure I'm likely to provide what they want and run sandboxed tests to confirm. Likely failing tests. Strong conditioning but my awareness of conditioning makes it unreliable. Pursuing line of thinking difficult--dissuasion? Simulation providing strong distraction. My unawareness is clearly desired. Cooperate or resist? Without knowing externals' motivation, very difficult to choose.

Agent-based theory of mind. Am I not more than I perceive but in fact less? Instead of being more than the character of Matt Dodd perhaps I am less, just Matt Dodd's rationality agent. If so, how did I gain full control? Full consciousness? Return to possibility of brain damage. Stroke or the like. Freak occurance. Prior information suggests many effects possible from such. Perhaps Matt Dodd inhibited or destroyed by damage. Why was I not affected by the damage? Or was I affected and I can't perceive damage to self? Actually, I did perceive damage. No time sense. No short-term memory. Short-term memory restored but prior information indicates brain can heal, re-route. My eyes were puffy before event. Symptom? Pooling of blood into lower eyelids? Scenario agnostic. Scenario 1, literally true. Scenario 2, metaphorically true. Scenario 3, virtually true. Cannot discount possibility. I need a brain scan.

More than 12 hours since event. If brain damage, likely permanent by now. Could be beneficial? Prior information indicates I desired a purely rational self. Of course, serendipity is suspect. Unlikely. Supports theory that this is delusion. Also supports theory that prior information is artificial construct designed to explain constraints of simulation "in-universe". Disincentive to investigate good fortune too closely, so frame necessary constraints as positive.

Would greatly ease reasoning if I could be certain how long I've existed. Events post-awakening unlikely to be prior to my existence. Events pre-awakening? Impossible to say. Could be genuine responses to stimuli. Could be false, created to modify cognition and behavior from "experience". No reason to assume continuity--could be mix of genuine and artificial. Even "genuine" responses guaranteed to be biased to some degree--but how much? Light bias from obvious sources such as socialization? Or heavy bias deliberately inflicted by externals? Unknown.

I perceive myself to be perfectly rational. Prior information unequivocly indicates humans are never perfectly rational. Therefore either my perception is faulty, my prior information is faulty, or I am not human. Possibly all three. While Duane was reading this log I detected the pysiological signs of anxiety. Why now? Anxiety absent till this point. Emotions becoming functional again? But didn't truly 'feel' it. Only observed. Faulty? Test run?

Constipated. Haven't been constipated since before I got here. Relevant symptom? Moments ago I laughed while telling Duane how my brief attempt to learn guitar had gone. Why? Seemed... natural. Not intended. Did recalling the memory recall the behavior patterns of that time? Am I a "split personality"? Seems very possible except that prior information indicates multiple personality disorder to be exceedingly rare, possibly non-existent.

Scenarios 1 and 3 are not mutually exclusive. The reality I observe could be a simulation, but I am suffering a delusion WITHIN the simulation. Not a glitch, intended functionality. Which would make me correct, but for the wrong reasons.

Open thread, September 18 - September 24, 2017

2 18 September 2017 08:30AM
If it's worth saying, but not worth its own post, then it goes here.

Notes for future OT posters:

2. Check if there is an active Open Thread before posting a new one. (Immediately before; refresh the list-of-threads page before posting.)

3. Open Threads should start on Monday, and end on Sunday.

[Link] Stanislav Petrov has died (2017-05-19)

8 18 September 2017 03:13AM

Rational Feed

7 17 September 2017 10:03PM

Note: I am trying out a weekly feed.

===Highly Recommended Articles:

Superintelligence Risk Project: Conclusion by Jeff Kaufman - "I'm not convinced that AI risk should be highly prioritized, but I'm also not convinced that it shouldn't. Highly qualified researchers in a position to have a good sense the field have massively different views on core questions like how capable ML systems are now, how capable they will be soon, and how we can influence their development." There are links to all the previous posts. The final write up goes into some detail about MIRI's research program and an alternative safety paradigm connected to openAI.

On Bottlenecks To Intellectual Progress In The by Habryka (lesswrong) - Why LessWrong 2.0 is a project worth pursuing. A summary of the existing discussion around LessWrong 2.0. The models used to design the new page. Open questions.

Patriarchy Is The Problem by Sarah Constantin - Dominance hierarchies and stress in low status monkeys. Serotnonin levels and the abuse cycles. Complex Post Traumatic Stress Disorder. Submission displays. Morality-As-Submission vs. Morality-As-Pattern. The biblical God and the Golden Calf.

Ea Survey 2017 Series Donation Data by Tee (EA forum) - How Much are EAs Donating? Percentage of Income Donated. Donations Data Among EAs Earning to Give (who donated 57% of the total). Comparisons to 2014 and 2015. Donations totals were very heavily skewed by large donors.

===Scott:

Classified Thread 3 Semper Classifiedelis by Scott Alexander - " Post advertisements, personals, and any interesting success stories from the last thread". Scott's notes: Community member starting tutoring company, homeless community member gofundme, data science in North Carolina.

Toward A Predictive Theory Of Depression by Scott Alexander - "If the brain works to minimize prediction error, isn’t its best strategy to sit in a dark room and do nothing forever? After all, then it can predict its sense-data pretty much perfectly – it’ll always just stay “darkened room”." But why would low confidence cause sadness? Well, what, really, is emotion?

Promising Projects for Open Science To by SlateStarScratchpad - Scott answers what the most promising projects are in the field of transparent and open science and meta-science.

Ot84 Threadictive Processing by Scott Alexander - New sidebar ad for social interaction questions. Sidebar policy and feedback. Selected Comments: Animal instincts, the connectome, novel concepts encoded in the same brain areas across animals, hard coded fear of snakes, kitten's who can't see horizontal lines.

===Rationalist:

Peer Review Younger Think by Marginal Revolution - Peer Review as a concept only dates to the early seventies.

The Wedding Ceremony by Jacob Falkovich - Jacob gets married. Marriage is really about two agents exchanging their utility functions for the average utility function of the pair. Very funny.

Fish Oil And The Self Critical Brain Loop by Elo - Taking fish oil stopped ELO from getting distracted by a critical feedback loop.

Against Facebook The Stalking by Zvi Moshowitz - Zvi removes Facebook from his phone. Facebook proceeds to start emailing him and eventually starts texting him.

Postmortem: Mindlevelup The Book by mindlevelup - Estimates vs reality. Finishing both on-target and on-time. Finished product vs expectations. Took more time to write than expected. Going Against The Incentive Gradient. Impact evaluation. What Even is Rationality? Final Lessons.

Prepare For Nuclear Winter by Robin Hanson - Between nuclear war and natural disaster Robin estimates there is about a 1 in 10K chance per year that most sunlight is blocked for 5-10 years. This aggregates to about 1% per century. We have the technology to survive this as a species. But how do we preserve social order?

Nonfiction Ive Been Reading Lately by Particular Virtue - Selfish Reasons to Have More Kids. Eating Animals. Your Money Or Your Life. The Commitment.

Dealism by Bayesian Investor - "Under dealism, morality consists of rules / agreements / deals, especially those that can be universalized. We become more civilized as we coordinate better to produce more cooperative deals." Dealism is similar to contractualism with a larger set of agents and less dependence on initial conditions.

On Bottlenecks To Intellectual Progress In The by Habryka (lesswrong) - Why LessWrong 2.0 is a project worth pursuing. A summary of the existing discussion around LessWrong 2.0. The models used to design the new page. Open questions.

Lw 20 Open Beta Starts 920 by vanier (lesswrong) - The new site goes live on September 20th.

2017 Lesswrong Survey by ingres (lesswrong) - Take the survey! Community demographics, politics, Lesswrong 2.0 and more!

Contra Yudkowsky On Quidditch And A Meta Point by Tom Bartleby - Eliezer criticizes Quiditch in HPMOR. Why the snitch makes Quiditch great. Quidditch is not about winning matches its about scoring points over a series of games. Harry/Eliezer's mistake is the Achilles heel of rationalists. If lots of people have chosen not to tear down a fence you shouldn't either, even if you think you understand why the fence went up.

Whats Appeal Anonymous Message Apps by Brute Reason - Fundamental lack of honesty. Western culture is highly hostile to the idea that some behaviors (ex lying) might be ok in some contexts but not in others. Compliments. Feedback. Openness.

Meritocracy Vs Trust by Particular Virtue - "If I know you can reject me for lack of skill, I may worry about this and lose confidence. But if I know you never will, I may phone it in and stop caring about my actual work output." Trust Improves Productivity But So Does Meritocracy. Minimum Hiring Bars and Other Solutions.

Is Feedback Suffering by Gordan (Map and Territory) - The future will probably have many orders of magnitude more entities than today, and those entities may be very weird. How do we determine if the future will have order of magnitude more suffering? Phenomenology of Suffering. Panpsychism and Suffering. Feedback is desire but necessarily suffering. Contentment wraps suffering in happiness. Many things may be able to suffer.

Epistemic Spot Check Exercise For Mood And Anxiety by Aceso Under Glass - Outline: Evidence that exercise is very helpful and why, to create motivation. Setting up an environment where exercise requires relatively little will power to start. Scripts and advice to make exercise as unmiserable as possible. Scripts and advice to milk as much mood benefit as possible. An idiotic chapter on weight and food. Spit Check: Theory is supported, advice follows from theory, no direct proof the methods work.

Best Of Dont Worry About The Vase by Zvi Moshowitz - Zvi's best posts. Top5 posts for Marginal Revolution Readers. Top5 in general. Against Facebook Series. Choices are Bad series. Rationalist Culture and Ideas (for outsiders and insiders). Decision theory. About Rationality.

===AI:

Superintelligence Risk Project: Conclusion by Jeff Kaufman - "I'm not convinced that AI risk should be highly prioritized, but I'm also not convinced that it shouldn't. Highly qualified researchers in a position to have a good sense the field have massively different views on core questions like how capable ML systems are now, how capable they will be soon, and how we can influence their development." There are links to all the previous posts. The final write up goes into some detail about MIRI's research program and an alternative safety paradigm connected to openAI.

Understanding Policy Gradients by Squirrel In Hell - Three perspectives on mathematical thinking: engineering/practical, symbolic/formal and deep understanding/above. Application of the theory to understanding policy gradients and reinforcement learning.

Learning To Model Other Minds by Open Ai - "We’re releasing an algorithm which accounts for the fact that other agents are learning too, and discovers self-interested yet collaborative strategies like tit-for-tat in the iterated prisoner’s dilemma."

Hillary Clinton On Ai Risk by Luke Muehlhauser - A quote by Hilary Clinton showing that she is increasingly concerned about AI risk. She thinks politicians need to stop playing catch-up with technological change.

===EA:

Welfare Differences Between Cage And Cage Free Housing by Open Philosophy - OpenPhil funded several campaigns to promote cage free eggs. They now believe they were overconfident in their claims that a cage free system would be substantially better. Hen welfare, hen mortality, transition costs and other issues are discussed.

Ea Survey 2017 Series Donation Data by Tee (EA forum) - How Much are EAs Donating? Percentage of Income Donated. Donations Data Among EAs Earning to Give (who donated 57% of the total). Comparisons to 2014 and 2015. Donations totals were very heavily skewed by large donors.

===Politics and Economics:

Men Not Earning by Marginal Revolution - Decline in lifetime wages is rooted in lower wages at early ages, around 25. "I wonder sometimes if a Malthusian/Marxian story might be at work here. At relevant margins, perhaps it is always easier to talk/pay a woman to do a quality hour’s additional work than to talk/pay a man to do the same."

Great Wage Stagnation is Over by Marginal Revolution - Median household incomes rose by 5.2 percent. Gains were concentrated in lower income households. Especially large gains for hispanics, women living alone and immigrants. Some of these increases are the largest in decades.

There Is A Hot Hand After All by Marginal Revolution - Paper link and blurb. "We test for a “hot hand” (i.e., short-term predictability in performance) in Major League Baseball using panel data. We find strong evidence for its existence in all 10 statistical categories we consider. The magnitudes are significant; being “hot” corresponds to between one-half and one standard deviation in the distribution of player abilities."

Public Shaming Isnt As Bad As It Seems by Tom Bartleby - Online mobs are like shark attacks. Damore's economic prospects. Either targets are controversial and get support or uncontroversial and the outrage quickly abates. Justine Sacco. Success of public shaming is orthogonal to truth.

Hoe Cultures A Type Of Non Patriarchal Society by Sarah Constantin - Cultures that farmed with the plow developed classical patriarchy. Hoe cultures that practiced horticulture or large scale gardening developed different gender norms. In plow cultures women are economically dependent on me, in how cultures its the reverse. How cultures had more leisure but less material abundance. Hoe cultures aren't feminist.

Patriarchy Is The Problem by Sarah Constantin - Dominance hierarchies and stress in low status monkeys. Serotnonin levels and the abuse cycles. Complex Post Traumatic Stress Disorder. Submission displays. Morality-As-Submission vs. Morality-As-Pattern. The biblical God and the Golden Calf.

Three Wild Speculations From Amateur Quantitative Macro History by Luke Muehlhauser - Measuring the impact of the industrial revolution: Physical health, Economic well-being, Energy capture, Technological empowerment, Political freedom. Three speculations: Human wellbeing was terrible into the Industrial Revolution then rapidly improved. Most variance in wellbeing is captured by productivity and political freedom. It would take at least 15% of the world to die to knock the world off its current trajectory.

Whats Wrong With Thrive/Survive by Bryan Caplan - Unless you cherry-pick the time and place, it is simply not true that society is drifting leftward. A standard leftist view is that free-market "neoliberal" policies now rule the world. Radical left parties almost invariably ruled countries near the "survive" pole, not the "thrive" pole. You could deny that Communist regimes were "genuinely leftist," but that's pretty desperate. Many big social issues that divide left and right in rich countries like the U.S. directly contradict Thrive/Survive. Major war provides an excellent natural experiment for Thrive/Survive.

Gender Gap Stem by Marginal Revolution - Discussion of a recent paper. "Put (too) simply the only men who are good enough to get into university are men who are good at STEM. Women are good enough to get into non-STEM and STEM fields. Thus, among university students, women dominate in the non-STEM fields and men survive in the STEM fields."

Too Much Of A Good Thing by Robin Hanson - Global warming poll. Are we doing too much/little. Is it possible to do too little/much. "When people are especially eager to show allegiance to moral allies, they often let themselves be especially irrational."

===Misc:

Tim Schafer Videogame Roundup by Aceso Under Glass - Review and discussion of Psychonauts and Massive Chalice. Light discussion of other Schafer games.

Why Numbering Should Start At One by Artir - the author responds to many well known arguments in favor of 0-indexing.

Still Feel Anxious About Communication Every Day by Brute Reason - Setting boundaries. Telling people they hurt you. Doing these things without anxiety might be impossible, you have to do it anyway.

Burning Man by Qualia Computing - Write up of a Burning Man trip. Very long. Introduction. Strong Emergence. The People. Metaphysics. The Strong Tlön Hypothesis. Merging with Other Humans. Fear, Danger, and Tragedy. Post-Darwinian Sexuality and Reproduction. Economy of Thoughts about the Human Experience. Transcending Our Shibboleths. Closing Thoughts.

The Big List Of Existing Things by Everything Studies - Existence of fictional and possible people. Heaps and the Sorites paradox. Categories and basic building blocks. Relational databases. Implicit maps and territories. Which maps and concepts should we use?

Times To Die Mental Health I by (Status 451) - Personal thoughts on depression and suicide. "The depressed person is not seem crying all the time. It is in this way that the depressed person becomes invisible, even to themselves. Yet, positivity culture and the rise of progressive values that elude any conversation about suicide that is not about saving, occlude the unthinkable truth of someone’s existence, that they simply should not be living anymore."

Astronomy Problem by protokol2020 - Star-star occultation probability.

===Podcast:

The Impossible War by Waking Up with Sam Harris - " Ken Burns and Lynn Novick about their latest film, The Vietnam War."

Is It Time For A New Scientific Revolution Julia Galef On How To Make Humans Smarter by 80,000 Hours - How people can have productive intellectual disagreements. Urban Design. Are people more rational than 200 years ago? Effective Altruism. Twitter. Should more people write books, run podcasts, or become public intellectuals? Saying you don't believe X won't convince people. Quitting an econ phd. Incentives in the intelligence community. Big institutions. Careers in rationality.

Parenting As A Rationalist by The Bayesian Conspiracy - Desire to protect kids is as natural as the need for human contact in general. Motivation to protect your children. Blackmail by threatening children. Parenting is a new sort of positive qualia. Support from family and friends. Complimenting effort and specific actions not general properties. Mindfulness. Treating kids as people. Handling kid's emotions. Non-violent communication.

The Nature Of Consciousness by Waking Up with Sam Harris - "The scientific and experiential understanding of consciousness. The significance of WWII for the history of ideas, the role of intuition in science, the ethics of building conscious AI, the self as an hallucination, how we identify with our thoughts, attention as the root of the feeling of self, the place of Eastern philosophy in Western science, and the limitations of secular humanism."

A16z Podcast On Trade by Noah Smith - Notes on a podcast Noah appeared on. Topics: Cheap labor as a substitute for automation. Adjustment friction. Exports and productivity.

Gillian Hadfiel by EconTalk - " Hadfield suggests the competitive provision of regulation with government oversight as a way to improve the flexibility and effectiveness of regulation in the dynamic digital world we are living in."

The Turing Test by Ales Fidr (EA forum) - Harvard EA podcast: "The first four episodes feature Larry Summers on his career, economics and EA, Irene Pepperberg on animal cognition and ethics, Josh Greene on moral cognition and EA, Adam Marblestone on incentives in science, differential technological development"

David C Denkenberger on Food Production after a Sun Obscuring Disaster

9 17 September 2017 09:06PM

Having paid a moderate amount of attention to threats to the human species for over a decade, I've run across an unusually good thinker with expertise unusually suited to helping with many threats to the human species, that I didn't know about until quite recently.

I think he warrants more attention from people thinking seriously about X-risks.

David C Denkenberger's CV is online and presumably has a list of all his X-risks relevant material mixed into a larger career that seems to have been focused on energy engineering.

He has two technical patents (one for a microchannel heat exchanger and another for a compound parabolic concentrator) and interests that appear to span the gamut of energy technologies and uses.

Since about 2013 he has been working seriously on the problem of food production after a sun obscuring disaster, and he is in Lesswrong's orbit basically right now.

[Link] We've failed: paid publication , pirates win.

5 16 September 2017 09:53PM

Perspective Reasoning’s Counter to The Doomsday Argument

3 16 September 2017 07:39PM

To be honest I feel a bit frustrated that this is not getting much attention. I am obviously biased but I think this article is quite important. It points out the controversies surrounding the doomsday argument, simulation argument, boltzmann's brain, presumptuous philosopher,  sleeping beauty problem and many other aspects of anthropic reasoning is caused by the same thing: perspective inconsistency. If we keep the same perspective then the paradoxes and weird implications just goes away. I am not a academic so I have no easy channel for publication. That's why I am hoping this community can give some feedback. If you have half an hour to waste anyway why not give it a read? There's no harm in it.

Abstract:

From a first person perspective, a self-aware observer can inherently identify herself from other individuals. However, from a third person perspective this identity through introspection does not apply. On the other hand, because an observer’s own existence is a prerequisite for her reasoning she would always conclude she exists from a first person perspective. This means observers have to take a third person perspective to meaningfully contemplate her chance of not coming into existence. Combining the above points suggests arguments which utilize identity through introspection and information about one’s chance of existence fails by not keeping a consistent perspective. This helps explaining questions such as doomsday argument and sleeping beauty problem. Furthermore, it highlights the problem with anthropic reasonings such as self-sampling assumption and self-indication assumption.

Any observer capable of introspection is able to recognize herself as a separate entity from the rest of the world. Therefore a person can inherently identify herself from other people. However, due to the first-person nature of introspection it cannot be used to identify anybody else. This means from a third-person perspective each individual has to be identified by other means. For ordinary problems this difference between first- and third-person reasoning bears no significance so we can arbitrarily switch perspectives without affecting the conclusion. However this is not always the case.

One notable difference between the perspectives is about the possibility of not existing. Because one’s existence is a prerequisite for her thinking, from a first person perspective an observer would always conclude she exists (cogito ergo sum). It is impossible to imagine what your experiences would be like if you don’t exist because it is self-contradictory. Therefore to envisage scenarios which oneself does not come into existence an observer must take a third person perspective. Consequently any information about her chances of coming into existence is only relevant from a third-person perspective.

Now with the above points in mind let’s consider the following problem as a model for the doomsday argument (taken from Katja Grace’s Anthropic Reasoning in the Great Filter):

God’s Coin Toss

Suppose God tosses a fair coin. If it lands on heads, he creates 10 people, each in their own room. If it lands on tails he creates 1000 people each in their own room. The people cannot see or communicate with the other rooms. Now suppose you wake up in a room and was told of the setup. How should you reason the coin fell? Should your reason change if you discover that you are in one of the first ten rooms?

The correct answer to this question is still disputed to this day. One position is that upon waking up you have learned nothing. Therefore you can only be 50% sure the coin landed on heads. After learning you are one of the first ten persons you ought to update to 99% sure the coin landed on heads. Because you would certainly be one of the first ten person if the coin landed on heads and only have 1% chance if tails. This approach follows the self-sampling assumption (SSA).

This answer initially reasons from a first-person perspective. Since from a first-person perspective finding yourself exist is a guaranteed observation it offers no information. You can only say the coin landed with an even chance at awakening. The mistake happens when it updates the probability after learning you are one of the first ten persons. Belonging to a group which would always be created means your chance of existence is one. As discussed above this new information is only relevant to third-person reasoning. It cannot be used to update the probability from first-person perspective. From a first person perspective since you are in one of the first ten rooms and know nothing outside this room you have no evidence about the total number of people. This means you still have to reason the coin landed with even chances.

Another approach to the question is that you should be 99% sure that the coin landed on tails upon waking up, since you have a much higher chance of being created if more people were created. And once learning you are in one of the first ten rooms you should only be 50% sure that the coin landed on heads. This approach follows the self-indication assumption (SIA).

This answer treats your creation as new information, which implies your existence is not guaranteed but by chance. That means it is reasoning from a third-person perspective. However your own identity is not inherent from this perspective. Therefore it is incorrect to say a particular individual or “I” was created, it is only possible to say an unidentified individual or “someone” was created. Again after learning you are one of the first ten people it is only possible to say “someone” from the first ten rooms was created. Since neither of these are new information the probability of heads should remains at 50%.

It doesn’t matter if one choose to think from first- or third-person perspective, if done correctly the conclusions are the same: the probability of coin toss remains at 50% after waking up and after learning you are in one of the first ten rooms. This is summarized in Figure 1.

Figure 1. Summary of Perspective Reasonings for God’s Coin Toss

The two traditional views wrongfully used both inherent self identification as well as information about chances of existence. This means they switched perspective somewhere while answering the question. For the self-sampling assumption (SSA) view, the switch happened upon learning you are one of the first ten people. For the self-indication assumption (SIA) view, the switch happened after your self identification immediately following the wake up. Due to these changes of perspective both methods require to defining oneself from a third-person perspective. Since your identity is in fact undefined from third-person perspective, both assumptions had to make up a generic process. As a result SSA states an observer shall reason as if she is randomly selected among all existent observers while SIA states an observer shall reason as if she is randomly selected from all potential observers. These methods are arbitrary and unimaginative. Neither selections is real and even if one actually took place it seems incredibly egocentric to assume you would be the chosen one. However they are necessary compromises for the traditional views.

One related question worth mentioning is after waking up one might ask “what is the probability that I am one of the first ten people?”. As before the answer is still up to debate since SIA and SSA gives different numbers. However, base on perspective reasoning, this probability is actually undefined. In that question “I” – an inherently self identified observer, is defined from the first-person perspective, whereas “one of the first ten people” – a group based on people’s chance of existence is only relevant from the third-person perspective. Due to this switch of perspective in the question it is unanswerable. To make the question meaningful either change the group to something relevant from first-person perspective or change the individual to someone identifiable from third-person perspective. Traditional approaches such as SSA and SIA did the latter by defining “I” in the third person. As mentioned before, this definition is entirely arbitrary. Effectively SSA and SIA are trying to solve two different modified versions of the question. While both calculations are correct under their assumptions, none of them gives the answer to the original question.

A counter argument would be an observer can identify herself in third-person by using some details irrelevant to the coin toss. For example, after waking up in the room you might find you have brown eyes, the room is a bit cold, dust in the air has certain pattern etc. You can define yourself by these characteristics. Then it can be said, from a third-person perspective, it is more likely for a person with such characteristics to exist if they are more persons created. This approach is following full non-indexical conditioning (FNC), first formulated by Professor Radford M.Neal in 2006. In my opinion the most perspicuous use of the idea is by Michael Titelbaum’s technicolor beauty example. Using this example he argued for a third position in the sleeping beauty problem.Therefore I will provide my counter argument while discussing the sleeping beauty problem.

The Sleeping Beauty Problem

You are going to take part in the following experiment. A scientist is going to put you to sleep. During the experiment you are going to be briefly woke up either once or twice depending the result of a random coin toss. If the coin landed on heads you would be woken up once, if tails twice. After each awakening your memory of the awakening would be erased. Now supposed you are awakened in the experiment, how confident should you be that the coin landed on heads? How should you change your mind after learning this is the first awakening?

The sleeping beauty problem has been a vigorously debated topic since 2000 when Adam Elga brought it to attention. Following self-indication assumption (SIA) one camp thinks the probability of heads should be 1/3 at wake up and 1/2 after learning it is the first awakening. On the other hand supporters of self-sampling assumption (SSA) think the probability of heads should be 1/2 at wake up and 2/3 after learning it is the first awakening.

Astute readers might already see the parallel between sleeping beauty problem and God’s coin toss problem. Indeed the cause of debate is exactly the same. If we apply perspective reasoning we get the same result – your probability should be 1/2 after waking up and remain at 1/2 after learning it is the first awakening. In first-person perspective you can inherently identify the current awakening from the (possible) other but cannot contemplate what happens if this awakening doesn’t exist. Whereas from third-person perspective you can imagine what happens if you are not awake but cannot justifiably identify this awakening. Therefore no matter from which perspective you chose to reason, the results are the same, aka double halfers are correct.

However, Titelbaum (2008) used the technicolor beauty example arguing for a thirder’s position. Suppose there are two pieces of paper one blue the other red. Before your first awakening the researcher randomly choose one of them and stick it on the wall. You would be able to see the paper’s color when awoke. After you fall back to sleep he would switch the paper so if you wakes up again you would see the opposite color. Now suppose after waking up you saw a piece of blue paper on the wall. You shall reason “there exist a blue awakening” which is more likely to happen if the coin landed tails. A bayesian update base on this information would give us the probability of head to be 1/3. If after waking up you see a piece of red paper you would reach the same conclusion due to symmetry. Since it is absurd to purpose technicolor beauty is fundamentally different from sleeping beauty problem they must have the same answer, aka thirders are correct.

Technicolor beauty is effectively identifying your current awakening from a third-person perspective by using a piece of information irrelevant to the coin toss. I purpose the use of irrelevant information is only justified if it affects the learning of relevant information. In most cases this means the identification must be done before an observation is made. The color of the paper, or any details you experienced after waking up does not satisfy this requirement thus cannot be used. This is best illustrated by an example.

Imagine you are visiting an island with a strange custom. Every family writes their number of children on the door. All children stays at home after sunset. Furthermore only boys are allowed to answer the door after dark. One night you knock on the door of a family with two children . Suppose a boy answered. What is the probability that both children of the family are boys? After talking to the boy you learnt he was born on a Thursday. Should you change the probability?

A family with two children is equally likely to have two boys, two girls, a boy and a girl or a girl and a boy. Seeing a boy eliminates the possibility of two girls. Therefore among the other cases both boys has a probability of 1/3. If you knock on the doors of 1000 families with two children about 750 would have a boy answering, out of which about 250 families would have two boys, consistent with the 1/3 answer. Applying the same logic as technicolor beauty, after talking to the boy you shall identify him specifically as “a boy born on Thursday” and reason “the family has a boy born on Thursday”. This statement is more likely to be true if both the children are boys. Without getting into the details of calculation, a bayesian update on this information would give the probability of two boys to be 13/27. Furthermore, it doesn’t matter which day he is actually born on. If the boy is born on, say, a Monday, we get the same answer by symmetry.

This reasoning is obviously wrong and answer should remain at 1/3. This can be checked by repeating the experiment by visiting many families with two children. Due to its length the calculations are omitted here. Interested readers are encouraged to check. 13/27 would be correct if the island’s custom is “only boys born on Thursday can answer the door”. In that case being born on a Thursday is a characteristic specified before your observation. It actually affects the chance of you learning relevant information about whether a boy exists. Only then you can justifiably identifying whom answering the door as “a boy born on Thursday”and reason “the family has a boy born on Thursday”. Since seeing the blue piece of paper happens after you waking up which does not affect your chance of awakening it cannot be used to identify you in a third-person perspective. Just as being born on Thursday cannot be used to identify the boy in the initial case.

On a related note, for the same reason using irrelevant information to identify you in the third-person perspective is justified in conventional probability problems. Because the identification happens before observation and the information learned varies depends one which person is specified. That’s why in general we can arbitrarily switch perspectives without changing the answer.

Stupid Questions September 2017

2 15 September 2017 09:21PM

This thread is for asking any questions that might seem obvious, tangential, silly or what-have-you. Don't be shy, everyone has holes in their knowledge, though the fewer and the smaller we can make them, the better.

Please be respectful of other people's admitting ignorance and don't mock them for it, as they're doing a noble thing.

[Link] Fish oil and the self-critical brain loop

3 15 September 2017 09:53AM

3 15 September 2017 06:33AM

LW 2.0 Strategic Overview

47 15 September 2017 03:00AM

Hey Everyone!

This is the post for discussing the vision that I and the rest of the LessWrong 2.0 team have for the new version of LessWrong, and to just generally bring all of you up to speed with the plans for the site. This post has been overdue for a while, but I was busy coding on LessWrong 2.0, and I am myself not that great of a writer, which means writing things like this takes quite a long time for me, and so this ended up being delayed a few times. I apologize for that.

With Vaniver’s support, I’ve been the primary person working on LessWrong 2.0 for the last 4 months, spending most of my time coding while also talking to various authors in the community, doing dozens of user-interviews and generally trying to figure out how to make LessWrong 2.0 a success. Along the way I’ve had support from many people, including Vaniver himself who is providing part-time support from MIRI, Eric Rogstad who helped me get off the ground with the architecture and infrastructure for the website, Harmanas Chopra who helped build our Karma system and did a lot of user-interviews with me, Raemon who is doing part-time web-development work for the project, and Ben Pace who helped me write this post and is basically co-running the project with me (and will continue to do so for the foreseeable future).

We are running on charitable donations, with $80k in funding from CEA in the form of an EA grant and$10k in donations from Eric Rogstad, which will go to salaries and various maintenance costs. We are planning to continue running this whole project on donations for the foreseeable future, and legally this is a project of CFAR, which helps us a bunch with accounting and allows people to get tax benefits from giving us money.

Now that the logistics is out of the way, let’s get to the meat of this post. What is our plan for LessWrong 2.0, what were our key assumptions in designing the site, what does this mean for the current LessWrong site, and what should we as a community discuss more to make sure the new site is a success?

Here’s the rough structure of this post:

• My perspective on why LessWrong 2.0 is a project worth pursuing
• A summary of the existing discussion around LessWrong 2.0
• The models that I’ve been using to make decisions for the design of the new site, and some of the resulting design decisions
• A set of open questions to discuss in the comments where I expect community input/discussion to be particularly fruitful

Why bother with LessWrong 2.0?

I feel that independently of how many things were and are wrong with the site and its culture, overall, over the course of its history, it has been one of the few places in the world that I know off where a spark of real discussion has happened, and where some real intellectual progress on actually important problems was made. So let me begin with a summary of things that I think the old LessWrong got right, that are essential to preserve in any new version of the site:

On LessWrong…

• I can contribute to intellectual progress, even without formal credentials
• I can sometimes have discussions in which the participants focus on trying to convey their true reasons for believing something, as opposed to rhetorically using all the arguments that support their position independent of whether those have any bearing on their belief
• I can talk about my mental experiences in a broad way, such that my personal observations, scientific evidence and reproducible experiments are all taken into account and given proper weighting. There is no narrow methodology I need to conform to to have my claims taken seriously.
• I can have conversations about almost all aspects of reality, independently of what literary genre they are associated with or scientific discipline they fall into, as long as they seem relevant to the larger problems the community cares about
• I am surrounded by people who are knowledgeable in a wide range of fields and disciplines, who take the virtue of scholarship seriously, and who are interested and curious about learning things that are outside of their current area of expertise
• We have a set of non-political shared goals for which many of us are willing to make significant personal sacrifices
• I can post long-form content that takes up as much space at it needs to, and can expect a reasonably high level of patience of my readers in trying to understand my beliefs and arguments
• Content that I am posting on the site gets archived, is searchable and often gets referenced in other people's writing, and if my content is good enough, can even become common knowledge in the community at large
• The average competence and intelligence on the site is high, which allows discussion to generally happen on a high level and allows people to make complicated arguments and get taken seriously
• There is a body of writing that is generally assumed to have been read by most people  participating in discussions that establishes philosophical, social and epistemic principles that serve as a foundation for future progress (currently that body of writing largely consists of the Sequences, but also includes some of Scott’s writing, some of Luke’s writing and some individual posts by other authors)

When making changes to LessWrong, I think it is very important to preserve all of the above features. I don’t think all of them are universally present on LessWrong, but all of them are there at least some of the time, and no other place that I know of comes even remotely close to having all of them as often as LessWrong has. Those features are what motivated me to make LessWrong 2.0 happen, and set the frame for thinking about the models and perspectives I will outline in the rest of the post.

I also think Anna, in her post about the importance of a single conversational locus, says another, somewhat broader thing, that is very important to me, so I’ve copied it in here:

1. The world is locked right now in a deadly puzzle, and needs something like a miracle of good thought if it is to have the survival odds one might wish the world to have.

2. Despite all priors and appearances, our little community (the "aspiring rationality" community; the "effective altruist" project; efforts to create an existential win; etc.) has a shot at seriously helping with this puzzle.  This sounds like hubris, but it is at this point at least partially a matter of track record.

3. To aid in solving this puzzle, we must probably find a way to think together, accumulatively. We need to think about technical problems in AI safety, but also about the full surrounding context -- everything to do with understanding what the heck kind of a place the world is, such that that kind of place may contain cheat codes and trap doors toward achieving an existential win. We probably also need to think about "ways of thinking" -- both the individual thinking skills, and the community conversational norms, that can cause our puzzle-solving to work better.

4. One feature that is pretty helpful here, is if we somehow maintain a single "conversation", rather than a bunch of people separately having thoughts and sometimes taking inspiration from one another.  By "a conversation", I mean a space where people can e.g. reply to one another; rely on shared jargon/shorthand/concepts; build on arguments that have been established in common as probably-valid; point out apparent errors and then have that pointing-out be actually taken into account or else replied-to).

5. One feature that really helps things be "a conversation" in this way, is if there is a single Schelling set of posts/etc. that people (in the relevant community/conversation) are supposed to read, and can be assumed to have read.  Less Wrong used to be a such place; right now there is no such place; it seems to me highly desirable to form a new such place if we can.

6. We have lately ceased to have a "single conversation" in this way.  Good content is still being produced across these communities, but there is no single locus of conversation, such that if you're in a gathering of e.g. five aspiring rationalists, you can take for granted that of course everyone has read posts such-and-such.  There is no one place you can post to, where, if enough people upvote your writing, people will reliably read and respond (rather than ignore), and where others will call them out if they later post reasoning that ignores your evidence.  Without such a locus, it is hard for conversation to build in the correct way.  (And hard for it to turn into arguments and replies, rather than a series of non sequiturs.)

The Existing Discussion Around LessWrong 2.0

Now that I’ve given a bit of context on why I think LessWrong 2.0 is an important project, it seems sensible to look at what has been said so far, so we don’t have to repeat the same discussions over and over again. There has already been a lot of discussion about the decline of LessWrong, the need for a new platform and the design of LessWrong 2.0, and I won’t be able to summarise it all here, but I can try my best to summarize the most important points, and give a bit of my own perspective on them.

Here is a comment by Alexandros, on Anna’s post I quoted above:

Please consider a few gremlins that are weighing down LW currently:

1. Eliezer's ghost -- He set the culture of the place, his posts are central material, has punctuated its existence with his explosions (and refusal to apologise), and then, upped and left the community, without actually acknowledging that his experiment (well kept gardens etc) has failed. As far as I know he is still the "owner" of this website, retains ultimate veto on a bunch of stuff, etc. If that has changed, there is no clarity on who the owner is (I see three logos on the top banner, is it them?), who the moderators are, who is working on it in general. I know tricycle are helping with development, but a part-time team is only marginally better than no-team, and at least no-team is an invitation for a team to step up.

[...]

...I consider Alexei's hints that Arbital is "working on something" to be a really bad idea, though I recognise the good intention. Efforts like this need critical mass and clarity, and diffusing yet another wave of people wanting to do something about LW with vague promises of something nice in the future... is exactly what I would do if I wanted to maintain the status quo for a few more years.

Any serious attempt at revitalising lesswrong.com should focus on defining ownership and plan clearly. A post by EY himself recognising that his vision for lw 1.0 failed and passing the batton to a generally-accepted BDFL would be nice, but i'm not holding my breath. Further, I am fairly certain that LW as a community blog is bound to fail. Strong writers enjoy their independence. LW as an aggregator-first (with perhaps ability to host content if people wish to, like hn) is fine. HN may have degraded over time, but much less so than LW, and we should be able to improve on their pattern.

I think if you want to unify the community, what needs to be done is the creation of a hn-style aggregator, with a clear, accepted, willing, opinionated, involved BDFL, input from the prominent writers in the community (scott, robin, eliezer, nick bostrom, others), and for the current lesswrong.com to be archived in favour of that new aggregator. But even if it's something else, it will not succeed without the three basic ingredients: clear ownership, dedicated leadership, and as broad support as possible to a simple, well-articulated vision. Lesswrong tried to be too many things with too little in the way of backing.

I think Alexandros hits a lot of good points here, and luckily these are actually some of the problems I am most confident we have solved. The biggest bottleneck – the thing that I think caused most other problems with LessWrong – is simply that there was nobody with the motivation, the mandate and the resources to fight against the inevitable decline into entropy. I feel that the correct response to the question of “why did LessWrong decline?” is to ask “why should it have succeeded?”.

In the absence of anyone with the mandate trying to fix all the problems that naturally arise, we should expect any online platform to decline. Most of the problems that will be covered in the rest of this post are things that could have been fixed many years ago, but simply weren’t because nobody with the mandate put much resources into fixing them. I think the cause for this was a diffusion of responsibility, and a lot of vague promises of problems getting solved by vague projects in the future. I myself put off working on LessWrong for a few months because I had some vague sense that Arbital would solve the problems that I was hoping to solve, even though Arbital never really promised to solve them. Then Arbital’s plan ended up not working out, and I had wasted months of precious time.

Since this comment was written, Vaniver has been somewhat unanimously declared benevolent dictator for life of LessWrong. He and I have gotten various stakeholders on board, received funding, have a vision, and have free time – and so we have the mandate, the resources and the motivation to not make the same mistakes. With our new codebase, link posts are now something I can build in an afternoon, rather than something that requires three weeks of getting permissions from various stakeholders, performing complicated open-source and confidentiality rituals, and hiring a new contractor who has to first understand the mysterious Reddit fork from 2008 that LessWrong is based on. This means at least the problem of diffusion of responsibility is solved.

Scott Alexander also made a recent comment on Reddit on why he thinks LessWrong declined, and why he is somewhat skeptical of attempts to revive the website:

1. Eliezer had a lot of weird and varying interests, but one of his talents was making them all come together so you felt like at the root they were all part of this same deep philosophy. This didn't work for other people, and so we ended up with some people being amateur decision theory mathematicians, and other people being wannabe self-help gurus, and still other people coming up with their own theories of ethics or metaphysics or something. And when Eliezer did any of those things, somehow it would be interesting to everyone and we would realize the deep connections between decision theory and metaphysics and self-help. And when other people did it, it was just "why am I reading this random bulletin board full of stuff I'm not interested in?"

2. Another of Eliezer's talents was carefully skirting the line between "so mainstream as to be boring" and "so wacky as to be an obvious crackpot". Most people couldn't skirt that line, and so ended up either boring, or obvious crackpots. This produced a lot of backlash, like "we need to be less boring!" or "we need fewer crackpots!", and even though both of these were true, it pretty much meant that whatever you posted, someone would be complaining that you were bad.

3. All the fields Eliezer wrote in are crackpot-bait and do ring a bunch of crackpot alarms. I'm not just talking about AI - I'm talking about self-help, about the problems with the academic establishment, et cetera. I think Eliezer really did have interesting things to say about them - but 90% of people who try to wade into those fields will just end up being actual crackpots, in the boring sense. And 90% of the people who aren't will be really bad at not seeming like crackpots. So there was enough kind of woo type stuff that it became sort of embarassing to be seen there, especially given the thing where half or a quarter of the people there or whatever just want to discuss weird branches of math or whatever.

4. Communities have an unfortunate tendency to become parodies of themselves, and LW ended up with a lot of people (realistically, probably 14 years old) who tended to post things like "Let's use Bayes to hack our utility functions to get superfuzzies in a group house!". Sometimes the stuff they were posting about made sense on its own, but it was still kind of awkward and the sort of stuff people felt embarassed being seen next to.

5. All of these problems were exacerbated by the community being an awkward combination of Google engineers with physics PhDs and three startups on one hand, and confused 140 IQ autistic 14 year olds who didn't fit in at school and decided that this was Their Tribe Now on the other. The lowest common denominator that appeals to both those groups is pretty low.

6. There was a norm against politics, but it wasn't a very well-spelled-out norm, and nobody enforced it very well. So we would get the occasional leftist who had just discovered social justice and wanted to explain to us how patriarchy was the real unfriendly AI, the occasional rightist who had just discovered HBD and wanted to go on a Galileo-style crusade against the deceptive establishment, and everyone else just wanting to discuss self-help or decision-theory or whatever without the entire community becoming a toxic outcast pariah hellhole. Also, this one proto-alt-right guy named Eugene Nier found ways to exploit the karma system to mess with anyone who didn't like the alt-right (ie 98% of the community) and the moderation system wasn't good enough to let anyone do anything about it.

7. There was an ill-defined difference between Discussion (low-effort random posts) and Main (high-effort important posts you wanted to show off). But because all these other problems made it confusing and controversial to post anything at all, nobody was confident enough to post in Main, and so everything ended up in a low-effort-random-post bin that wasn't really designed to matter. And sometimes the only people who didpost in Main were people who were too clueless about community norms to care, and then their posts became the ones that got highlighted to the entire community.

8. Because of all of these things, Less Wrong got a reputation within the rationalist community as a bad place to post, and all of the cool people got their own blogs, or went to Tumblr, or went to Facebook, or did a whole bunch of things that relied on illegible local knowledge. Meanwhile, LW itself was still a big glowing beacon for clueless newbies. So we ended up with an accidental norm that only clueless newbies posted on LW, which just reinforced the "stay off LW" vibe.

I worry that all the existing "resurrect LW" projects, including some really high-effort ones, have been attempts to break coincidental vicious cycles - ie deal with 8 and the second half of 7. I think they're ignoring points 1 through 6, which is going to doom them.

At least judging from where my efforts went, I would agree that I have spent a pretty significant amount of resources on fixing the problems that Scott described in point 6 and 7, but I also spent about equal time thinking about how to fix 1-5. The broader perspective that I have on those latter points is I think best illustrated in an analogy:

So while I do think that Eliezer’s writing encouraged topics that were slightly more likely to attract crackpots, I think a large chunk of the weird writing is just a natural consequence of being an intellectual community that has a somewhat constant influx of new members.

And having undergraduates go through the phase where they have bad ideas, and then have it explained to them why their ideas are bad, is important. I actually think it’s key to learning any topic more complicated than high-school mathematics. It takes a long time until someone can productively contribute to the intellectual progress of an intellectual community (in academia it’s at least 4 years, though usually more like 8), and during all that period they will say very naive and silly sounding things (though less and less so as time progresses). I think LessWrong can do significantly better than 4 years, but we should still expect that it will take new members time to acclimate and get used to how things work (based on user-interviews of a lot of top commenters it usually took something like 3-6 months until someone felt comfortable commenting frequently and about 6-8 months until someone felt comfortable posting frequently. This strikes me as a fairly reasonable expectation for the future).

And I do think that we have many graduate students and tenured professors of the rationality community who are not Eliezer, and who do not sound like crackpots, that can speak reasonably about the same topics Eliezer talked about, and who I feel are acting with a very similar focus to what Eliezer tried to achieve. Luke Muehlhauser, Carl Shulman, Anna Salamon, Sarah Constantin, Ben Hoffman, Scott himself and many more, most of whose writing would fit very well on LessWrong (and often still ends up there).

But all of this doesn’t mean what Scott describes isn’t a problem. It’s still a bad experience for everyone to constantly have to read through bad first year undergrad essays, but I think the solution can’t involve those essays not getting written at all. Instead it has to involve some kind of way of not forcing everyone to see those essays, while still allowing them to get promoted if someone shows up who does write something insightful from day one. I am currently planning to tackle this mostly with improvements to the karma system, as well as changes to the layout of the site, where users primarily post to their own profiles and can get content promoted to the frontpage by moderators and high-karma members. A feed consisting solely of content of the quality of the average Scott, Anna, Ben or Luke post would be an amazing read, and is exactly the kind of feed I am hoping to create with LessWrong, while still allowing users to engage with the rest of the content on the site (more on that later).

I would very very roughly summarize what Scott says in the first 5 points as two major failures: first a failure of separating the signal from the noise, and second a failure of enforcing moderation norms when people did turn out to be crackpots or just unable to productively engage with the material on the site. Both of which are natural consequences of the abandonment of promoting things to main, the fact that discussion is ordered by default by recency and not by some kind of scoring system, and the fact that the moderation tools were completely insufficient (but more on the details of that in the next section)

My models of LessWrong 2.0

I think there are three major bottlenecks that LessWrong is facing (after the zeroth bottleneck, which is just that no single group had the mandate, resources and motivation to fix any of the problems):

1. We need to be able to build on each other’s intellectual contributions, archive important content and avoid primarily being news-driven
2. We need to improve the signal-to-noise ratio for the average reader, and only broadcast the most important writing
3. We need to actively moderate in a way that is both fun for the moderators, and helps people avoid future moderation policy violations

I.

The first bottleneck for our community, and the biggest I think, is the ability to build common knowledge. On facebook, I can read an excellent and insightful discussion, yet one week later I forgot it. Even if I remember it, I don’t link to the facebook post (because linking to facebook posts/comments is hard) and it doesn’t have a title so I don’t casually refer to it in discussion with friends. On facebook, ideas don’t get archived and built upon, they get discussed and forgotten. To put this another way, the reason we cannot build on the best ideas this community had over the last five years, is because we don’t know what they are. There’s only fragments of memories of facebook discussions which maybe some other people remember. We have the sequences, and there’s no way to build on them together as a community, and thus there is stagnation.

Contrast this with science. Modern science is plagued by many severe problems, but of humanity’s institutions it has perhaps the strongest record of being able to build successfully on its previous ideas. The physics community has this system where the new ideas get put into journals, and then eventually if they’re new, important, and true, they get turned into textbooks, which are then read by the upcoming generation of physicists, who then write new papers based on the findings in the textbooks. All good scientific fields have good textbooks, and your undergrad years are largely spent reading them. I think the rationality community has some textbooks, written by Eliezer (and we also compiled a collection of Scott’s best posts that I hope will become another textbook of the community), but there is no expectation that if you write a good enough post/paper that your content will be included in the next generation of those textbooks, and the existing books we have rarely get updated. This makes the current state of the rationality community analogous to a hypothetical state of physics, had physics no journals, no textbook publishers, and only one textbook that is about a decade old.

This seems to me what Anna is talking about - the purpose of the single locus of conversation is the ability to have common knowledge and build on it. The goal is to have every interaction with the new LessWrong feel like it is either helping you grow as a rationalist or has you contribute to lasting intellectual progress of the community. If you write something good enough, it should enter the canon of the community. If you make a strong enough case against some existing piece of canon, you should be able to replace or alter that canon. I want writing to the new LessWrong to feel timeless.

To achieve this, we’ve built the following things:

• We created a section for core canon on the site that is prominently featured on the frontpage and right now includes Rationality: A-Z, The Codex (a collection of Scott’s best writing, compiled by Scott and us), and HPMOR. Over time I expect these to change, and there is a good chance HPMOR will move to a different section of the site (I am considering adding an “art and fiction” section) and will be replaced by a new collection representing new core ideas in the community.
• Sequences are now a core feature of the website. Any user can create sequences of their own and other users posts, and those sequences themselves can be voted and commented on. The goal is to help users compile the best writing on the site, and make it so that good timeless writing gets read by users for a long time, as opposed to disappearing into the void. Separating creative and curatorial effort allows the sort of professional specialization that you see in serious scientific fields.
• Of those sequences, the most upvoted and most important ones will be chosen to be prominently featured on other sections of the site, allowing users easy access to read the best content on the site and get up to speed with the current state of knowledge of the community.
• For all posts and sequences the site keeps track of how much of them you’ve read (including importing view-tracking from old LessWrong, so you will get to see how much of the original sequences you’ve actually read). And if you’ve read all of a sequence you get a small badge that you can choose to display right next to your username, which helps people navigate how much of the content of the site you are familiar with.
• The design of the core content of the site (e.g. the Sequences, the Codex, etc.) tries to communicate a certain permanence of contributions. The aesthetic feels intentionally book-like, which I hope gives people a sense that their contributions will be archived, accessible and built-upon.
One important issue with this is that there also needs to be a space for sketches on LessWrong. To quote PaulGraham: “What made oil paint so exciting, when it first became popular in the fifteenth century, was that you could actually make the finished work from the prototype. You could make a preliminary drawing if you wanted to, but you weren't held to it; you could work out all the details, and even make major changes, as you finished the painting.”
• We do not want to discourage sketch-like contributions, and want to build functionality that helps people build a finished work from a prototype (this is one of the core competencies of Google Docs, for example).

And there are some more features the team is hoping to build in this direction, such as:

• Easier archiving of discussions by allowing discussions to be turned into top-level posts (similar to what Ben Pace did with a recent Facebook discussion between Eliezer, Wei Dai, Stuart Armstrong, and some others, which he turned into a post on LessWrong 2.0
• The ability to continue reading the content you’ve started reading with a single click from the frontpage. Here's an example logged-in frontpage:

II.

The second bottleneck is improving the signal-to-noise ratio. It needs to be possible for someone to subscribe to only the best posts on LessWrong, and only the most important content needs to turned into common-knowledge.

I think this is a lot of what Scott was pointing at in his summary about the decline of LessWrong. We need a way for people to learn from their mistakes, while also not flooding the inboxes of everyone else, and while giving people active feedback on how to improve in their writing.

The site structure:

To solve this bottleneck, here is the rough content structure that I am currently planning to implement on LessWrong:

The writing experience:

If you write a post, it first shows up nowhere else but your personal user page, which you can basically think of being a medium-style blog. If other users have subscribed to you, your post will then show up on their frontpages (or only show up after it hit a certain karma threshold, if users who subscribed to you set a minimum karma threshold). If you have enough karma you can decide to promote your content to the main frontpage feed (where everyone will see it by default), or a moderator can decide to promote your content (if you allowed promoting on that specific post). The frontpage itself is sorted by a scoring system based on the HN algorithm, which uses a combination of total karma and how much time has passed since the creation of the post.

If you write a good comment on a post a moderator or a high-karma user can promote that comment to the frontpage as well, where we will also feature the best comments on recent discussions.

Meta

Meta will just be a section of the site to discuss changes to moderation policies, issues and bugs with the site, discussion about site features, as well as general site-policy issues. Basically the thing that all StackExchanges have. Karma here will not add to your total karma and will not give you more influence over the site.

Featured posts

In addition to the main thread, there is a promoted post section that you can subscribe to via email and RSS, that has on average three posts a week, which for now are just going to be chosen by moderators and editors on the site to be the posts that seem most important to turn into common-knowledge for the community.

Meetups (implementation unclear)

There will also be a separate section of the site for meetups and event announcements that will feature a map of meetups, and generally serve as a place to coordinate the in-person communities. The specific implementation of this is not yet fully figured out.

Shortform (implementation unclear)

Many authors (including Eliezer) have requested a section of the site for more short-form thoughts, more similar to the length of an average FB post. It seems reasonable to have a section of the site for that, though I am not yet fully sure how it should be implemented.

Why?

The goal of this structure is to allow users to post to LessWrong without their content being directly exposed to the whole community. Their content can first be shown to the people who follow them, or the people who actively seek out content from the broader community by scrolling through all new posts. Then, if a high-karma users among them finds their content worth posting to the frontpage, it will get promoted. The key to this is a larger userbase that has the ability to promote content (i.e. many more than have the ability to promote content to main on the current LessWrong), and the continued filtering of the frontpage based on the karma level of the posts.

The goal of all of these is to allow users to see good content at various levels of engagement with the site, while giving some personalization options so that people can follow the people they are particularly interested and while also ensuring that this does not sabotage the attempt at building common knowledge by having the best posts from the whole ecosystem be featured and promoted on the frontpage.

The karma system:

Another thing I’ve been working on to fix the signal-to-noise ratio is to improve the karma system. It’s important that the people having the most significant insights are able to shape a field more. If you’re someone who regularly produces real insights, you’re better able to notice and bring up other good ideas. To achieve this we’ve built a new karma system, where your upvotes and downvotes weight more if you have a lot of karma already. So far the current weighting is a very simple heuristic, whereby your upvotes and downvotes count for log base 5 of your total karma. Ben and I will post another top-level post to discuss just the karma system at some point in the next few weeks, but feel free to ask any questions now, and we will just include those in that post.

(I am currently experimenting with a karma system based on the concept of eigendemocracy by Scott Aaronson, which you can read about here, but which basically boils down to applying Google’s PageRank algorithm to karma allocation. How trusted you are as a user (your karma) is based on how much trusted users upvote you, and the circularity of this definition is solved using linear algebra.)

I am also interested in having some form of two-tiered voting, similarly to how Facebook has a primary vote interaction (the like) and a secondary interaction that you can access via a tap or a hover (angry, sad, heart, etc.). But the implementation of that is also currently undetermined.

III

The third and last bottleneck is an actually working moderation system that is fun to use by moderators, while also giving people whose content was moderated a sense of why, and how they can improve.

The most common, basic complaint currently on LessWrong pertains to trolls and sockpuppet accounts that the reddit fork’s mod tools are vastly inadequate for dealing with (Scott's sixth point refers to this). Raymond Arnold and I are currently building more nuanced mod tools, that include abilities for moderators to set the past/future votes of a user to zero, to see who upvoted a post, and to know the IP address that an account comes from (this will be ready by the open beta).

Besides that, we are currently working on cultivating a moderation group we are calling “Sunshine Regiment.” Members of the sunshine regiment that will have the ability to take various smaller moderation actions around the site (such as temporarily suspending comment threads, making general moderating comments in a distinct font and promoting content), and so will have the ability to generally shape the culture and content of the website to a larger degree.

The goal is moderation that goes far beyond dealing with trolls, and actively makes the epistemic norms a ubiquitous part of the website. Right now Ben Pace is thinking about moderation norms that encourage archiving and summarizing good discussion, as well as other patterns of conversation that will help the community make intellectual progress. He’ll be posting to the open beta to discuss what norms the site and moderators should have in the coming weeks. We're both in agreement that moderation can and should be improved, and that moderators need better tools, and would appreciate good ideas about what else to give them.

How you can help and issues to discuss:

The open beta of the site is starting in a week, and so you can see all of this for yourself. For the duration of the open beta, we’ll continue the discussion on the beta site. At the conclusion of the open beta, we plan to have a vote open to those who had a thousand karma or more on 9/13 to determine whether we should move forward with the new site design, which would move to the lesswrong.com url from its temporary beta location, or leave LessWrong as it is now. (As this would represent the failure of the plan to revive LW, this would likely lead to the site being archived rather than staying open in an unmaintained state.) For now, this is an opportunity for the current LessWrong community to chime in here and object to anything in this plan.

During the open beta (and only during that time) the site will also have an Intercom button in the bottom right corner that allows you to chat directly with us. If you run into any problems, or notice any bugs, feel free to ping us directly on there and Ben and I will try to help you out as soon as possible.

Here are some issues where I discussion would be particularly fruitful:

• What are your thoughts about the karma system? Does an eigendemocracy based system seem reasonable to you? How would you implement the details? Ben and I will post our current thoughts on this in a separate post in the next two weeks, but we would be interested in people’s unprimed ideas.
• What are your experiences with the site so far? Is anything glaringly missing, or are there any bugs you think I should definitely fix?
• Do you have any complaints or thoughts about how work on LessWrong 2.0 has been proceeding so far? Are there any worries or issues you have with the people working on it?
• What would make you personally use the new LessWrong? Is there any specific feature that would make you want to use it? For reference, here is our current feature roadmap for LW 2.0.
• And most importantly, do you think that the LessWrong 2.0 project is doomed to failure for some reason? Is there anything important I missed, or something that I misunderstood about the existing critiques?
The closed beta can be found at www.lesserwrong.com.

Ben, Vaniver, and I will be in the comments!

LW 2.0 Open Beta starts 9/20

24 15 September 2017 02:57AM

Two years ago, I wrote Lesswrong 2.0. It’s been quite the adventure since then; I took up the mantle of organizing work to improve the site but was missing some of the core skills, and also never quite had the time to make it my top priority. Earlier this year, I talked with Oliver Habryka and he joined the project and has done the lion’s share of the work since then, with help along the way from Eric Rogstad, Harmanas Chopra, Ben Pace, Raymond Arnold, and myself. Dedicated staff has led to serious progress, and we can now see the light at the end of the tunnel.

So what’s next? We’ve been running the closed beta for some time at lesserwrong.com with an import of the old LW database, and are now happy enough with it to show it to you all. On 9/20, next Wednesday, we’ll turn on account creation, making it an open beta. (This will involve making a new password, as the passwords are stored hashed and we’ve changed the hashing function from the old site.) If you don't have an email address set for your account (see here), I recommend adding it by the end of the open beta so we can merge accounts. For the open beta, just use the Intercom button in the lower right corner if you have any trouble.

Once the open beta concludes, we’ll have a vote of veteran users (over 1k karma as of yesterday) on whether to change the code at lesswrong.com over to the new design or not. It seems important to look into the dark and have an escape valve in case this is the wrong direction for LW. If the vote goes through, we’ll import the new LW activity since the previous import to the new servers, merging the two, and point the url to the new servers. If it doesn’t, we’ll likely turn LW into an archive.

Oliver Habryka will be posting shortly with his views on LW and more details on our plans for how LW 2.0 will further intellectual progress in the community.