Kudos to Coleman Snell for the great podcast versions of these and for the rapid fire releases recently to catch up to the present week.
I was surprised he doesn't include the LW Forum portion of this post in the podcast, but I guess that's out of scope for what he's trying to do?
Thanks for the feedback! I've passed it on.
It's mainly because we wanted to keep the episodes to ~20m, to make them easy for people to keep up with week to week - and the LW posts tended toward the more technical side, which doesn't translate as easily in podcast form (it can be hard to take in without the writing in front of you). We may do something for the LW posts in future though, unsure at this point.
Supported by Rethink Priorities
This is part of a weekly series - you can see the full collection here. The first post includes some details on purpose and methodology.
If you'd like to receive these summaries via email, you can subscribe here.
Podcast version: prefer your summaries in podcast form? A big thanks to Coleman Snell for producing these! Subscribe on your favorite podcast app by searching for 'Effective Altruism Forum Podcast'.
Top / Curated Readings
Designed for those without the time to read all the summaries. Everything here is also within the relevant sections later on so feel free to skip if you’re planning to read it all.
Lessons learned from talking to >100 academics about AI safety
by mariushobbhahn
The author talked to 100-200 people in academia about AI safety, from bachelor’s students to senior faculty, over several years. This post summarizes the key learnings.
Author’s tl;dr (lightly edited): “Academics are increasingly open to arguments about AI risk and I’d recommend having lots of these chats. I underestimated how much work related to aspects of AI safety (eg. interpretability) already exists in academia - we sometimes reinvent the wheel. Messaging matters, e.g. technical discussions got more interest than alarmism and explaining the problem rather than trying to actively convince someone received better feedback.”
Measuring Good Better
by MichaelPlant, GiveWell, Jason Schukraft, Matt_Lerner, Innovations for Poverty Action
Transcripts of 5-minute lightning talks by various orgs on their approach to measuring ‘good’. A very short summary of each is below:
Givewell uses moral weights to compare different units (eg. doubling incomes vs. saving an under-5s life). These are 60% based on donor surveys, 30% from a 2019 survey of 2K people in Kenya and Ghana, and 10% staff opinion.
Open Philanthropy’s global health and wellbeing team uses the unit of ‘a single dollar to someone making 50K per year’ and then compares everything to that. Eg. Averting a DALY is worth 100K of these units.
Happier Lives Institute focuses on wellbeing, measuring WELLBYs. One WELLBY is a one-point increase on a 0-10 life satisfaction scale for one year.
Founder’s pledge values cash at $199 per WELLBY. They have conversion rates from WELLBYs to Income Doublings to Deaths Avoided to DALYs Avoided, using work from some of the orgs above. This means they can get a dollar figure they’re willing to spend for each of these measures.
Innovations for Poverty Action asks different questions depending on the project stage (eg. idea, pilot, measuring, scaling). Early questions can be eg. if it’s the right solution for the audience, and only down the line can you ask ‘does it actually save more lives?’
Metaculus Launches the 'Forecasting Our World In Data' Project to Probe the Long-Term Future
by christian, EdMathieu
Forecasting Our World In Data is a tournament that will deliver predictions on technological advancement, global development, and social progress using Our World in Data metrics. 20K prize pool for accurate forecasts on 1-3 year time horizons, and cogent analysis on 10-100 years horizons. The first questions have opened, with more to come on 19th and 26th Oct.
EA Forum
Philosophy and Methodologies
We can do better than argmax
by Jan_Kulveit, Gavin
Author’s tl;dr (lightly edited): a common prioritization method in EA is putting all resources on your top option (argmax). But this can be foolish, so we deviate in ad-hoc ways. We describe a principled softmax approach, allocating resources to several options by confidence. This works well when a whole community collaborates on impact; when some opportunities are fleeting or initially unknown; or when large actors are in play.
Parfit + Singer + Aliens = ?
by Maxwell Tabarrok
If we observe even simple (eg. single cellular) alien life, the chance of intelligent and morally relevant alien life existing somewhere increases drastically. In this case, human extinction isn’t as bad - the difference between eg. 95% and 100% of humans dead becomes much less. This makes risky moves like advancing AI or biotech (which could either destroy us or be hugely positive) more positive on balance, and implies we should upweight higher volatility paths.
Measuring Good Better
by MichaelPlant, GiveWell, Jason Schukraft, Matt_Lerner, Innovations for Poverty Action
Transcripts of 5-minute lightning talks by various orgs on their approach to measuring ‘good’. A very short summary of each is below:
Givewell uses moral weights to compare different units (eg. doubling incomes vs. saving an under-5s life). These are 60% based on donor surveys, 30% from a 2019 survey of 2K people in Kenya and Ghana, and 10% staff opinion.
Open Philanthropy’s global health and wellbeing team uses the unit of ‘a single dollar to someone making 50K per year’ and then compares everything to that. Eg. Averting a DALY is worth 100K of these units.
Happier Lives Institute focuses on wellbeing, measuring WELLBYs. One WELLBY is a one-point increase on a 0-10 life satisfaction scale for one year.
Founder’s pledge values cash at $199 per WELLBY. They have conversion rates from WELLBYs to Income Doublings to Deaths Avoided to DALYs Avoided, using work from some of the orgs above. This means they can get a dollar figure they’re willing to spend for each of these measures.
Innovations for Poverty Action asks different questions depending on the project stage (eg. idea, pilot, measuring, scaling). Early questions can be eg. if it’s the right solution for the audience, and only down the line can you ask ‘does it actually save more lives?’
Object Level Interventions / Reviews
Introducing the EA Good Governance Project
by Grayden
Author’s tl;dr: “I believe good governance is important and often underrated within EA. I'm launching the EA Good Governance Project. Its first initiative will be a directory of EA Board candidates. If you have skills and experience to offer to an EA Board, please add your profile.”
They also plan to add practical resources for Boards eg. how to measure impact and set appropriate policies, and are looking for contributors to this.
Why I think there's a one-in-six chance of an imminent global nuclear war
by Tegmark
The author predicts there is a 30% chance of Russia launching nukes, in that case 80% chance that NATO responds with conventional weapons, and in that case 70% chance of a global nuclear war. This equates to a ⅙ chance of global nuclear war from today’s state.
They argue that Putin will not accept a full loss without going nuclear, because he’d likely be jailed / killed. The other alternative, de-escalation, seems disfavored in the West because Ukraine is winning. And if a nuke is used in Ukraine, escalation to eventual nuclear war seems likely because the countries involved have a long history of nuclear near misses, and have made retaliation threats already.
These predictions are significantly more pessimistic than the community average. Metaculus currently gives the first stage (nuclear use in Ukraine) 7% odds this year, and the Samotsvetsy gives 16% for the next year, and 1.6% of nuclear use beyond that in the next year.
Lessons learned from talking to >100 academics about AI safety
by mariushobbhahn
The author talked to 100-200 people in academia about AI safety, from bachelor’s students to senior faculty, over several years. This post summarizes the key learnings.
Author’s tl;dr (lightly edited): “Academics are increasingly open to arguments about AI risk and I’d recommend having lots of these chats. I underestimated how much work related to aspects of AI safety (eg. interpretability) already exists in academia - we sometimes reinvent the wheel. Messaging matters, e.g. technical discussions got more interest than alarmism and explaining the problem rather than trying to actively convince someone received better feedback.”
Anonymous advice: If you want to reduce AI risk, should you take roles that advance AI capabilities?
by Benjamin Hilton, 80000_Hours
Work which increases AI capabilities is intertwined with certain types of safety work, and can be a good way to skill-build for safety work. However, it can also be harmful by accelerating dangerous AI. 80K anonymously asked 22 experts for their views on this balance, and received 10 responses published in full in this post.
How many experts leant each way and the common arguments for those leanings are summarized below.
Mostly Yes (2)
It Depends / 'Yes if careful' (4)
Mostly No (3)
Strong No (1)
Sheltering humanity against x-risk: report from the SHELTER weekend
by Janne M. Korhonen
Write-up of learnings from a participant in the August SHELTER weekend, an event for gaining clarity on what’s needed to build civilizational shelters.
Key takeaways included:
Responding to recent critiques of iron fortification in India
by e19brendan
Fortify Health co-founder Brendan responds to forum posts which suggest that recent studies on the prevalence of anemia in India and proportion attributable to iron-deficiency should lower cost-effectiveness estimates of fortification. The posts also suggested using more targeted treatment and changing anemia cut-offs.
Brendan reviewed the provided studies, and found that:
Counterarguments to the basic AI risk case
by Katja_Grace
Counters to the argument that goal-directed AIs are likely and it’s hard to align them to good goals, therefore there’s significant x-risk:
The US expands restrictions on AI exports to China. What are the x-risk effects?
by Stephen Clare
Last week the Biden administration announced regulations that make it illegal for US companies to export certain AI-related products and services to China, including high-end chips and semiconductor equipment.
The author questions on the impact on China’s AI trajectory, if this will increase the likelihood of conflict, and how these rising tensions might affect cooperation on other global risks.
Opportunities
SERI MATS Program - Winter 2022 Cohort
by Ryan Kidd
Applications open until Oct 24th for the MATS program, which supports aspiring alignment researchers to do independent research with funding (via LTFF), mentorship and training. The winter cohort will run Nov 7 - Feb 23rd, and be in person in Berkeley from Jan 3rd.
Book a corporate event for Giving Season
by Jack Lewars, Luke Freeman, Federico Speziali
Author’s tl;dr: “following the success of previous corporate talks, One for the World, Giving What We Can and High Impact Professionals are collaborating to offer a range of corporate talks this giving season. Use our contact form to learn more or book a talk.” GWWC is also offering workshops for a more interactive experience.
Alignment 201 curriculum
by richard_ngo
A follow-up to the Alignment Fundamentals curriculum, this 9-week curriculum aims to give enough knowledge to understand the frontier of current research discussions. It’s targeted at those who have taken the previous course, in addition to having some knowledge of deep learning and reinforcement learning.
Metaculus Launches the 'Forecasting Our World In Data' Project to Probe the Long-Term Future
by christian, EdMathieu
Forecasting Our World In Data is a tournament that will deliver predictions on technological advancement, global development, and social progress using Our World in Data metrics. 20K prize pool for accurate forecasts on 1-3 year time horizons, and cogent analysis on 10-100 years horizons. The first questions have opened, with more to come on 19th and 26th Oct.
Growth Theory Reading List
by LuisMota
List of readings on economic growth theory, broken down into 10 sub-topics such as long run historical growth, AI and growth, stagnation, growth and happiness.
EAGxVirtual: A virtual venue, timings, and other updates
by Alex Berezhnoi
Applications due before 19th October (conference from 21st October). >600 applicants so far from >60 countries. The post highlights content to expect, platforms used, and puts out a call for volunteers.
Community & Media
Why defensive writing is bad for community epistemics
by Emrik
Defensive writing is optimizing your writing for making sure no-one has a bad impression of you. This can become a norm when readers try to make inferences about the author vs. just learning from the content (‘judgemental reading’). Both make communication inefficient and writing scary.
The author suggests being clear as a writer about the purpose of your writing, and if it’s helping your readers. As a reader, he suggests interpreting things charitably, rewarding confidence, and not punishing people for what they don’t know.
When reporting AI timelines, be clear who you're (not) deferring to
by Sam Clarke
It’s common to ask people’s AI timelines, and also common for responses not to include whether they’re independent impressions or based on other’s views. This can lead to these timelines feeling more robust than they are, and to groups of EAs converging on the same timelines without good reason.
The author suggests if you haven’t formed an independent impression, always say who you’re deferring to. If you’re asking about someone’s timelines, always ask how they got to them. They’ve also put up a survey to work out who people are deferring to most.
On absurdity
by OllieBase
What we’re doing is absurdly ambitious. Looking at things through the absurdity lens can help us step back, get energy, and be kinder to ourselves and others (particularly when we fail). For instance, realizing ‘trying to work out the world’s biggest problem with my two college friends’ or ‘running for office with no political background to single-handedly influence the senate on global health security’ are absurd takes off some of the pressure - while still remembering it’s worth a shot!
Some Carl Sagan quotations
by finm
Carl Sagan (1934 - 1996) was an astronomer and science communicator who captured many ideas related to longtermism and existential risk poetically. This article is a collection of some of the author’s favorite quotes from him.
Counterproductive EA mental health advice (and what to say instead)
by Ada-Maaria Hyvärinen
Some well-meaning advice is counter-productive. This includes telling people:
Instead, say things in a way that reflects you care about that person for their intrinsic value, not just the impact they can have. This can be important in self-talk too.
Cultural EA considerations for Nordic folks
by Ada-Maaria Hyvärinen
Cultural information about EA that contrasts with the norm from a Finnish / Nordic perspective.
Some key topics:
Changes to EA Giving Tuesday for 2022
by Giving What We Can, mjamer, GraceAdams, Jack Lewars
Giving What We Can and One For The World volunteered to manage EA Giving Tuesday for 2022, somewhat scaled back (~25% the charities as previously, and minimal testing / revision of donation strategy). If you’d like to participate, sign up for email updates here.
An EA's Guide to Berkeley and the Bay Area
by ES, Vaidehi Agarwalla
Guide to newcomers, positives and negatives. Most helpful if you’re already planning or seriously considering coming to Berkeley and want to get more context on the community and culture.
Pineapple Operations is expanding to include all operations talent (Oct '22 Update)
by Vaidehi Agarwalla, Alexandra Malikova
Pineapple Operations database of candidates now includes all Ops talent, not just PAs/ExAs. Links to list yourself or search the 100+ candidates.
Ask Charity Entrepreneurship Anything
by Ula, KarolinaSarek, Joey
Some top comments at time of summarizing:
Didn’t Summarize
Let me blind myself to Forum post authors by Will Payne (forum feature request)
Scout Mindset Poster by Anthony Fleming (printable poster)
LW Forum
AI Related
Possible miracles
by Akash, Thomas Larsen
Eliezer’s List of Lethalities is a list of ways we could fail with regards to AGI. This post is a brainstorm of ways we might win - intended as an exercise for others to try too.
The author suggests it could be helpful to backchain from these brainstorms to come up with new project ideas.
QAPR 4: Inductive biases
by Quintin Pope
A roundup of 16 alignment papers focused on the inductive biases of stochastic gradient descent. Links, quotes, and the author’s opinion are provided for each.
Niceness is unnatural
by So8res
There’s an argument that it might be easy to make AIs ‘nice’, because prosocial behavior is advantageous in multi-agent settings.
The author argues this is unlikely, because:
1. The role of niceness is selection pressures can be replaced with other strategies like ‘merge with local potential allies immediately’
2. Humans ‘niceness’ is detailed eg. We only do it sometimes, have limited patience, and differing sensitivity to various types of cheating. An AI might have a different set of details no longer recognizable as ‘niceness’.
3. Related skills like empathy might occur because our self-models are the same as our other-models (we’re both human). This doesn’t apply for AIs.
4. The AI might display nice behaviors while they’re useful, and then reflect and drop them when they’re not.
Help out Redwood Research’s interpretability team by finding heuristics implemented by GPT-2 small
by Haoxing Du, Buck
Some of Redwood’s research involves interpretability on specific behaviors language models exhibit. They’re considering scaling up that line of research, so asking commenters for more behaviors to investigate! They’ve put up a web app for people to use GPT-2 in order to identify behaviors.
An example is acronyms - GPT-2 Small consistently follows the heuristic “string together the first letter of each capitalized word, and then close the parentheses” when asked to generate acronyms.
Rationality Related
Consider your appetite for disagreements
by Adam Zerner
4 illustrated examples of people disagreeing on a point because of minor differences. For instance, arguing whether a poker hand should have been folded, when both people believe it was a marginal call either way. Or arguing if a basketball player is the third or fifth best, when you both agree they’re top ten.
The author advocates that you have a small ‘appetite’ for these sorts of marginal disagreements, and spend little time on them. They also recommend communicating you have a large appetite for important and substantial disagreements.
The Balto/Togo theory of scientific development
by Elizabeth
In 1925, a relay of 150 dogs and 20 humans ran antibody serum to Alaska to end a diphtheria outbreak. The dog on the final and easiest leg was Balto, who became famous for it. The dog who ran the longest / hardest was Togo, who got comparatively little media.
A similar dynamic happens in science. Alfred Wegener is credited for discovering continental drift, but did no data collection, and little synthesis of evidence (the idea already existed in some papers). But people remember him and he inspired others to research further by advocating for an unproven idea. The author wonders how important this popularizing function is in general.
Calibration of a thousand predictions
by KatjaGrace
The author has made predictions in a spreadsheet for 4 years - as of now, ~1K are resolved. They created a calibration curve for ~630 predictions not about their own behavior, with 11 buckets, and found an average miscalibration error of only 3%. (These were primarily everyday life predictions such as if they'll be paid by x date, or invited to a certain party.) The accuracy was surprising because their internal experience of the predictions was ‘pulling a number out of thin air’. Accuracy for predicting their own behavior was much lower, particularly for the 35-55% range.
A common failure for foxes
by Rob Bensinger
In the parable of the Hedgehog and the Fox, the fox knows many things while the hedgehog knows one thing well. The author argues that people who see themselves as foxes often focus too much on RCTs over informal arguments, even when the RCT isn’t that relevant. This is because they want to feel like they ‘know things’ for sure, progress quickly in learning, and look intellectually modest (ie. ‘I’m just deferring to the data’).
Other
That one apocalyptic nuclear famine paper is bunk & Actually, All Nuclear Famine Papers are Bunk by Lao Mein
Some bloggers cite a study from Nature Food on why full US <-> Russia nuclear exchange might collapse civilization. The paper assumes a 10C drop in temperatures from nuclear winter will reduce farm yields 90%. However it also assumes no adaptation by humans - that we’ll keep the same crop selection and crop locations. Lao argues this is not realistic and makes the conclusions irrelevant.
There have also been claims by people such as Peter Zaihan that the world only has ~2 months' worth of food in reserve. Similarly, the paper above states an assumption that all food stores will be used up in the first year of attack. By examining US grain reserves, Lao finds there is enough to last the US ~half a decade, even without considering other food sources.
Transformative VR Is Likely Coming Soon
by jimrandomh
The author estimates 2.5 years until VR is better than in person for most meetings, given that Oculus announced a new headset last week which tackles many of the issues with previous VR meetings (eg. not being able to see the real world, hidden facial expressions, and audio latency). They expect the shift to be sudden and impactful - particularly on organizational structures and remote work.
Towards a comprehensive study of potential psychological causes of the ordinary range of variation of affective gender identity in males
by tailcalled
Someone who identifies as male might vary from being distressed by the idea of being a woman, to being neutral / wouldn’t mind, to being positive about it. The author studies this variation via surveys of cis men who don’t idenitfy as trans or gender questioning, and tries to correlate it to other factors such as gender conservativism or extraversion.
Didn’t Summarize
Six (and a half) intuitions for KL divergence by TheMcDouglas
Prettified AI Safety Game Cards by abramdemski
Contra shard theory, in the context of the diamond maximizer problem by So8res
This Week on Twitter
AI
2022’s State of AI report is live. Key trends (aggregated in this tweet) include:
DeepMind released a paper about self-supervised training on video instead of image datasets (richer data). (tweet)
EA
The US Supreme Court heard arguments on whether or not to uphold Prop12 - a California law banning the sale of pork from pigs kept in spaces too small to turn around. Could have implications on the types of further laws that can be passed. Decision due in late June 2023. (tweet) (article)
New paper addressing how natural risks might be higher because of civilization. Eg. Pandemics are risker because of travel, and space weather is riskier because it can affect technology such as power grids. (tweet) (paper)
National Security
New US national security strategy includes explicit commitments to strengthening the BWC (biological weapons convention) and the need for more focus on deliberate + accidental threat mitigation. (tweet)
Putin said that more missile strikes against Ukraine ‘not necessary’ and that the aim isn’t to destroy the country. (tweet)
Iran sending drones and missiles to Russia. The drones are already being used by Russia against Ukraine. (tweet) (article)
Science
New ‘our world in data’ section on which countries routinely administer vaccines. (tweet) (page)