This is part of a weekly series - you can see the full collection here. The first post includes some details on purpose and methodology.
If you'd like to receive these summaries via email, you can subscribe here.
Podcast version: prefer your summaries in podcast form? A big thanks to Coleman Snell who is now producing these! The first episode is now up, and this week's will be up soon. Subscribe on your favorite podcast app by searching for 'Effective Altruism Forum Podcast'.
Top Readings / Curated
Designed for those without the time to read all the summaries. Everything here is also within the relevant sections later on so feel free to skip if you’re planning to read it all.
by Nick_Beckstead, leopold, William_MacAskill, ketanrama, ab
The Future Fund believes they may be significantly under or over estimating catastrophic risk from AI, or that they are focusing on the wrong aspects of it. Since this affects the distribution of hundreds of millions of dollars, they are offering prizes of up to $1.5M for arguments that significantly shift their credences on when and how AGI will be developed and its chance of catastrophic effect. A secondary goal is to test the efficacy of prizes at motivating important new insights. Smaller prizes are also available. Entries are due Dec 23rd.
Rethink Priorities is launching a program to help start promising EA initiatives. Some will be internally incubated and later spun-off into independent organizations, others run externally will be fiscally sponsored and given operations support. The team is currently incubating / sponsoring eight projects ranging from insect welfare to community building in Brazil to AI alignment research, and is looking for expressions of interest for those interested in fiscal sponsorship or project incubation.
The AI deployment problem is the question of how and when to (attempt to) build and deploy powerful AI, when unsure about safety and how close other actors are to deploying. This post assumes a major AI company (‘Magma’) thinks it is 6months - 2years to TAI using human feedback on diverse tasks (HFDT), and what both it and an organization dedicated to tracking and censoring dangerous AI (‘IAIA’) would ideally do in this situation.
He splits this into three phases, with suggestions in a longer summary under the LessWrong -> AI section below. This forecast implies actions we should take today, such as create an IAIA equivalent, selectively share info, and for labs to do outreach and advocacy (non-comprehensive list of Holden’s suggestions).
Ethical theories often contain background assumptions about consciousness, personal identity, and valence. Instead of arguing these or theorizing around them, we can test them in reality. For instance, mixed valence states (pain and pleasure, together) are relevant to negative utilitarianism. From real life, we can observe that if a pleasant experience (eg. music) occurs during a painful experience (eg. stomach ache) it can mute the pain. This helps us refine the negative utilitarianism view, and opens up new questions like if this is still the case for extensive pains, or if they are always net negative - which we can also test empirically.
Other examples are given, including high-dose DMT as experiences that aren’t successfully captured by many philosophical frameworks. The author argues this approach becomes more important as we open up futures with more complex valences and states of consciousness available.
Quantified Intuitions helps users practice assigning credences to outcomes with a quick feedback loop. Currently it includes a calibration game, and pastcasting (forecasting on resolved questions you don’t know about already).
The NYU Mind, Ethics, and Policy Program (MEP) will conduct and support foundational research about the nature and intrinsic value of nonhuman minds (biological and artificial). Several projects are currently underway, including a free public talk on whether large language models are sentient (signup here), and an award and workshop seeking published papers on animal and AI consciousness (details here).
Indoor air quality is a promising biosafety intervention, but experimental evidence on which methods to use is sparse. EA being an early adopter of interventions like air filters or UV lights, and recording outcomes, would help assess what works - and help the community reduce infections at the same time. EAG is a natural candidate for this due to its size. Other possibilities include EA office / coworking spaces or group houses.
Rethink Priorities is launching a program to help start promising EA initiatives. Some will be internally incubated and later spun-off into independent organizations, others run externally will be fiscally sponsored and given operations support. The team is currently incubating / sponsoring eight projects ranging from insect welfare to community building in Brazil to AI alignment research, and is looking for expressions of interest for those interested in fiscal sponsorship or project incubation.
Rethink priorities is running a monthly survey of the US aimed at understanding public perceptions of Effective Altruism and related cause areas, funded by FTX Future Fund.
This includes over time tracking of general attitudes, as well as ad-hoc questions such as testing support for particular policies or EA messaging. Requests for both sections are welcome - ideally by October 20th.
Individual charitable donations are in the hundreds of billions eg. last year individuals, bequests, foundations and corporations gave an estimated $484.85 billion to charities. The largest proportions are religious donations, and educational donations to existing 3-4 year colleges. Only 6% is donated internationally. More public outreach, approachable arguments and asks, specialized charity evaluators and public attacks on practices like Ivy League university endowments could help direct this money more effectively.
AI alignment might benefit from thinking about our bodies’ values, not just our brains’. Behavior in humans is often an interplay eg. If we consider hunger and eating, this includes some cognition, but also interplay with the stomach, gut, hunger-related hormones etc. which in some sense have ‘goals’ like to keep blood glucose within certain bounds.
If we align AI with only our cognitive values, this has error modes. Eg. many like donuts - but our bodies usually prefer leafy greens. If an AI is trained to care for human bodies the way the body ‘wants’, and keep its systems running smoothly, this makes us safer from these failure modes. It also allows us to include preferences of infants, fetuses, those in comas, and others unable to communicate cognitively. We can learn these body values and goals via pushing forward evolutionary biology.
Excerpts and links to discussion with Shahar Avin, a senior researcher at the Center for the Study of Existential Risk. Key points from excerpts include that:
A lot of cutting-edge AI research is probably private
Security failures are unavoidable with big enough systems
Companies should be paying the red tape cost of proving their system is safe, secure & aligned. Big companies are used to paying to meet regulations.
We should regulate now, but make it ‘future ready’ and updateable.
There’s a lot of pessimism in the AI safety community. However, keep in mind:
All of the arguments saying that it’s hard to be confident that transformative AI isn’t just around the corner also apply to safety research progress.
It’s still early days and we’ve had about as much progress as you’d predict given that up until recently we’ve only had double-digit numbers of people working on the problem.
by Nick_Beckstead, leopold, William_MacAskill, ketanrama, ab
The Future Fund believes they may be significantly under or over estimating catastrophic risk from AI, or that they are focusing on the wrong aspects of it. Since this affects the distribution of hundreds of millions of dollars, they are offering prizes of up to $1.5M for arguments that significantly shift their credences on when and how AGI will be developed and its chance of catastrophic effect. A secondary goal is to test the efficacy of prizes at motivating important new insights. Smaller prizes are also available. Entries are due Dec 23rd.
Hiring for an EA Global Events Associate, Retreats Associate, and Community Events Associate. The team has grown rapidly, and on track to facilitate 4x the connections in the EA community this year as in 2021. Apply by October 11th.
13K of prizes up for grabs. Win some by either changing Clearer Thinking’s mind about which of 28 finalists to fund (and by how much), or being a top 20 forecaster for what projects they end up funding.
This role will be across areas like event & project management, 1:1 outreach & advising calls, setting up & improving IT infrastructure, writing, giving talks, and attending in-person networking events. Previous community building experience is helpful but not required. Deadline Oct 16th.
$5k prize pool for quantitative estimates of the value of some or all of 80,000 hours' top 10 career paths. Flexibility on units (eg. QALYs, % x-risk reduction) and methodology. Deadline Nov 1st.
The Truman Prize, now live on EA prize platform Superlinear, recognizes Effective Altruists with $5,000-$10,000 prizes for declining credit in order to increase their impact, in ways that can't be publicized directly. Submissions are now open.
There are many projects providing free or cheap support to EAs globally, but aren’t well known. This post and comment section aims to collect them. Multiple examples are linked in each of the following sections: coworking and socializing spaces, professional services, coaching, financial support.
Personas of donors by donation amount, eg. a retail donor (<1K) is less likely to care about transaction costs or to investigate charities. Each level is separated by a factor of 10. The author also discusses how a person might move up levels (donate substantially more) eg. via increasing income or pooling donations with others.
EA is big, and we need a wide range of skills. This isn’t always obvious due to the visibility of AI safety and biosecurity discussions. Examples include: operations management, design, communications, policy, historical or behavioral research. There’s also EAs using their existing niche eg. Kikiope Oluwarore, a veterinarian who co-founded healthier hens, or Liv Boeree, a poker player who uses her network to encourage other poker players to donate.
The author suggests increasing visibility of EAs doing this wide range of work eg. via a ‘humans of EA’ monthly feature.
Linkpost for an interview of Rutger Bregman on his personal view of philanthropy. He’s a historian and author good at reaching people outside the EA community eg. being mentioned more than any other individual in Effektiv Spenden’s post donation survey.
EA started in universities, and community building efforts have heavily focused there. Even if students are on average more receptive and flexible, continuing this trend can be a mistake because it risks an image of EA as a ‘youth movement’, losing out on important skills and networks from experienced professionals, and creates a lack of mentorship.
The author suggests redirecting resources at the margin by encouraging general community building, skill building, or object-level work more strongly with students (over becoming university group organizers).
The 80K job board isn’t just highly impactful jobs - it also lists jobs good for career capital. In an informal twitter poll, 55% of EAs weren’t aware of this and believed it important. The author suggests only including high impact jobs, allowing community discussion of impact level, and better communicating current state.
Kush_kan from 80K comments they plan to visually distinguish top impact roles, update their job board tagline and FAQ, link orgs EA forum pages, and add a new feedback form to help address this. They also note most roles are there for a mix of impact and career capital reasons.
Arguments that misunderstand a field can reduce credibility and put experts from those fields off. The question author wants to collect examples to minimize this with AI Safety. Responses include:
Psychology (no clear definition of ‘intelligence’ that encompasses eg. cultural intelligence)
Economics (forecasting double-digit GDP growth based on AI feels dubious)
Anthropology (the ‘AI is to us what we are to chimps’ misunderstands how humans acquired power / knowledge ie. cumulative cultural changes over time)
Numerical summary of 300 comments on MacAskill’s NYT piece on longtermism. 60 were skeptical, 42 were positive. The most common skepticism themes were ‘Our broken culture prevents us from focusing on the long-term’ (20) and ‘We're completely doomed, there's no point’ (16). Few commenters engaged on biorisk or AI, associating long-term future concern with the environmental concern. Many also assumed long-term referred to 2-7 generations.
EA community building often talks about the ‘funnel’, and more recently focus has been on creating core EAs in that funnel. Another model is we have outreach that’s like creating seeds (putting lots of resources into few promising proto-EAs) and like creating pollen (low resource per person but spread widely). Like plants, we need to get the ratio right. The author suggests we’re currently too weighted towards seeds and should be doing more pollen-like efforts - spreading ideas to a wide audience. Sometimes it will stick and someone will become a core EA, despite not having much dedicated support.
Author’s tl;dr: “Effective Dropouts” is meant to be a casual fun excuse for reminding people that dropping out of a degree can be a good decision, as a counter to all the existing pressure for degrees being the default/only/obvious way. The rest of the post is mostly a joke.
The group aimed to increase the number and quality of technical people pursuing a career in AI safety research. Key lessons included:
Using an AI audience instead of an EA one greatly increased the technical audience. AI safety was still an easy sell without EA / longtermist philosophy.
Socials after talks were of high value.
Expert supervisors providing research questions and limited (~1hr per week) support to groups of students was an effective development opportunity.
Lecture slides and exercises from the course ‘Topics in Economic Theory and Global Prioritization’, designed primarily for economics graduate students considering careers in global priorities research. The program included lunches, social opportunities, and shared accommodations, and will run again in 2023.
All participants were very satisfied or satisfied with the course, and over 18 / 34 wrote it may have or definitely changed their plans. Social aspects were the favorite portion.
The author has been involved with EA for over a decade, making significant life path changes such as ETG (medicine), launching a 60K+ per month business and donating profits, and running successful cage-free egg campaigns. They’ve felt alternately welcomed (EAGx Virtual, local groups, and a supportive CEA member) and disillusioned (multiple EAG rejections with lack of context and unsatisfactory replies) by the EA community.
The author suggests improvements in the EAG admissions process:
Consider how rejection can feel like a judgment of applicants’ worth. Send better responses, which could include:
Links to sensitive explanations of common reasons for rejection.
A semi-automated system for feedback eg. flagging applications by primary reason for rejection and sending templated emails based on that.
Training CEA staff on authentic relating / non-violent communication methods.
Analyze the cost of rejection, experiment with interventions to reduce it, publish results.
Address the power of a small group (CEA) in admissions via transparency, blinding admissions, and potentially renaming EAG (eg. to CEA Global).
EA spends on luxuries to save time, increase productivity or community build (eg. flights, uber eats, fancy retreats). This is usually justified with the argument that some EA work can be incredibly impactful, to where 30m of time might be worth more than a year of someone’s life (which costs ~$200 via donations). However, the author argues frugality is important to EA’s internal and external reputation, to a healthy community, and can be more motivating than luxury.
Links to talks by 22 fellows from CERI (Cambridge Existential Risk Initiative)’s research symposiums over the past 2 years, split by subject areas. These were given by summer research fellows, and cover AI risk (technical & governance), biorisk, nuclear risk, extreme climate change, and meta x-risk.
by Patrick Gruban, LRudL, Maris Sala, simeon_c, Marc Carauleanu
Beth Barnes shared an idea and proposal for human data for AI alignment with multiple people at EAG London. Six interested people then self-organized cofounder selection.
Their steps were to ask CE for advice, answer ‘50 questions to explore with cofounders’, work weekly in pairs on test tasks for five weeks, and meet for an in-person workshop to finalize preferences and choose a co-founding pair.
Participants rated highest the pair working, and suggested focusing it on customer interviews to better define the intervention (in this case, the final cofounder pair dropped the project after doing this stage post cofounder matching). The reveal of preferences on who to co-found with was also highly rated and successfully selected one pair. Other stages could have been cut or done informally to speed up the process.
Weekly themed round-ups of papers, published each Monday. They include links, abstracts and Quintin’s thoughts. This week’s theme is the structure/redundancy of trained models, as well as linear interpolations through parameter space.
Linkpost for the author’s summaries of most core readings, and many further readings, from the alignment fundamentals curriculum composed by Richard Ngo.
Describes the work of AI Safety researchers, research organizations, and training programs in one sentence each. Not intended to be comprehensive, but includes 15 researchers / research orgs, and 8 training / mentoring programs.
New video series that covers what infra-Bayesianism is at a high level and how it's supposed to help with alignment, assuming no prior knowledge. It also targets those who want to gain mastery of the technical details behind IB so that they can apply it to their own alignment research, and is good preparation for more technical sources like the original IB sequences.
The AI deployment problem is the question of how and when to (attempt to) build and deploy powerful AI, when unsure about safety and how close other actors are to deploying. This post assumes a major AI company (‘Magma’) thinks it is 6months - 2years to TAI using human feedback on diverse tasks (HFDT), and what both it and an organization dedicated to tracking and censoring dangerous AI (‘IAIA’) would ideally do in this situation.
He splits this into three phases, with suggestions as below. This forecast implies actions we should take today, such as create an IAIA equivalent, selectively share info, and for labs to do outreach and advocacy (non-comprehensive list of Holden’s suggestions).
Phase 1: Before Magma develops aligned TAI
Magma should:
Focus on developing aligned TAI asap, before other actors
Reduce risk from other actors - prioritize internal security, reduce ‘race’ pressure by making deals with other AI companies, and educate other players on misaligned AI risk.
IAIA should:
Monitor all major AI development projects, looking for signs of dangerous AI, ensuring sufficient safety measures, information security and selective info sharing, and taking action where they find issues.
Serve as a hub to share public goods such as education on AI alignment or coordinating deals between different orgs.
Selective info sharing is important for both parties. ‘Dual-use’ information (helpful for avoiding misaligned AI, and for making powerful AI) should be shared more readily with cautious parties.
Phase 2: Magma has developed aligned TAI, but other actors may develop misaligned TAI
Magma & IAIA should:
Focus on deploying AI systems that can reduce the risk that other actors cause a catastrophe (eg. that can align more powerful systems, patch cybersecurity holes, cover more uses to reduce space for other AIs, detect and obstruct misaligned AIs, enforce safety procedures, or offer general guidance)
Reduce misuse risk of the aligned TAI eg. deploy with appropriate oversight, ensure users have good intentions, and bake in some resistance to misuse.
If despite this, dangerous AI is catching up, Magma & IAIA should consider drastic action:
Recommend governments suppress AI development by any means necessary.
Develop AI capabilities to persuade or threaten actors to achieve the above - even if that holds some misalignment risk in itself.
Phase 3: No actors are likely to be able to develop misaligned TAI
Magma & IAIA should:
Focus on avoiding lock-in of bad worlds where one player has seized global power.
Broker peaceful compromises or coalitions.
Design AIs that may help humans with moral progress.
A response to “Reward is not the optimization target”. The author argues that it’s not possible for reinforcement learning policies to care about “reward” in an embedded setting, but wireheading in RL agents is still possible because wireheading doesn’t mean “the policy has reward as its objective”.
Censorship in AI (for instance, preventing the user from viewing model results that contain swear words or real faces) is an issue because: a) like other forms of social censorship, it can have negative social effects. Eg. see controversy over social media censorship.
b) It confuses the term ‘AI Safety’, such that people associate it with censorship under a false banner of ‘safety’, and have a negative view of the field overall.
The author suggests reducing censorship in public-facing models, using differentiated terminology for different types of AI safety, and more media interaction to direct opinion. A top comment notes ‘safety’ in the online context is already strongly associated to prevention of things like pornography and racism, and AI security may be a better term to use.
Linkpost for a new Anthropic paper which explores superposition - where one neuron in a net is used to capture several unrelated concepts, a problem for interpretability.
Lightcone’s founder on practices that seem to have helped with high output. These primarily revolve around removing blockers, and includes:
Have a single decision maker to resolve ties
Have clear top priorities, and set aside time where everyone is focused on just them (and time specifically for other stuff eg. a day for less important meetings)
Work together (team in same room, work in pairs or trios sometimes, no-one remote, everyone full time, if you’re blocked ask for help immediately, and if you’re tagged respond immediately)
Keep context high (lots of 1-1s, never DM in slack - use a public channel named for your 1-1, make time for chit-chat)
Story-like examples of getting lost in the details, and why it’s worthwhile to step back and ask ‘what was this aiming at? Is it the most effective way to get it?’ Hierarchical tree representations with lines weighted on impact can be a clear way to visualize this.
When you measure a metric, you usually don’t learn what you think you do. For instance, a company measures click-through-rate on two sign-up flows to determine which info is more important to put upfront, but really the CTR changed due to latency differences in the two options. Solution: measure lots of things, and you might figure out what’s really going on. Some tools are better at this (eg. a microscope gives heaps of info, an AB test just one data point). In addition to your own experimental design, keep this in mind when reading other’s research - don’t just take in the abstract and the p-values, look at the data and cross-reference other papers to build a picture of what’s going on.
Breakdown of an argument on twitter about twitter polls. One user argues they aren’t representative and contain response bias, and therefore they offer actively ‘bad’ (ie. misleading) evidence and shouldn’t be used. The post author argues that any evidence is good, as long as you update on it properly considering the context (such as the sample, any likely bias), and therefore we should do more twitter polls since they’re cheap and neat. There are methods to help with bias, such as comparing results between polls by the same user.
A productivity hack for when the amount of tasks to-do feels overwhelming. The author organizes the tasks into time boxes during the day, and tries to ‘collect’ as many as possible by completing them before the end of their box. If they complete one early, they can go back to try and retrieve an earlier one. This helps by focusing them on one task at a time, and making every minute feel meaningful.
Interviews with AI alignment researchers found they reported the following as difficult and costly: running experiments, formalizing intuitions, unifying disparate insights into a coherent frame and proving theorems. Conjecture’s epistemology team questions whether these four things are the right approaches, or just ‘cached patterns’ on how to do research that can be adjusted to better suit the context. They plan to consult with more alignment researchers, question, refine and replace cached patterns that aren’t serving their purpose, and therefore improve alignment research. They call this ‘methodological therapy’, and have developed some framing questions to kick off this work (eg. ‘what are the researchers trying to accomplish?’)
“To get good at something—parenting, writing code, doing research—you need to internalize examples of prime achievements in that field.” Look for awards, review articles of scientific fields, who is cited etc. to generate a list of those at the top of your domain. Then study those people’s processes - your brain will learn from the training data.
Or in short: Identify your aim. Locate the prime achievements in this field, and observe them in a messy living context. Reflect on how your understanding of the world has changed. Is this a fitting method for you? If not, course correct.
If you’re not feeling motivated, you might push yourself through something instead. But if that becomes the norm, you can forget the ‘pushing yourself’ was a substitution for true motivation to begin with. This pattern can happen in many areas eg. ‘doing’ empathy vs. ‘having’ empathy. A common trigger is that stress blocks the useful / happy mind state, so you substitute with a forced one, stay stressed and can’t get it back. The first step to fixing this is to notice it.
Gene drives work by transmitting an allele to 100% of offspring, soon covering the whole population. In 2018, Crispr allowed the Cristani lab to create a gene drive in a lab environment that suppressed all reproduction in the only malaria-transmitting mosquito species. However, this has still not been released in the wild, for two reasons:
Possibility of the mosquitoes generating resistant alleles - so the population is not eliminated, and future gene drives are harder. (being addressed via larger tests to ensure this doesn’t occur)
If done without good local and government buy-in, could cause a backlash that restricts the development of other gene drives. (being addressed via building community consensus)
In the meantime, 1.6K die every day from Malaria. Is the wait worth it?
Petrov Day commemorates when Petrov avoided a nuclear war by reporting a false alarm, when sensors seemed to show a nuclear attack. LW will celebrate by having a button which can bring down the site’s frontpage for all, available anonymously to all existing users with non-negative karma.
The author will give 5K to anyone who reduces malaria by 95%+ without causing negative effects that outweigh that, because it seems like this should have a reward.
People with the DEC2-P384R mutation produce more prepro-orexin and have a reduced need for sleep. Plausible reasons it wasn’t evolutionarily selected for include a higher food need and metabolism, less anxiety / more risk-taking. Due to orexin being a natural human protein, it can’t be patented so hasn’t been studied in detail. Trials are underway for an orexin antagonist (binds to its receptors) for use in treating Narcolepsy. The author suggests we fund direct orexin supplementation studies, originally for Narcolepsy to get FDA approval, and then for reducing general sleep needs.
OpenAI trained an AI that “approaches human-level robustness and accuracy on English speech recognition”. (tweet) Other tweets note it works well even with fast speech and back-tracking. (tweet)
Deepmind released a new chatbot, Sparrow, trained on human feedback to follow rules like not impersonating humans, and to search the internet for helpful info. (tweet)
CSET shares that China is building “cyber ranges” to allow cybersecurity teams to test new tools and practice attack and defense. (article)
National Security
Vladimir Putin has threatened use of nuclear weapons, saying Russia had ‘lots of weapons to reply’ to threats, and that he was ‘not bluffing’. (article) The US and allies have threatened catastrophic consequences if so. (article)
Current metaculus forecasts put the chances of Russia launching a nuclear weapon before 2023 at 3%, after a brief period directly after the announcement where they rose to 6%. (metaculus forecast) The metaculus forecast for chances of non-test nuclear detonation before 2035 (anywhere globally) have risen to 27%.
Russia has initiated partial conscription, mobilizing ~300K of its reserve forces. Thousands flee Russia in response. (metaculus forecast)(article)
Science
NASA successfully crashed a spacecraft into an asteroid, to test the ability to deflect a problematic one. (tweet)
Scientists engineered mosquitoes that slow the growth of malaria-causing parasites in their guts, preventing transmission to humans - but reduces mosquito lifespan so likely not viable in practice. (tweet)
The White House announced $2B funding to launch a National Biotechnology and Biomanufacturing Initiative which aims to foster innovation, strengthen supply chains, mitigate biological risks and improve health outcomes. (tweet)
Supported by Rethink Priorities
Also posted on the EA forum.
This is part of a weekly series - you can see the full collection here. The first post includes some details on purpose and methodology.
If you'd like to receive these summaries via email, you can subscribe here.
Podcast version: prefer your summaries in podcast form? A big thanks to Coleman Snell who is now producing these! The first episode is now up, and this week's will be up soon. Subscribe on your favorite podcast app by searching for 'Effective Altruism Forum Podcast'.
Top Readings / Curated
Designed for those without the time to read all the summaries. Everything here is also within the relevant sections later on so feel free to skip if you’re planning to read it all.
Announcing the Future Fund's AI Worldview Prize
by Nick_Beckstead, leopold, William_MacAskill, ketanrama, ab
The Future Fund believes they may be significantly under or over estimating catastrophic risk from AI, or that they are focusing on the wrong aspects of it. Since this affects the distribution of hundreds of millions of dollars, they are offering prizes of up to $1.5M for arguments that significantly shift their credences on when and how AGI will be developed and its chance of catastrophic effect. A secondary goal is to test the efficacy of prizes at motivating important new insights. Smaller prizes are also available. Entries are due Dec 23rd.
Announcing the Rethink Priorities Special Projects Program
by Rachel Norman
Rethink Priorities is launching a program to help start promising EA initiatives. Some will be internally incubated and later spun-off into independent organizations, others run externally will be fiscally sponsored and given operations support. The team is currently incubating / sponsoring eight projects ranging from insect welfare to community building in Brazil to AI alignment research, and is looking for expressions of interest for those interested in fiscal sponsorship or project incubation.
Nearcast-based "deployment problem" analysis
by HoldenKarnofsky
The AI deployment problem is the question of how and when to (attempt to) build and deploy powerful AI, when unsure about safety and how close other actors are to deploying. This post assumes a major AI company (‘Magma’) thinks it is 6months - 2years to TAI using human feedback on diverse tasks (HFDT), and what both it and an organization dedicated to tracking and censoring dangerous AI (‘IAIA’) would ideally do in this situation.
He splits this into three phases, with suggestions in a longer summary under the LessWrong -> AI section below. This forecast implies actions we should take today, such as create an IAIA equivalent, selectively share info, and for labs to do outreach and advocacy (non-comprehensive list of Holden’s suggestions).
EA Forum
Philosophy and Methodologies
Just Look At The Thing! – How The Science of Consciousness Informs Ethics
by algekalipso
Ethical theories often contain background assumptions about consciousness, personal identity, and valence. Instead of arguing these or theorizing around them, we can test them in reality. For instance, mixed valence states (pain and pleasure, together) are relevant to negative utilitarianism. From real life, we can observe that if a pleasant experience (eg. music) occurs during a painful experience (eg. stomach ache) it can mute the pain. This helps us refine the negative utilitarianism view, and opens up new questions like if this is still the case for extensive pains, or if they are always net negative - which we can also test empirically.
Other examples are given, including high-dose DMT as experiences that aren’t successfully captured by many philosophical frameworks. The author argues this approach becomes more important as we open up futures with more complex valences and states of consciousness available.
Object Level Interventions & New Projects
Quantified Intuitions: An epistemics training website including a new EA-themed calibration app
by Sage
Quantified Intuitions helps users practice assigning credences to outcomes with a quick feedback loop. Currently it includes a calibration game, and pastcasting (forecasting on resolved questions you don’t know about already).
Announcing the NYU Mind, Ethics, and Policy Program
by Sofia_Fogel
The NYU Mind, Ethics, and Policy Program (MEP) will conduct and support foundational research about the nature and intrinsic value of nonhuman minds (biological and artificial). Several projects are currently underway, including a free public talk on whether large language models are sentient (signup here), and an award and workshop seeking published papers on animal and AI consciousness (details here).
The Next EA Global Should Have Safe Air
By joshcmorrison
Indoor air quality is a promising biosafety intervention, but experimental evidence on which methods to use is sparse. EA being an early adopter of interventions like air filters or UV lights, and recording outcomes, would help assess what works - and help the community reduce infections at the same time. EAG is a natural candidate for this due to its size. Other possibilities include EA office / coworking spaces or group houses.
Announcing the Rethink Priorities Special Projects Program
by Rachel Norman
Rethink Priorities is launching a program to help start promising EA initiatives. Some will be internally incubated and later spun-off into independent organizations, others run externally will be fiscally sponsored and given operations support. The team is currently incubating / sponsoring eight projects ranging from insect welfare to community building in Brazil to AI alignment research, and is looking for expressions of interest for those interested in fiscal sponsorship or project incubation.
Announcing EA Pulse, large monthly US surveys on EA
by David_Moss, Jamie Elsey
Rethink priorities is running a monthly survey of the US aimed at understanding public perceptions of Effective Altruism and related cause areas, funded by FTX Future Fund.
This includes over time tracking of general attitudes, as well as ad-hoc questions such as testing support for particular policies or EA messaging. Requests for both sections are welcome - ideally by October 20th.
The Hundred Billion Dollar Opportunity that EAs mostly ignore
by JeremiahJohnson
Individual charitable donations are in the hundreds of billions eg. last year individuals, bequests, foundations and corporations gave an estimated $484.85 billion to charities. The largest proportions are religious donations, and educational donations to existing 3-4 year colleges. Only 6% is donated internationally. More public outreach, approachable arguments and asks, specialized charity evaluators and public attacks on practices like Ivy League university endowments could help direct this money more effectively.
EA’s brain-over-body bias, and the embodied value problem in AI alignment
by Geoffrey Miller
AI alignment might benefit from thinking about our bodies’ values, not just our brains’. Behavior in humans is often an interplay eg. If we consider hunger and eating, this includes some cognition, but also interplay with the stomach, gut, hunger-related hormones etc. which in some sense have ‘goals’ like to keep blood glucose within certain bounds.
If we align AI with only our cognitive values, this has error modes. Eg. many like donuts - but our bodies usually prefer leafy greens. If an AI is trained to care for human bodies the way the body ‘wants’, and keep its systems running smoothly, this makes us safer from these failure modes. It also allows us to include preferences of infants, fetuses, those in comas, and others unable to communicate cognitively. We can learn these body values and goals via pushing forward evolutionary biology.
Shahar Avin on How to Strategically Regulate Advanced AI Systems
by Michaël Trazzi
Excerpts and links to discussion with Shahar Avin, a senior researcher at the Center for the Study of Existential Risk. Key points from excerpts include that:
Two reasons we might be closer to solving alignment than it seems
by Kat Woods, Amber Dawn
There’s a lot of pessimism in the AI safety community. However, keep in mind:
Opportunities
Announcing the Future Fund's AI Worldview Prize
by Nick_Beckstead, leopold, William_MacAskill, ketanrama, ab
The Future Fund believes they may be significantly under or over estimating catastrophic risk from AI, or that they are focusing on the wrong aspects of it. Since this affects the distribution of hundreds of millions of dollars, they are offering prizes of up to $1.5M for arguments that significantly shift their credences on when and how AGI will be developed and its chance of catastrophic effect. A secondary goal is to test the efficacy of prizes at motivating important new insights. Smaller prizes are also available. Entries are due Dec 23rd.
CEA's Events Team is Hiring!
by Amy Labenz
Hiring for an EA Global Events Associate, Retreats Associate, and Community Events Associate. The team has grown rapidly, and on track to facilitate 4x the connections in the EA community this year as in 2021. Apply by October 11th.
$13,000 of prizes for changing our minds about who to fund (Clearer Thinking Regrants Forecasting Tournament)
by spencerg
13K of prizes up for grabs. Win some by either changing Clearer Thinking’s mind about which of 28 finalists to fund (and by how much), or being a top 20 forecaster for what projects they end up funding.
[Open position] S-Risk Community Manager at CLR
by stefan.torges
This role will be across areas like event & project management, 1:1 outreach & advising calls, setting up & improving IT infrastructure, writing, giving talks, and attending in-person networking events. Previous community building experience is helpful but not required. Deadline Oct 16th.
$5k challenge to quantify the impact of 80,000 hours' top career paths
by NunoSempere
$5k prize pool for quantitative estimates of the value of some or all of 80,000 hours' top 10 career paths. Flexibility on units (eg. QALYs, % x-risk reduction) and methodology. Deadline Nov 1st.
The $100,000 Truman Prize: Rewarding Anonymous EA Work
by Drew Spartz
The Truman Prize, now live on EA prize platform Superlinear, recognizes Effective Altruists with $5,000-$10,000 prizes for declining credit in order to increase their impact, in ways that can't be publicized directly. Submissions are now open.
Community & Media
Let's advertise infrastructure projects
by Arepo
There are many projects providing free or cheap support to EAs globally, but aren’t well known. This post and comment section aims to collect them. Multiple examples are linked in each of the following sections: coworking and socializing spaces, professional services, coaching, financial support.
Levels of donation
by vipulnaik
Personas of donors by donation amount, eg. a retail donor (<1K) is less likely to care about transaction costs or to investigate charities. Each level is separated by a factor of 10. The author also discusses how a person might move up levels (donate substantially more) eg. via increasing income or pooling donations with others.
EA for people with non-technical skillsets
by Ronja
EA is big, and we need a wide range of skills. This isn’t always obvious due to the visibility of AI safety and biosecurity discussions. Examples include: operations management, design, communications, policy, historical or behavioral research. There’s also EAs using their existing niche eg. Kikiope Oluwarore, a veterinarian who co-founded healthier hens, or Liv Boeree, a poker player who uses her network to encourage other poker players to donate.
The author suggests increasing visibility of EAs doing this wide range of work eg. via a ‘humans of EA’ monthly feature.
Author Rutger Bregman about effective altruism and philanthropy
by Sebastian Schwiecker
Linkpost for an interview of Rutger Bregman on his personal view of philanthropy. He’s a historian and author good at reaching people outside the EA community eg. being mentioned more than any other individual in Effektiv Spenden’s post donation survey.
The Mistakes of Focusing on Student Outreach in EA
by DavidNash
EA started in universities, and community building efforts have heavily focused there. Even if students are on average more receptive and flexible, continuing this trend can be a mistake because it risks an image of EA as a ‘youth movement’, losing out on important skills and networks from experienced professionals, and creates a lack of mentorship.
The author suggests redirecting resources at the margin by encouraging general community building, skill building, or object-level work more strongly with students (over becoming university group organizers).
Criticism of the 80k job board listing strategy
by Yonatan Cale
The 80K job board isn’t just highly impactful jobs - it also lists jobs good for career capital. In an informal twitter poll, 55% of EAs weren’t aware of this and believed it important. The author suggests only including high impact jobs, allowing community discussion of impact level, and better communicating current state.
Kush_kan from 80K comments they plan to visually distinguish top impact roles, update their job board tagline and FAQ, link orgs EA forum pages, and add a new feedback form to help address this. They also note most roles are there for a mix of impact and career capital reasons.
What Do AI Safety Pitches Not Get About Your Field?
by Aris Richardson
Arguments that misunderstand a field can reduce credibility and put experts from those fields off. The question author wants to collect examples to minimize this with AI Safety. Responses include:
Summarizing the comments on William MacAskill's NYT opinion piece on longtermism
by West
Numerical summary of 300 comments on MacAskill’s NYT piece on longtermism. 60 were skeptical, 42 were positive. The most common skepticism themes were ‘Our broken culture prevents us from focusing on the long-term’ (20) and ‘We're completely doomed, there's no point’ (16). Few commenters engaged on biorisk or AI, associating long-term future concern with the environmental concern. Many also assumed long-term referred to 2-7 generations.
Optimizing seed:pollen ratio to spread ideas
by Holly_Elmore
EA community building often talks about the ‘funnel’, and more recently focus has been on creating core EAs in that funnel. Another model is we have outreach that’s like creating seeds (putting lots of resources into few promising proto-EAs) and like creating pollen (low resource per person but spread widely). Like plants, we need to get the ratio right. The author suggests we’re currently too weighted towards seeds and should be doing more pollen-like efforts - spreading ideas to a wide audience. Sometimes it will stick and someone will become a core EA, despite not having much dedicated support.
Announcing “Effective Dropouts”
by Yonatan Cale, Gavin, Vardev, Jonny Spicer
Author’s tl;dr: “Effective Dropouts” is meant to be a casual fun excuse for reminding people that dropping out of a degree can be a good decision, as a counter to all the existing pressure for degrees being the default/only/obvious way. The rest of the post is mostly a joke.
Establishing Oxford’s AI Safety Student Group: Lessons Learnt and Our Model
by CharlieGriffin, juliakarbing
The group aimed to increase the number and quality of technical people pursuing a career in AI safety research. Key lessons included:
ETGP 2022: materials, feedback, and lessons for 2023
by trammell
Lecture slides and exercises from the course ‘Topics in Economic Theory and Global Prioritization’, designed primarily for economics graduate students considering careers in global priorities research. The program included lunches, social opportunities, and shared accommodations, and will run again in 2023.
All participants were very satisfied or satisfied with the course, and over 18 / 34 wrote it may have or definitely changed their plans. Social aspects were the favorite portion.
My Personal Story of Rejection from EA Global
by Constance Li
The author has been involved with EA for over a decade, making significant life path changes such as ETG (medicine), launching a 60K+ per month business and donating profits, and running successful cage-free egg campaigns. They’ve felt alternately welcomed (EAGx Virtual, local groups, and a supportive CEA member) and disillusioned (multiple EAG rejections with lack of context and unsatisfactory replies) by the EA community.
The author suggests improvements in the EAG admissions process:
Why Wasting EA Money is Bad
by Jordan Arel
EA spends on luxuries to save time, increase productivity or community build (eg. flights, uber eats, fancy retreats). This is usually justified with the argument that some EA work can be incredibly impactful, to where 30m of time might be worth more than a year of someone’s life (which costs ~$200 via donations). However, the author argues frugality is important to EA’s internal and external reputation, to a healthy community, and can be more motivating than luxury.
CERI Research Symposium Presentations (incl. Youtube links)
by Will Aldred
Links to talks by 22 fellows from CERI (Cambridge Existential Risk Initiative)’s research symposiums over the past 2 years, split by subject areas. These were given by summer research fellows, and cover AI risk (technical & governance), biorisk, nuclear risk, extreme climate change, and meta x-risk.
Learning from Matching Co-Founders for an AI-Alignment Startup
by Patrick Gruban, LRudL, Maris Sala, simeon_c, Marc Carauleanu
Beth Barnes shared an idea and proposal for human data for AI alignment with multiple people at EAG London. Six interested people then self-organized cofounder selection.
Their steps were to ask CE for advice, answer ‘50 questions to explore with cofounders’, work weekly in pairs on test tasks for five weeks, and meet for an in-person workshop to finalize preferences and choose a co-founding pair.
Participants rated highest the pair working, and suggested focusing it on customer interviews to better define the intervention (in this case, the final cofounder pair dropped the project after doing this stage post cofounder matching). The reveal of preferences on who to co-found with was also highly rated and successfully selected one pair. Other stages could have been cut or done informally to speed up the process.
LW Forum
AI Related
Quintin's alignment papers roundup - week 2
by Quintin Pope
Weekly themed round-ups of papers, published each Monday. They include links, abstracts and Quintin’s thoughts. This week’s theme is the structure/redundancy of trained models, as well as linear interpolations through parameter space.
Summaries: Alignment Fundamentals Curriculum
by Leon Lang
Linkpost for the author’s summaries of most core readings, and many further readings, from the alignment fundamentals curriculum composed by Richard Ngo.
Alignment Org Cheat Sheet
by Akash, Thomas Larsen
Describes the work of AI Safety researchers, research organizations, and training programs in one sentence each. Not intended to be comprehensive, but includes 15 researchers / research orgs, and 8 training / mentoring programs.
Understanding Infra-Bayesianism: A Beginner-Friendly Video Series
by Jack Parker, Connall Garrod
New video series that covers what infra-Bayesianism is at a high level and how it's supposed to help with alignment, assuming no prior knowledge. It also targets those who want to gain mastery of the technical details behind IB so that they can apply it to their own alignment research, and is good preparation for more technical sources like the original IB sequences.
Nearcast-based "deployment problem" analysis
by HoldenKarnofsky
The AI deployment problem is the question of how and when to (attempt to) build and deploy powerful AI, when unsure about safety and how close other actors are to deploying. This post assumes a major AI company (‘Magma’) thinks it is 6months - 2years to TAI using human feedback on diverse tasks (HFDT), and what both it and an organization dedicated to tracking and censoring dangerous AI (‘IAIA’) would ideally do in this situation.
He splits this into three phases, with suggestions as below. This forecast implies actions we should take today, such as create an IAIA equivalent, selectively share info, and for labs to do outreach and advocacy (non-comprehensive list of Holden’s suggestions).
Phase 1: Before Magma develops aligned TAI
Magma should:
IAIA should:
Selective info sharing is important for both parties. ‘Dual-use’ information (helpful for avoiding misaligned AI, and for making powerful AI) should be shared more readily with cautious parties.
Phase 2: Magma has developed aligned TAI, but other actors may develop misaligned TAI
Magma & IAIA should:
If despite this, dangerous AI is catching up, Magma & IAIA should consider drastic action:
Phase 3: No actors are likely to be able to develop misaligned TAI
Magma & IAIA should:
Towards deconfusing wireheading and reward maximization
by leogao
A response to “Reward is not the optimization target”. The author argues that it’s not possible for reinforcement learning policies to care about “reward” in an embedded setting, but wireheading in RL agents is still possible because wireheading doesn’t mean “the policy has reward as its objective”.
Public-facing Censorship Is Safety Theater, Causing Reputational Damage
by Yitz
Censorship in AI (for instance, preventing the user from viewing model results that contain swear words or real faces) is an issue because:
a) like other forms of social censorship, it can have negative social effects. Eg. see controversy over social media censorship.
b) It confuses the term ‘AI Safety’, such that people associate it with censorship under a false banner of ‘safety’, and have a negative view of the field overall.
The author suggests reducing censorship in public-facing models, using differentiated terminology for different types of AI safety, and more media interaction to direct opinion. A top comment notes ‘safety’ in the online context is already strongly associated to prevention of things like pornography and racism, and AI security may be a better term to use.
Toy Models of Superposition
by evhub
Linkpost for a new Anthropic paper which explores superposition - where one neuron in a net is used to capture several unrelated concepts, a problem for interpretability.
Research & Productivity Advice
How my team at Lightcone sometimes gets stuff done
by jacobjacob
Lightcone’s founder on practices that seem to have helped with high output. These primarily revolve around removing blockers, and includes:
Losing the root for the tree
by Adam Zerner
Story-like examples of getting lost in the details, and why it’s worthwhile to step back and ask ‘what was this aiming at? Is it the most effective way to get it?’ Hierarchical tree representations with lines weighted on impact can be a clear way to visualize this.
You Are Not Measuring What You Think You Are Measuring
by johnswentworth
When you measure a metric, you usually don’t learn what you think you do. For instance, a company measures click-through-rate on two sign-up flows to determine which info is more important to put upfront, but really the CTR changed due to latency differences in the two options. Solution: measure lots of things, and you might figure out what’s really going on. Some tools are better at this (eg. a microscope gives heaps of info, an AB test just one data point). In addition to your own experimental design, keep this in mind when reading other’s research - don’t just take in the abstract and the p-values, look at the data and cross-reference other papers to build a picture of what’s going on.
Some notes on solving hard problems
by Joe Rocca
Tips for solving hard problems:
Start simple
Question assumptions
Make it easier to hold in your head
Twitter Polls: Evidence is Evidence
by Zvi
Breakdown of an argument on twitter about twitter polls. One user argues they aren’t representative and contain response bias, and therefore they offer actively ‘bad’ (ie. misleading) evidence and shouldn’t be used. The post author argues that any evidence is good, as long as you update on it properly considering the context (such as the sample, any likely bias), and therefore we should do more twitter polls since they’re cheap and neat. There are methods to help with bias, such as comparing results between polls by the same user.
A game of mattering
by KatjaGrace
A productivity hack for when the amount of tasks to-do feels overwhelming. The author organizes the tasks into time boxes during the day, and tries to ‘collect’ as many as possible by completing them before the end of their box. If they complete one early, they can go back to try and retrieve an earlier one. This helps by focusing them on one task at a time, and making every minute feel meaningful.
Methodological Therapy: An Agenda For Tackling Research Bottlenecks
by adamShimi, Lucas Teixeira, remember
Interviews with AI alignment researchers found they reported the following as difficult and costly: running experiments, formalizing intuitions, unifying disparate insights into a coherent frame and proving theorems. Conjecture’s epistemology team questions whether these four things are the right approaches, or just ‘cached patterns’ on how to do research that can be adjusted to better suit the context. They plan to consult with more alignment researchers, question, refine and replace cached patterns that aren’t serving their purpose, and therefore improve alignment research. They call this ‘methodological therapy’, and have developed some framing questions to kick off this work (eg. ‘what are the researchers trying to accomplish?’)
Scraping training data for your mind
by Henrik Karlsson
“To get good at something—parenting, writing code, doing research—you need to internalize examples of prime achievements in that field.” Look for awards, review articles of scientific fields, who is cited etc. to generate a list of those at the top of your domain. Then study those people’s processes - your brain will learn from the training data.
Or in short: Identify your aim. Locate the prime achievements in this field, and observe them in a messy living context. Reflect on how your understanding of the world has changed. Is this a fitting method for you? If not, course correct.
Fake qualities of mind
by Kaj_Sotala
If you’re not feeling motivated, you might push yourself through something instead. But if that becomes the norm, you can forget the ‘pushing yourself’ was a substitution for true motivation to begin with. This pattern can happen in many areas eg. ‘doing’ empathy vs. ‘having’ empathy. A common trigger is that stress blocks the useful / happy mind state, so you substitute with a forced one, stay stressed and can’t get it back. The first step to fixing this is to notice it.
Other
Gene drives: why the wait?
by Metacelsus
Gene drives work by transmitting an allele to 100% of offspring, soon covering the whole population. In 2018, Crispr allowed the Cristani lab to create a gene drive in a lab environment that suppressed all reproduction in the only malaria-transmitting mosquito species. However, this has still not been released in the wild, for two reasons:
In the meantime, 1.6K die every day from Malaria. Is the wait worth it?
LW Petrov Day 2022 (Monday, 9/26)
by Ruby
Petrov Day commemorates when Petrov avoided a nuclear war by reporting a false alarm, when sensors seemed to show a nuclear attack. LW will celebrate by having a button which can bring down the site’s frontpage for all, available anonymously to all existing users with non-negative karma.
Announcing $5,000 bounty for ending malaria
by lc
The author will give 5K to anyone who reduces malaria by 95%+ without causing negative effects that outweigh that, because it seems like this should have a reward.
Orexin and the quest for more waking hours
by ChristianKl
People with the DEC2-P384R mutation produce more prepro-orexin and have a reduced need for sleep. Plausible reasons it wasn’t evolutionarily selected for include a higher food need and metabolism, less anxiety / more risk-taking. Due to orexin being a natural human protein, it can’t be patented so hasn’t been studied in detail. Trials are underway for an orexin antagonist (binds to its receptors) for use in treating Narcolepsy. The author suggests we fund direct orexin supplementation studies, originally for Narcolepsy to get FDA approval, and then for reducing general sleep needs.
Didn’t Summarize
Do bamboos set themselves on fire? by Malmesbury
The Redaction Machine by Ben (original fiction story about a world where you can wind someone back to a prior state)
Dath Ilan's Views on Stopgap Corrigibility by David Udell
Interpreting Neural Networks through the Polytope Lens by Sid Black, Lee Sharkey, Connor Leahy, beren, CRG, merizian, EricWinsor, Dan Braun
Funding is All You Need: Getting into Grad School by Hacking the NSF GRFP Fellowship
by hapanin
This Week on Twitter
AI
OpenAI trained an AI that “approaches human-level robustness and accuracy on English speech recognition”. (tweet) Other tweets note it works well even with fast speech and back-tracking. (tweet)
Deepmind released a new chatbot, Sparrow, trained on human feedback to follow rules like not impersonating humans, and to search the internet for helpful info. (tweet)
CSET shares that China is building “cyber ranges” to allow cybersecurity teams to test new tools and practice attack and defense. (article)
National Security
Vladimir Putin has threatened use of nuclear weapons, saying Russia had ‘lots of weapons to reply’ to threats, and that he was ‘not bluffing’. (article) The US and allies have threatened catastrophic consequences if so. (article)
Current metaculus forecasts put the chances of Russia launching a nuclear weapon before 2023 at 3%, after a brief period directly after the announcement where they rose to 6%. (metaculus forecast) The metaculus forecast for chances of non-test nuclear detonation before 2035 (anywhere globally) have risen to 27%.
Russia has initiated partial conscription, mobilizing ~300K of its reserve forces. Thousands flee Russia in response. (metaculus forecast) (article)
Science
NASA successfully crashed a spacecraft into an asteroid, to test the ability to deflect a problematic one. (tweet)
Scientists engineered mosquitoes that slow the growth of malaria-causing parasites in their guts, preventing transmission to humans - but reduces mosquito lifespan so likely not viable in practice. (tweet)
The White House announced $2B funding to launch a National Biotechnology and Biomanufacturing Initiative which aims to foster innovation, strengthen supply chains, mitigate biological risks and improve health outcomes. (tweet)