This is part of a weekly series summarizing the top (40+ karma) posts on the EA and LW forums - you can see the full collection here. The first post includes some details on purpose and methodology.
If you'd like to receive these summaries via email, you can subscribe here.
Podcast version: prefer your summaries in podcast form? A big thanks to Coleman Snell for producing these! Subscribe on your favorite podcast app by searching for 'Effective Altruism Forum Podcast'.
Author's note: I'm currently travelling, so there won't be any summary post next week. It'll be back the week after with the summaries for those 2 weeks of content at a higher karma bar.
Top / Curated Readings
FTX
FTX filed for bankruptcy on 11th Nov. The FTX Future Fund, financed by FTX and its founder SBF (Sam Bankman-Fried), had been a major funder in EA since its launch in Feb this year.
What happened: Binance, a competitor, sold their stake in FTX’s primary coin - this caused it to drop in value, and led to the equivalent of a bank run ie. a large number of clients attempting to withdraw funds. FTX didn’t have enough money to pay them, and there have been claims this is due to SBF misusing customer funds to prop up his trading company Alameda Research.
This has meant thousands of people with funds in FTX have likely lost them. If you need someone to talk to, even just to vent, CEA’s community health team is available or you can access peer support set up by Rethink Wellbeing and the Mental Health Navigator here.
FTX FAQ by Hamish Doodlesprovides a good overview of the situation as of Sunday 13th November.
Other key points from posts:
The EA community has distanced itself from FTX and is strongly condemning any misuse of customer funds that might have occurred. This includes statements by Robert Wiblin and William MacAskill, the FTX fund team (who have resigned), and many others.
Upcoming grantees of the FTX Future Fund who have yet to receive money are unlikely to. Money received from FTX after August 11th may also be clawed back (currently uncertain - there’s also some chance of earlier grants being clawed back but this is much less likely). If this puts you in financial distress, limited emergency funding is available via Nonlinear or the AI Safety Microgrants round. These budgets are comparatively small, so donations are welcomed. Edit: Open Philanthropy has also opened up applications for grantees impacted by recent events.
Open Philanthropy will continue funding longtermist work and is not directly affected, but their bar will be raised. Most longtermist funding applications are paused while they re-evaluate this bar.
There have been various discussions on how to prevent something similar happening again - including private investigators for this case, discussions on how much due diligence to apply to funders, better governance practices generally, and whistleblowing mechanisms.
This is an evolving situation, with many posts on the topic. Those with 40+ karma from the past week are below, with the most recent at the top:
Second post in the moral weight project sequence. Assesses the likelihood of trait possession for each of 11 different species on >90 empirical proxies that might provide evidence of variation in valenced states. (Ie. The proxies provide evidence on if the species best and worst potential experiences would be very different in ‘goodness’). These included both hedonic and cognitive proxies. Some examples: ‘joy like behavior’, ‘concept of death’, ‘reward based learning’, ‘cooperative behavior’, ‘individual differences / personality’. This is presented visually in a (super cool) graph, and available in excel also.
Key meta-results included that there are a lot of unknowns, a decent amount of cases with positive evidence for a trait, and very few cases where we have evidence a species doesn’t have the trait. The most information was known about terrestrial vertebrates (eg. pigs) and the least about invertebrates (eg. silkworms). In terms of traits, a lot was known about some (eg. cooperative behavior, communication, parental care) and very little about others - particularly affective states like guilt-like behavior, sympathy-like behavior, or shared intentionality.
Naive utilitarianism is acting whenever the most salient first-order consequences are positive. These calculations are unreliable, and violating people’s rights is almost guaranteed to be of negative expected value, even when the first order effects look positive. This is well known by utilitarian theorists, and so prudent / rational utilitarians tend to abide by cooperative norms - making them more trustworthy than critics equating their moral philosophy to naive utilitarianism would think.
Clearer Thinking evaluated >630 project proposals as part of their Regrants program. They share key learnings that updated them toward there being many promising opportunities where 10K-500K gifts could make a big difference.
Unexpected room for funding of well-known orgs
Rethink Priorities survey team has significant room for funding for original research to benefit EA as a whole (eg. message testing).
Happier lives institute has significant room for more funding.
1Day Sooner conducts activities outside of their human challenge trial work eg. faster regulatory pathways for vaccines, and these activities have significant room for funding.
Object-level learnings
There’s been minimal work to quantify risks of large scale volcanic eruptions.
Boiling water using solid fuels contributes significantly to indoor air pollution in low and middle income countries.
There are new ideas on how to reduce nuclear war risk, despite the age of the field (eg. developing nuclear de-escalation toolkits via historical analysis).
It’s hard to detect vitamin deficiency in a population, but point of care bio-sensors might soon be an option to fix that gap.
Delaying AGI timelines by 1 year gives the entire alignment community an extra year to solve the problem. ‘Buying time’ interventions typically involve convincing AI labs that AI x-risk is an important concern, and providing feasible actions for them to reduce it. Eg:
Outreach to AGI researchers, 1-1, in written form or via coordination events. Alternatively improve persuasiveness of outreach by:
Red-teaming counter-arguments that are common at top AI labs
Support safety and governance teams at major AI labs.
Develop and promote safety standards for AI labs eg. infosecurity and publication policies.
Potentially many other ideas could help eg. creating safety benchmarks, overviews on open safety problems to make it easier to dig in, or alignment competitions.
The author thinks ~40-60% of alignment researchers should work on this instead of technical research (particularly if they would be a better fit for it), and 20-40% of AI Safety community builders should also switch to this focus. This is particularly the case if you are able to progress an intervention that buys time at the end, i.e. when we know more and have more tools to help with alignment.
Summaries of two recent posts by prominent economists on the likelihood of a China / US war over Taiwan, using economic frameworks. Both think it’s relatively likely. Details below:
Applying the bargaining framework he developed in his book ‘Why We Fight’, Chris argues that as China grows its economy and military, war becomes more likely. While negotiated settlement is theoretically preferable, there may be principles (e.g. democracy vs autocracy) that are non-negotiable, and China harmed its reputation for sticking to settlements with its crackdown on Hong Kong.
Uses game theory to predict the most likely war scenarios, via:
Weighing up factors like national pride, reputation, and military cost in terms of their importance to each of Chinese and American leaders
Uses these weights to assign expected payoffs to different actions in a decision tree with three cascading choices (China on if to invade Taiwan, China on if to invade US bases first, US on whether to resist the invasion)
Given his assumptions, the equilibrium solution is for China to invade and attack the US to maximize its chances of victory, nearly assuring the outbreak of a major great power war. However the assumptions might not be great, and he discusses how they don’t take into account eg. misinformation, or the ‘US resists’ outcomes being too negative for both countries for either to take that path.
A US nationally representative survey by Rethink Priorities found 15.7% of respondents (N=2,698) supported a ban on slaughterhouses when presented with arguments for and against, and asked to explain their reasoning. Previous surveys by the Sentience Institute in the US found ~39-43% of people supported banning slaughterhouses, highlighting a large discrepancy.
Rethink Priorities suggests that previous polls which determined attitudes in response to broad questions (e.g. “I support a ban on slaughterhouses”) may not be accurate indicators of support for certain policies. These findings are notable for animal advocates as previous findings had been cited as support for bold reforms.
Neil also notes that for future research, it might be useful to test a radical ask (ban factory farming) and a moderate ask (labelling for cage-free eggs, say) each with a radical message ("meat is murder") versus a moderate one ("human/consumer welfare") somewhat similar to this paper.
A guide on how to change your organization’s rules, protocols, decision-making, culture or ethos from the inside. Originally written by UK civil servants, the author is most confident it applies in that context, but has adapted the post to give advice more broadly to staff from juniors to middle managers in large organisations. The process suggested is:
Understand what needs to change (be observant, learn best practice, brainstorm and prioritize ideas, gather evidence, and understand why it hasn’t been fixed already)
Understand how things have changed already (talk to those making changes, identify precedents)
Create space for action by doing your day job well, and exploring options with your manager to add this new objective to your day job.
Fix things you can fix (start small, ask for forgiveness not permission)
Build a group of collaborators (network, find others with the same goals, make them into a team with clear responsibilities and action points)
Involve senior staff (build credibility, work out who can pull the right levers and talk to them, present a process not a solution, and be humble)
Change is slow, and can fail or be limited by external factors - but giving it a go can be good for learning and career capital even if it fails.
8-month programme to catapult grads into high-impact career paths in EU policy, mainly working on the topic of AI Governance. Includes remote PT study July/Aug, 2x week-long policy trainings in Brussels, and choice of ~5 month placement at a host org or support applying to other EU policy jobs. Open to EU citizens, Apply by Dec 11.
GiveWell is hiring for a Research Analyst for their core interventions team, which investigates and makes funding decisions about programs they’re already supporting at scale. All locations welcome, as long as willing to join meetings in the California time zone. No specific experience or degrees needed. Apply by Nov 20th.
A lot of EA aligned work being done in Australia involves working remotely with the international community, but there’s also a growing list of projects at least partially led by Australian residents. This post compiles them together and provides summaries, who’s working on it, links, and requests / calls for action for each.
Includes: AI Safety Australia & New Zealand, AI Safety Support, EA Pathfinder, Foundations for Tomorrow, Giving What We Can, Good Ancestors Project, High Impact Engineers, High Impact Recruitment, Insights for Impact, Lead Exposure Elimination Project, Quantifying Uncertainty in Givewell CEAs, Ready Research, and Sentience Institute.
When typing ‘effective altruism’ into Google Scholar, some EA forum posts show up - although without proper titles. DAOMaxi explains in the comments why this would happen, and how it’s possible to either manually add articles to Google Scholar or for the forum team devs to include tags that will do it automatically (and with correct details). Currently 48 forum posts are indexed on Google Scholar.
You might want to do independent research as a side / transition project, if there aren’t positions open in the area you want to research, or because you value independence / flexibility.
The author has several tips for doing this successfully:
Identify if you’re primarily looking to produce useful outputs, or to upskill. This will help you decide between eg. replicating current key research vs. doing new research.
Get feedback early eg. on your project goals and plan.
Have clear research goals and re-evaluate from time to time to see if your plan is the most effective way to get there.
Collaborate with others - it’s motivating as well as useful.
Create accountability mechanisms eg. via intermediate goals or committing to posting results.
Be more active overall - independent researchers have less structure and more responsibility.
Guide by Pineapple Operations on things to consider before entering Ops work at an EA org. Includes:
What is Ops, and how to get into it?
List of possible activities eg. HR, recruiting, events, logistics, fundraising, marketing, comms, accounting, process implementation, project management, generalist work.
Even within titles, roles can differ significantly - always read the job description
Important attributes include an ownership & service mindset, able to context-switch, detail-oriented, organized, quick learner, process thinking, good judgment.
Mission-alignment is also important, particularly for smaller orgs.
The best way to learn Ops is to do it.
Test fit via introspection, research, talking to people, doing some Ops somewhere, volunteering, applying to roles and doing test tasks
Contract / volunteer roles are common but rarely advertised, you can find them via talking to others (eg. via EA conferences, the forum, Pineapple Ops).
What Ops is like
Your role may feel similar to one at a non-EA org, day to day.
In some cases work can be intense, lonely, or difficult to progress in - advocate for yourself, don’t burn out.
Pay depends on company size, cause area, salary philosophy, funding and work location. 2022 UK roles are often £35,000 to £75,000.
They also list further resources and orgs specializing in operations.
by tereziekosik, Kristyna Stastna, Sylvie Wagnerová
Mental Health Navigator has published an 8-week mental health programme for EAs. This includes an 8-chapter workbook and resources for facilitators to run weekly workshops.
If you want talented ML students to learn about AI safety, offer them what they find valuable ie. projects and skill-building that create career capital. This is the model of ML @ Berkeley, which is run unpaid, requires 15 hour p/w commitment from participants, is extremely selective (~7% get in) and still has 50 students. Many groups are more discussion-focused, and could benefit from this approach.
The author designed symbols for a utilitarianism flag, their local EA group, and common EA mindsets which they share here. They’ve also created a lot of banners, thumbnails, and images used elsewhere on the forum.
Alignment proposals are answers to the scenario: ‘Imagine you’re about to launch AGI, and think there is >50% chance it will end the acute risk period in a good way. Why do you think that?’
Current proposals fall into three buckets, which the author believes are doomed for the reasons indented below each:
Output evaluation - you know exactly what the AGI will do eg. it only outputs plans that you then screen and implement.
Humans aren’t capable of evaluating plans of a complexity that might end the risk period, particularly if they involve branching (eg. info or power gathering first).
Cognitive interpretability - you understand the AI’s cognition enough to be confident in how it will reach a plan and that it would never consider bad approaches.
The current paradigm trains minds vs. building them, and we have little insight into their internal thinking.
Heavy-precedent approaches - you’ve run this AGI before, trained out all the hiccups, and only plan to run it on similar tasks.
The author can’t think of any pivotal acts that would end the risk period, while having safe analogs we could train on.
by Eric Drexler A lot of AI safety research assumes a monolithic AGI, with the argument that if there were multiple superintelligent-level systems they would inevitably collude and act as one.
Factors that make collusion less likely include a large number of actors, sensitivity to defectors, diversity among actors, constrained communication, single-move decision-processes, and lack of shared knowledge. The author argues these conditions are supported by current architectures and incentives (eg. making multiple diverse models improves quality / reliability of answers). Applying multiple potentially untrustworthy superintelligent-level systems to problems can improve rather than degrade safety by thwarting collusion. They call for greater attention on this prospect.
The author has converted book 1 of the sequences into machine-read audio overlaid on unrelated Subway Surfers gameplay footage. Similar videos are often recommended on TikTok, so this format may be highly engaging for some people. The videos are all linked in the post.
Rudeness is a way of spending down social capital, which you accumulate doing high status respectable things. Different communities and cultures have different norms of ‘rude’ eg. belching at a meal might be rude in one culture, and rude not to do in another. This allows groups to fine-tune what they optimize for - making some actions more socially expensive and so occur less often.
The author builds a toy model to try and empirically prove the possibility of treacherous mesa-optimizers ie. optimizers that try to look aligned in training.
They created a model to follow the X = Y line in a graph up to Y = 5, to know humans can’t control it after that, to learn via simulation, and to have a loss model that is different from the ‘true’ loss model we want. They show in this case the model sometimes veers away from X = Y after the Y = 5 point. Code, graphs, and commentary are provided.
Author’s tl;dr (lightly edited): General intelligence is possible because solving real-world problems requires solving common subtasks. Common subtasks are what give us instrumental convergence (definition: the tendency for most sufficiently intelligent beings to pursue similar sub-goals). Common subtasks are also what make AI useful; you want AIs to pursue instrumentally convergent goals. Capabilities research proceeds by figuring out algorithms for instrumentally convergent cognition. Consequentialism and search are fairly general ways of solving common subtasks.
Shares a transcription of a short Q&A between Brian Christian (Author of The Alignment Problem) and Yale Philosophy Professor L.A. Paul. The latter suggests that the issue with RLHF (reinforcement learning from human feedback) is new scenarios where humans can’t distinguish what is better, or even bucket it into an existing category. The phrasing is crisp and tackles the problem clearly, despite Professor Paul having limited background in AI Safety. The author recommends watching the full recording.
Genetics is the study of genes ie. sequences of genetic material that encode functional products. Epigenetics is the study of modifications to this genetic material that don’t affect the sequence, but control which genes get expressed. In mammals, these are DNA methylation and histone modifications. Newly copied DNA lacks these modifications and the modifications can be read, written, and erased by specialized proteins (scientists can also do this via CRISPR).
The Sahel region is close to the Malthusian equilibrium (where all production is used for sustenance). This means many die in an economic downturn. Support is likely neglected due to government corruption, necessitating charities to deliver support in person, with one source stating antibiotic imports of the entirety of Mali amounted to $53k in 2020. Givewell top charities also tend to physically distribute goods. Based on this, the author suggests even flying in with a backpack of antibiotics to give away might be highly impactful.
Suggests exam-only universities as a way to allow people to learn at their own pace / from wherever they want, and for testing to become more standardized across institutions. It also removes bias from lecturers offering exam hints during classes, and the inconvenience and cost of attending classes.
Supported by Rethink Priorities
This is part of a weekly series summarizing the top (40+ karma) posts on the EA and LW forums - you can see the full collection here. The first post includes some details on purpose and methodology.
If you'd like to receive these summaries via email, you can subscribe here.
Podcast version: prefer your summaries in podcast form? A big thanks to Coleman Snell for producing these! Subscribe on your favorite podcast app by searching for 'Effective Altruism Forum Podcast'.
Author's note: I'm currently travelling, so there won't be any summary post next week. It'll be back the week after with the summaries for those 2 weeks of content at a higher karma bar.
Top / Curated Readings
FTX
FTX filed for bankruptcy on 11th Nov. The FTX Future Fund, financed by FTX and its founder SBF (Sam Bankman-Fried), had been a major funder in EA since its launch in Feb this year.
What happened: Binance, a competitor, sold their stake in FTX’s primary coin - this caused it to drop in value, and led to the equivalent of a bank run ie. a large number of clients attempting to withdraw funds. FTX didn’t have enough money to pay them, and there have been claims this is due to SBF misusing customer funds to prop up his trading company Alameda Research.
This has meant thousands of people with funds in FTX have likely lost them. If you need someone to talk to, even just to vent, CEA’s community health team is available or you can access peer support set up by Rethink Wellbeing and the Mental Health Navigator here.
FTX FAQ by Hamish Doodles provides a good overview of the situation as of Sunday 13th November.
Other key points from posts:
This is an evolving situation, with many posts on the topic. Those with 40+ karma from the past week are below, with the most recent at the top:
Proposals for reform should come with detailed stories by Eric Neyman
The FTX crisis highlights a deeper cultural problem within EA - we don't sufficiently value good governance by Fods12
Effective Peer Support Network in FTX crisis (Update) by Emily, Inga
AI Safety Microgrant Round by Chris Leong, Damola Morenikeji, David_Kristoffersson
NY Times on the FTX implosion's impact on EA by AllAmericanBreakfast
Wrong lessons from the FTX catastrophe by burner
How the FTX crash damaged the Altruistic Agency by Markus Amalthea Magnuson
The FTX Situation: Wait for more information before proposing solutions by D0TheMath
Announcing Nonlinear Emergency Funding by Kat Woods, Emerson Spartz, Drew Spartz
SBF, extreme risk-taking, expected value, and effective altruism by vipulnaik
Thoughts on legal concerns surrounding the FTX situation by Molly
Hubris and coldness within EA (my experience) by James Gough
Will MacAskill's role in connecting SBF to Elon Musk for a potential Twitter deal by dyj34650
Noting an unsubstantiated belief about the FTX disaster by Yitz
In favour of compassion, and against bandwagons of outrage by Emrik
A personal statement on FTX by William_MacAskill
After recent FTX events, what are alternative sources of funding for longtermist projects? by CarolineJ
CEA/EV + OP + RP should engage an independent investigator to determine whether key figures in EA knew about the (likely) fraud at FTX by Tyrone-Jay Barugh
How could we have avoided this? by Nathan Young
IMPCO, don't injure yourself by returning FTXFF money for services you already provided by EliezerYudkowsky
Thoughts on FTX and returning to our ideals by michel
Under what conditions should FTX grantees voluntarily return their grants? by sawyer
My reaction to FTX: appalled by Robert_Wiblin
Resorting constantly to bets can be bad for optics, especially in the current situation by EdMathieu
For the mental health of those affected by the FTX crisis… by Daystar Eld
We must be very clear: fraud in the service of effective altruism is unacceptable by evhub
Some comments on recent FTX-related events by Holden Karnofsky
Community support given FTX situation by Julia_Wise
The FTX Future Fund team has resigned by Nick_Beckstead, leopold, ab, ketanrama
Money Stuff: FTX Had a Death Spiral by Elliot Temple
FTX will probably be sold at a steep discount. What we know and some forecasts on what will happen next by Nathan Young, NunoSempere, Stan van Wingerden, Juan Gil
FTX.com has probably collapsed by Charles He
EA Forum
Philosophy and Methodologies
The Welfare Range Table
by Bob Fischer
Second post in the moral weight project sequence. Assesses the likelihood of trait possession for each of 11 different species on >90 empirical proxies that might provide evidence of variation in valenced states. (Ie. The proxies provide evidence on if the species best and worst potential experiences would be very different in ‘goodness’). These included both hedonic and cognitive proxies. Some examples: ‘joy like behavior’, ‘concept of death’, ‘reward based learning’, ‘cooperative behavior’, ‘individual differences / personality’. This is presented visually in a (super cool) graph, and available in excel also.
Key meta-results included that there are a lot of unknowns, a decent amount of cases with positive evidence for a trait, and very few cases where we have evidence a species doesn’t have the trait. The most information was known about terrestrial vertebrates (eg. pigs) and the least about invertebrates (eg. silkworms). In terms of traits, a lot was known about some (eg. cooperative behavior, communication, parental care) and very little about others - particularly affective states like guilt-like behavior, sympathy-like behavior, or shared intentionality.
Naïve vs Prudent Utilitarianism
by Richard Y Chappell
Naive utilitarianism is acting whenever the most salient first-order consequences are positive. These calculations are unreliable, and violating people’s rights is almost guaranteed to be of negative expected value, even when the first order effects look positive. This is well known by utilitarian theorists, and so prudent / rational utilitarians tend to abide by cooperative norms - making them more trustworthy than critics equating their moral philosophy to naive utilitarianism would think.
Object Level Interventions / Reviews
Opportunities that surprised us during our Clearer Thinking Regrants program
by spencerg, Clare_Diane
Clearer Thinking evaluated >630 project proposals as part of their Regrants program. They share key learnings that updated them toward there being many promising opportunities where 10K-500K gifts could make a big difference.
Unexpected room for funding of well-known orgs
Object-level learnings
Instead of technical research, more people should focus on buying time and Ways to buy time
by Akash
Delaying AGI timelines by 1 year gives the entire alignment community an extra year to solve the problem. ‘Buying time’ interventions typically involve convincing AI labs that AI x-risk is an important concern, and providing feasible actions for them to reduce it. Eg:
The author thinks ~40-60% of alignment researchers should work on this instead of technical research (particularly if they would be a better fit for it), and 20-40% of AI Safety community builders should also switch to this focus. This is particularly the case if you are able to progress an intervention that buys time at the end, i.e. when we know more and have more tools to help with alignment.
[Links post] Economists Chris Blattman and Noah Smith on China, Taiwan, and the likelihood of war
by Stephen Clare
Summaries of two recent posts by prominent economists on the likelihood of a China / US war over Taiwan, using economic frameworks. Both think it’s relatively likely. Details below:
The prospects for war with China: Why I see a serious chance of World War III in the next decade by Chris Blattman
Applying the bargaining framework he developed in his book ‘Why We Fight’, Chris argues that as China grows its economy and military, war becomes more likely. While negotiated settlement is theoretically preferable, there may be principles (e.g. democracy vs autocracy) that are non-negotiable, and China harmed its reputation for sticking to settlements with its crackdown on Hong Kong.
Why I think an invasion of Taiwan probably means WW3 by Noah Smith
Uses game theory to predict the most likely war scenarios, via:
Given his assumptions, the equilibrium solution is for China to invade and attack the US to maximize its chances of victory, nearly assuring the outbreak of a major great power war. However the assumptions might not be great, and he discusses how they don’t take into account eg. misinformation, or the ‘US resists’ outcomes being too negative for both countries for either to take that path.
Tracking the money flows in forecasting
by NunoSempere
A list of 27 forecasting organisations (within and outside EA), including description and rough estimates of monetary value and social value.
Does the US public support radical action against factory farming in the name of animal welfare?
by Neil_Dullaghan
A US nationally representative survey by Rethink Priorities found 15.7% of respondents (N=2,698) supported a ban on slaughterhouses when presented with arguments for and against, and asked to explain their reasoning. Previous surveys by the Sentience Institute in the US found ~39-43% of people supported banning slaughterhouses, highlighting a large discrepancy.
Rethink Priorities suggests that previous polls which determined attitudes in response to broad questions (e.g. “I support a ban on slaughterhouses”) may not be accurate indicators of support for certain policies. These findings are notable for animal advocates as previous findings had been cited as support for bold reforms.
Neil also notes that for future research, it might be useful to test a radical ask (ban factory farming) and a moderate ask (labelling for cage-free eggs, say) each with a radical message ("meat is murder") versus a moderate one ("human/consumer welfare") somewhat similar to this paper.
How to change a system from the inside
by weeatquince
A guide on how to change your organization’s rules, protocols, decision-making, culture or ethos from the inside. Originally written by UK civil servants, the author is most confident it applies in that context, but has adapted the post to give advice more broadly to staff from juniors to middle managers in large organisations. The process suggested is:
Change is slow, and can fail or be limited by external factors - but giving it a go can be good for learning and career capital even if it fails.
Opportunities
Apply now for the EU Tech Policy Fellowship 2023 by Jan-WillemvanPutten, Cillian Crosson, Training for Good, SteveThompson
8-month programme to catapult grads into high-impact career paths in EU policy, mainly working on the topic of AI Governance. Includes remote PT study July/Aug, 2x week-long policy trainings in Brussels, and choice of ~5 month placement at a host org or support applying to other EU policy jobs. Open to EU citizens, Apply by Dec 11.
GiveWell is hiring a Research Analyst (apply by November 20)
by GiveWell
GiveWell is hiring for a Research Analyst for their core interventions team, which investigates and makes funding decisions about programs they’re already supporting at scale. All locations welcome, as long as willing to join meetings in the California time zone. No specific experience or degrees needed. Apply by Nov 20th.
Community & Media
What's Happening in Australia
by Bradley Tjandra, Nathan Sherburn
A lot of EA aligned work being done in Australia involves working remotely with the international community, but there’s also a growing list of projects at least partially led by Australian residents. This post compiles them together and provides summaries, who’s working on it, links, and requests / calls for action for each.
Includes: AI Safety Australia & New Zealand, AI Safety Support, EA Pathfinder, Foundations for Tomorrow, Giving What We Can, Good Ancestors Project, High Impact Engineers, High Impact Recruitment, Insights for Impact, Lead Exposure Elimination Project, Quantifying Uncertainty in Givewell CEAs, Ready Research, and Sentience Institute.
Google Scholar is now listing (some) EA Forum posts
by PeterSlattery
When typing ‘effective altruism’ into Google Scholar, some EA forum posts show up - although without proper titles. DAOMaxi explains in the comments why this would happen, and how it’s possible to either manually add articles to Google Scholar or for the forum team devs to include tags that will do it automatically (and with correct details). Currently 48 forum posts are indexed on Google Scholar.
Some advice on independent research
by mariushobbhahn
You might want to do independent research as a side / transition project, if there aren’t positions open in the area you want to research, or because you value independence / flexibility.
The author has several tips for doing this successfully:
Doing Ops in EA FAQ: before you join (2022)
by Vaidehi Agarwalla, Alexandra Malikova, ES
Guide by Pineapple Operations on things to consider before entering Ops work at an EA org. Includes:
What is Ops, and how to get into it?
What Ops is like
They also list further resources and orgs specializing in operations.
The 8-week mental health programme for EAs finally published
by tereziekosik, Kristyna Stastna, Sylvie Wagnerová
Mental Health Navigator has published an 8-week mental health programme for EAs. This includes an 8-chapter workbook and resources for facilitators to run weekly workshops.
AI Safety groups should imitate career development clubs
by Joshc
If you want talented ML students to learn about AI safety, offer them what they find valuable ie. projects and skill-building that create career capital. This is the model of ML @ Berkeley, which is run unpaid, requires 15 hour p/w commitment from participants, is extremely selective (~7% get in) and still has 50 students. Many groups are more discussion-focused, and could benefit from this approach.
EA Images
by Bob Jacobs
The author designed symbols for a utilitarianism flag, their local EA group, and common EA mindsets which they share here. They’ve also created a lot of banners, thumbnails, and images used elsewhere on the forum.
LW Forum
AI Related
How could we know that an AGI system will have good consequences?
by So8res
Alignment proposals are answers to the scenario: ‘Imagine you’re about to launch AGI, and think there is >50% chance it will end the acute risk period in a good way. Why do you think that?’
Current proposals fall into three buckets, which the author believes are doomed for the reasons indented below each:
Applying superintelligence without collusion
by Eric Drexler
A lot of AI safety research assumes a monolithic AGI, with the argument that if there were multiple superintelligent-level systems they would inevitably collude and act as one.
Factors that make collusion less likely include a large number of actors, sensitivity to defectors, diversity among actors, constrained communication, single-move decision-processes, and lack of shared knowledge. The author argues these conditions are supported by current architectures and incentives (eg. making multiple diverse models improves quality / reliability of answers). Applying multiple potentially untrustworthy superintelligent-level systems to problems can improve rather than degrade safety by thwarting collusion. They call for greater attention on this prospect.
Rationality Related
I Converted Book I of The Sequences Into A Zoomer-Readable Format
by dkirmani
The author has converted book 1 of the sequences into machine-read audio overlaid on unrelated Subway Surfers gameplay footage. Similar videos are often recommended on TikTok, so this format may be highly engaging for some people. The videos are all linked in the post.
"Rudeness", a useful coordination mechanic
by Raemon
Rudeness is a way of spending down social capital, which you accumulate doing high status respectable things. Different communities and cultures have different norms of ‘rude’ eg. belching at a meal might be rude in one culture, and rude not to do in another. This allows groups to fine-tune what they optimize for - making some actions more socially expensive and so occur less often.
Trying to Make a Treacherous Mesa-Optimizer
by MadHatter
The author builds a toy model to try and empirically prove the possibility of treacherous mesa-optimizers ie. optimizers that try to look aligned in training.
They created a model to follow the X = Y line in a graph up to Y = 5, to know humans can’t control it after that, to learn via simulation, and to have a loss model that is different from the ‘true’ loss model we want. They show in this case the model sometimes veers away from X = Y after the Y = 5 point. Code, graphs, and commentary are provided.
Instrumental convergence is what makes general intelligence possible
by tailcalled
Author’s tl;dr (lightly edited): General intelligence is possible because solving real-world problems requires solving common subtasks. Common subtasks are what give us instrumental convergence (definition: the tendency for most sufficiently intelligent beings to pursue similar sub-goals). Common subtasks are also what make AI useful; you want AIs to pursue instrumentally convergent goals. Capabilities research proceeds by figuring out algorithms for instrumentally convergent cognition. Consequentialism and search are fairly general ways of solving common subtasks.
A philosopher's critique of RLHF
by ThomasW
Shares a transcription of a short Q&A between Brian Christian (Author of The Alignment Problem) and Yale Philosophy Professor L.A. Paul. The latter suggests that the issue with RLHF (reinforcement learning from human feedback) is new scenarios where humans can’t distinguish what is better, or even bucket it into an existing category. The phrasing is crisp and tackles the problem clearly, despite Professor Paul having limited background in AI Safety. The author recommends watching the full recording.
Other
What is epigenetics?
by Metacelsus
Genetics is the study of genes ie. sequences of genetic material that encode functional products. Epigenetics is the study of modifications to this genetic material that don’t affect the sequence, but control which genes get expressed. In mammals, these are DNA methylation and histone modifications. Newly copied DNA lacks these modifications and the modifications can be read, written, and erased by specialized proteins (scientists can also do this via CRISPR).
Speculation on Current Opportunities for Unusually High Impact in Global Health
by johnswentworth
The Sahel region is close to the Malthusian equilibrium (where all production is used for sustenance). This means many die in an economic downturn. Support is likely neglected due to government corruption, necessitating charities to deliver support in person, with one source stating antibiotic imports of the entirety of Mali amounted to $53k in 2020. Givewell top charities also tend to physically distribute goods. Based on this, the author suggests even flying in with a backpack of antibiotics to give away might be highly impactful.
Exams-Only Universities
by Mati_Roy
Suggests exam-only universities as a way to allow people to learn at their own pace / from wherever they want, and for testing to become more standardized across institutions. It also removes bias from lecturers offering exam hints during classes, and the inconvenience and cost of attending classes.
Didn’t Summarize
What it's like to dissect a cadaver by Alok Singh
Mysteries of mode collapse due to RLHF by janus