I don't usually give content warnings, but you probably don't want to read this if sentient animals being cooked alive is too much.
The octopus thing reminds me of a pretty common custom in a lot of BBQ places here. Honestly, I've only seen it in Korean/Japanese BBQ joints, but that's probably due to me going there a lot. Anyways, they would put an octopus in a bowl, and cut off its legs one by one. The legs would be cooked on the grill and seasoned while the octopus watched. When we finished eating the legs, the body of the octopus would finally be placed on the grill and cooked alive.
The whole process takes around 10 minutes. It honestly didn't taste that good.
I don't have a strong way to end this post, but I feel like somehow stopping specifically this might be a good idea.
Supported by Rethink Priorities
This is part of a weekly series summarizing the top posts on the EA and LW forums - you can see the full collection here. The first post includes some details on purpose and methodology. Feedback, thoughts, and corrections are welcomed.
If you'd like to receive these summaries via email, you can subscribe here.
Podcast version: prefer your summaries in podcast form? A big thanks to Coleman Snell for producing these! Subscribe on your favorite podcast app by searching for 'EA Forum Podcast (Summaries)'. Note this is a new feed, part of a coordinated effort to create more narrated EA content - check out the announcement post here.
Author's note: FTX-related posts can be found in their own section. There's also a temporary new section for 'Giving Recommendations and Year-End Posts' as we hit giving season, and many organizations reflect on the year just been.
Top / Curated Readings
Designed for those without the time to read all the summaries. Everything here is also within the relevant sections later on so feel free to skip if you’re planning to read it all. These are picked by the summaries’ author and don’t reflect the forum ‘curated’ section.
Announcing FTX Community Response Survey
by Conor McGurk, WillemSleegers, David_Moss
Author’s TL;DR: Let us know how you are feeling about EA post the FTX crisis by filling out the EA survey. If you’ve already responded to the EA survey, you can take the extra questionnaire here.
The LessWrong 2021 Review (Intellectual Circle Expansion)
by Ruby, Raemon
The Annual LessWrong review is starting on 1st December. This involves nominating and voting on the best posts of 2021. For the first time, this year is also allowing off-site content to be nominated. If you have an account registered before Jan 2021, you can participate by casting preliminary votes until 14th December - a UI for it will show up at the top of all 2021 posts.
Physical books with the winners of the 2020 review will likely be ready by January.
Why Neuron Counts Shouldn't Be Used as Proxies for Moral Weight
by Adam Shriver
The fourth post in the Moral Weight Project Sequence.
Neuron counts are sometimes proposed as a proxy for moral weight due to relation to intelligence, an argument they result in “more valenced consciousness”, or being required to reach minimal information capacity for morally relevant cognitive abilities.
The author challenges this on three bases:
They conclude neuron counts should not be used as a sole proxy for moral weight, but cannot be dismissed entirely. Rather, neuron counts should be combined with other metrics in an overall weighted score that includes information about whether different species have welfare-relevant capacities.
EA Forum
Philosophy and Methodologies
Why Neuron Counts Shouldn't Be Used as Proxies for Moral Weight
by Adam Shriver
The fourth post in the Moral Weight Project Sequence.
Neuron counts are sometimes proposed as a proxy for moral weight due to relation to intelligence, an argument they result in “more valenced consciousness”, or being required to reach minimal information capacity for morally relevant cognitive abilities.
The author challenges this on three bases:
They conclude neuron counts should not be used as a sole proxy for moral weight, but cannot be dismissed entirely. Rather, neuron counts should be combined with other metrics in an overall weighted score that includes information about whether different species have welfare-relevant capacities.
Object Level Interventions / Reviews
Altruistic kidney donation in the UK: my experience
by RichArmitage
Around 250 people on the UK kidney waiting list die each year. Donating your kidney via the UK Living Kidney Sharing Scheme can potentially kick off altruistic chains of donor-recipient pairs ie. multiple donations. Donor and recipient details are kept confidential.
The process is ~12-18 months and involves consultations, tests, surgery, and for the author 3 days of hospital recovery. In a week since discharge, most problems have cleared up, they can slowly walk several miles, and they encountered no serious complications.
Banding Together to Ban Octopus Farming
by Tessa @ ALI
Summary by Hamish Doodles (somewhat edited):
"Approximately 500 billion aquatic animals are farmed annually in high-suffering conditions and [...] there is negligible advocacy aimed at improving [their] welfare conditions.”
Octopus and squid are highly intelligent, but demand for them is growing, and so they are likely to be factory farmed in future. Aquatic Life Institute (ALI) is a non-profit trying to prevent this by campaigning to ban octopus farming in countries / regions where it is being considered (ie. Spain, Mexico, the EU). ALI "will work with corporations on procurement policies banning the purchase of farmed octopus" and "support research to compare potential welfare interventions". So far, ALI has sent some letters to government officials, organised a tweet campign, planned a couple of protests, run online events, and started the Aquatic Animal Alliance (AAA) coalition with 100 animal protection organisations.
ALI currently has five welfare concerns for farmed octopuses: environmental enrichment, feed composition, stock density & space requirements, water quality, and stunning / slaughter.
The deathprint of replacing beef by chicken and insect meat
by Stijn
A recent study (Bressler, 2021) estimated that for every 4000 ton CO2 emitted today, there will be one extra premature human death before 2100. The post author converts this into human deaths per kilogram of meat produced (based on CO2 emissions for that species), and pairs this with the number of animals of that species that need to be slaughtered to produce 1kg of meat.
After weighting by neurons per animal, their key findings are below:
This suggests switching from beef to chicken or insect meat reduces climate change but increases animal suffering significantly, so might be bad overall. They suggest prioritizing a reduction of chicken meat consumption, and that policy makers stop subsidizing research on insect meat, tax meat based on climate and suffering externalities, and start subsidizing plant / cell based meat.
Giving Recommendations and Year-End Posts
2022 ALLFED highlights
by Ross_Tieman, Sonia_Cassidy, Denkenberger, JuanGarcia
Highlights for ALLFED in 2022 include:
ALLFED is funding constrained and gratefully appreciates any donations. The heightened geopolitical tensions from the Russo-Ukrainian conflict create a time-limited policy window for bringing their research on food system preparedness to the forefront of decision makers’ minds.
Update on Harvard AI Safety Team and MIT AI Alignment
by Xander Davies, Sam Marks, kaivu, TJL, eleni, maxnadeau, Naomi Bashkansky, Oam Patel
Reflections from organizers of the student organisations Harvard AI Safety Team (HAIST) and MIT AI Alignment (MAIA).
Top things that worked:
Top things that didn’t work:
In future, they plan to set up an office space for MAIA, share infrastructure and resources with other university alignment groups, and improve programming for already engaged students (including opportunities over winter and summer break).
They’re looking for mentors for junior researchers / students, researchers to visit during retreats or host Q&As, feedback, and applicants to their January ML bootcamp or to roles in the Cambridge Boston Alignment Initiative.
Why Giving What We Can recommends using expert-led charitable funds
by Michael Townsend, SjirH
Funds allow donors to give as a community, with expert grantmakers and evaluators directing funds as cost-effectively as possible. Advantages include that the fund can learn how much funding an organization needs, provide it when they need it, monitor how it’s used, and incentivize them to be even more impactful. It also provides a reliable source of funding and support for those organisations.
GWWC recommends most donors give to funds, with the exception of those who have unique donation opportunities that funds can’t access, or who believe they can identify more cost-effective opportunities themselves (eg. due to substantial expertise, or differing values to existing funds). You can find their recommended funds here.
Why I gave AUD$12,573 to Innovations For Poverty Action
by Henry Howard
The author gave 50% of their salary to charity last year, with the largest portion to Innovations for Poverty Action. They prioritized this charity because they believe the slow rate of discovery of new effective charities is a bottleneck for effective altruism, that as an established global development research organization they probably know more than us, they have a track record of actionable research (eg. their research kick-started Evidence Action’s Dispensers for Safe Water program), and they are tax-deductible in Australia.
The Founders Pledge Climate Fund at 2 years
by jackva, violet, Luisa_S
The Founders Pledge Climate Fund has run for 2 years and distributed over $10M USD.
Because the climate-space has ~$1T per year committed globally, the team believes the best use of marginal donations is to correct existing biases of overall climate philanthropy, fill blindspots and leverage existing attention on climate. The Fund can achieve this more effectively than individual donations because it can make large grants to allow grantees to start new programs, quickly respond to time-sensitive opportunities, and make catalytic grants to early-stage organizations who don’t yet have track records.
Examples include substantial increase in growth of grantee Clean Air Task Force, and significant investments into emerging economies that get less from other funders.
Future work will look at where best to focus policy efforts, and the impact of the Russo-Ukrainian war on possible policy windows.
Opportunities
Apply to join Rethink Priorities’ board of directors.
by abrahamrowe, kierangreig
Apply by January 13th to join Rethink Priorities (RP) board of directors in an unpaid (3-10 hours per month) or paid (5-10 hours per week) capacity. These roles are key to helping RP secure it’s foundations and scale toward ambitious plans.
Announcing the Cambridge Boston Alignment Initiative [Hiring!]
by kuhanj, levin, Xander Davies, Alexandra Bates
Author’s TLDR: “The Cambridge Boston Alignment Initiative (CBAI) is a new organization aimed at supporting and accelerating Cambridge and Boston students interested in pursuing careers in AI safety. We’re excited about our ongoing work, including running a winter ML bootcamp, and are hiring for Cambridge-based roles (rolling applications, priority deadline Dec. 14 to work with us next year).”
Come get malaria with me?
by jeberts
There is a paid opportunity to be part of a Malaria vaccine trial in Baltimore from January to early March. The vaccine has a solid chance of being deployed for pregnant women if it passes this challenge trial. It’s ~55 hours time commitment if in Baltimore or more if needing to travel, and the risk of serious complications is very low. The author signed up, and knows 6 others who have expressed serious interest. Get in touch with questions or to join an AirBnB the author is setting up for it.
Good Futures Initiative: Winter Project Internship
by Aris Richardson
Author’s TLDR: I'm launching Good Futures Initiative, a winter project internship to sponsor students to take on projects to upskill, test their fit for career aptitudes, or do impactful work over winter break. You can read more on our website and apply here by December 11th if interested!
Create a fundraiser with GWWC!
by Giving What We Can, GraceAdams
GWWC have launched brand new fundraising pages. If you’d like to create a fundraiser for effective charities, you can fill out this form, choosing up to 3 charities or funds from their Donate page.
Winter ML upskilling camp
by Nathan_Barnard, Joe Hardie, qurat, Catherine Brewer, hannah
Author’s TL;DR: “We are running a UK-based ML upskilling camp from 2-10 January in Cambridge for people with no prior experience in ML who want to work on technical AI safety. Apply here by 11 December.”
Community & Media
FYI: CC-BY license for all new Forum content from today (Dec 1)
by Will Bradshaw
From December 1st, all new Forum content (including comments and short form posts) will be published under a CC BY 4.0 license. This lets others share and adapt the material for any purpose, including commercial, and is irrevocable.
Effective giving subforum and other updates (bonus Forum update November 2022)
by Lizka
Updates from the forum team:
Some notes on common challenges building EA orgs
by Ben_Kuhn
The author’s observations from talking to / offering advice to several EA orgs:
They suggest solutions of:
An EA storybook for kids
by Simon Newstead
Free storybook for kids, designed to inspire kindness and thoughtfulness. The author is looking for beta readers from the community: https://eaforkids.org and aiming for an early 2023 release.
I’m a 22-year-old woman involved in Effective Altruism. I’m sad, disappointed, and scared.
by Maya D
The author discovered EA 14 months ago and has been highly involved. However they’ve seen behavior that makes them skeptical and sad about the community.
They detail cases of unwelcoming acts towards women, including a Bay Area list ranking them, someone booking an EAG 1-1 on the basis of attractiveness, a recent blog post on the topic with nasty comments, and the suicide of Kathy Forth in 2018 (who attributed large portions of her suffering to her experiences in the EA and rationalist communities).
They suggest that EAs need to let people know if they see them doing something unwelcoming, never shield someone from accountability because of status, and listen and give compassion to people sharing their personal experiences or emotions even when strong evidence isn’t available.
CEA has responded in the comments to follow up. They also provide details on specific actions they’ve taken in response to some of the above and similar situations, and why it isn’t always outwardly visible (eg. because the person reporting it wants to maintain confidentiality, and more visible responses would break this).
Beware frictions from altruistic value differences
by Magnus Vinding
Differing values creates risks of uncooperative behavior within the EA community, such as failing to update on good arguments because they come from the “other side”, failing to achieve common moral aims (eg. avoiding worst case outcomes), failing to compromise, or committing harmful acts out of spite / tribalism.
The author suggests mitigating these risks by assuming good intent, looking for positive-sum compromises, actively noticing and reducing our tendency to promote / like our ingroup, and validating that the situation is challenging and it’s normal to feel some tension.
Is Headhunting within EA Appropriate?
by Dan Stein
The author works at IDinsight, which has had a number of staff directly approached by headhunters for EA orgs. They question whether headhunting is a good practice within EA, or leads to net negative outcomes by hindering high impact organisations whose staff are headhunted.
Top comments argue that headhunting is primarily sharing information about opportunities, and that this is positive as it allows individuals to make the best decisions.
Didn’t Summarize
James Lovelock (1919 – 2022) by Gavin
Effective Giving Day is only 1 day away! by Giving What We Can, GraceAdams (effective giving day was 28th November - this post listed events for it)
How have your views on where to give updated over the past year? by JulianHazell
LW Forum
AI Related
Jailbreaking ChatGPT on Release Day
by Zvi
ChatGPT attempts to be safe by refusing to answer questions that call upon it to do or help you do something illegal or otherwise outside its bounds. Prompt engineering methods to break these safeguards were discovered within a day, and meant users could have it eg. describe how to enrich uranium, write hate speech, or describe steps a rogue AI could take to destroy humanity. Loopholes included (some now patched):
Did ChatGPT just gaslight me?
by ThomasW
In chatting with ChatGPT, the author found it contradicted itself and its previous answers. For instance, it said that orange juice would be a good non-alcoholic substitute for tequila because both were sweet, but when asked if tequila was sweet it said it was not. When further quizzed, it apologized for being unclear and said “When I said that tequila has a "relatively high sugar content," I was not suggesting that tequila contains sugar.”
This behavior is worrying because the system has the capacity to produce convincing, difficult to verify, completely false information. Even if this exact pattern is patched, others will likely emerge. The author guesses it produced the false information because it was trained to give outputs the user would like - in this case a non-alcoholic sub for tequila in a drink, with a nice-sounding reason.
The Plan - 2022 Update
by johnswentworth
Last year, the author wrote up a plan they gave a “better than 50/50 chance” would work before AGI kills us all. This predicted that in 4-5 years, the alignment field would progress from preparadigmatic (unsure of the right questions or tools) to having a general roadmap and toolset.
They believe this is on track and give 40% confidence that over the next 1-2 years the field of alignment will converge toward primarily working on decoding the internal language of neural nets - with interpretability on the experimental side, in addition to theoretical work. This could lead to identifying which potential alignment targets (like human values, corrigibility, Do What I Mean, etc) are likely to be naturally expressible in the internal language of neural nets, and how to express them. They think we should then focus on those.
In their personal work, they’ve found theory work faster than expected, and crossing the theory-practice gap mildly slower. In 2022 most of their time went into theory work like the Basic Foundations sequence, workshops and conferences, training others, and writing up intro-level arguments on alignment strategies.
A challenge for AGI organizations, and a challenge for readers
by Rob Bensinger, Eliezer Yudkowsky
OpenAI recently released their approach to alignment research. The post authors challenge other major AGI orgs like Anthropic and DeepMind to also publish their alignment plan.
They also challenge readers to write up their unanchored thoughts on OpenAI’s plan, ahead of Eliezer and Nate publishing their own thoughts.
On the Diplomacy AI
by Zvi
The Diplomacy AI got a handle on the basics of the game, but didn’t ‘solve it’. It mainly does well due to avoiding common mistakes like failing to communicate with victims (thus signaling intention), or forgetting the game ends after the year 1908. It also benefits from anonymity, one-shot games, short round limits etc.
Some things were easier than expected eg. defining the problem space, communications generic and simple and quick enough to easily imitate and even surpass humans, no reputational or decision theoretic considerations, you can respond to existing metagame without it responding to you. Others were harder eg. tactical and strategic engines being lousy (relative to what the author would have expected).
Overall the author did not on net update much on the Diplomacy AI news, in any direction, as nothing was too shocking and the surprises often canceled out.
Finding gliders in the game of life
by paulfchristiano
Author’s summary (lightly edited): “ARC’s current approach to Eliciting Latent Knowledge is to point to latent structure within a model by searching for the “reason” for particular correlations in the model’s output. In this post we’ll walk through a very simple example of using this approach to identify gliders (particular patterns of cells) in the game of life.”
Inner and outer alignment decompose one hard problem into two extremely hard problems
by TurnTrout
Author’s tl;dr (lightly edited): “One alignment strategy is to 1) capture “what we want” in a loss function to a very high degree (“robust grading”), 2) use that loss function to train the AI, and 3) get the AI to exclusively care about optimizing that objective.
I think that each step contains either a serious and unnecessary difficulty, or an unnecessary assumption. I think that:
Distinguishing test from training
by So8res
Various people have proposed variants of “align AGI by making it sufficiently uncertain about whether it’s in the real world versus still in training”. This seems unpromising because AGI could still have bad outcomes if convinced, and convincing it would be difficult.
Non-exhaustive list of how it could tell it’s in reality:
If you can understand the contents of the AI’s mind well enough to falsify every possible check it could do to determine the difference between simulation and reality, then you could use that knowledge to build a friendly AI that doesn’t need to be fooled in the first place.
Why Would AI "Aim" To Defeat Humanity?
by HoldenKarnofsky
The author argues that if today’s AI development methods lead directly to powerful enough AI systems, disaster is likely by default (in the absence of specific countermeasures).
This is because there is good economic reason to have AIs ‘aim’ at certain outcomes - eg. We might want an AI that can accomplish goals such as ‘get me a TV for a great price’. Current methods train AIs to do this via trial and error, but because we ourselves are often misinformed, we can sometimes negatively reinforce truthful behavior and positively reinforce deception that makes it look like things are going well. This can mean AIs learn an unintended aim, which if ambitious enough, is very dangerous. There are also intermediate goals like ‘don’t get turned off’ and ‘control the world’ that are useful for almost any ambitious aim.
Warning signs for this scenario are hard to observe, because of the deception involved. There will likely still be some warning signs, but in a situation with incentives to roll out powerful AI as fast as possible, responses are likely to be inadequate.
Rationality Related
Be less scared of overconfidence
by benkuhn
The author gives examples where their internal mental model suggested one conclusion, but a low-information heuristic like expert or market consensus differed, so they deferred. Other examples include assuming something won’t work in a particular case because the stats for the general case are bad. (eg. ‘90% of startups fail - why would this one succeed?’), or assuming something will happen similarly to past situations.
Because the largest impact comes from outlier situations, outperforming these heuristics is important. The author suggests that for important decisions people should build a gears-level model of the decision, put substantial time into building an inside view, and use heuristics to stress test those views. They also suggest being ambitious, particularly when it’s high upside and low downside.
Always know where your abstractions break
by lsusr
Every idea has a domain it can be applied to, beyond which the idea will malfunction. If you don't understand an idea's limitations then you don't understand that idea. Eg. Modern political ideologies were invented in the context of an industrial civilization, and when applied to other contexts like medieval Japan they cloud your ability to understand what’s going on.
MrBeast's Squid Game Tricked Me
by lsusr
Most of our beliefs never go through a thorough examination. It's infeasible to question whether every rock and tree you look at is real. For instance, the author didn’t expect CGI in MrBeast’s videos (which are usually real), so never noticed it when it happened.
The problem of identifying what to question is a more fundamental challenge than reconsidering cached thoughts and thinking outside your comfort zone. Cached thoughts have been examined at least once (or at least processed into words). Identifying the small bits of information that are even worth thinking about is the first step toward being less wrong.
Other
The LessWrong 2021 Review (Intellectual Circle Expansion)
by Ruby, Raemon
The Annual LessWrong review is starting on 1st December. This involves nominating and voting on the best posts of 2021. For the first time, this year is also allowing off-site content to be nominated. If you have an account registered before Jan 2021, you can participate by casting preliminary votes until 14th December - a UI for it will show up at the top of all 2021 posts.
Physical books with the winners of the 2020 review will likely be ready by January.
Didn’t Summarize
The Singular Value Decompositions of Transformer Weight Matrices are Highly Interpretable by beren, Sid Black
Causal Scrubbing: a method for rigorously testing interpretability hypotheses [Redwood Research] by LawrenceC, Adrià Garriga-alonso, Nicholas Goldowsky-Dill, ryan_greenblatt, jenny, Ansh Radhakrishnan, Buck, Nate Thomas
Geometric Rationality is Not VNM Rational by Scott Garrabrant
Re-Examining LayerNorm by Eric Winsor
My take on Jacob Cannell’s take on AGI safety by Steven Byrnes
FTX-Related Posts
Announcing FTX Community Response Survey
by Conor McGurk, WillemSleegers, David_Moss
Author’s TL;DR: Let us know how you are feeling about EA post the FTX crisis by filling out the EA survey. If you’ve already responded to the EA survey, you can take the extra questionnaire here.
Sense-making around the FTX catastrophe: a deep dive podcast episode we just released
by spencerg
Clearer Thinking podcast episode discussing “how more than ten billion dollars of apparent value was lost in a matter of days, the timeline of what happened, deception and confusion related to the event, why this catastrophe took place in the first place, and what this means for communities connected to it.”
"Insider EA content" in Gideon Lewis-Kraus's recent New Yorker article
by To be stuck inside of Mobile
Linkpost and key excerpts from a New Yorker article overviewing how EA has reacted to SBF and the FTX collapse. The article claims there was an internal slack channel of EA leaders where a warning that SBF “has a reputation [in some circles] as someone who regularly breaks laws to make money” was shared, before the collapse.
List of past fraudsters similar to SBF
by NunoSempere
Spreadsheet and descriptions of 22 qualitatively similar cases to FTX selected from the Wikipedia list of fraudsters. The author’s main takeaway was that many salient aspects of FTX have precedents: the incestuous relationship between an exchange and a trading house (Bernie Madoff, Richard Whitney), a philosophical or philanthropic component (Enric Duran, Tom Petters, etc.), embroiling friends and families in the scheme (Charles Ponzi), or multi-billion fraud not getting found out for years (Elizabeth Holmes, many others).
How VCs can avoid being tricked by obvious frauds: Rohit Krishnan on Noahpinion (linkpost)
by HaydnBelfield
Linkpost to an article by Rohit Krishnan, a former hedge fund manager. Haydn highlights key excerpts, including one claiming that “This isn’t Enron, where you had extremely smart folk hide beautifully constructed fictions in their publicly released financial statements. This is Dumb Enron, where someone “trust me bro”-ed their way to a $32 Billion valuation.”
They mention that “the list of investors in FTX [was] a who’s who of the investing world” and while “VCs don’t really do forensic accounting” there were still plenty of red flags they should have checked. Eg. basics like if FTX had an accountant, management team, back office, board, lent money to the CEO, or how intertwined FTX and Alameda were. The author has had investments 1/10th the size of what some major investors had in FTX, and still required a company audit, with most of these questions taking “half an hour max”.
SBF interview with Tiffany Fong
by Timothy Chan
Linkpost to the interview, which the author suggests listening to from 11.31. They also highlight key quotes, and suggest that from this and other publicly available information, it seems likely to them that SBF acted as a naive consequentialist.