EA & LW Forums Weekly Summary (19 - 25 Sep 22')

Zoe Williams

Supported by Rethink Priorities

Also posted on the EA forum.

This is part of a weekly series - you can see the full collection here. The first post includes some details on purpose and methodology.

If you'd like to receive these summaries via email, you can subscribe here.

Podcast version: prefer your summaries in podcast form? A big thanks to Coleman Snell who is now producing these! The first episode is now up, and this week's will be up soon. Subscribe on your favorite podcast app by searching for 'Effective Altruism Forum Podcast'.

EA Forum

Philosophy and Methodologies

Just Look At The Thing! – How The Science of Consciousness Informs Ethics

by algekalipso

Ethical theories often contain background assumptions about consciousness, personal identity, and valence. Instead of arguing these or theorizing around them, we can test them in reality. For instance, mixed valence states (pain and pleasure, together) are relevant to negative utilitarianism. From real life, we can observe that if a pleasant experience (eg. music) occurs during a painful experience (eg. stomach ache) it can mute the pain. This helps us refine the negative utilitarianism view, and opens up new questions like if this is still the case for extensive pains, or if they are always net negative - which we can also test empirically.

Other examples are given, including high-dose DMT as experiences that aren’t successfully captured by many philosophical frameworks. The author argues this approach becomes more important as we open up futures with more complex valences and states of consciousness available.

Object Level Interventions & New Projects

Quantified Intuitions: An epistemics training website including a new EA-themed calibration app

by Sage

Quantified Intuitions helps users practice assigning credences to outcomes with a quick feedback loop. Currently it includes a calibration game, and pastcasting (forecasting on resolved questions you don’t know about already).

Announcing the NYU Mind, Ethics, and Policy Program

by Sofia_Fogel

The NYU Mind, Ethics, and Policy Program (MEP) will conduct and support foundational research about the nature and intrinsic value of nonhuman minds (biological and artificial). Several projects are currently underway, including a free public talk on whether large language models are sentient (signup here), and an award and workshop seeking published papers on animal and AI consciousness (details here).

The Next EA Global Should Have Safe Air

By joshcmorrison

Indoor air quality is a promising biosafety intervention, but experimental evidence on which methods to use is sparse. EA being an early adopter of interventions like air filters or UV lights, and recording outcomes, would help assess what works - and help the community reduce infections at the same time. EAG is a natural candidate for this due to its size. Other possibilities include EA office / coworking spaces or group houses.

Announcing the Rethink Priorities Special Projects Program

by Rachel Norman

Announcing EA Pulse, large monthly US surveys on EA

by David_Moss, Jamie Elsey

Rethink priorities is running a monthly survey of the US aimed at understanding public perceptions of Effective Altruism and related cause areas, funded by FTX Future Fund.

This includes over time tracking of general attitudes, as well as ad-hoc questions such as testing support for particular policies or EA messaging. Requests for both sections are welcome - ideally by October 20th.

The Hundred Billion Dollar Opportunity that EAs mostly ignore

by JeremiahJohnson

Individual charitable donations are in the hundreds of billions eg. last year individuals, bequests, foundations and corporations gave an estimated $484.85 billion to charities. The largest proportions are religious donations, and educational donations to existing 3-4 year colleges. Only 6% is donated internationally. More public outreach, approachable arguments and asks, specialized charity evaluators and public attacks on practices like Ivy League university endowments could help direct this money more effectively.

EA’s brain-over-body bias, and the embodied value problem in AI alignment

by Geoffrey Miller

AI alignment might benefit from thinking about our bodies’ values, not just our brains’. Behavior in humans is often an interplay eg. If we consider hunger and eating, this includes some cognition, but also interplay with the stomach, gut, hunger-related hormones etc. which in some sense have ‘goals’ like to keep blood glucose within certain bounds.

If we align AI with only our cognitive values, this has error modes. Eg. many like donuts - but our bodies usually prefer leafy greens. If an AI is trained to care for human bodies the way the body ‘wants’, and keep its systems running smoothly, this makes us safer from these failure modes. It also allows us to include preferences of infants, fetuses, those in comas, and others unable to communicate cognitively. We can learn these body values and goals via pushing forward evolutionary biology.

Shahar Avin on How to Strategically Regulate Advanced AI Systems

by Michaël Trazzi

Excerpts and links to discussion with Shahar Avin, a senior researcher at the Center for the Study of Existential Risk. Key points from excerpts include that:

A lot of cutting-edge AI research is probably private
Security failures are unavoidable with big enough systems
Companies should be paying the red tape cost of proving their system is safe, secure & aligned. Big companies are used to paying to meet regulations.
We should regulate now, but make it ‘future ready’ and updateable.

Two reasons we might be closer to solving alignment than it seems

by Kat Woods, Amber Dawn

There’s a lot of pessimism in the AI safety community. However, keep in mind:

All of the arguments saying that it’s hard to be confident that transformative AI isn’t just around the corner also apply to safety research progress.
It’s still early days and we’ve had about as much progress as you’d predict given that up until recently we’ve only had double-digit numbers of people working on the problem.

Opportunities

Announcing the Future Fund's AI Worldview Prize

by Nick_Beckstead, leopold, William_MacAskill, ketanrama, ab

CEA's Events Team is Hiring!

by Amy Labenz

Hiring for an EA Global Events Associate, Retreats Associate, and Community Events Associate. The team has grown rapidly, and on track to facilitate 4x the connections in the EA community this year as in 2021. Apply by October 11th.

$13,000 of prizes for changing our minds about who to fund (Clearer Thinking Regrants Forecasting Tournament)

by spencerg

13K of prizes up for grabs. Win some by either changing Clearer Thinking’s mind about which of 28 finalists to fund (and by how much), or being a top 20 forecaster for what projects they end up funding.

[Open position] S-Risk Community Manager at CLR

by stefan.torges

This role will be across areas like event & project management, 1:1 outreach & advising calls, setting up & improving IT infrastructure, writing, giving talks, and attending in-person networking events. Previous community building experience is helpful but not required. Deadline Oct 16th.

$5k challenge to quantify the impact of 80,000 hours' top career paths

by NunoSempere

$5k prize pool for quantitative estimates of the value of some or all of 80,000 hours' top 10 career paths. Flexibility on units (eg. QALYs, % x-risk reduction) and methodology. Deadline Nov 1st.

The $100,000 Truman Prize: Rewarding Anonymous EA Work

by Drew Spartz

The Truman Prize, now live on EA prize platform Superlinear, recognizes Effective Altruists with $5,000-$10,000 prizes for declining credit in order to increase their impact, in ways that can't be publicized directly. Submissions are now open.

Community & Media

Let's advertise infrastructure projects

by Arepo

There are many projects providing free or cheap support to EAs globally, but aren’t well known. This post and comment section aims to collect them. Multiple examples are linked in each of the following sections: coworking and socializing spaces, professional services, coaching, financial support.

Levels of donation

by vipulnaik

Personas of donors by donation amount, eg. a retail donor (<1K) is less likely to care about transaction costs or to investigate charities. Each level is separated by a factor of 10. The author also discusses how a person might move up levels (donate substantially more) eg. via increasing income or pooling donations with others.

EA for people with non-technical skillsets

by Ronja

EA is big, and we need a wide range of skills. This isn’t always obvious due to the visibility of AI safety and biosecurity discussions. Examples include: operations management, design, communications, policy, historical or behavioral research. There’s also EAs using their existing niche eg. Kikiope Oluwarore, a veterinarian who co-founded healthier hens, or Liv Boeree, a poker player who uses her network to encourage other poker players to donate.

The author suggests increasing visibility of EAs doing this wide range of work eg. via a ‘humans of EA’ monthly feature.

Author Rutger Bregman about effective altruism and philanthropy

by Sebastian Schwiecker

Linkpost for an interview of Rutger Bregman on his personal view of philanthropy. He’s a historian and author good at reaching people outside the EA community eg. being mentioned more than any other individual in Effektiv Spenden’s post donation survey.

The Mistakes of Focusing on Student Outreach in EA

by DavidNash

EA started in universities, and community building efforts have heavily focused there. Even if students are on average more receptive and flexible, continuing this trend can be a mistake because it risks an image of EA as a ‘youth movement’, losing out on important skills and networks from experienced professionals, and creates a lack of mentorship.

The author suggests redirecting resources at the margin by encouraging general community building, skill building, or object-level work more strongly with students (over becoming university group organizers).

Criticism of the 80k job board listing strategy

by Yonatan Cale

The 80K job board isn’t just highly impactful jobs - it also lists jobs good for career capital. In an informal twitter poll, 55% of EAs weren’t aware of this and believed it important. The author suggests only including high impact jobs, allowing community discussion of impact level, and better communicating current state.

Kush_kan from 80K comments they plan to visually distinguish top impact roles, update their job board tagline and FAQ, link orgs EA forum pages, and add a new feedback form to help address this. They also note most roles are there for a mix of impact and career capital reasons.

What Do AI Safety Pitches Not Get About Your Field?

by Aris Richardson

Arguments that misunderstand a field can reduce credibility and put experts from those fields off. The question author wants to collect examples to minimize this with AI Safety. Responses include:

Psychology (no clear definition of ‘intelligence’ that encompasses eg. cultural intelligence)
Economics (forecasting double-digit GDP growth based on AI feels dubious)
Anthropology (the ‘AI is to us what we are to chimps’ misunderstands how humans acquired power / knowledge ie. cumulative cultural changes over time)
Politics (nonstarter political ideas)
Philosophy (muddled use of concept of ‘agency’)

Summarizing the comments on William MacAskill's NYT opinion piece on longtermism

by West

Numerical summary of 300 comments on MacAskill’s NYT piece on longtermism. 60 were skeptical, 42 were positive. The most common skepticism themes were ‘Our broken culture prevents us from focusing on the long-term’ (20) and ‘We're completely doomed, there's no point’ (16). Few commenters engaged on biorisk or AI, associating long-term future concern with the environmental concern. Many also assumed long-term referred to 2-7 generations.

Optimizing seed:pollen ratio to spread ideas

by Holly_Elmore

EA community building often talks about the ‘funnel’, and more recently focus has been on creating core EAs in that funnel. Another model is we have outreach that’s like creating seeds (putting lots of resources into few promising proto-EAs) and like creating pollen (low resource per person but spread widely). Like plants, we need to get the ratio right. The author suggests we’re currently too weighted towards seeds and should be doing more pollen-like efforts - spreading ideas to a wide audience. Sometimes it will stick and someone will become a core EA, despite not having much dedicated support.

Announcing “Effective Dropouts”

by Yonatan Cale, Gavin, Vardev, Jonny Spicer

Author’s tl;dr: “Effective Dropouts” is meant to be a casual fun excuse for reminding people that dropping out of a degree can be a good decision, as a counter to all the existing pressure for degrees being the default/only/obvious way. The rest of the post is mostly a joke.

Establishing Oxford’s AI Safety Student Group: Lessons Learnt and Our Model

by CharlieGriffin, juliakarbing

The group aimed to increase the number and quality of technical people pursuing a career in AI safety research. Key lessons included:

Using an AI audience instead of an EA one greatly increased the technical audience. AI safety was still an easy sell without EA / longtermist philosophy.
Socials after talks were of high value.
Expert supervisors providing research questions and limited (~1hr per week) support to groups of students was an effective development opportunity.

ETGP 2022: materials, feedback, and lessons for 2023

by trammell

Lecture slides and exercises from the course ‘Topics in Economic Theory and Global Prioritization’, designed primarily for economics graduate students considering careers in global priorities research. The program included lunches, social opportunities, and shared accommodations, and will run again in 2023.

All participants were very satisfied or satisfied with the course, and over 18 / 34 wrote it may have or definitely changed their plans. Social aspects were the favorite portion.

My Personal Story of Rejection from EA Global

by Constance Li

The author has been involved with EA for over a decade, making significant life path changes such as ETG (medicine), launching a 60K+ per month business and donating profits, and running successful cage-free egg campaigns. They’ve felt alternately welcomed (EAGx Virtual, local groups, and a supportive CEA member) and disillusioned (multiple EAG rejections with lack of context and unsatisfactory replies) by the EA community.

The author suggests improvements in the EAG admissions process:

Consider how rejection can feel like a judgment of applicants’ worth. Send better responses, which could include:
- Links to sensitive explanations of common reasons for rejection.
- A semi-automated system for feedback eg. flagging applications by primary reason for rejection and sending templated emails based on that.
- Training CEA staff on authentic relating / non-violent communication methods.
Analyze the cost of rejection, experiment with interventions to reduce it, publish results.
Address the power of a small group (CEA) in admissions via transparency, blinding admissions, and potentially renaming EAG (eg. to CEA Global).

Why Wasting EA Money is Bad

by Jordan Arel

EA spends on luxuries to save time, increase productivity or community build (eg. flights, uber eats, fancy retreats). This is usually justified with the argument that some EA work can be incredibly impactful, to where 30m of time might be worth more than a year of someone’s life (which costs ~$200 via donations). However, the author argues frugality is important to EA’s internal and external reputation, to a healthy community, and can be more motivating than luxury.

CERI Research Symposium Presentations (incl. Youtube links)

by Will Aldred

Links to talks by 22 fellows from CERI (Cambridge Existential Risk Initiative)’s research symposiums over the past 2 years, split by subject areas. These were given by summer research fellows, and cover AI risk (technical & governance), biorisk, nuclear risk, extreme climate change, and meta x-risk.

Learning from Matching Co-Founders for an AI-Alignment Startup

by Patrick Gruban, LRudL, Maris Sala, simeon_c, Marc Carauleanu

Beth Barnes shared an idea and proposal for human data for AI alignment with multiple people at EAG London. Six interested people then self-organized cofounder selection.

Their steps were to ask CE for advice, answer ‘50 questions to explore with cofounders’, work weekly in pairs on test tasks for five weeks, and meet for an in-person workshop to finalize preferences and choose a co-founding pair.

Participants rated highest the pair working, and suggested focusing it on customer interviews to better define the intervention (in this case, the final cofounder pair dropped the project after doing this stage post cofounder matching). The reveal of preferences on who to co-found with was also highly rated and successfully selected one pair. Other stages could have been cut or done informally to speed up the process.

LW Forum

by Quintin Pope

Weekly themed round-ups of papers, published each Monday. They include links, abstracts and Quintin’s thoughts. This week’s theme is the structure/redundancy of trained models, as well as linear interpolations through parameter space.

Summaries: Alignment Fundamentals Curriculum

by Leon Lang

Linkpost for the author’s summaries of most core readings, and many further readings, from the alignment fundamentals curriculum composed by Richard Ngo.

Alignment Org Cheat Sheet

by Akash, Thomas Larsen

Describes the work of AI Safety researchers, research organizations, and training programs in one sentence each. Not intended to be comprehensive, but includes 15 researchers / research orgs, and 8 training / mentoring programs.

Understanding Infra-Bayesianism: A Beginner-Friendly Video Series

by Jack Parker, Connall Garrod

New video series that covers what infra-Bayesianism is at a high level and how it's supposed to help with alignment, assuming no prior knowledge. It also targets those who want to gain mastery of the technical details behind IB so that they can apply it to their own alignment research, and is good preparation for more technical sources like the original IB sequences.

Nearcast-based "deployment problem" analysis

by HoldenKarnofsky

The AI deployment problem is the question of how and when to (attempt to) build and deploy powerful AI, when unsure about safety and how close other actors are to deploying. This post assumes a major AI company (‘Magma’) thinks it is 6months - 2years to TAI using human feedback on diverse tasks (HFDT), and what both it and an organization dedicated to tracking and censoring dangerous AI (‘IAIA’) would ideally do in this situation.

He splits this into three phases, with suggestions as below. This forecast implies actions we should take today, such as create an IAIA equivalent, selectively share info, and for labs to do outreach and advocacy (non-comprehensive list of Holden’s suggestions).

Phase 1: Before Magma develops aligned TAI

Magma should:

Focus on developing aligned TAI asap, before other actors
Reduce risk from other actors - prioritize internal security, reduce ‘race’ pressure by making deals with other AI companies, and educate other players on misaligned AI risk.

IAIA should:

Monitor all major AI development projects, looking for signs of dangerous AI, ensuring sufficient safety measures, information security and selective info sharing, and taking action where they find issues.
Serve as a hub to share public goods such as education on AI alignment or coordinating deals between different orgs.

Selective info sharing is important for both parties. ‘Dual-use’ information (helpful for avoiding misaligned AI, and for making powerful AI) should be shared more readily with cautious parties.

Phase 2: Magma has developed aligned TAI, but other actors may develop misaligned TAI

Magma & IAIA should:

Focus on deploying AI systems that can reduce the risk that other actors cause a catastrophe (eg. that can align more powerful systems, patch cybersecurity holes, cover more uses to reduce space for other AIs, detect and obstruct misaligned AIs, enforce safety procedures, or offer general guidance)
Reduce misuse risk of the aligned TAI eg. deploy with appropriate oversight, ensure users have good intentions, and bake in some resistance to misuse.

If despite this, dangerous AI is catching up, Magma & IAIA should consider drastic action:

Recommend governments suppress AI development by any means necessary.
Develop AI capabilities to persuade or threaten actors to achieve the above - even if that holds some misalignment risk in itself.

Phase 3: No actors are likely to be able to develop misaligned TAI

Magma & IAIA should:

Focus on avoiding lock-in of bad worlds where one player has seized global power.
Broker peaceful compromises or coalitions.
Design AIs that may help humans with moral progress.

Towards deconfusing wireheading and reward maximization

by leogao

A response to “Reward is not the optimization target”. The author argues that it’s not possible for reinforcement learning policies to care about “reward” in an embedded setting, but wireheading in RL agents is still possible because wireheading doesn’t mean “the policy has reward as its objective”.

Public-facing Censorship Is Safety Theater, Causing Reputational Damage

by Yitz

Censorship in AI (for instance, preventing the user from viewing model results that contain swear words or real faces) is an issue because:
a) like other forms of social censorship, it can have negative social effects. Eg. see controversy over social media censorship.

b) It confuses the term ‘AI Safety’, such that people associate it with censorship under a false banner of ‘safety’, and have a negative view of the field overall.

The author suggests reducing censorship in public-facing models, using differentiated terminology for different types of AI safety, and more media interaction to direct opinion. A top comment notes ‘safety’ in the online context is already strongly associated to prevention of things like pornography and racism, and AI security may be a better term to use.

Toy Models of Superposition

by evhub

Linkpost for a new Anthropic paper which explores superposition - where one neuron in a net is used to capture several unrelated concepts, a problem for interpretability.

Research & Productivity Advice

How my team at Lightcone sometimes gets stuff done

by jacobjacob

Lightcone’s founder on practices that seem to have helped with high output. These primarily revolve around removing blockers, and includes:

Have a single decision maker to resolve ties
Have clear top priorities, and set aside time where everyone is focused on just them (and time specifically for other stuff eg. a day for less important meetings)
Work together (team in same room, work in pairs or trios sometimes, no-one remote, everyone full time, if you’re blocked ask for help immediately, and if you’re tagged respond immediately)
Keep context high (lots of 1-1s, never DM in slack - use a public channel named for your 1-1, make time for chit-chat)

Losing the root for the tree

by Adam Zerner

Story-like examples of getting lost in the details, and why it’s worthwhile to step back and ask ‘what was this aiming at? Is it the most effective way to get it?’ Hierarchical tree representations with lines weighted on impact can be a clear way to visualize this.

You Are Not Measuring What You Think You Are Measuring

by johnswentworth

When you measure a metric, you usually don’t learn what you think you do. For instance, a company measures click-through-rate on two sign-up flows to determine which info is more important to put upfront, but really the CTR changed due to latency differences in the two options. Solution: measure lots of things, and you might figure out what’s really going on. Some tools are better at this (eg. a microscope gives heaps of info, an AB test just one data point). In addition to your own experimental design, keep this in mind when reading other’s research - don’t just take in the abstract and the p-values, look at the data and cross-reference other papers to build a picture of what’s going on.

Some notes on solving hard problems

by Joe Rocca

Tips for solving hard problems:

Start simple

Start with a minimal problem and solution, and build from there, incrementally adding cases and generalizing. Don’t jump wildly from path to path.
Start with things you’re (almost) sure about, and ask ‘what do they imply?’
Abstract first, then make it concrete (eg. a real world thought experiment).

Question assumptions

See it from multiple perspectives / mental models.
Take a break. It might help you realize a mistaken assumption.
If something feels promising but you hit a dead-end, keep pushing! Question fundamental assumptions to make it work.

Make it easier to hold in your head

Draw it - diagrams are great for reasoning.
Name your concepts.
Surround yourself with context - lists of key questions, considerations, milestones - to help pull yourself out of the weeds.

Twitter Polls: Evidence is Evidence

by Zvi

Breakdown of an argument on twitter about twitter polls. One user argues they aren’t representative and contain response bias, and therefore they offer actively ‘bad’ (ie. misleading) evidence and shouldn’t be used. The post author argues that any evidence is good, as long as you update on it properly considering the context (such as the sample, any likely bias), and therefore we should do more twitter polls since they’re cheap and neat. There are methods to help with bias, such as comparing results between polls by the same user.

A game of mattering

by KatjaGrace

A productivity hack for when the amount of tasks to-do feels overwhelming. The author organizes the tasks into time boxes during the day, and tries to ‘collect’ as many as possible by completing them before the end of their box. If they complete one early, they can go back to try and retrieve an earlier one. This helps by focusing them on one task at a time, and making every minute feel meaningful.

Methodological Therapy: An Agenda For Tackling Research Bottlenecks

by adamShimi, Lucas Teixeira, remember

Interviews with AI alignment researchers found they reported the following as difficult and costly: running experiments, formalizing intuitions, unifying disparate insights into a coherent frame and proving theorems. Conjecture’s epistemology team questions whether these four things are the right approaches, or just ‘cached patterns’ on how to do research that can be adjusted to better suit the context. They plan to consult with more alignment researchers, question, refine and replace cached patterns that aren’t serving their purpose, and therefore improve alignment research. They call this ‘methodological therapy’, and have developed some framing questions to kick off this work (eg. ‘what are the researchers trying to accomplish?’)

Scraping training data for your mind

by Henrik Karlsson

“To get good at something—parenting, writing code, doing research—you need to internalize examples of prime achievements in that field.” Look for awards, review articles of scientific fields, who is cited etc. to generate a list of those at the top of your domain. Then study those people’s processes - your brain will learn from the training data.

Or in short: Identify your aim. Locate the prime achievements in this field, and observe them in a messy living context. Reflect on how your understanding of the world has changed. Is this a fitting method for you? If not, course correct.

Fake qualities of mind

by Kaj_Sotala

If you’re not feeling motivated, you might push yourself through something instead. But if that becomes the norm, you can forget the ‘pushing yourself’ was a substitution for true motivation to begin with. This pattern can happen in many areas eg. ‘doing’ empathy vs. ‘having’ empathy. A common trigger is that stress blocks the useful / happy mind state, so you substitute with a forced one, stay stressed and can’t get it back. The first step to fixing this is to notice it.

Other

Gene drives: why the wait?

by Metacelsus

Gene drives work by transmitting an allele to 100% of offspring, soon covering the whole population. In 2018, Crispr allowed the Cristani lab to create a gene drive in a lab environment that suppressed all reproduction in the only malaria-transmitting mosquito species. However, this has still not been released in the wild, for two reasons:

Possibility of the mosquitoes generating resistant alleles - so the population is not eliminated, and future gene drives are harder. (being addressed via larger tests to ensure this doesn’t occur)
If done without good local and government buy-in, could cause a backlash that restricts the development of other gene drives. (being addressed via building community consensus)

In the meantime, 1.6K die every day from Malaria. Is the wait worth it?

LW Petrov Day 2022 (Monday, 9/26)

by Ruby

Petrov Day commemorates when Petrov avoided a nuclear war by reporting a false alarm, when sensors seemed to show a nuclear attack. LW will celebrate by having a button which can bring down the site’s frontpage for all, available anonymously to all existing users with non-negative karma.

Announcing $5,000 bounty for ending malaria

by lc

The author will give 5K to anyone who reduces malaria by 95%+ without causing negative effects that outweigh that, because it seems like this should have a reward.

Orexin and the quest for more waking hours

by ChristianKl

People with the DEC2-P384R mutation produce more prepro-orexin and have a reduced need for sleep. Plausible reasons it wasn’t evolutionarily selected for include a higher food need and metabolism, less anxiety / more risk-taking. Due to orexin being a natural human protein, it can’t be patented so hasn’t been studied in detail. Trials are underway for an orexin antagonist (binds to its receptors) for use in treating Narcolepsy. The author suggests we fund direct orexin supplementation studies, originally for Narcolepsy to get FDA approval, and then for reducing general sleep needs.

Didn’t Summarize

Do bamboos set themselves on fire? by Malmesbury

The Redaction Machine by Ben (original fiction story about a world where you can wind someone back to a prior state)

Dath Ilan's Views on Stopgap Corrigibility by David Udell

Interpreting Neural Networks through the Polytope Lens by Sid Black, Lee Sharkey, Connor Leahy, beren, CRG, merizian, EricWinsor, Dan Braun

Funding is All You Need: Getting into Grad School by Hacking the NSF GRFP Fellowship

by hapanin

This Week on Twitter

AI

OpenAI trained an AI that “approaches human-level robustness and accuracy on English speech recognition”. (tweet) Other tweets note it works well even with fast speech and back-tracking. (tweet)

Deepmind released a new chatbot, Sparrow, trained on human feedback to follow rules like not impersonating humans, and to search the internet for helpful info. (tweet)

CSET shares that China is building “cyber ranges” to allow cybersecurity teams to test new tools and practice attack and defense. (article)

National Security

Vladimir Putin has threatened use of nuclear weapons, saying Russia had ‘lots of weapons to reply’ to threats, and that he was ‘not bluffing’. (article) The US and allies have threatened catastrophic consequences if so. (article)

Current metaculus forecasts put the chances of Russia launching a nuclear weapon before 2023 at 3%, after a brief period directly after the announcement where they rose to 6%. (metaculus forecast) The metaculus forecast for chances of non-test nuclear detonation before 2035 (anywhere globally) have risen to 27%.

Russia has initiated partial conscription, mobilizing ~300K of its reserve forces. Thousands flee Russia in response. (metaculus forecast) (article)

Science

NASA successfully crashed a spacecraft into an asteroid, to test the ability to deflect a problematic one. (tweet)

Scientists engineered mosquitoes that slow the growth of malaria-causing parasites in their guts, preventing transmission to humans - but reduces mosquito lifespan so likely not viable in practice. (tweet)

The White House announced $2B funding to launch a National Biotechnology and Biomanufacturing Initiative which aims to foster innovation, strengthen supply chains, mitigate biological risks and improve health outcomes. (tweet)

[-]ChristianKl2y20

Scientists engineered mosquitoes that slow the growth of malaria-causing parasites in their guts, preventing transmission to humans. (tweet)

The link suggests that this reduces the mosquito lifespan and is thus likely not a viable solution that's evolutionary stable.

[-]Zoe Williams2y10

Thanks for the info - added to post

LESSWRONG
LW

16