What AI Safety Materials Do ML Researchers Find Compelling?
I (Vael Gates) recently ran a small pilot study with Collin Burns in which we showed ML researchers (randomly selected NeurIPS / ICML / ICLR 2021 authors) a number of introductory AI safety materials, asking them to answer questions and rate those materials. Summary We selected materials that were relatively short and disproportionally aimed at ML researchers, but we also experimented with other types of readings.[1] Within the selected readings, we found that researchers (n=28) preferred materials that were aimed at an ML audience, which tended to be written by ML researchers, and which tended to be more technical and less philosophical. In particular, for each reading we asked ML researchers (1) how much they liked that reading, (2) how much they agreed with that reading, and (3) how informative that reading was. Aggregating these three metrics, we found that researchers tended to prefer (Steinhardt > [Gates, Bowman] > [Schulman, Russell]), and tended not to like Cotra > Carlsmith. In order of preference (from most preferred to least preferred) the materials were: 1. “More is Different for AI” by Jacob Steinhardt (2022) (intro and first three posts only) 2. “Researcher Perceptions of Current and Future AI” by Vael Gates (2022) (first 48m; skip the Q&A) (Transcript) 3. “Why I Think More NLP Researchers Should Engage with AI Safety Concerns” by Sam Bowman (2022) 4. “ Frequent arguments about alignment” by John Schulman (2021) 5. “Of Myths and Moonshine” by Stuart Russell (2014) 6. "Current work in AI Alignment" by Paul Christiano (2019) (Transcript) 7. “Why alignment could be hard with modern deep learning” by Ajeya Cotra (2021) (feel free to skip the section “How deep learning works at a high level”) 8. “Existential Risk from Power-Seeking AI” by Joe Carlsmith (2021) (only the first 37m; skip the Q&A) (Transcript) (Not rated) * "AI timelines/risk projections as of Sept 2022" (first 3 pages only) Commentary Christiano (2019), Cotra (2021), and Ca
FAQ
This is cool! Why haven't I heard of this?
Arkose has been in soft-launch for a while, and we've been focused on email outreach more than public comms. But we're increasingly public, and are in communication with other AI safety fieldbuilding organizations!
How big is the team?
3 people: Zach Thomas and Audra Zook are doing an excellent job in operations, and I'm the founder.
How do you pronounce "Arkose"? Where did the name come from?
I think whatever pronunciation is fine, and it's the name of a rock. We have an SEO goal for arkose.org to surpass the rock's Wikipedia page.
Where does your funding come from?
The Survival and Flourishing Fund.
Are you kind of like the 80,000... (read more)