Review

TLDR: “Key Phenomena in AI Risk” is a 7 week-long, facilitated reading group. It is aimed at people interested in conceptual AI alignment research, in particular from fields such as philosophy, systems research, biology, cognitive and social sciences.

The program will run between July and August 2023. Sign up here by May 28th.

What?

The “Key Phenomena in AI risk” reading curriculum provides an extended introduction to some key ideas in AI risk, in particular risks from misdirected optimization or 'consequentialist cognition'. As such, it aims to remain largely agnostic of solution paradigms. 

See here for a short overview of the curriculum; here for a more extensive summary; and here for the full curriculum. 

This is a 7-week long program, which consists of a weekly 90’ facilitated call to discuss the week’s key phenomena and readings, as well as individual time for reading (min. 2h, more if you would like to explore the optional readings).  

The courses are virtual and free of charge.

For Who? 

The curriculum is primarily aimed at people interested in conceptual research in AI risk and alignment. 

It is designed to be accessible to audiences in, among others, philosophy (of agency, knowledge, power, etc.) and systems research (e.g. biological, cognitive, information-theoretic, social systems, etc.).

When?

The reading groups will be taking place in July and August 2023.

 We expect to run 2-6 groups à 4-8 participants (including 1 facilitator). Each group will be led by a facilitator with substantive knowledge of AI risk. 

Overview of the curriculum

  • Week 0 is dedicated to getting to know each other and clarifying how the program will work. 
  • Week 1 focuses on why important features of generally intelligent 'consequentialist cognition' might be algorithmically realizable, and its potential implications. 
  • Week 2 focuses on why it can be hard to direct such intelligence in a safe and beneficial direction. 
  • Week 3 discusses instrumental convergence in goal-oriented systems. . 
  • Week 4 discusses risks from systems that seek predictive omniscience. 
  • Week 5 discusses some factors on why surveilling (or oversight of) these artificial systems may be fraught with differential advantage for deceptive tendencies. 
  • Week 6 discusses why even an incoherent aggregation of optimizing systems could still impose a (misaligned) optimizing pressure in the world.

Here is a longer summary. Here is the full curriculum. 

The curriculum has been developed by TJ (Research Scholar FHI) with inputs from Nora Ammann, Sahil Kulshrestha, and Tsvi Benson-Tilsen. The program is operationally supported by PIBBSS

The curriculum was initially developed as part of the PIBBSS summer research fellowship,. but we realized that it might be of interest and useful to people outside of the fellowship program, too. 

We are orienting to this present round of reading groups as a way to test whether it’s worth continuing to run them on a more regular basis, as well so to help us improve the program. 

Sign up

Sign up here by May 28th.

About the application

The application consists of one stage, where we ask you to fill in a form with 

  • Your CV
  • Your motivation for participating in the program
  • Your prior exposure to AI risk/alignment to date

We select people based on our best understanding of their motivation to contribute to AI alignment and how much they would counterfactually benefit from participating in the program. 


If you have any questions, feel free to leave a comment below or contact us at contact@pibbss.ai 

If you are keen to facilitate one of the reading group, also reach out. 

New Comment
4 comments, sorted by Click to highlight new comments since:

After filling out the form, I could click on "see previous responses", which allowed me to see the responses of all other people who have filled out the form so far

That is probably not intended?

Indeed that wasn't intended. Thanks a lot for spotting & sharing it! It's fixed now.

Sounds really cool. Would be useful to have some idea of the kind of time you're planning to pick so that people in other timezones can make a call about whether or not to apply.

Good point! We are planning to gauge time preferences among the participants and fix slots then. What is maybe most relevant, we are intending to accommodate all time zones. (We have been doing this with PIBBSS fellows as well, so I am pretty confident we will be able to find time slots that work pretty well across the globe.)