Summaries: Alignment Fundamentals Curriculum

Leon Lang

This is a linkpost for https://docs.google.com/document/d/1mVwQgyrEgWJ9xsO6wIizDp2T3578OqNe4K5goKiOLgE/edit?usp=sharing

The linked document provides my summaries for most core readings and many further readings of the alignment fundamentals curriculum composed by Richard Ngo, as accessed from July to early September 2022. Additionally, it often contains my preliminary opinions on the texts. Note that I’m not an expert on the topic.

I have read all texts while simultaneously doing full-time work unrelated to AI alignment, and thus, due to time constraints, many summaries probably contain mistakes, and my opinions would change upon further reflection. Additionally:

I only streamlined the process after a few weeks
the summaries of the first weeks are of lower quality, and more of them or my opinions are missing
Some summaries are also missing since I had a minor repetitive strain issue along the way, and since the curriculum changed while reading through it
Sometimes, the formatting is not ideal since I originally wrote the summaries on a slack channel and then copy-pasted them to google docs

Nevertheless, I was told that these summaries are useful, and therefore I’m sharing them with the wider community of people interested in alignment.

If anyone wants to contribute their own summary, please put a suggestion into the google doc, and I will accept it with an attribution to the (optionally anonymous) author.

Acknowledgments: I want to thank Albert Garde, Benjamin Kolb, Fritz Dorn, Jens Brandt, and Tom Lieberum for discussions on the curriculum.

If you do AI policy, this is a great way to quickly skill up at explaining alignment, and also quickly skilling up on AI itself.

So far, the summaries are only "tested" by people who have worked through the whole curriculum themselves. They used the summaries to check their understanding of the articles and contrast their view with mine.

So I'm not yet confident that someone could just read my summaries without at the same time going through the full articles, but it seems worth a try.

Thank you for writing this. I plan on working through it over the next couple weeks to fill in gaps in my previous alignment-related knowledge.

If you do AI policy, this is a great way to quickly skill up at explaining alignment, and also quickly skilling up on AI itself.

So I'm not yet confident that someone could just read my summaries without at the same time going through the full articles, but it seems worth a try.

Thank you for writing this. I plan on working through it over the next couple weeks to fill in gaps in my previous alignment-related knowledge.

LESSWRONG
LW

LESSWRONG
LW

44

Summaries: Alignment Fundamentals Curriculum

44

Ω 10

44

Ω 10

44

Ω 10