rmoehn

I'm leaving AI alignment – you better stay

This diagram summarizes the requirements for independent AI alignment research and how they are connected. In this post I'll outline my four-year-long attempt at becoming an AI alignment researcher. It's an ‘I did X [including what I did wrong], and here's how it went’ post (see also jefftk's More writeups!). I'm not complaining about how people treated me – they treated me well. And I'm not trying to convince you to abandon AI alignment research – you shouldn't. I'm not saying that anyone should have done anything differently – except myself. Requirements Funding Funding is the main requirement, because it enables everything else. Thanks to Paul Christiano I had funding for nine months between January 2019 and January 2020. Thereafter I applied to the EA Foundation Fund (now Center on Long-Term Risk Fund) and Long-Term Future Fund for a grant and they rejected my applications. Now I don't know of any other promising sources of funding. I also don't know of any AI alignment research organisation that would hire me as a remote worker. How much funding you need varies. I settled on 5 kUSD per month, which sounds like a lot when you're a student, and which sounds like not a lot when you look at market rates for software developers/ML engineers/ML researchers. On top of that, I'm essentially a freelancer who has to pay social insurance by himself, take time off to do accounting and taxes, and build runway for dry periods. Results and relationships In any job you must get results and build relationships. If you don't, you don't earn your pay. (Manager Tools talks about results and relationships all the time. See for example What You've Been Taught About Management is Wrong or First Job Fundamentals.) The results I generated weren't obviously good enough to compel Paul to continue to fund me. And I didn't build good enough relationships with people who could have convinced the LTFF and EAFF fund managers that I have the potential they're looking for. Time Fundi

152Mar 12, 2020

rmoehn

Message

Operations associate at Ashgro.

655

163

10y

Want a single job to serve many AI safety projects? Ashgro is hiring an Operations Associate

🥞 Apply now! (First step takes < 15 min if you have a résumé ready.) What is Ashgro? https://www.ashgro.org/ Ashgro helps AI safety projects focus on AI safety. We offer fiscal sponsorship to AI safety projects, saving them time and allowing them to access more funding. We save them time...

Nov 24, 202512

How to make a predicted AI alignment event/meeting calendar

This is the process I used for maintaining the Predicted AI alignment event/meeting calendar. I shared it with Linda Linsefors, since she wanted to create an AI Safety Google Calendar. And I think it might be useful for others, too, so I'm sharing it here. I did the following every...

Jun 20, 20206

Makeshift face touch warner

If you strap this device to your forearm, it vibrates when you raise your arm to touch your face. Useful? Not sure, because one gets many false positives. Cheap? Yes. Applicable everywhere? Yes. Update 2020-03-19: I've been using it all day while working at my desk and it has warned...

Mar 18, 202015

I'm leaving AI alignment – you better stay

Mar 12, 2020152

Usable implementation of IDA available

Paul Christiano published his and Buck Shlegeris' implementation at https://github.com/paulfchristiano/amplification. It's the code behind the article Supervising strong learners by amplifying weak experts. With William Saunders' permission, I published a version modified by him and later me: https://github.com/rmoehn/amplification This one has changes and more documentation that allow you to run...

Feb 29, 202012

Training a tiny SupAmp model on easy tasks. The influence of failure rate on learning curves

Link: https://github.com/rmoehn/farlamp/blob/master/build/tiny-supfail.pdf This is the first experiment report of my AI alignment research project. A reminder what it is about: I'm studying the impact of overseer failure on RL-based IDA, because I want to know under what conditions amplification and distillation increase or decrease the failure rate, in order to...

Feb 5, 20205

Twenty-three AI alignment research project definitions

I came up with these research project definitions when I read the iterated amplification sequence. Last year I put five of them up for voting (see Which of these five AI alignment research projects ideas are no good?) and chose no. 23 to work on (see IDA with RL and...

Feb 3, 202023

Load More (7/24)

LESSWRONG
LW

LESSWRONG
LW

rmoehn

rmoehn

rmoehn

I'm leaving AI alignment – you better stay

Please give your links speaking names!

Predicted AI alignment event/meeting calendar

A cognitive intervention for wrist pain

rmoehn

Want a single job to serve many AI safety projects? Ashgro is hiring an Operations Associate

How to make a predicted AI alignment event/meeting calendar

Makeshift face touch warner

I'm leaving AI alignment – you better stay

Usable implementation of IDA available

Training a tiny SupAmp model on easy tasks. The influence of failure rate on learning curves

Twenty-three AI alignment research project definitions

I'm leaving AI alignment – you better stay

Please give your links speaking names!

Predicted AI alignment event/meeting calendar

A cognitive intervention for wrist pain

Want a single job to serve many AI safety projects? Ashgro is hiring an Operations Associate

How to make a predicted AI alignment event/meeting calendar

Makeshift face touch warner

I'm leaving AI alignment – you better stay

Usable implementation of IDA available

Training a tiny SupAmp model on easy tasks. The influence of failure rate on learning curves

Twenty-three AI alignment research project definitions