Open Philanthropy is planning a request for proposals (RFP) for AI alignment projects working with deep learning systems, and we’re looking for feedback on the RFP and on the research directions we’re looking for proposals within. We’d be really interested in feedback from people on the Alignment Forum on the current (incomplete) draft of the RFP.
The main RFP text can be viewed here. It links to several documents describing two of the research directions we’re interested in:
- Measuring and forecasting risks
- Techniques for enhancing human feedback [Edit: this previously linked to an older, incorrect version]
Please feel free to comment either directly on the documents, or in the comments section below.
We are unlikely to add or remove research directions at this stage, but we are open to making any other changes, including to the structure of the RFP. We’d be especially interested in getting the Alignment Forum’s feedback on the research directions we present, and on the presentation of our broader views on AI alignment. It’s important to us that our writing about AI alignment is accurate and easy to understand, and that it’s clear how the research we’re proposing relates to our goals of reducing risks from power-seeking systems.
Great initiative! I'll try to leave some comments sometime next week.
Is there a deadline? (I've seen floating around the 15th of September, but I guess feedback would be valuable before that so you can take it into account?)
Also, is this the proposal mentioned by Rohin in his last newsletter, or a parallel effort?
Getting feedback in the next week would be ideal; September 15th will probably be too late.
Different request for proposals!