There are two very similar pages. This one and https://www.lesswrong.com/tag/scoring-rules/
By "refining pure human feedback", do you mean refining RLHF ML techniques?
I assume you still view enhancing human feedback as valuable? And also more straightforwardly just increasing the quality of the best human feedback?
Amazing! Thanks so much for making this happen so quickly.
To anyone who's trying to figure out how to get it to work on Google Podcasts, here's what worked for me (searching the name didn't, maybe this will change?):
Go to the Libsyn link. Click the RSS symbol. Copy the link. Go to Google Podcasts. Click the Library tab (bottom right). Go to Subscriptions. Click symbol that looks like adding a link in the upper right. Paste link, confirm.
Hey Paul, thanks for taking the time to write that up, that's very helpful!
Hey Rohin, thanks a lot, that's genuinely super helpful. Drawing analogies to "normal science" seems both reasonable and like it clears the picture up a lot.
I would be interested to hear opinions about what fraction of people could possibly produce useful alignment work?
Ignoring the hurdle of "knowing about AI safety at all", i.e. assuming they took some time to engage with it (e.g. they took the AGI Safety Fundamentals course). Also assume they got some good mentorship (e.g. from one of you) and then decided to commit full-time (and got funding for that). The thing I'm trying to get at is more about having the mental horsepower + epistemics + creativity + whatever other qualities are useful, or likely being able to get there after some years of training.
Also note that I mean direct useful work, not indirect meta things like outreach or being a PA to a good alignment researcher etc. (these can be super important, but I think it's productive to think of them as a distinct class). E.g. I would include being a software engineer at Anthropic, but exclude doing grocery-shopping for your favorite alignment researcher.
An answer could look like "X% of the general population" or "half the people who could get a STEM degree at Ivy League schools if they tried" or "a tenth of the people who win the Fields medal".
I think it's useful to have a sense of this for many purposes, incl. questions about community growth and the value of outreach in different contexts, as well as priors about one's own ability to contribute. Hence, I think it's worth discussing honestly, even though it can obviously be controversial (with some possible answers implying that most current AI safety people are not being useful).
That's a very detailed answer, thanks! I'll have a look at some of those tools. Currently I'm limiting my use to a particular 10-minute window per day with freedom.to + the app BlockSite. It often costs me way more than 10 minutes (checking links after, procrastinating before...) of focus though, so I might try to find an alternative.
Sorry for the tangent, but how do you recommend engaging with Twitter, without it being net bad?
Thanks! Great to hear that it's going well!
I work at Open Philanthropy, and I recently let Gavin know that Open Phil is planning to recommend a grant of $5k to Arb for the second project on your list: Overview of AI Safety in 2024 (they had already raised ~$10k by the time we came across it). Thanks for writing this post Austin — it brought the funding opportunity to our attention.
Like other commenters on Manifund, I believe this kind of overview is a valuable reference for the field, especially for newcomers.
I wanted to flag that this project would have been eligible for our RFP for work that builds capacity to address risks from transformative AI. I worry that not all potential applicants are aware of the RFP or its scope, so I’ll take this opportunity to mention that this RFP’s scope is quite broad, including funding for:
More details at the link above. People might also find this page helpful, which lists all currently open application programs at Open Phil.