Regarding our career development and transition funding (CDTF) program:
Yeah, I was thinking of PhD programs as one of the most common longer-term grants.
Agree that it's reasonable for a lot of this funding to be shorter, but also think that given the shifting funding landscape where most good research by my lights can no longer get funding, I would be quite hesitant for people to substantially sacrifice career capital in the hopes of getting funding later (or more concretely, I think it's the right choice for people to choose a path where they end up with a lot of slack to think about what directions to pursue, instead ...
Just wanted to flag quickly that Open Philanthropy's GCR Capacity Building team (where I work) has a career development and transition funding program.
The program aims to provide support—in the form of funding for graduate study, unpaid internships, self-study, career transition and exploration periods, and other activities relevant to building career capital—for individuals at any career stage who want to pursue careers that could help reduce global catastrophic risks (esp. AI risks). It’s open globally and operates on a rolling basis.
I realize that this ...
I would mostly advise people against making large career transitions on the basis of Open Phil funding, or if you do, I would be very conservative with it. Like, don't quit your job because of a promise of 1 year of funding, because it is quite possible your second year will only be given conditional on you aligning with the political priorities of OP funders or OP reputational management, and career transitions usually take longer than a year. To be clear, I think it often makes sense to accept funding from almost anyone, but in the case of OP it is fundi...
Thanks for the feedback! I’ll forward it to our team.
I think I basically agree with you that from reading the RFP page, this project doesn’t seem like a central example of the projects we’re describing (and indeed, many of the projects we do fund through this RFP are more like the examples given on the RFP page).
Some quick reactions:
I work at Open Philanthropy, and I recently let Gavin know that Open Phil is planning to recommend a grant of $5k to Arb for the second project on your list: Overview of AI Safety in 2024 (they had already raised ~$10k by the time we came across it). Thanks for writing this post Austin — it brought the funding opportunity to our attention.
Like other commenters on Manifund, I believe this kind of overview is a valuable reference for the field, especially for newcomers.
I wanted to flag that this project would have been eligible for our RFP for work tha...
By "refining pure human feedback", do you mean refining RLHF ML techniques?
I assume you still view enhancing human feedback as valuable? And also more straightforwardly just increasing the quality of the best human feedback?
Amazing! Thanks so much for making this happen so quickly.
To anyone who's trying to figure out how to get it to work on Google Podcasts, here's what worked for me (searching the name didn't, maybe this will change?):
Go to the Libsyn link. Click the RSS symbol. Copy the link. Go to Google Podcasts. Click the Library tab (bottom right). Go to Subscriptions. Click symbol that looks like adding a link in the upper right. Paste link, confirm.
I would be interested to hear opinions about what fraction of people could possibly produce useful alignment work?
Ignoring the hurdle of "knowing about AI safety at all", i.e. assuming they took some time to engage with it (e.g. they took the AGI Safety Fundamentals course). Also assume they got some good mentorship (e.g. from one of you) and then decided to commit full-time (and got funding for that). The thing I'm trying to get at is more about having the mental horsepower + epistemics + creativity + whatever other qualities are useful, or likely being a...
"Possibly produce useful alignment work" is a really low bar, such that the answer is ~100%. Lots of things are possible. I'm going to instead answer "for what fraction of people would I think that the Long-Term Future Fund should fund them on the current margin".
If you imagine that the people are motivated to work on AI safety, get good mentorship, and are working full-time, then I think on my views most people who could get into an ML PhD in any university would qualify, and a similar number of other people as well (e.g. strong coders who are less good a...
(Off the cuff answer including some random guesses and estimates I won't stand behind, focused on the kind of theoretical alignment work I'm spending most of my days thinking about right now.)
Over the long run I would guess that alignment is broadly similar to other research areas, where a large/healthy field could support lots of work from lots of people, where some kinds of contributions are very heavy-tailed but there is a lot of complementarity and many researchers are having large overall marginal impacts.
Right now I think difficulties (at least for g...
That's a very detailed answer, thanks! I'll have a look at some of those tools. Currently I'm limiting my use to a particular 10-minute window per day with freedom.to + the app BlockSite. It often costs me way more than 10 minutes (checking links after, procrastinating before...) of focus though, so I might try to find an alternative.
My advice: follow <50 people, maybe <15 people, and always have the setting on "Latest Tweets". That way, the algorithms don't have power over you, it's impossible to spend a lot of time (because there just aren't that many tweets), and since you filtered so hard, hopefully the mean level of interesting-ness is high.
Maybe I'm being stupid here. On page 42 of the write-up, it says:
In order to ensure we learned the human simulator, we would need to change the training strategy to ensure that it contains sufficiently challenging inference problems, and that doing direct translation was a cost-effective way to improve speed (i.e. that there aren’t other changes to the human simulator that would save even more time). [emphasis mine]
Shouldn't that be?
In order to ensure we learned the direct translator, ...
We’re planning to evaluate submissions as we receive them, between now and the end of January; we may end the contest earlier or later if we receive more or fewer submissions than we expect.
Just wanted to note that the "we may end the contest earlier" part here makes me significantly more hesitant about trying this. I will probably still at least have a look at it, but part of me is afraid that I'll invest a bunch of time and then the contest will be announced to be over before I got around to submitting. And I suspect Holden's endorsement may make t...
Sorry for my very late reply!
Thanks for taking the time to answer, I now don't endorse most of what I wrote anymore.
I think that if the AGI has a perfect motivation system then we win, there's no safety problem left to solve. (Well, assuming it also remains perfect over time, as it learns new things and thinks new thoughts.) (See here for the difference between motivation and reward.)
and from the post:
And if we get to a point where we can design reward signals that sculpt an AGI's motivation with surgical precision, that's fine!
This is mostly where I...
Interesting post!
Disclaimer: I have only read the abstract of the "Reward is enough" paper. Also, I don't have much experience in AI safety, but I consider changing that.
Here are a couple of my thoughts.
Your examples haven't entirely convinced me that reward isn't enough. Take the bird. As I see it, something like the following is going on:
Evolution chose to take a shortcut: maybe a bird with a very large brain and a lot of time would eventually figure out that singing is a smart thing to do if it received reward for singing well. But evolution being a rut...
Right at the start under "How to use this book", there is this paragraph:
If you have never been to a CFAR workshop, and don’t have any near-term...
plans to come to one, then you may be the kind of person who would love to
set their eyes on a guide to improving one’s rationality, full of straightforward
instructions and exercises on how to think more clearly, act more effectively,
learn more from your experiences, make better decisions, and do more with
your life. This book is not that guide (nor is the workshop itself, for that
matter). I
I'm curious what the biggest factors were that made you update?