Rohin Shah

Research Scientist at Google DeepMind. Creator of the Alignment Newsletter. http://rohinshah.com/

Sequences

Value Learning
Alignment Newsletter

Wikitag Contributions

Comments

Sorted by

We've got a lot of interest, so it's taking some time to go through applications. If you haven't heard back by the end of March, please ping me; hopefully it will be sooner than that.

The answer to that question will determine which team will do the first review of your application. (We get enough applications that the first review costs quite a bit of time, so we don't want both teams to review all applications separately.)

You can still express interest in both teams (e.g. in the "Any other info" question), and the reviewer will take that into account and consider whether to move your application to the other team, but Gemini Safety reviewers aren't going to be as good at evaluating ASAT candidates, and vice versa, so you should choose the team that you think is a better fit for you.

There are different interview processes. ASAT is more research-driven while Gemini Safety is more focused on execution and implementation. If you really don't know which of the two teams would be a better fit, you can submit a separate application for each.

Our hiring this round is a small fraction of our overall team size, so this is really just correcting a minor imbalance, and shouldn't be taken as reflective of some big strategy. I'm guessing we'll go back to hiring a mix of the two around mid-2025.

You can check out my career FAQ, as well as various other resources linked from there.

Still pretty optimistic by the standards of the AGI safety field, somewhat shorter timelines than I reported in that post.

Neither of these really affect the work we do very much. I suppose if I were extremely pessimistic I would be doing something else, but even at a p(doom) of 50% I'd do basically the same things I'm doing now.

(And similarly individual team members have a wide variety of beliefs on both optimism and timelines. I actually don't know their beliefs on those topics very well because these beliefs are usually not that action-relevant for us.)

More capability research than AGI safety research but idk what the ratio is and it's not something I can easily find out

Since we have multiple roles, the interview process varies across candidates, but usually it would have around 3 stages that in total correspond to 4-8 hours of interviews.

We'll leave it up until the later of those two (and probably somewhat beyond that, but that isn't guaranteed). I've edited the post.

Rohin ShahΩ220

Is that right?

Yes, that's broadly accurate, though one clarification:

This is not obvious because trying it out and measuring the effectiveness of MONA is somewhat costly

That's a reason (and is probably sufficient by itself), but I think a more important reason is that your first attempt at using MONA is at the point where problems arise, MONA will in fact be bad, whereas if you have iterated on it a bunch previously (and in particular you know how to provide appropriate nonmyopic approvals), your attempt at using MONA will go much better.

I think this will become much more likely once we actually start observing long-term optimization failures in prod.

Agreed, we're not advocating for using MONA now (and say so in the paper).

Maybe an intervention I am excited about is enough training technique transparency that it is possible for people outside of labs to notice if issues plausibly stems from long-term optimization?

Idk, to be effective I think this would need to be a pretty drastic increase in transparency, which seems incompatible with many security or non-proliferation intuitions, as well as business competitiveness concerns. (Unless you are thinking of lots of transparency to a very small set of people.)

Load More