All of Ryan Kidd's Comments + Replies

LISA's current leadership team consists of an Operations Director (Mike Brozowski) and a Research Director (James Fox). LISA is hiring for a new CEO role; there has never been a LISA CEO.

Ryan Kidd*5-12

How fast should the field of AI safety grow? An attempt at grounding this question in some predictions.

  • Ryan Greenblatt seems to think we can get a 30x speed-up in AI R&D using near-term, plausibly safe AI systems; assume every AIS researcher can be 30x’d by Alignment MVPs
  • Tom Davidson thinks we have <3 years from 20%-AI to 100%-AI; assume we have ~3 years to align AGI with the aid of Alignment MVPs
  • Assume the hardness of aligning TAI is equivalent to the Apollo Program (90k engineer/scientist FTEs x 9 years = 810k FTE-years); therefore, we need ~
... (read more)
Buck128

I appreciate the spirit of this type of calculation, but think that it's a bit too wacky to be that informative. I think that it's a bit of a stretch to string these numbers together. E.g. I think Ryan and Tom's predictions are inconsistent, and I think that it's weird to identify 100%-AI as the point where we need to have "solved the alignment problem", and I think that it's weird to use the Apollo/Manhattan program as an estimate of work required. (I also don't know what your Manhattan project numbers mean: I thought there were more like 2.5k scientists/engineers at Los Alamos, and most of the people elsewhere were purifying nuclear material)

5Garrett Baker
There's the standard software engineer response of "You cannot make a baby in 1 month with 9 pregnant women". If you don't have a term in this calculation for the amount of research hours that must be done serially vs the amount of research hours that can be done in parallel, then it will always seem like we have too few people, and should invest vastly more in growth growth growth! If you find that actually your constraint is serial research output, then you still may conclude you need a lot of people, but you will sacrifice a reasonable amount of growth speed for attracting better serial researchers.  (Possibly this shakes out to mathematicians and physicists, but I don't want to bring that conversation into here)
Ryan Kidd*150

Crucial questions for AI safety field-builders:

  • What is the most important problem in your field? If you aren't working on it, why?
  • Where is everyone else dropping the ball and why?
  • Are you addressing a gap in the talent pipeline?
  • What resources are abundant? What resources are scarce? How can you turn abundant resources into scarce resources?
  • How will you know you are succeeding? How will you know you are failing?
  • What is the "user experience" of my program?
  • Who would you copy if you could copy anyone? How could you do this?
  • Am I better than the counterfactual?
  • Who are your clients? What do they want?
1Archimedes
Did you mean to link to my specific comment for the first link?
Ryan Kidd*20

And 115 prospective mentors applied for Summer 2025!

When onboarding advisors, we made it clear that we would not reveal their identities without their consent. I certainly don't want to require that our advisors make their identities public, as I believe this might compromise the intent of anonymous peer review: to obtain genuine assessment, without fear of bias or reprisals. As with most academic journals, the integrity of the process is dependent on the editors; in this case, the MATS team and our primary funders.

It's possible that a mere list of advisor names (without associated ratings) would be sufficient to ensure public trust in our process without compromising the peer review process. We plan to explore this option with our advisors in future.

habryka153

Yeah, it's definitely a kind of messy tradeoff. My sense is just that the aggregate statistics you provided didn't have that many bits of evidence that would allow me to independently audit a trust chain.

A thing that I do think might be more feasible is to make it opt-in for advisors to be public. E.g. SFF only had a minority of recommenders be public about their identity, but I do still think it helps a good amount to have some names.

(Also, just for historical consistency: Most peer review in the history of science was not anonymous. Anonymous peer review... (read more)

Not currently. We thought that we would elicit more honest ratings of prospective mentors from advisors, without fear of public pressure or backlash, if we kept the list of advisors internal to our team, similar to anonymous peer review.

4Austin Chen
Makes sense, thanks. FWIW, I really appreciated that y'all posted this writeup about mentor selection -- choosing folks for impactful, visible, prestigious positions is a whole can of worms, and I'm glad to have more public posts explaining your process & reasoning.
Ryan Kidd279

I'm tempted to set this up with Manifund money. Could be a weekend project.

2Marius Hobbhahn
Go for it. I have some names in mind for potential experts. DM if you're interested. 
Ryan KiddΩ6120

How would you operationalize a contest for short-timeline plans?

Marius HobbhahnΩ92111

Something like the OpenPhil AI worldview contest: https://www.openphilanthropy.org/research/announcing-the-winners-of-the-2023-open-philanthropy-ai-worldviews-contest/
Or the ARC ELK prize: https://www.alignment.org/blog/prizes-for-elk-proposals/

In general, I wouldn't make it too complicated and accept some arbitrariness. There is a predetermined panel of e.g. 5 experts and e.g. 3 categories (feasibility, effectiveness, everything else). All submissions first get scored by 2 experts with a shallow judgment (e.g., 5-10 minutes). Maybe there is some "saving" ... (read more)

But who is "MIRI"? Most of the old guard have left. Do you mean Eliezer and Nate? Or a consensus vote of the entire staff (now mostly tech gov researchers and comms staff)?

Ryan Kidd364

On my understanding, EA student clubs at colleges/universities have been the main “top of funnel” for pulling people into alignment work during the past few years. The mix people going into those clubs is disproportionately STEM-focused undergrads, and looks pretty typical for STEM-focused undergrads. We’re talking about pretty standard STEM majors from pretty standard schools, neither the very high end nor the very low end of the skill spectrum.

At least from the MATS perspective, this seems quite wrong. Only ~20% of MATS scholars in the last ~4 program... (read more)

Ryan Kidd159

You could consider doing MATS as "I don't know what to do, so I'll try my hand at something a decent number of apparent experts consider worthwhile and meanwhile bootstrap a deep understanding of this subfield and a shallow understanding of a dozen other subfields pursued by my peers." This seems like a common MATS experience and I think this is a good thing.

Ryan Kidd110

Some caveats:

  • A crucial part of the "hodge-podge alignment feedback loop" is "propose new candidate solutions, often grounded in theoretical models." I don't want to entirely focus on empirically fleshing out existing research directions to the exclusion of proposing new candidate directions. However, it seems that, often, new on-paradigm research directions emerge in the process of iterating on old ones!
  • "Playing theoretical builder-breaker" is an important skill and I think this should be taught more widely. "Iterators," as I conceive of them, are capable
... (read more)
Ryan Kidd*4411

Alice is excited about the eliciting latent knowledge (ELK) doc, and spends a few months working on it. Bob is excited about debate, and spends a few months working on it. At the end of those few months, Alice has a much better understanding of how and why ELK is hard, has correctly realized that she has no traction on it at all, and pivots to working on technical governance. Bob, meanwhile, has some toy but tangible outputs, and feels like he's making progress.

 

I don't want to respond to the examples rather than the underlying argument, but it seems ... (read more)

4Noosphere89
Yeah, the worst-case ELK problem could well have no solution, but in practice alignment is solvable either by other methods or by having an ELK solution that does work on a large classes of AIs like neural nets, so Alice is plausibly making a big mistake, and a crux here is that I don't believe we will ever get no-go theorems, or even arguments to the standard level of rigor in physics because I believe alignment has pretty lax constraints, so a lot of solutions can appear. The relevant sentence below:
Ryan Kidd110

Some caveats:

  • A crucial part of the "hodge-podge alignment feedback loop" is "propose new candidate solutions, often grounded in theoretical models." I don't want to entirely focus on empirically fleshing out existing research directions to the exclusion of proposing new candidate directions. However, it seems that, often, new on-paradigm research directions emerge in the process of iterating on old ones!
  • "Playing theoretical builder-breaker" is an important skill and I think this should be taught more widely. "Iterators," as I conceive of them, are capable
... (read more)
Ryan Kidd246

Obviously I disagree with Tsvi regarding the value of MATS to the proto-alignment researcher; I think being exposed to high quality mentorship and peer-sourced red-teaming of your research ideas is incredibly valuable for emerging researchers. However, he makes a good point: ideally, scholars shouldn't feel pushed to write highly competitive LTFF grant applications so soon into their research careers; there should be longer-term unconditional funding opportunities. I would love to unlock this so that a subset of scholars can explore diverse research directions for 1-2 years without 6-month grant timelines looming over them. Currently cooking something in this space.

Reply12211

@yanni kyriacos when will you post about TARA and Sydney AI Safety Hub on LW? ;)

1yanni kyriacos
SASH isn't official (we're waiting on funding). Here is TARA :) https://www.lesswrong.com/posts/tyGxgvvBbrvcrHPJH/apply-to-be-a-ta-for-tara
1Towards_Keeperhood
(haha cool. perhaps you could even PM Abram if he doesn't PM you. I think it would be pretty useful to speed up his agenda through this.)

Can you make a Manifund.org grant application if you need funding?

We don't collect GRE/SAT scores, but we do have CodeSignal scores and (for the first time) a general aptitude test developed in collaboration with SparkWave. Many MATS applicants have maxed out scores for the CodeSignal and general aptitude tests. We might share these stats later.

1Daniel Tan
FWIW from what I remember, I would be surprised if most people doing MATS 7.0 did not max out the aptitude test. Also, the aptitude test seems more like an SAT than anything measuring important procedural knowledge for AI safety. 

I don't agree with the following claims (which might misrepresent you):

  • "Skill levels" are domain agnostic.
  • Frontier oversight, control, evals, and non-"science of DL" interp research is strictly easier in practice than frontier agent foundations and "science of DL" interp research.
  • The main reason there is more funding/interest in the former category than the latter is due to skill issues, rather than worldview differences and clarity of scope.
  • MATS has mid researchers relative to other programs.
9johnswentworth
Y'know, you probably have the data to do a quick-and-dirty check here. Take a look at the GRE/SAT scores on the applications (both for applicant pool and for accepted scholars). If most scholars have much-less-than-perfect scores, then you're probably not hiring the top tier (standardized tests have a notoriously low ceiling). And assuming most scholars aren't hitting the test ceiling, you can also test the hypothesis about different domains by looking at the test score distributions for scholars in the different areas.
Ryan Kidd2-1

I don't think it makes sense to compare Google intern salary with AIS program stipends this way, as AIS programs are nonprofits (with associated salary cut) and generally trying to select against people motivated principally by money. It seems like good mechanism design to pay less than tech internships, even if the technical bar for is higher, given that value alignment is best selected by looking for "costly signals" like salary sacrifice.

I don't think the correlation for competence among AIS programs is as you describe.

I think there some confounders here:

  • PIBBSS had 12 fellows last cohort and MATS had 90 scholars. The mean/median MATS Summer 2024 scholar was 27; I'm not sure what this was for PIBBSS. The median age of the 12 oldest MATS scholars was 35 (mean 36). If we were selecting for age (which is silly/illegal, of course) and had a smaller program, I would bet that MATS would be older than PIBBSS on average. MATS also had 12 scholars with completed PhDs and 11 in-progress.
  • Several PIBBSS fellows/affiliates have done MATS (e.g., Ann-Kathrin Dombrowski, Magdalena Wache,
... (read more)
3johnswentworth
I think this is less a matter of my particular taste, and more a matter of selection pressures producing genuinely different skill levels between different research areas. People notoriously focus on oversight/control/evals/specific interp over foundations/generalizable interp because the former are easier. So when one talks to people in those different areas, there's a very noticeable tendency for the foundations/generalizable interp people to be noticeably smarter, more experienced, and/or more competent. And in the other direction, stronger people tend to be more often drawn to the more challenging problems of foundations or generalizable interp. So possibly a MATS apologist reply would be: yeah, the MATS portfolio is more loaded on the sort of work that's accessible to relatively-mid researchers, so naturally MATS ends up with more relatively-mid researchers. Which is not necessarily a bad thing.

Are these PIBBSS fellows (MATS scholar analog) or PIBBSS affiliates (MATS mentor analog)?

2johnswentworth
Fellows.

Updated figure with LASR Labs and Pivotal Research Fellowship at current exchange rate of 1 GBP = 1.292 USD.

That seems like a reasonable stipend for LASR. I don't think they cover housing, however.

That said, maybe you are conceptualizing of an "efficient market" that principally values impact, in which case I would expect the governance/policy programs to have higher stipends. However, I'll note that 87% of MATS alumni are interested in working at an AISI and several are currently working at UK AISI, so it seems that MATS is doing a good job of recruiting technical governance talent that is happy to work for government wages.

1johnswentworth
No, I meant that the correlation between pay and how-competent-the-typical-participant-seems-to-me is, if anything, negative. Like, the hiring bar for Google interns is lower than any of the technical programs, and PIBBSS seems-to-me to have the most competent participants overall (though I'm not familiar with some of the programs).

Note that governance/policy jobs pay less than ML research/engineering jobs, so I expect GovAI, IAPS, and ERA, which are more governance focused, to have a lower stipend. Also, MATS is deliberately trying to attract top CS PhD students, so our stipend should be higher than theirs, although lower than Google internships to select for value alignment. I suspect that PIBBSS' stipend is an outlier and artificially low due to low funding. Given that PIBBSS has a mixture of ML and policy projects, and IMO is generally pursuing higher variance research than MATS, I suspect their optimal stipend would be lower than MATS', but higher than a Stanford PhD's; perhaps around IAPS' rate.

2Ryan Kidd
That said, maybe you are conceptualizing of an "efficient market" that principally values impact, in which case I would expect the governance/policy programs to have higher stipends. However, I'll note that 87% of MATS alumni are interested in working at an AISI and several are currently working at UK AISI, so it seems that MATS is doing a good job of recruiting technical governance talent that is happy to work for government wages.

That's interesting! What evidence do you have of this? What metrics are you using?

6johnswentworth
My main metric is "How smart do these people seem when I talk to them or watch their presentations?". I think they also tend to be older and have more research experience.
Ryan Kidd*60

MATS lowered the stipend from $50/h to $40/h ahead of the Summer 2023 Program to support more scholars. We then lowered it again to $30/h ahead of the Winter 2023-24 Program after surveying alumni and determining that 85% would be accept $30/h.

CHAI interns are paid $5k/month for in-person interns and $3.5k/month for remote interns. I used the in-person figure. https://humancompatible.ai/jobs

5Leon Lang
Then the MATS stipend today is probably much lower than it used to be? (Which would make sense since IIRC the stipend during MATS 3.0 was settled before the FTX crash, so presumably when the funding situation was different?)

Yes, this doesn't include those costs and programs differ in this respect.

Ryan Kidd*470

Hourly stipends for AI safety fellowship programs, plus some referents. The average AI safety program stipend is $26/h.

Edit: updated figure to include more programs.

2jacquesthibs
I’d be curious to know if there’s variability in the “hours worked per week” given that people might work more hours during a short program vs a longer-term job (to keep things sustainable).
1metawrong
LASR (https://www.lasrlabs.org/) is giving a £11,000 stipend for a 13 week program, assuming 40h/week it works out to ~$27

... WOW that is not an efficient market. 

4Leon Lang
Is “CHAI” being a CHAI intern, PhD student, or something else? My MATS 3.0 stipend was clearly higher than my CHAI internship stipend.

Interesting, thanks! My guess is this doesn't include benefits like housing and travel costs? Some of these programs pay for those while others don't, which I think is a non-trivial difference (especially for the bay area)

1% are "Working/interning on AI capabilities."

Erratum: previously, this statistic was "7%", which erroneously included two alumni who did not complete the program before Winter 2023-24, which is outside the scope of this report. Additionally, two of the three alumni from before Winter 2023-24 who selected "working/interning on AI capabilities" first completed our survey in Sep 2024 and were therefore not included in the data used for plots and statistics. If we include those two alumni, this statistic would be 3/74 = 4.1%, but this would be misrepresentative as several other alumni who completed the program before Winter 2023-24 filled in the survey during or after Sep 2024.

Scholars working on safety teams at scaling labs generally selected "working/interning on AI alignment/control"; some of these also selected "working/interning on AI capabilities", as noted. We are independently researching where each alumnus ended up working, as the data is incomplete from this survey (but usually publicly available), and will share separately.

Great suggestion! We'll publish this in our next alumni impact evaluation, given that we will have longer-term data (with more scholars) soon.

Cheers!

I think you might have implicitly assumed that my main crux here is whether or not take-off will be fast. I actually feel this is less decision-relevant for me than the other cruxes I listed, such as time-to-AGI or "sharp left turns." If take-off is fast, AI alignment/control does seem much harder and I'm honestly not sure what research is most effective; maybe attempts at reflectively stable or provable single-shot alignment seem crucial, or maybe we should just do the same stuff faster? I'm curious: what current AI safety research do you consider ... (read more)

5james.lucassen
Ah, didn't mean to attribute the takeoff speed crux to you, that's my own opinion. I'm not sure what's best in fast takeoff worlds. My message is mainly just that getting weak AGI to solve alignment for you doesn't work in a fast takeoff. "AGI winter" and "overseeing alignment work done by AI" do both strike me as scenarios where agent foundations work is more useful than in the scenario I thought you were picturing. I think #1 still has a problem, but #2 is probably the argument for agent foundations work I currently find most persuasive. In the moratorium case we suddenly get much more time than we thought we had, which enables longer payback time plans. Seems like we should hold off on working on the longer payback time plans until we know we have that time, not while it still seems likely that the decisive period is soon. Having more human agent foundations expertise to better oversee agent foundations work done by AI seems good. How good it is depends on a few things. How much of the work that needs to be done is conceptual breakthroughs (tall) vs schlep with existing concepts (wide)? How quickly does our ability to oversee fall off for concepts more advanced than what we've developed so far? These seem to me like the main ones, and like very hard questions to get certainty on - I think that uncertainty makes me hesitant to bet on this value prop, but again, it's the one I think is best.
Ryan Kidd282

I just left a comment on PIBBSS' Manifund grant proposal (which I funded $25k) that people might find interesting.

Main points in favor of this grant

  1. My inside view is that PIBBSS mainly supports “blue sky” or “basic” research, some of which has a low chance of paying off, but might be critical in “worst case” alignment scenarios (e.g., where “alignment MVPs” don’t work, or “sharp left turns” and “intelligence explosions” are more likely than I expect). In contrast, of the technical research MATS supports, about half is basic research (e.g., interpretability
... (read more)

I don't think I'd change it, but my priorities have shifted. Also, many of the projects I suggested now exist, as indicated in my comments!

More contests like ELK with well-operationalized research problems (i.e., clearly explain what builder/breaker steps look like), clear metrics of success, and have a well-considered target audience (who is being incentivized to apply and why?) and user journey (where do prize winners go next?).

We've seen a profusion of empirical ML hackathons and contests recently.

A New York-based alignment hub that aims to provide talent search and logistical support for NYU Professor Sam Bowman’s planned AI safety research group.

Based on Bowman's comment, I no longer think this is worthwhile.

Hackathons in which people with strong ML knowledge (not ML novices) write good-faith critiques of AI alignment papers and worldviews

Apart Research runs hackathons, but these are largely empirical in nature (and still valuable).

A talent recruitment and onboarding organization targeting cyber security researchers

Palisade Research now exists and are running the AI Security Forum. However, I don't think Palisade are quite what I envisaged for this hiring pipeline.

SPAR exists! Though I don't think it can offer visas.

1Lucie Philippon
This is list is aimed at people visiting the Bay area and searching how to get in contact with the local community. Currently, the Lighthaven website does not list events happening there, so I don't think it's relevant for someone who is not searching for a venue. Possibly a larger index of rationalist resources in the Bay would be useful, including potential venues.
2kave
It's a bit of a non-central example. It's an event space, not a coworking space nor a group of people. On the other hand, it's pretty relevant for people who might want to run events or are attending an event there.
Ryan Kidd*40

I interpret your comment as assuming that new researchers with good ideas produce more impact on their own than in teams working towards a shared goal; this seems false to me. I think that independent research is usually a bad bet in general and that most new AI safety researchers should be working on relatively few impactful research directions, most of which are best pursued within a team due to the nature of the research (though some investment in other directions seems good for the portfolio).

I've addressed this a bit in thread, but here are some more ... (read more)

2Elizabeth
I don't believe that, although I see how my summary could be interpreted that way. I agree with basically all the reasons in your recent comment and most in the original comment. I could add a few reasons of my own doing independent grant-funded work sucks. But I think it's really important to track how founding projects tracks to increased potential safety instead of intermediates, and push hard against potential tail wagging the dog scenarios.  I was trying to figure out why this was important to me, given how many of your points I agree with. I think it's a few things: * Alignment work seems to be prone to wagging the dog, and is harder to correct, due to poor feedback loops.  * The consequences of this can be dire * making it harder to identify and support the best projects.  * making it harder to identify and stop harmful projects * making it harder to identify when a decent idea isn't panning out, leading to people and money getting stuck in the mediocre project instead of moving on.  * One of the general concerns about MATS is it spins up potential capabilities researchers. If the market can't absorb the talent, that suggests maybe MATS should shrink. * OTOH if you told me that for every 10 entrants MATS spins up 1 amazing safety researcher and 9 people who need makework to prevent going into capabilities, I'd be open to arguments that that was a good trade. 

Also note that historically many individuals entering AI safety seem to have been pursuing the "Connector" path, when most jobs now (and probably in the future) are "Iterator"-shaped, and larger AI safety projects are also principally bottlenecked by "Amplifiers". The historical focus on recruiting and training Connectors to the detriment of Iterators and Amplifiers has likely contributed to this relative talent shortage. A caveat: Connectors are also critical for founding new research agendas and organizations, though many self-styled Connectors would lik... (read more)

Load More