MIRI updates

  • MIRI Communications Manager Gretta Duleba explains MIRI’s current communications strategy. We hope to clearly communicate to policymakers and the general public why there’s an urgent need to shut down frontier AI development, and make the case for installing an “off-switch”. This will not be easy, and there is a lot of work to be done. Some projects we’re currently exploring include a new website, a book, and an online reference resource.
     
  • Rob Bensinger argues, contra Leopold Aschenbrenner, that the US government should not race to develop artificial superintelligence. “If anyone builds it, everyone dies.” Instead, Rob outlines a proposal for the US to spearhead an international alliance to halt progress toward the technology.
     
  • At the end of June, the Agent Foundations team, including Scott Garrabrant and others, will be parting ways with MIRI to continue their work as independent researchers. The team was originally set up and “sponsored” by Nate Soares and Eliezer Yudkowsky. However, as AI capabilities have progressed rapidly in recent years, Nate and Eliezer have become increasingly pessimistic about this type of work yielding significant results within the relevant timeframes. Consequently, they have shifted their focus to other priorities.

    Senior MIRI leadership explored various alternatives, including reorienting the Agent Foundations team’s focus and transitioning them to an independent group under MIRI fiscal sponsorship with restricted funding, similar to AI Impacts. Ultimately, however, we decided that parting ways made the most sense.

    The Agent Foundations team has produced some stellar work over the years, and made a true attempt to tackle one of the most crucial challenges humanity faces today. We are deeply grateful for their many years of service and collaboration at MIRI, and we wish them the very best in their future endeavors.
     
  • The Technical Governance Team responded to NIST’s request for comments on draft documents related to the AI Risk Management Framework. The team also sent comments in response to the “Framework for MItigating AI Risks” put forward by U.S. Senators Mitt Romney (R-UT), Jack Reed (D-RI), Jerry Moran (R-KS), and Angus King (I-ME).
     
  • Brittany Ferrero has joined MIRI’s operations team. Previously, she worked on projects such as the Embassy Network and Open Lunar Foundation. We’re excited to have her help to execute on our mission.
  • AI alignment researcher Paul Christiano was appointed as head of AI safety at the US AI Safety Institute. Last fall, Christiano published some of his thoughts about AI regulation as well as responsible scaling policies.
  • The Superalignment team at OpenAI has been disbanded following the departure of its co-leaders Ilya Sutskever and Jan Leike. The team was launched last year to try to solve the AI alignment problem in four years. However, Leike says that the team struggled to get the compute it needed and that “safety culture and processes have taken a backseat to shiny products” at OpenAI. This seems extremely concerning from the perspective of evaluating OpenAI’s seriousness when it comes to safety and robustness work, particularly given that a similar OpenAI exodus occurred in 2020 in the wake of concerns about OpenAI’s commitment to solving the alignment problem.
  • Vox’s Kelsey Piper reports that employees who left OpenAI were subject to an extremely restrictive NDA indefinitely preventing them from criticizing the company (or admitting that they were under an NDA), under threat of losing their vested equity in the company. OpenAI executives have since contacted former employees to say that they will not enforce the NDAs. Rob Bensinger comments on these developments here, strongly criticizing OpenAI for this policy.
  • Korea and the UK co-hosted the AI Seoul Summit, a virtual mini-summit following up on the first AI Safety Summit (which took place in the UK last November). At the Seoul summit, 16 AI companies committed to create and publish safety frameworks, including “thresholds at which severe risks posed by a model or system, unless adequately mitigated, would be deemed intolerable.” 
  • California State Senator Scott Wiener’s SB 1047 passed in the California State Senate and is now being considered in the California State Assembly. The bill requires pre-deployment testing and post-deployment monitoring for models trained with 10^26 FLOP and $100M.

You can subscribe to the MIRI Newsletter here.

New Comment
20 comments, sorted by Click to highlight new comments since:
[-]plex446

I initially thought MIRI dropping the AF team was a really bad move, and wrote (but didn't publish) an open letter aiming to discourage this (tl;dr thesis: This research might be critical, we want this kind of research to be ready to take advantage of a possible AI assisted research window).

After talking with the team more, I concluded that actually having an institutional home for this kind of work which is focused on AF would be healthier, as they'd be able to fundraise independently, self-manage, set their own agendas entirely freely, have budget sovereignty, etc, rather than being crammed into an org which was not hopeful about their work.

I've been talking in the background and trying to set them up with fiscal sponsorship and advising on forming an org for a few weeks now, it looks like this will probably work for most of the individuals, but the team has not cohered around a leadership structure or agenda yet. I'm hopeful that this will come together, as I think that this kind of theoretical research is one of the most likely classes of progress we need to navigate the transition to superintelligence. Most likely an umbrella org which hosts individual researchers is the short term solution, hopefully coalescing into a more organized team at some point.

Broadly I agree.

I'm not sure about:

but the team has not cohered around a leadership structure or agenda yet. I'm hopeful that this will come together

I don't expect the most effective strategy at present to be [(try hard to) cohere around an agenda]. An umbrella org hosting individual researchers seems the right starting point. Beyond that, I'd expect [structures and support to facilitate collaboration and self-organization] to be ideal.
If things naturally coalesce that's probably a good sign - but I'd prefer that to be a downstream consequence of exploration, not something to aim for in itself.

To be clear, this is all on the research side - on the operations side organization is clearly good.

[-]plex33

Yeah, I mostly agree with the claim that individuals pursuing their own agendas is likely better than trying to push for people to work more closely. Finding directions which people feel like converging on could be great, but not at the cost of being able to pursue what seems most promising in a self-directed way.

I think I meant I was hopeful about the whole thing coming together, rather than specifically the coherent agenda part.

Strongly agree that there needs to be an institutional home. My biggest problem is that there is still no such new home!

[-]plex110

AFFINE (Agent Foundations FIeld NEtwork) was set up and applied for SFF funding on behalf of several ex-MIRI members, but only got relatively small amounts of funding. We're thinking about the best currently possible model, but it's still looking like individuals applying for funding separately. I would be keen for a more structured org to pop up and fill the place, or for someone to join AFFINE and figure out how to make it a better home for AF.

It would be really interesting to read postmortem on Agent Foundarions work in MIRI.

Also a strategy postmortem on the decision to pivot to technical research in 2013: https://intelligence.org/2013/04/13/miris-strategy-for-2013/

I do wonder about the counterfactual where MIRI never sold the Singularity Summit, and it was blowing up as an annual event, same way Less Wrong blew up as a place to discuss AI. Seems like owning the Summit could create a lot of leverage for advocacy.

One thing I find fascinating is the number of times MIRI has reinvented themselves as an organization over the decades. People often forget that they were originally founded to bring about the Singularity with no concern for friendliness. (I suspect their advocacy would be more credible if they emphasized that.)

Really wishing the new Agent Foundations team the best. (MIRI too, but its position seems more secured)

I think that naively, I feel pretty good about this potential split. If MIRI is doing much more advocacy work, that work just seems very different to Agent Foundations research.

This could allow MIRI to be more controversial and risk-taking without tying things to the Agent Foundations research, and that research could hypothetically more easily getting funding from groups that otherwise disagree with MIRI's political views.

I hope that team finds good operations support or a different nonprofit sponsor of some kind. 

[-]niplav155

I think MIRI should update its team page if there are drastic changes to its team.

Senior MIRI leadership explored various alternatives, including reorienting the Agent Foundations team’s focus and transitioning them to an independent group under MIRI fiscal sponsorship with restricted funding, similar to AI Impacts. Ultimately, however, we decided that parting ways made the most sense.

I'm surprised! If MIRI is mostly a Pause advocacy org now, I can see why agent foundations research doesn't fit the new focus and should be restructured. But the benefit of a Pause is that you use the extra time to do something in particular. Why wouldn't you want to fiscally sponsor research on problems that you think need to be solved for the future of Earth-originating intelligent life to go well? (Even if the happy-path plan is Pause and superbabies, presumably you want to hand the superbabies as much relevant prior work as possible.) Do we know how Garrabrant, Demski, et al. are going to eat??

Relatedly, is it time for another name change? Going from "Singularity Institute for Artificial Intelligence" to "Machine Intelligence Research Institute" must have seemed safe in 2013. (You weren't unambiguously for artificial intelligence anymore, but you were definitely researching it.) But if the new–new plan is to call for an indefinite global ban on research into machine intelligence, then the new name doesn't seem appropriate, either?

But the benefit of a Pause is that you use the extra time to do something in particular. Why wouldn't you want to fiscally sponsor research on problems that you think need to be solved for the future of Earth-originating intelligent life to go well? 

MIRI still sponsors some alignment research, and I expect we'll sponsor more alignment research directions in the future. I'd say MIRI leadership didn't have enough aggregate hope in Agent Foundations in particular to want to keep supporting it ourselves (though I consider its existence net-positive).

My model of MIRI is that our main focus these days is "find ways to make it likelier that a halt occurs" and "improve the world's general understanding of the situation in case this helps someone come up with a better idea", but that we're also pretty open to taking on projects in all four of these quadrants, if we find something that's promising and that seems like a good fit at MIRI (or something promising that seems unlikely to occur if it's not housed at MIRI):

 AI alignment workNon-alignment work
High-EV absent a pause  
High-EV given a pause  

In terms of "improve the world's general understanding of the situation", I encourage MIRI to engage more with informed skeptics. Our best hope is if there is a flaw in MIRI's argument for doom somewhere. I would guess that e.g. Matthew Barnett he has spent something like 100x as much effort engaging with MIRI as MIRI has spent engaging with him, at least publicly. He seems unusually persistent -- I suspect many people are giving up, or gave up long ago. I certainly feel quite cynical about whether I should even bother writing a comment like this one.

Offering a quick two cents: I think MIRI‘s priority should be to engage with “curious and important newcomers” (e.g., policymakers and national security people who do not yet have strong cached views on AI/AIS). If there’s extra capacity and interest, I think engaging with informed skeptics is also useful (EG big fan of the MIRI dialogues), but on the margin I don’t suspect it will be as useful as the discussions with “curious and important newcomers.”

So what's the path by which our "general understanding of the situation" is supposed to improve? There's little point in delaying timelines by a year, if no useful alignment research is done in that year. The overall goal should be to maximize the product of timeline delay and rate of alignment insights.

Also, I think you may be underestimating the ability of newcomers to notice that MIRI tends to ignore its strongest critics. See also previously linked comment.

[-]Akash155

I think if MIRI engages with “curious newcomers” those newcomers will have their own questions/confusions/objections and engaging with those will improve general understanding.

Based on my experience so far, I don’t expect their questions/confusions/objections to overlap a lot with the questions/confusions/objections that tech-oriented active LW users have.

I also think it’s not accurate to say that MIRI tends to ignore its strongest critics; there’s perhaps more public writing/dialogues between MIRI and its critics than for pretty much any other organization in the space.

My claim is not that MIRI should ignore its critics but moreso that it should focus on replying to criticisms or confusions from “curious and important newcomers”. My fear is that MIRI might engage too much with criticisms from LW users and other ingroup members and not focus enough on engaging with policy folks, whose cruxes and opinions often differ substantially than EG the median LW commentator.

I think if MIRI engages with “curious newcomers” those newcomers will have their own questions/confusions/objections and engaging with those will improve general understanding.

You think policymakers will ask the sort of questions that lead to a solution for alignment?

In my mind, the most plausible way "improve general understanding" can advance the research frontier for alignment is if you're improving the general understanding of people fairly near that frontier.

Based on my experience so far, I don’t expect their questions/confusions/objections to overlap a lot with the questions/confusions/objections that tech-oriented active LW users have.

I expect MIRI is not the only tech-oriented group policymakers are talking to. So in the long run, it's valuable for MIRI to either (a) convince other tech-oriented groups of its views, or (b) provide arguments that will stand up against those from other tech-oriented groups.

there’s perhaps more public writing/dialogues between MIRI and its critics than for pretty much any other organization in the space.

I believe they are also the only organization in the space that says its main focus is on communications. I'm puzzled that multiple full-time paid staff are getting out-argued by folks like Alex Turner who are posting for free in their spare time.

If MIRI wants us to make use of any added timeline in a way that's useful, or make arguments that outsiders will consider robust, I think they should consider a technical communications strategy in addition to a general-public communications strategy. The wave-rock model could help for technical communications as well. Right now their wave game for technical communications seems somewhat nonexistent. E.g. compare Eliezer's posting frequency on LW vs X.

You depict a tradeoff between focusing on "ingroup members" vs "policy folks", but I suspect there are other factors which are causing their overall output to be low, given their budget and staffing levels. E.g. perhaps it's an excessive concern with org reputation that leads them to be overly guarded in their public statements. In which case they could hire an intern to argue online for 40 hours a week, and if the intern says something dumb, MIRI can say "they were just an intern -- and now we fired them." (Just spitballing here.)

It's puzzling to me that MIRI originally created LW for the purpose of improving humanity's thinking about AI, and now Rob says that's their "main focus", yet they don't seem to use LW that much? Nate hasn't said anything about alignment here in the past ~6 months. I don't exactly see them arguing with the ingroup too much.

Don’t have time to respond in detail but a few quick clarifications/responses:

— I expect policymakers to have the most relevant/important questions about policy and to be the target audience most relevant for enacting policies. Not solving technical alignment. (Though I do suspect that by MIRI’s lights, getting policymakers to understand alignment issues would be more likely to result in alignment progress than having more conversations with people in the technical alignment space.)

— There are lots of groups focused on comms/governance. MIRI is unique only insofar as it started off as a “technical research org” and has recently pivoted more toward comms/governance.

— I do agree that MIRI has had relatively low output for a group of its size/resources/intellectual caliber. I would love to see more output from MIRI in general. Insofar as it is constrained, I think they should be prioritizing “curious policy newcomers” over people like Matthew and Alex. — Minor but I don’t think MIRI is getting “outargued” by those individuals and I think that frame is a bit too zero-sum.

— Controlling for overall level of output, I suspect I’m more excited than you about MIRI spending less time on LW and more time on comms/policy work with policy communities (EG Malo contributing to the Schumer insight forums, MIRI responding to government RFCs). — My guess is we both agree that MIRI could be doing more on both fronts and just generally having higher output. My impression is they are working on this and have been focusing on hiring; I think if their output stayed relatively the same 3-6 months from now I will be fairly disappointed.

Don’t have time to respond in detail but a few quick clarifications/responses:

Sure, don't feel obligated to respond, and I invite the people disagree-voting my comments to hop in as well.

— There are lots of groups focused on comms/governance. MIRI is unique only insofar as it started off as a “technical research org” and has recently pivoted more toward comms/governance.

That's fair, when you said "pretty much any other organization in the space" I was thinking of technical orgs.

MIRI's uniqueness does seem to suggest it has a comparative advantage for technical comms. Are there any organizations focused on that?

by MIRI’s lights, getting policymakers to understand alignment issues would be more likely to result in alignment progress than having more conversations with people in the technical alignment space

By 'alignment progress' do you mean an increased rate of insights per year? Due to increased alignment funding?

Anyway, I don't think you're going to get "shut it all down" without either a warning shot or a congressional hearing.

If you just extrapolate trends, it wouldn't particularly surprise me to see Alex Turner at a congressional hearing arguing against "shut it all down". Big AI has an incentive to find the best witnesses it can, and Alex Turner seems to be getting steadily more annoyed. (As am I, fwiw.)

Again, extrapolating trends, I expect MIRI's critics like Nora Belrose will increasingly shift from the "inside game" of trying to engage w/ MIRI directly to a more "outside game" strategy of explaining to outsiders why they don't think MIRI is credible. After the US "shuts it down", countries like the UAE (accused of sponsoring genocide in Sudan) will likely try to quietly scoop up US AI talent. If MIRI is considered discredited in the technical community, I expect many AI researchers to accept that offer instead of retooling their career. Remember, a key mistake the board made in the OpenAI drama was underestimating the amount of leverage that individual AI researchers have, and not trying to gain mindshare with them.

Pause maximalism (by which I mean focusing 100% on getting a pause and not trying to speed alignment progress) only makes sense to me if we're getting a ~complete ~indefinite pause. I'm not seeing a clear story for how that actually happens, absent a much broader doomer consensus. And if you're not able to persuade your friends, you shouldn't expect to persuade your enemies.

Right now I think MIRI only gets their stated objective in a world where we get a warning shot which creates a broader doom consensus. In that world it's not clear advocacy makes a difference on the margin.

I realize if you had a good answer here the org would be doing different stuff, but, do you (or other MIRI folk) have any rough sense of the sort of alignment work that'd plausibly be in the left two quadrants there?

(also, when you say "high EV", are you setting the "high" bar at a level that means "good enough that anyone should be prioritizing?" or "MIRI is setting a particularly high bar for alignment research right now because it doesn't seem like the most important thing to be focusing on?")

superbabies

I'm concerned there may be an alignment problem for superbabies.

Humans often have contempt for people and animals with less intelligence than them. "You're dumb" is practically an all-purpose putdown. We seem to assign moral value to various species on the basis of intelligence rather than their capacity for joy/suffering. We put chimpanzees in zoos and chickens in factory farms.

Additionally, jealousy/"xenophobia" towards superbabies from vanilla humans could lead them to become misanthropes. Everyone knows genetic enhancement is a radioactive topic. At what age will a child learn they were modified? It could easily be just as big of a shock as learning that you were adopted or conceived by a donor. Then stack more baggage on top: Will they be bullied for it? Will they experience discrimination?

I feel like we're charging headlong into these sociopolitical implications, hollering "more intelligence is good!", the same way we charged headlong into the sociopolitical implications of the internet/social media in the 1990s and 2000s while hollering "more democracy is good!" There's a similar lack of effort to forecast the actual implications of the technology.

I hope researchers are seeking genes for altruism and psychological resilience in addition to genes for intelligence.