All of Gretta Duleba's Comments + Replies

I am not sure which way you intended that sentence. Did you mean:

A. We want to shut down all AGI research everywhere by everyone, or
B. We want to shut down AGI research and we also want to shut down governments and militaries and spies

I assume you meant the first thing, but want to be sure!

We support A. Eliezer has been very clear about that in his tweets. In broader MIRI communications, it depends on how many words we have to express our ideas, but when we have room we spell out that idea.

I agree that current / proposed regulation is mostly not aimed at A.

8wassname
Definitely A, and while it's clear MIRI means well, I'm suggesting a focus on preventing military and spy arms races in AI. Because it seems like a likely failure mode, which no one is focusing on. It seems like a place where a bunch of blunt people can expand the Overton window to everyone's advantage. MIRI has used nuclear non-proliferation as an example (getting lots of pushback). But non-proliferation did not stop new countries from getting the bomb, it did certainly did not stop existing countries from scaling up their nuclear arsenals. Global de-escalation after the end of the Cold War is what caused that. For example, look at this graph it doesn't go down after the 1968 treaty, it goes down after the Cold War (>1985). We would not want to see a similar situation with AI, where existing countries race to scale up their efforts and research. This is in no way a criticism, MIRI is probably already doing the most here, and facing criticism for it. I'm just suggesting the idea.

As I mentioned in the post we are looking to hire or partner with a new spokesperson if we can find someone suitable. We don't think it will be easy to find someone great; it's a pretty hard job.

Gosh, I haven't really conducted a survey here or thought deeply about it, so this answer will be very off the cuff and not very 2024. Some of the examples that come to mind are the major media empires of, e.g. Brene Brown or Gretchen Rubin. 

Rob Bensinger has tweeted about it some.

Overall we continue to be pretty weak in on the "wave" side, having people comment publicly on current events / take part in discourse, and the people we hired recently are less interested in that and more interested in producing the durable content. We'll need to work on it.

7Rob Bensinger
The stuff I've been tweeting doesn't constitute an official MIRI statement — e.g., I don't usually run these tweets by other MIRI folks, and I'm not assuming everyone at MIRI agrees with me or would phrase things the same way. That said, some recent comments and questions from me and Eliezer: * May 17: Early thoughts on the news about OpenAI's crazy NDAs. * May 24: Eliezer flags that GPT-4o can now pass one of Eliezer's personal ways of testing whether models are still bad at math. * May 29: My initial reaction to hearing Helen's comments on the TED AI podcast. Includes some follow-on discussion of the ChatGPT example, etc. * May 30: A conversation between me and Emmett Shear about the version of events he'd tweeted in November. (Plus a comment from Eliezer.) * May 30: Eliezer signal-boosting a correction from Paul Graham. * June 4: Eliezer objects to Aschenbrenner's characterization of his timelines argument as open-and-shut "believing in straight lines on a graph".

Oh, yeah, to be clear I completely made up the "rock / wave" metaphor. But the general model itself is pretty common I think; I'm not claiming to be inventing totally new ways of spreading a message, quite the opposite.

4Gunnar_Zarncke
I like it. It's quite evocative. 

Your curiosity and questions are valid but I'd prefer not to give you more than I already have, sorry.

8[anonymous]
Valid!

What are the artifacts you're most excited about, and what's your rough prediction about when they will be ready?

 

Due to bugs in human psychology, we are more likely to succeed in our big projects if we don't yet state publicly what we're going to do by when. Sorry. I did provide some hints in the main post (website, book, online reference).

how do you plan to assess the success/failure of your projects? Are there any concrete metrics you're hoping to achieve? What does a "really good outcome" for MIRI's comms team look like by the end of the year,

The ... (read more)

8[anonymous]
Thanks! Despite the lack of SMART goals, I still feel like this reply gave me a better sense of what your priorities are & how you'll be assessing success/failure. One failure mode– which I'm sure is already on your radar– is something like: "MIRI ends up producing lots of high-quality stuff but no one really pays attention. Policymakers and national security people are very busy and often only read things that (a) directly relate to their work or (b) are sent to them by someone who they respect." Another is something like: "MIRI ends up focusing too much on making arguments/points that are convincing to general audiences but fail to understand the cruxes/views of the People Who Matter." (A strawman version of this is something like "MIRI ends up spending a lot of time in the Bay and there's lots of pressure to engage a bunch with the cruxes/views of rationalists, libertarians, e/accs, and AGI company employees. Meanwhile, the kinds of conversations happening among natsec folks & policymakers look very different, and MIRI's materials end up being less relevant/useful to this target audience." I'm extremely confident that these are already on your radar, but I figure it might be worth noting that these are two of the failure modes I'm most worried about. (I guess besides the general boring failure mode along the lines of "hiring is hard and doing anything is hard and maybe things just stay slow and when someone asks what good materials you guys have produced the answer is still 'we're working on it'.) (Final note: A lot of my questions and thoughts have been critical, but I should note that I appreciate what you're doing & I'm looking forward to following MIRI's work in the space! :D)

In this reply I am speaking just about the comms team and not about other parts of MIRI or other organizations.

We want to produce materials that are suitable and persuasive for the audiences I named. (And by persuasive, I don't mean anything manipulative or dirty; I just mean using valid arguments that address the points that are most interesting / concerning to our audience in a compelling fashion.)

So there are two parts here: creating high quality materials, and delivering them to that audience.

First, creating high quality materials. Some of this is down... (read more)

4[anonymous]
Thank you! I still find myself most curious about the "how will MIRI make sure it understands its audience" and "how will MIRI make sure its materials are read by policymakers + natsec people" parts of the puzzle. Feel free to ignore this if we're getting too in the weeds, but I wonder if you can share more details about either of these parts.

We think that most people who see political speech know it to be political speech and automatically discount it. We hope that speaking in a different way will cut through these filters.

7Erich_Grunewald
That's one reason why an outspoken method could be better. But it seems like you'd want some weighing of the pros and cons here? (Possible drawbacks of such messaging could include it being more likely to be ignored, or cause a backlash, or cause the issue to become polarized, etc.) Like, presumably the experts who recommend being careful what you say also know that some people discount obviously political speech, but still recommend/practice being careful what you say. If so, that would suggest this one reason is not on its own enough to override the experts' opinion and practice.

At the start of 2024, the comms team was only me and Rob. We hired Harlan in Q1 and Joe and Mitch are only full time as of this week. Hiring was extremely labor-intensive and time consuming. As such, we haven't kicked into gear yet.

The main publicly-visible artifact we've produced so far is the MIRI newsletter; that comes out monthly.

Most of the rest of the object-level work is not public yet; the artifacts we're producing are very big and we want to get them right.

5[anonymous]
To the extent that this can be shared– What are the artifacts you're most excited about, and what's your rough prediction about when they will be ready? Moreover, how do you plan to assess the success/failure of your projects? Are there any concrete metrics you're hoping to achieve? What does a "really good outcome" for MIRI's comms team look like by the end of the year, and what does a "we have failed and need to substantially rethink our approach, speed, or personnel" outcome look like? (I ask partially because one of my main uncertainties right now is how well MIRI will get its materials in front of the policymakers and national security officials you're trying to influence. In the absence of concrete goals/benchmarks/timelines, I could imagine a world where MIRI moves at a relatively slow pace, produces high-quality materials with truthful arguments, but this content isn't getting to the target audience, and the work isn't being informed by the concerns/views of the target audience.)

All of your questions fall under Lisa's team and I will defer to her.

3[anonymous]
Got it– thank you! Am I right in thinking that your team intends to influence policymakers and national security officials, though? If so, I'd be curious to learn more about how you plan to get your materials in front of them or ensure that your materials address their core points of concern/doubt. Put a bit differently– I feel like it would be important for your team to address these questions insofar as your team has the following goals:

A reasonable point, thank you. We said it pretty clearly in the MIRI strategy post in January, and I linked to that post here, but perhaps I should have reiterated it.

For clarity: we mostly just expect to die. But while we can see viable paths forward at all, we'll keep trying not to.

That phrase sounds like the Terminator movies to me; it sounds like plucky humans could still band together to overthrow their robot overlords. I want to convey a total loss of control.

In documents where we have more room to unpack concepts I can imagine getting into some of the more exotic scenarios like aliens buying brain scans, but mostly I don't expect our audiences to find that scenario reassuring in any way, and going into any detail about it doesn't feel like a useful way to spend weirdness points.

Some of the other things you suggest, like future s... (read more)

Some of the other things you suggest, like future systems keeping humans physically alive, do not seem plausible to me.

I agree with Gretta here, and I think this is a crux. If MIRI folks thought it were likely that AI will leave a few humans biologically alive (as opposed to information-theoretically revivable), I don't think we'd be comfortable saying "AI is going to kill everyone". (I encourage other MIRI folks to chime in if they disagree with me about the counterfactual.)

I also personally have maybe half my probability mass on "the AI just doesn't stor... (read more)

going into any detail about it doesn't feel like a useful way to spend weirdness points.

That may be a reasonable consequentialist decision given your goals, but it's in tension with your claim in the post to be disregarding the advice of people telling you to "hoard status and credibility points, and [not] spend any on being weird."

Whatever they're trying to do, there's almost certainly a better way to do it than by keeping Matrix-like human body farms running.

You've completely ignored the arguments from Paul Christiano that Ryan linked to at the to... (read more)

2davekasten
I would like to +1 the "I don't expect our audiences to find that scenario reassuring in any way" -- I would also add that the average policymaker I've ever met wouldn't find a lack of including the exotic scenarios to be in any way inaccurate or deceitful, unless you were way in the weeds for a multi-hour convo and-or they asked you in detail for "well, are there any weird edge cases where we make it through".  
4ryan_greenblatt
Yeah, seems like a reasonable concern. FWIW, I also do think that it is reasonably likely that we'll see conflict between human factions and AI factions (likely with humans allies) in which the human factions could very plausibly win. So, personally, I don't think that "immediate total loss of control" is what people should typically be imagining.
4ryan_greenblatt
Insofar as AIs are doing things because they are what existing humans want (within some tiny cost budget), then I expect that you should imagine that what actually happens is what humans want (rather than e.g. what the AI thinks they "should want") insofar as what humans want is cheap. See also here which makes a similar argument in response to a similar point. So, if humans don't end up physically alive but do end up as uploads/body farms/etc one of a few things must be true: * Humans didn't actually want to be physically alive and instead wanted to be uploads. In this case, it is very misleading to say "the AI will kill everyone (and sure there might be uploads, but you don't want to be an upload right?)" because we're conditioning on people deciding to become uploads! * It was too expensive to keep people physically alive rather than uploads. I think this is possible but somewhat implausible: the main reasons for cost here apply to uploads as much as to keeping humans physically alive. In particular, death due to conflict or mass slaughter in cases where conflict was the AI's best option to increase the probability of long run control.

I don't speak for Nate or Eliezer in this reply; where I speak about Eliezer I am of course describing my model of him, which may be flawed.

Three somewhat disjoint answers:

  1. From my perspective, your point about algorithmic improvement only underlines the importance of having powerful people actually get what the problem is and have accurate working models. If this becomes true, then the specific policy measures have some chance of adapting to current conditions, or of being written in an adaptive manner in the first place.
  2. Eliezer said a few years ago that "
... (read more)
2aogara
Has MIRI considered supporting work on human cognitive enhancement? e.g. Foresight's work on WBE. 
8Seth Herd
It seems like including this in the strategy statement is crucial to communicating that strategy clearly (at least to people who understand enough of the background). A long-shot strategy looks very different from one where you expect to achieve at least useful parts of your goals.

Indeed! However, I'd been having stress dreams for months about getting drowned in the churning tidal wave of the constant news cycle, and I needed something that fit thematically with 'wave.' :-)

Writers at MIRI will primarily be focusing on explaining why it's a terrible idea to build something smarter than humans that does not want what we want. They will also answer the subsequent questions that we get over and over about that. 

We want a great deal of overlap with Pacific time hours, yes. A nine-hour time zone difference would probably be pretty rough unless you're able to shift your own schedule by quite a bit.

1Heighn
Alright, thanks for your answer!

Of course. But if it's you, I can't guess which application was yours from your LW username. Feel free to DM me details.

No explicit deadline, I currently expect that we'll keep the position open until it is filled. That said, I would really like to make a hire and will be fairly aggressively pursuing good applications.

I don't think there is a material difference between applying today or later this week, but I suspect/hope there could be a difference between applying this week and next week.

"Wearing your [feelings] on your sleeve" is an English idiom meaning openly showing your emotions.

It is quite distinct from the idea of belief as attire from Eliezer's sequence post, in which he was suggesting that some people "wear" their (improper) beliefs to signal what team they are on.

Nate and Eliezer openly show their despair about humanity's odds in the face of AI x-risk, not as a way of signaling what team they're on, but because despair reflects their true beliefs.

-9Thoth Hermes

2. Why do you see communications as being as decoupled (rather, either that it is inherently or that it should be) from research as you currently do? 

The things we need to communicate about right now are nowhere near the research frontier.

One common question we get from reporters, for example, is "why can't we just unplug a dangerous AI?" The answer to this is not particularly deep and does not require a researcher or even a research background to engage on.

We've developed a list of the couple-dozen most common questions we are asked by the press and ... (read more)

6Malo
Quickly chiming in to add that I can imagine there might be some research we could do that could be more instrumentally useful to comms/policy objectives. Unclear whether it makes sense for us to do anything like that, but it's something I'm tracking.

Re: the wording about airstrikes in TIME: yeah, we did not anticipate how that was going to be received and it's likely we would have wordsmithed it a bit more to make the meaning more clear had we realized. I'm comfortable calling that a mistake. (I was not yet employed at MIRI at the time but I was involved in editing the draft of the op-ed so it's at least as much on me as anybody else who was involved.)

Re: policy division: we are limited by our 501(c)3 status as to how much of our budget we can spend on policy work, and here 'budget' includes the time ... (read more)

8[anonymous]
Thanks for this; seems reasonable to me. One quick note is that my impression is that it's fairly easy to set up a 501(c)4. So even if [the formal institution known as MIRI] has limits, I think MIRI would be able to start a "sister org" that de facto serves as the policy arm. (I believe this is accepted practice & lots of orgs have sister policy orgs.)  (This doesn't matter right now, insofar as you don't think it would be an efficient allocation of resources to spin up a whole policy division. Just pointing it out in case your belief changes and the 501(c)3 thing felt like the limiting factor). 

I think that's pretty close, though when I hear the word "activist" I tend to think of people marching in protests and waving signs, and that is not the only way to contribute to the effort to slow AI development. I think more broadly about communications and policy efforts, of which activism is a subset.

It's also probably a mistake to put capabilities researchers and alignment researchers in two entirely separate buckets. Their motivations may distinguish them, but my understanding is that the actual work they do unfortunately overlaps quite a bit.

That's pretty surprising to me; for a while I assumed that the scenario where 10% of the population knew about superintelligence as the final engineering problem was a nightmare scenario e.g. because it would cause acceleration.

 

"Don't talk too much about how powerful AI could get because it will just make other people get excited and go faster" was a prevailing view at MIRI for a long time, I'm told. (That attitude pre-dates me.) At this point many folks at MIRI believe that the calculus has changed, that AI development has captured so much energy and attention that it is too late for keeping silent to be helpful, and now it's better to speak openly about the risks.

6trevor
Awesome, that's great to hear and these recommendations/guidance are helpful and more than I expected, thank you. I can't wait to see what you've cooked up for the upcoming posts, MIRI's outputs (including decisions) are generally pretty impressive and outstanding, and I have a feeling that I'll appreciate and benefit from them even more now that people are starting to pull out the stops.
  • What do you see as the most important messages to spread to (a) the public and (b) policymakers?

 

That's a great question that I'd prefer to address more comprehensively in a separate post, and I should admit up front that the post might not be imminent as we are currently hard at work on getting the messaging right and it's not a super quick process.

  • What mistakes do you think MIRI has made in the last 6 months?

Huh, I do not have a list prepared and I am not entirely sure where to draw the line around what's interesting to discuss and what's not; furth... (read more)

8[anonymous]
Thanks for this response! I didn't have any candidate mistakes in mind. After thinking about it for 2 minutes, here are some possible candidates (though it's not clear to me that any of them were actually mistakes): * Eliezer's TIME article explicitly mentions that states should "be willing to destroy a rogue datacenter by airstrike." On one hand, it seems important to clearly/directly communicate the kind of actions that Eliezer believes the world will need to be willing to take in order to keep us safe. On the other hand, I think this particular phrasing might have made the point easier to critique/meme. On net, it's unclear to me if this was a mistake, but it seems plausible that there could've been a way to rephrase this particular sentence that maintains clarity while reducing the potential for misinterpretations or deliberate attacks.  * I sometimes wonder if MIRI should have a policy division that is explicitly trying to influence policymakers. It seems like we have a particularly unique window over the next (few? several?) months, where Congress is unusually-motivated to educate themselves about AI. Until recently, my understanding is that virtually no one was advocating from a MIRI perspective (e.g., alignment is difficult, we can't just rely on dangerous capability evaluations, we need a global moratorium and the infrastructure required for it). I think the Center for AI Policy is now trying to fill the gap, and it seems plausible to me that MIRI should either start their own policy team ( or assume more of a leadership role at the Center for AI Policy-- EG being directly involved in hiring, having people in-person in DC, being involved in strategy/tactics, etc. This is of course conditional on the current CAIP CEO being open to this, and I suspect he would be).  * Perhaps MIRI should be critiquing some of the existing policy proposals. I think MIRI has played an important role in "breaking" alignment proposals (i.e., raising awareness about the limit
7Malo
To expand on this a bit, I and a couple others at MIRI have been spending some time syncing up and strategizing with other people and orgs who are more directly focused on policy work themselves. We've also spent some time chatting with folks in government that we already know and have good relationships with. I expect we'll continue to do a decent amount of this going forward.  It's much less clear to me that it makes sense for us to end up directly engaging in policy discussions with policymakers as an important focus of ours (compared to focusing on broad public comms), given that this is pretty far outside of our area of expertise. It's definitely something I'm interested in exploring though, and chatting about with folks who have expertise in the space.
Malo131
  • Does MIRI need any help? (Or perhaps more precisely "Does MIRI need any help from the right kind of person with the right kind of skills, and if so, what would that person or those skills look like?")

Yes, I expect to be hiring in the comms department relatively soon but have not actually posted any job listings yet. I will post to LessWrong about it when I do.

That said, I'd be excited for folks who think they might have useful background or skills to contribute and would be excited to work at MIRI, to reach out and let us know they exist, or pitch us on why they might be a good addition to the team.

I do not (yet) know that Nye resource so I don't know if I endorse it. I do endorse the more general idea that many folks who understand the basics of AI x-risk could start talking more to their not-yet-clued-in friends and family about it.

I think in the past, many of us didn't bring this up with people outside the bubble for a variety of reasons: we expected to be dismissed or misunderstood, it just seemed fruitless, or we didn't want to freak them out.

I think it's time to freak them out.

And what we've learned from the last seven months of media appearanc... (read more)

3trevor
That makes sense, I would never ask for such an endorsement; I don't think it would help MIRI directly, but Soft Power is one of the most influential concepts among modern international relations experts and China experts, and it's critical for understanding the environment that AI safety public communication takes place in (e.g. if the world is already oversaturated with highly professionalized information warfare then that has big implications e.g. MIRI could be fed false data to mislead them into believing they are succeeding at describing the problem when in reality the needle isn't moving). That's pretty surprising to me; for a while I assumed that the scenario where 10% of the population knew about superintelligence as the final engineering problem was a nightmare scenario e.g. because it would cause acceleration. I even abstained from helping Darren McKee with his book attempting to describe the AI safety problem to the public, even though I wanted to, because I was worried about contributing to making people more capable of spreading AI safety ideas.  If MIRI has changed their calculus on this, then of course I will defer to that since I have a sense of how far outside my area of expertise it is. But it's still a really big shift for me. I'm not sure what to make of this; AI advancement is pretty valuable for national security (e.g. allowing hypersonic nuclear missiles to continue flying under the radar if military GPS systems are destroyed or jammed/spoofed) and the balance of power between the US and China in other ways, similar to nuclear weapons; and if public opinion turned against nuclear weapons during the 1940s and 1950s, back when democracy was stronger, I'm not sure if it would have had much of an effect (perhaps it would have pushed it underground). I'm still deferring to MIRI on this pivot and will help other people take the advice in this comment, and I found this comment really helpful and I'm glad that big pivots are being committed to, but

Thanks, much appreciated! Your work is on my (long) list to check out. Is there a specific video you're especially proud of that would be a great starting point?

Feel free to send me a discord server invitation at gretta@intelligence.org.

5Writer
If you just had to pick one, go for The Goddess of Everything Else.  Here's a short list of my favorites. In terms of animation:  - The Goddess of Everything Else - The Hidden Complexity of Wishes - The Power of Intelligence In terms of explainer: - Humanity was born way ahead of its time. The reason is grabby aliens. [written by me] - Everything might change forever this century (or we’ll go extinct). [mostly written by Matthew Barnett] Also, I've sent the Discord invite.

I think your thesis is not super crisp, because this was an off the cuff post! And your examples are accordingly not super clear either, same reason. But there's definitely still a nugget of an idea in here.

It's something like, with the decentralization of both taking a position in the first place, and commenting on other people's positions, the lizardmen have more access to the people taking positions than they did in a world without social media. And lizardmen can and do serious damage to individuals in a seemingly random fashion.

Yup, seems legit. Our sp... (read more)

Do what you need to do to take care of yourself! It sounds like you don't choose to open up to your wife about your distress, for fear of causing her distress. I follow your logic there, but I also hope you do have someone you can talk to about it whom you don't fear harming, because they already know and are perhaps further along on the grief / acceptance path than you are.

Good luck. I wish you well.

Yes, that's correct, I was referring to the fable. I should probably have included a broader hint about that.