An EA used deceptive messaging to advance her project; we need mechanisms to avoid deontologically dubious plans

Mikhail Samin

TL;DR: good-hearted EAs lack the mechanisms to not output information that can mislead people. Holly Elmore organized a protest, with the messaging centered on OpenAI changing their documents to "work with the Pentagon," while OpenAI only collaborates with DARPA on open-source cybersecurity tools and is in talks with the Pentagon about veteran suicide prevention. Many participants of the protest weren’t aware of this; the protest announcement and the press release did not mention this. People were misled into thinking OpenAI is working on military applications of AI. OpenAI still prohibits the use of their services to "harm people, develop weapons, for communications surveillance, or to injure others or destroy property". If OpenAI wanted to have a contract with the Pentagon to work on something bad, they wouldn't have needed to change the usage policies of their publicly available services and could've simply provided any services through separate agreements. Holly was warned in advance that messaging could be misleading in this way, but she didn't change it. I think this was deceptive, and some protest participants agreed with me. The community should notice a failure mode and implement something that would prevent unilateral decisions with bad consequences or noticeable violations of deontology.

See a comment from Holly in a footnote^[1] and a post she published before I published this post. In both, she continued to ignore the core of what was deceptive- not the "charter" mistake, but the claim that OpenAI was starting to work with the Pentagon, without providing important context of the nature of the work. She further made false statements in the comments, which is easily demonstratable with the messages she sent me; I'm happy to share those on request^[2], DM or email me.

The original version of this post didn't mention Holly's name and was trying to point at a failure mode I wanted the community to fix. But since Holly commented on the post and it's clear who it is about, and the situation hasn't improved, this became needlessly making the post more confusing, so I edited the post, removing earlier redactions of the name.

Correction: the post previously asserted^[3] that a co-organiser of the protest is a high-status member of the EA community; people told me this might be misleading, as many in the community disagree with their strategies; I removed that wording.

Thanks to everyone who provided helpful comments and feedback.

Recently, a group of EA/EA-adjacent people announced a protest against OpenAI changing their charter to work with the Pentagon.

But OpenAI didn’t change its charter. They changed only the usage policy: a document that describes how their publicly available service can be used. The previous version of the policy would not have in any way prevented OpenAI from, say, working on developing weapons for the Pentagon.

The nature of the announced work was also, in my opinion, beneficial: they announced a collaboration with DARPA on open-source cybersecurity, and said they’re in talks with the Pentagon about helping prevent suicides among veterans.

This differs from what people understood from both the initial announcement to protest OpenAI changing its charter to work with the Pentagon and the subsequently changed version that corrected (and acknowledged) the "charter" mistake:

Join us and tell OpenAI "Stop working with the Pentagon!"
On January 10th, without any announcement, OpenAI deleted the language in its usage policy* that had stated that OpenAI doesn’t allow its models to be used for “activities that have a high chance of causing harm” such as “military and warfare”. Then, on January 17th, TIME reported that OpenAI would be taking the Pentagon as a client. On 2/12, we will demand that OpenAI end its relationship with the Pentagon and not take any military clients. If their ethical and safety boundaries can be revised out of convenience, they cannot be trusted.
AI is rapidly becoming more powerful, far faster than virtually any AI scientist has predicted. Billions are being poured into AI capabilities, and the results are staggering. New models are outperforming humans in many domains. As capabilities increase, so do the risks. Scientists are even warning that AI might end up destroying humanity.
According to their charter, “OpenAI’s mission is to ensure that artificial general intelligence (AGI)—by which we mean highly autonomous systems that outperform humans at all economically valuable work—benefits all of humanity.” But many humans value their work and find meaning in it, and hence do not want their jobs to be done by an AGI instead. What protest co-organizer [name] of [org] calls “the Psychological Threat” applies even if AGI doesn't kill us.
*an earlier version of this description incorrectly referred to the usage policy as a “charter”

The usage policies still prohibit the use of their services to "harm people, develop weapons, for communications surveillance, or to injure others or destroy property."It is technically true that OpenAI wants to have the Pentagon as a client: they collaborate with DARPA on open-source cybersecurity and are talking to the Pentagon about veteran suicide prevention. But I think, even with "charter" changed to "the usage policy", the resulting phrasing is deceptive: a reader gets an impression that diverges from reality in ways that make the reader disagree with OpenAI’s actions and more likely to come to the protest. People understand it to mean that previously OpenAI couldn’t work with the Pentagon because of the policy, but now can. Which is, as far as I’m aware, false. Previously, it wasn’t clear whether the Pentagon could sign up on the OpenAI website, just like everyone else, and use the publicly available service; but nothing would’ve prevented OpenAI from making an agreement with the Pentagon outside their public terms of service. (Also, mostly the usage policies were changed to increase readability.)

A housemate realized all that after a short conversation and said they stopped wanting to attend the protest after it, as the messaging was misleading, and they didn’t want to contribute to spreading it. (I told them it’s fine to go and protest^[4] reckless OpenAI actions without necessarily supporting the central messaging of the protest organizers.) 3 or 4 pro-Palestine activists attended the protest because they saw the announcement and decided to speak out against OpenAI working with the Pentagon on military uses of AI, which might be helping Israel. They, possibly, wouldn't have come to the protest if they knew OpenAI wasn't actually helping the Pentagon with weapons.

This is not normal. If you’re organizing a protest (or doing any public comms), you want people to be more likely to attend the protest (or support your message) if they become more aware of the details of the situation (such as OpenAI only working with DARPA on open-source cybersecurity tools and being in talks with the Pentagon about veteran suicide prevention, and still having "no use for weapons or causing harm to humans" in their public policies), not less likely.

If their ethical and safety boundaries can be revised out of convenience, they cannot be trusted.

These were not OpenAI’s ethical and safety boundaries; these were a part of the policy for their publicly available service. Whether or not OpenAI can, e.g., develop weapons is not affected by this change.

I think when the organisers become more aware of the details like that, they should change the protest's message (or halt or postpone the protest: generally, it's important to make sure people don't decide to come for the wrong reasons, though in this case simply fixing the messaging to not be misleading would've been fine)^[5].

Instead, the organisers edited the press release at the last moment before sending it out, replacing “charter”, and went on with the protest and IMO deceptive messaging.

Even the best-intending people are not perfect, often have some inertia, and aren't able to make sure the messaging they put out isn't misleading and fully propagate updates.

Not being able to look at previously made plans and revise them in light of new information seems bad. Spreading misleading messages to the media and the people following you seems very bad.

I ask the community to think about designing mechanisms to avoid these failure modes in the future.

I feel that stronger norms around unilateral actions could’ve helped (for example, if people in the community looked into this and suggested not to do this bad thing).

^{^}
A comment from a protest organiser
(Holly wrote it in third person.)
> This differs from what people understood from both the initial announcement to protest OpenAI changing its charter to work with the Pentagon and the subsequently changed version that corrected (and acknowledged) the "charter" mistake
Holly Elmore explains that she simply made a mistake when writing the press release weeks before the event. She quoted the charter early on when drafting it, and then, in a kind of word mistake that is unfortunately common for her, started using the word “charter” for both the actual charter and the usage policy document. It was unfortunately a semantic mistake, so proofreaders didn’t catch it. She also did this verbally in several places. She even kind of convinced herself from hearing her own mistaken language that OpenAI had violated a much more serious boundary– their actual guiding document– than they had. She was horrified when she discovered the mistake because it conveyed a significantly different meaning than the true story, and could have slandered OpenAI. She spent hours trying to track down every place she had said it and people who may have repeated it so it could be corrected. She told the protesters right away about her mistake and explained changing the usage policy is a lot less bad than changing the charter, but the protest was still on, as it had been before the military story arose as the “small ask” focus to the “big ask” of pausing AI.
My reply
I felt confused about first their private messages and then their comments until I realised she simply didn't understand the problem I'm talking about in this post. I’m concerned not about the “charter”, which was indeed an honest and corrected mistake, but about the final messaging. The message remained "OpenAI changed some important documents to enable them to work with the Pentagon", which creates an impression different from reality: people think OpenAI did something to be able to work on military applications of AI. The day after the protest, I talked to three participants who were surprised to hear OpenAI is only working on open-source cybersecurity tools and is in talks about veteran suicide prevention, and the change of the usage policies didn't impact OpenAI's ability to separately work with the military. She agreed the messaging of the protest was misleading. (All three would've been happy to come to a more general protest about OpenAI's recklessness in racing to develop AGI.) Some people told me that actually, initially, a more general protest against OpenAI was planned, and a more specific message emerged after the news of the change of policies. I would have no problem with the protest whatsoever and wouldn't be writing this post if the protest had fallen back to the original/more general messaging that wasn't misleading people about OpenAI's relationship with the Pentagon. I encouraged people to attend despite the messaging of the protest being misleading, because it's possible for people to attend and communicate what creates a truthful impression despite what the announcement and the press release say.
The point that I want to make is that the community needs to design mechanisms to avoid unilateral actions leading to something deontologically bad, such as spreading messages the community knows are not truthful. Making decisions under pressure, avoiding motivated cognition, etc., are hard; without accepted mechanisms for coordination, fact-checks, strong norms around preventing deception, etc., we might do harm.
^{^}
My policy is to usually not share private messages, but these contain threats; if I get threats from someone, my policy is to instead publish or freely share the messages from that person.
^{^}
One of the protest organisers is a ~~high-status~~ member of the EA community with a lot of connections within it who spoken about organising protests at the recent EAG conference. ~~I expect many members of the EA community might feel like they shouldn’t criticise actions talked about in an EAG talk, even if they notice these actions are bad.~~ (I no longer strongly expect that people would be intimidated to criticise actions in this case.)
^{^}
Participating in a protest that's happening anyway can be positive, although I wasn't entirely sure; it also didn't feel right to reduce the number of people going to a protest that someone with similar goals was organised.
To be clear, in general, protests can be great, and I wouldn't be writing this post and trying to get the community to pay attention to this protest if not for the misleading messaging.
^{^}
I think the protest went mostly well, except for the message in the announcement and the press release. Most people participating in it were doing a great job, and most signs weren't clearly wrong. If she fell back to a more general message instead of keeping the misleading one, I wouldn't be writing this post.

[-]Viliam2y1212

Commenting here, because I don't have an account at the EA forum.

Seems like we have two debates in parallel:

whether the protest information was misleading (in a very important way)
whether the misleading information was intentional.

From my perspective, the answer to the former question is "definitely yes". If I participated in a protest against OpenAI cooperating with Pentagon, I would feel really ashamed if I later learned that the cooperation was about veteran suicide prevention. That would go against... well, the things that make me want to be a rationalist.

(An argument could be made that the veteran suicide prevention is a "foot in the door". Today it is preventing veteran suicide; tomorrow it could be increasing the troops morale, effective propaganda, psychological terror, who knows what else. But even in that case, I would like to make it clear that I am protesting against crossing a line that potentially leads to bad things, rather than the bad things already happening today.)

The second question is tricky -- Mikhail feels justified at using stronger language, because he communicated his concerns to the organizers and was ignored. But maybe it was a honest mistake in the communication. So, even if the accusation is justified, it would have been better to make it more separated from the main point.

I find it baffling that the most upvoted comment as of now calls it "a massive storm in a teacup". I am just an outsider here, but this is exactly the kind of a thing that would make someone lose a lot of credibility in my eyes. If you wanted me to update towards "whatever EA people tell me is probably misinformation optimized for maximum outrage", this would be a good way to do it. And the priors for "activists exaggerate" are already high.

LESSWRONG
LW

LESSWRONG
LW

25

An EA used deceptive messaging to advance her project; we need mechanisms to avoid deontologically dubious plans

25

A comment from a protest organiser

My reply

25

25