There’s a theory (twitter citing reddit) that at least one of these people filed GDPR right to be forgotten requests. So one hypothesis would be: all of those people filed such GDPR requests.
But the reddit post (as of right now) guesses that it might not be specifically about GDPR requests per se, but rather more generally “It's a last resort fallback for preventing misinformation in situations where a significant threat of legal action is present”.
OA has indirectly confirmed it is a right-to-be-forgotten thing in https://www.theguardian.com/technology/2024/dec/03/chatgpts-refusal-to-acknowledge-david-mayer-down-to-glitch-says-openai
ChatGPT’s developer, OpenAI, has provided some clarity on the situation by stating that the Mayer issue was due to a system glitch. “One of our tools mistakenly flagged this name and prevented it from appearing in responses, which it shouldn’t have. We’re working on a fix,” said an OpenAI spokesperson
...OpenAI’s Europe privacy policy makes clear that users can delete their personal data from its products, in a process also known as the “right to be forgotten”, where someone removes personal information from the internet.
OpenAI declined to comment on whether the “Mayer” glitch was related to a right to be forgotten procedure.
Good example of the redactor's dilemma and the need for Glomarizing: by confirming that they have a tool to flag names and hide them, and then by neither confirming or denying that this was related to a right-to-be-forgotten order (a meta-gag), they confirm that it's a right-to-be-forgotten bug.
Similar to when OA people were refusing to confirm or deny signing OA NDAs which forbade them from discussing whether they had signed an OA NDA... That was all the evidence you needed to know that there was a meta-gag order (as was eventually confirmed more directly).
I don't think it's necessarily GDPR-related but the names Brian Hood and Jonathan Turley make sense from a legal liability perspective. According to info via ArsTechnica,
Why these names?
We first discovered that ChatGPT choked on the name "Brian Hood" in mid-2023 while writing about his defamation lawsuit. In that lawsuit, the Australian mayor threatened to sue OpenAI after discovering ChatGPT falsely claimed he had been imprisoned for bribery when, in fact, he was a whistleblower who had exposed corporate misconduct.
...The case was ultimately resolved
This looks like it's related to the phenomenon of glitch tokens:
https://www.lesswrong.com/posts/8viQEp8KBg2QSW4Yc/solidgoldmagikarp-iii-glitch-token-archaeology
https://www.lesswrong.com/posts/f4vmcJo226LP7ggmr/glitch-token-catalog-almost-a-full-clear
ChatGPT no longer uses the same tokenizer that it used when the SolidGoldMagikarp phenomenon was discovered, but its new tokenizer could be exhibiting similar behavior.
It's not a classic glitch token. Those did not cause the current "I'm unable to produce a response" error that "David Mayer" does.
I don't think this explanation makes sense. I asked ChatGPT "Can you tell me things about Akhmed Chatayev", and it had no problem using his actual name over and over. I asked about his aliases and it said
Akhmed Chatayev, a Chechen Islamist and leader within the Islamic State (IS), was known to use several aliases throughout his militant activities. One of his primary aliases was "Akhmed Shishani," with "Shishani" translating to "Chechen," indicating his ethnic origin. Wikipedia
Additionally, Chatayev adopted the alias "David
Then threw an error messag...
This oddity is making the rounds on Reddit, Twitter, Hackernews, etc.
Is OpenAI censoring references to one of these people? If so, why?
https://en.m.wikipedia.org/wiki/David_Mayer_de_Rothschild https://en.wikipedia.org/wiki/David_Mayer_(historian)
Edit: More names have been found that behave similarly:
Source: https://www.reddit.com/r/ChatGPT/comments/1h420u5/unfolding_chatgpts_mysterious_censorship_and/
Update: "David Mayer" no longer breaks ChatGPT but the other names are still problematic.