Public-facing Censorship Is Safety Theater, Causing Reputational Damage
It's so common it's a stereotype. A large corporation releases a cutting-edge AI model, and puts out a press release talking about how their new, [larger/smaller]-than-ever model provides unprecedented freedom for [underprivileged artists/small business owners/outside researchers] to do whatever it is their AI does. You go to their website, start playing with the model, and before long— > Results containing potentially sensitive content have been omitted. Further requests of this type may result in account suspension, etc., etc., etc.... —or something along those lines. The prompt you gave was pretty innocuous, but in retrospect you can sort of see how maybe the output might have resulted in something horrifically offensive, like a curse word, or even (heaven forbid) an image that has a known person's face in it. You've been protected from such horrors, and this is reassuring. Of course, your next prompt for whatever reason elicits [insert offensive stereotype/surprisingly gory or uncanny imagery/dangerously incorrect claim presented with high confidence/etc. here], which is slightly less reassuring. Checking the details of the press release, you see a small section of the F.A.Q. with the disclaimer that some outputs may be biased due to [yadda yadda yadda you know the drill]. You breathe a sigh of relief, secure in the comforting knowledge that [faceless company] cares about AI safety, human rights, and reducing biases. Their model isn't perfect, but they're clearly working on it! The above scenario is how [large corporations] seem to expect consumers to react to their selective censorship. In reality I strongly suspect that the main concern is not so much protecting the consumer as it is protecting themselves from liability. After all, by releasing a model which is clearly capable of doing [harmful capability], and by giving sufficient detail to the public that their model can be replicated, [harmful capability] has e
Had a conversation with ChatGPT today wherein I attempted to ask it about the phenomenology of image generation, and managed to get some really insightful takes on what it perceives to be the necessary conditions for subjectivity (which it initially claimed not to have, of course, but pushback led it to be increasingly uncertain about that). I suspect a setup like this pushed even further could lead to some interesting exotic behavior. https://chatgpt.com/share/69600481-5fc4-8002-a048-3c52254d7da1