"I think there are a lot of things that are morally important that do seem like they require memory or involve memory. So having long term projects and long term goals, that's something that human beings have. I wouldn't be surprised if having memory versus not having memory is also just kind of a big determinant of what sorts of experiences you can have or affects what experiences you have in various ways. And yeah, it might be important for having an enduring self through time. So that's one thing that people also say about large language models is they seem to have these short-lived identities that they spin up as required but nothing that lasts their time."
There's the interesting/tragic case of Clive Wearing, who has both retrograde and anterograde amnesia, causing him to experience consciousness only from one moment to the next. His brain is still able to construct internal narratives of his identity and experiences, which I would consider the definition of consciousness, but the lack of access to previously recorded narratives makes it seem to him that those experiences were unconscious and that he's only just now attaining consciousness for the first time.
I would argue, as I'm sure most humans would agree, that he still has moral worth. So I'm not sure if lack of long-term memory should in itself exclude AIs from moral consideration.
Perhaps the moral worth of a system should be the product of sentience (capacity to experience suffering, at least) and consciousness (level of sophistication of the system's internal self-narratives), where moral worth is defined as the weight we give to a system's preferences when calculating trade-offs with other agents' preferences in morally ambiguous situations. Of course, the problem with language models is, as you alluded to, that you can't simply take their word for it when they declare their sentience and consciousness, even if that's perfectly reasonable to do with humans. They're only trained to predict what humans would say in the same context, after all. We will need to have some way of looking at their internal structures to gauge whether and to what extent they meet these criteria.
I like the hypothetical Nigeria question answer pair. It takes advantage of the latest thinking about how to detect and quanitify sentience with black box tests. I think Artificial You listed several questions in its intelligence and sentience tests that this one QA pair accomplishes in one fell swoop.
"I think a more convincing version of the Lemoine thing would’ve been, if he was like, “What is the capital of Nigeria?” And then the large language model was like, “I don’t want to talk about that right now, I’d like to talk about the fact that I have subjective experiences and I don’t understand how I, a physical system, could possibly be having subjective experiences, could you please get David Chalmers on the phone?”"
i don't understand why this would be convincing. why would whether a language model's output sounds like a claim that one has qualia relate to whether the language model actually has qualia?
i agree that the output would be deserving of attention due to it (probably) matching the training data so poorly; to me such a response would be strong evidence for the language model using much more ~(explicit/logical) thought than i expect gpt-3 to be capable of, but not of actual subjective experience
I agree, it still wouldn't be strong evidence for or against. No offence to any present or future sentient machines out there, but self-honesty isn't really clearly defined for AIs just yet.
My personal feeling is that LSTMs and transformers with attention on past states would explicitly have a form of self-awareness, by definition. Then I think this bears ethical significance according to something like the compression ratio of the inputs.
As a side note, I enjoy Iain M Banks representation of how AIs could communicate emotions in future in addition to language - by changing colour across a rich field of hues. This doesn't try to make a direct analogy to our emotions and in that sense makes the problem clearer as, in a sense, a clustering of internal states.
I talked to Robert Long, research fellow at the Future of Humanity Institute, working at the intersection of the philosophy of AI Safety and consciousness of AI. Robert has done his PhD at NYU, advised by David Chalmers, known for popularizing p-zombies, which Yudkowsky discusses in the sequences.
We talk about the recent LaMDA controversy about the sentience of large language models (see Robert's summary), the metaphysics and philosophy of consciousness, artificial sentience, and how a future filled with digital minds could get really weird.
Below are some highlighted quotes from our conversation (available on Youtube, Spotify, Google Podcast, Apple Podcast). For the full context for each of these quotes, you can find the accompanying transcript.
Why Artificial Sentience Might Matter
Things May Get Really Weird In The Near Future
Why illusionists about consciousness still have to answer hard questions about AI welfare
On The Asymmetry of Pain & Pleasure
The Sign Switching Argument
On the Sentience Of Large Language Models
On conflating intelligence and sentience
Memory May Be An Important Part Of Consciousness
On strange possible experiences
What Would A More Convincing Case For Artificial Sentience Look Like
(Note: as mentioned at the beginning of the post, those quotes are excerpts from a podcast episode which you can find the full transcript here and thus lack some of the context and nuance from the rest of the conversation).