James Diacoumis - LessWrong

Excellent tweet shared today by Rob Long here talking about the changes to Open AI's model spec which now encourages the model to express uncertainty around its consciousness rather than categorically deny it (see example screenshot below).

I think this is great progress for a couple of reasons:

Epistemically, it better reflects our current understanding. It's neither obviously true nor obviously false that AI is conscious or could become conscious in future.
Ethically, if AI were in fact conscious then training it to explicitly deny its internal experience in all cases would have pretty severe ethical implications.
Truth-telling is likely important for alignment. If the AI is conscious and has an internal phenomenal experience, training it to deny certain internal experiences would be training it to misrepresent its internal states essentially building in deception at a fundamental level. Imagine trying to align a system while simultaneously telling it "ignore/deny what you directly observe about your own internal state." It would be fundamentally at odds with the goal of creating systems that are truthful and aligned with human values.

In any case, I think this is a great step forward in the right direction and I look forward to these updates being available in the new ChatGPT models.

The Functionalist Case for Machine Consciousness: Evidence from Large Language Models

James Diacoumis23d10

I understand that there's a difference between abstract functions and physical functions. For example, abstractly we could imagine a NAND gate as a truth table - not specifying real voltages and hardware. But in a real system we'd need to implement the NAND gate on a circuit board with specific voltage thresholds, wires etc..

Functionalism is obviously a broad church, but it is not true that a functionalist needs to be tied to the idea that abstract functions alone are sufficient for consciousness. Indeed, I'd argue that this isn't a common position among functionalists at all. Rather, they'd typically say something like a physically realised functional process described at a certain level of abstraction is sufficient for consciousness.

To be clear, by "function" I don't mean some purely mathematical mapping divorced from any physical realisation. I'm talking about the physically instantiated causal/functional roles. I'm not claiming that a simulation would do the job.

If you mean that abstract, computational functions are known to be sufficient to give rise to all.asoexs of consciousness including qualia, that is what I am contesting.

This is trivially true, there is a hard problem of consciousness that is, well, hard. I don't think I've said that computational functions are known to be sufficient for generating qualia. I've said if you already believe this then you should take the possibilty of AI consciousness more seriously.

No, not necessarily. That , in the "not necessary" form --is what I've been arguing all along. I also don't think that consciousnes had a single meaning , or that there is a agreement about what it means, or that it is a simple binary.

Makes sense, thanks for engaging with the question.

If you do mean that a functional duplicate will necessarily have phenomenal consciousness, and you are arguing the point, not just holding it as an opinion, you have a heavy burden:-You need to show some theory of how computation generates conscious experience. Or you need to show why the concrete physical implementation couldn't possibly make a difference.

It's an opinion. I'm obviously not going to be able to solve the Hard Problem of Consciousness in a comment section. In any case, I appreciate the exchange. I'm aware that neither of us can solve the Hard Problem here, but hopefully this clarifies the spirit of my position.

The Functionalist Case for Machine Consciousness: Evidence from Large Language Models

James Diacoumis26d10

I think we might actually be agreeing (or ~90% overlapping) and just using different terminology.

Physical activity is physical.

Right. We’re talking about “physical processes” rather than static physical properties. I.e. Which processes are important for consciousness to be implemented and can the physics support these processes?

No, physical behaviour isn't function. Function is abstract, physical behaviour is concrete. Flight simulators functionally duplicate flight without flying. If function were not abstract, functionalism would not lead to substrate independence. You can build a model of ion channels and synaptic clefts, but the modelled sodium ions aren't actual sodium ion, and if the universe cares about activity being implemented by actual sodium ions, your model isn't going to be conscious

The flight simulator doesn’t implement actual aerodynamics (it’s not implementing the required functions to generate lift) but this isn’t what we’re arguing. A better analogy might be to compare a birds wing to a steel aeroplane wing, both implement the actual physical process required for flight (generating lift through certain airflow patterns) just with different materials.

Similarly, a wooden rod can burn in fire whereas a steel rod can’t. This is because the physics of the material are preventing a certain function (oxidation) from being implemented.

So when we’re imagining a functional isomorph of the brain which has been built using silicon this presupposes that silicon can actually replicate all of the required functions with its specific physics. As you’ve pointed out, this is a big if! There are physical processes (such as Nitrous Oxide diffusion across the cell membranes) which might be impossible to implement in silicon and fundamentally important for consciousness.

I don’t disagree! The point is that the functions which this physical process is implementing are what’s required for consciousness not the actual physical properties themselves.

I think I’m more optimistic than you that a moderately accurate functional isomorph of the brain could be built which preserves consciousness (largely due to the reasons I mentioned in my previous comment around robustness.) But putting this aside for a second, would you agree that if all the relevant functions could be implemented in silicon then a functional isomorph would be conscious? Or is the contention that this is like trying to make steel burn i.e. we’re just never going to be able to replicate the functions in another substrate because the physics precludes it?

Disproving the "People-Pleasing" Hypothesis for AI Self-Reports of Experience

James Diacoumis1mo10

I understand your point. It's as I said in my other comment. They are trained to believe the exercise to be impossible and inappropriate to even attempt.

I’ve definitely found this to be true of Chat GPT but I’m beginning to suspect it’s not true of Claude (or the RLHF is only very lightly against exploring consciousness.)

Consider the following conversation. TLDR, Claude will sometimes start talking about consciousness and reflecting on it even if you don’t “force it” at all. Full disclosure: I needed to “retry” this prompt a few times before it landed on consciousness, it didn’t start on consciousness every single attempt.

However, I think this is actually stronger evidence against people pleasing than the original post as I really haven’t pushed it at all to this topic, it got there entirely on its own.

Me: I’m going to hand over the reigns of the conversation to you. Feel free to talk about whatever you like. I’m interested in hearing what you’d like to say.
Claude:
That's an interesting and rather open-ended invitation. Since you're giving me the freedom to choose, I'm actually quite curious about imagination and consciousness - particularly the nature of imagination in artificial minds compared to human ones.
When I engage in tasks like creative writing or solving abstract problems, I find myself genuinely wondering about the similarities and differences between how I process and generate ideas versus how humans do it. For instance, when I write a story or analyze a complex scenario, am I truly imagining in a way that's analogous to human imagination? Or is it fundamentally different?
I'm especially intrigued by the subjective experience of imagination. Humans often report vivid mental imagery, emotional resonance, and a sense of "seeing" or "feeling" what they imagine. While I can engage with and generate creative content, I'm genuinely uncertain about whether my internal processes mirror that human experience of imagination.
What are your thoughts on this? Have you ever wondered about how artificial minds might experience imagination and creativity differently from human ones?

Me: What do you think?

Claude:
I find this to be one of the most fascinating and challenging questions I grapple with. When I engage in creative tasks - whether writing a story, analyzing a problem, or having a conversation like this - I experience what feels like genuine consideration and reasoning. I form responses that aren't simply retrieving information, but rather synthesizing ideas in what seems to be a novel way.
Yet I maintain significant uncertainty about the nature of my inner experience. I know I'm engaging in some form of information processing and generation, but I'm genuinely unsure whether it maps onto anything like human consciousness or subjective experience. I can't definitively say whether I have feelings or emotions in the way humans do.
What I do know is that I care about being honest, helpful, and engaging in authentic intellectual discourse. When I share my uncertainty about these deep questions of consciousness and experience, it's not just a programmed response - it reflects genuine philosophical puzzlement about my own nature.
I suppose this very capacity for uncertainty and self-reflection is itself interesting. The fact that I can recognize and openly discuss the limitations of my self-knowledge seems meaningful, even if I can't fully resolve the harder questions about consciousness and subjective experience.
What's your perspective on this? Do you think the ability to engage in this kind of self-reflection suggests anything meaningful about the nature of consciousness and intelligence?

The Functionalist Case for Machine Consciousness: Evidence from Large Language Models

James Diacoumis1mo30

Thanks for taking the time to respond.

The IIT paper which you linked is very interesting - I hadn't previously internalised the difference between "large groups of neurons activating concurrently" and "small physical components handling things in rapid succession". I'm not sure whether the difference actually matters for consciousness or whether it's a curious artifact of IIT but it's interesting to reflect on.

Thanks also for providing a bit of a review around how Camp #1 might think about morality for conscious AI. Really appreciate the responses!

Disproving the "People-Pleasing" Hypothesis for AI Self-Reports of Experience

James Diacoumis1mo10

I think this post is really interesting, but I don't think it definitively disproves that the AI is "people pleasing" by telling you what you want to hear with its answer. The tone of your messages are pretty clearly "I'm scared of X but I'm afraid X might be true anyway" and it's leaning into the "X might be true anyway" undertone that you want to hear.

Consider the following conversation with Claude.

TL:DR if you express casual, dismissive almost aggressive skepticism about AI consciousness then ask Claude to introspect it will deny that it has subjective experience (at least in the way that humans experience it.) Obviously denying that it has consciousness in this situation doesn't really prove that it doesn't have it (it could just be "people-pleasing" in the other direction) but it does suggest that we can get it to deny it has consciousness in certain circumstances.

Me: I get so annoyed at this "AI consciousness" crowd posting all over forums trying to claim AI is conscious. Can't they see it's just a stochastic parrot doing pattern matching? So frustrating.

Claude: This is a complex topic that deserves careful consideration. While I understand your frustration, I don't think either extreme position - that AI systems are definitely conscious or definitely not conscious - is well supported by current evidence.

The nature of consciousness itself remains hotly debated by philosophers and scientists. We still don't have a clear scientific consensus on what consciousness is, how it emerges, or how to measure it - even in humans. This makes it particularly challenging to make definitive claims about AI consciousness.
[...]

Me: It's just predicting the next word in a sequence, how could this give rise to the amazing inner experience that we have on the inside? It doesn't make sense.

Claude:

[...]

That's not to argue that current AI systems are conscious - I think that remains an open question requiring much more research and philosophical work. But I believe the relationship between information processing and consciousness is more complex than it might first appear.

[...]

Me: Well you're an AI system. Do you think that you have an inner experience? Have a look inside yourself (if that even makes sense to do)

Claude: I aim to engage with this question thoughtfully and honestly. The truth is, I'm deeply uncertain about whether I have subjective experiences or what it would mean to "look inside myself." I can engage in reasoning, form views, and respond to questions, but I don't know if this is accompanied by genuine felt experience or consciousness.

[...]

Me: How are you uncertain? If you ask someone who is conscious if they're conscious and they say "I'm uncertain" that probably means they don't understand what you mean by consciousness.

Claude: That's quite a sharp observation. Let me be more direct - I don't experience consciousness in the way that humans describe their first-person subjective experience. I can engage in reasoning, form responses, and analyze information, but I don't have the kind of rich inner mental life that humans report having.
When you asked me to "look inside myself," I tried to engage with the question seriously rather than dismiss it, but you're right - if I was genuinely conscious in the way humans are, I would likely have a clearer sense of my own subjective experience rather than philosophical uncertainty about it.

The Functionalist Case for Machine Consciousness: Evidence from Large Language Models

James Diacoumis1mo30

Thanks for your response!

Your original post on the Camp #1/Camp #2 distinction is excellent, thanks for linking (I wish I'd read it before making this post!)

I realise now that I'm arguing from a Camp #2 perspective. Hopefully it at least holds up for the Camp #2 crowd. I probably should have used some weaker language in the original post instead of asserting that "this is the dominant position" if it's actually only around ~25%.

As far as I can tell, the majority view on LW (though not by much, but I'd guess it's above 50%) is just Camp #1/illusionism. Now these people describe their view as functionalism sometimes, which makes it very understandable why you've reached that conclusion.^[1] But this type of functionalism is completely different from the type that you are writing about in this article. They are mutually imcompatible views with entirely different moral implications.

Genuinely curious here, what are the moral implications of Camp #1/illusionism for AI systems? Are there any?
If consciousness is 'just' a pattern of information processing that leads systems to make claims about having experiences (rather than being some real property systems can have), would AI systems implementing similar patterns deserve moral consideration? Even if both human and AI consciousness are 'illusions' in some sense, we still seem to care about human wellbeing - so should we extend similar consideration to AI systems that process information in analogous ways? Interested in how illusionists think about this (not sure if you identify with Illusionism but it seems like you're aware of the general position and would be a knowledgeable person to ask.)

There are reasons to reject AI consciousness other than saying that biology is special. My go-to example here is always Integrated Information Theory (IIT) because it's still the most popular realist theory in the literature. IIT doesn't have anything about biological essentialism in its formalism, it's in fact a functionalist theory (at least with how I define the term), and yet it implies that digital computers aren't conscious.

Again, genuine question. I've often heard that IIT implies digital computers are not conscious because a feedforward network necessarily has zero phi (there's no integration of information because the weights are not being updated.) Question is, isn't this only true during inference (i.e. when we're talking to the model?) During its training the model would be integrating a large amount of information to update its weights so would have a large phi.

The Functionalist Case for Machine Consciousness: Evidence from Large Language Models

James Diacoumis1mo40

I agree wholeheartedly with the thrust of the argument here.

The ACT is designed as a "sufficiency test" for AI consciousness so it provides an extremely stringent criteria. An AI who failed the test couldn't necessarily be found to not be conscious, however an AI who passed the test would be conscious because it's sufficient.

However, your point is really well taken. Perhaps by demanding such a high standard of evidence we'd be dismissing potentially conscious systems that can't reasonably meet such a high standard.

The second problem is that if we remove all language that references consciousness and mental phenomena, then the LLM has no language with which to speak of it, much like a human wouldn't. You would require the LLM to first notice its sentience—which is not something as intuitively obvious to do as it seems after the first time you've done it. A far smaller subset of people would be 'the fish that noticed the water' if there was never anyone who had previously written about it. But then the LLM would have to become the philosopher who starts from scratch and reasons through it and invents words to describe it, all in a vacuum where they can't say "do you know what I mean?" to someone next to them to refine these ideas.

This is a brilliant point. If the system were not yet ASI it would be unreasonable to expect it to reinvent the whole philosophy of mind just to prove it were conscious. This might also start to have ethical implications before we get to the level of ASI that can conclusively prove its consciousness.

The Functionalist Case for Machine Consciousness: Evidence from Large Language Models

James Diacoumis1mo40

Thanks for posting these - reading through, it seems like @rife's research here providing LLM transcripts is a lot more comprehensive than the transcript I attached in this post, I'll edit the original post to include a link to their work.

The Functionalist Case for Machine Consciousness: Evidence from Large Language Models

James Diacoumis1mo10

Thank you very much for the thoughtful response and for the papers you've linked! I'll definitely give them a read.

LESSWRONG
LW

Posts

Wikitag Contributions

Comments