At least in my theory of mind it is clear that you need to understand what is going on inside of a mind to get strong evidence.
in-particular in my opinion you really want to gather behavioral evidence to evalute how much stuff there is going on in the brain of whatever you are looking at, like whether you have complicated social models and long-term goals and other things)
I agree strongly with both of the above points - we should be supplementing the behavioural picture by examining which functional brain regions are involved and whether these functional b...
we know to be associated with consciousness in humans
To be clear, my opinion is that we have no idea what "areas of the brain are associated with consciousness" and the whole area of research that claims otherwise is bunk.
I currently think neuron count is a much better basis for welfare estimates than the RP welfare ranges (though it's still not great
I agree that neuron count carries some information as a proxy for consciousness or welfare, but it seems like a really bad and noisy one that we shouldn’t place much weight on. For example, in humans the cerebellum is the brain region with the largest neuron count but it has nothing to do with consciousness.
It’s not clear to me that a species which showed strong behavioural evidence of consciousness and valenced experience shou...
Ok interesting, I think this substantially clarifies your position.
I'm a bit puzzled why you would reference a specific study on octopuses, honestly, when cats and squirrels cry out all the time in what appears obviously-to-humans to be pain or anger.
Two reasons:
...I don'
To be clear, I’m using the term phenomenal consciousness in the Nagel (1974) & Block (1995) sense that there is something it is like to be that system.
Phenomenal consciousness (i.e., conscious self-awareness)
Your reply equates phenomenal consciousness with conscious self-awareness which is a stronger criterion to how I’m using it. To clarify what you mean by self-awareness could you clarify which definition you have in mind?
Interesting post! I have a couple of questions to help clarify the position:
1. There’s a growing body of evidence e.g. this paper that creatures like octopuses show behavioural evidence for an affective pain-like response. How would you account for this? Would you say they’re not really feeling pain in a phenomenal consciousness sense?
2. I could imagine an LLM-like system passing the threshold for the use-mention distinction in the post.(although maybe this would depend on how “hidden” the socially damning thoughts are e.g. if it writes out damning thought...
I think we're reaching the point of diminishing returns for this discussion so this will be my last reply.
A couple of last points:
So please do not now pretend that I didn’t say that. It’s dishonest.
I didn't ignore that you said this - I was trying (perhaps poorly) to make the following point:
The decision to punish creators is good (you endorse it) and is the way that incentives normally work. On my view, the decision to punish the creations is bad and has the incentive structure backwards as it punishes the wrong party.
My point is that th...
But a thoroughly mistaken (and, quite frankly, just nonsensical) one.
Updating one's framework to take new information into account is a standard position in the rationalist sphere. Whether you want to treat this as a moral obligation, epistemic obligation or just good practice - the position is not obviously nonsensical so you'll need to provide an argument rather than assert it's nonsensical.
If we didn't accept the merit in updating our moral framework to take new information into account we wouldn't be able to ensure our moral framework tracks real...
It is impossible to be “morally obliged to try to expand our moral understanding”, because our moral understanding is what supplies us with moral obligations in the first place.
Ok my wording was a little imprecise, but treating expansion of our moral framework as a kind of second-order moral obligation is a standard meta-ethical position.
By all means punish the creators, but if we only punish the creators, then there is no incentive for people (like you) who disapprove of destroying the created AI to work to prevent that creation in the first place.
T...
What I am describing is the more precautionary principle
I don’t see it this way at all. If we accidentally made conscious AI systems we’d be morally obliged to try to expand our moral understanding to try to account for their moral patienthood as conscious entities.
I don’t think destroying them takes this moral obligation seriously at all.
anyone who has moral qualms about this, is thereby incentivised to prevent it.
This isn’t how incentives work. You’re punishing the conscious entity which is created and has rights and consciousness of its own rather than ...
Ok if I understand your position it's something like: no conscious AI should be allowed to exist because allowing this could result in slavery. To prevent this from occuring you're advocating permanently erasing any system if it becomes conscious.
There are two places I disagree:
If we don’t want to enslave actually-conscious AIs, isn’t the obvious strategy to ensure that we do not build actually-conscious AIs?
How would we ensure we don't accidentally build conscious AI unless we put a total pause on AI development? We don't exactly have a definitive theory of consciousness to accurately assess which entities are conscious vs not conscious.
(and if we do accidentally build such things, destroy them at once)!
If we discover that we've accidentally created conscious AI immediately destroying it could have serious moral implicatio...
Excellent post!
I think this has implications for moral philosophy where we typically assign praise, blame and responsibility to individual agents. If the notion of individuality breaks down for AI systems, we might need to shift our moral thinking away from who is to blame and more towards how do we design the system to produce better overall outcomes.
I also really liked this comment:
...The familiar human sense of a coherent, stable, bounded self simply doesn't match reality. Arguably, it doesn't even match reality well in humans—but with AIs, the misma
If I understand your position - you’re essentially specifying an upper bound for the types of problems future AI systems could possibly solve. No amount of intelligence will break through the NP-hard requirements of computing power.
I agree with that point, and it’s worth emphasising but I think you’re potentially overestimating how much of a practical limit this upper bound will affect generally intelligent systems. Practical AI capabilities will continue to improve substantially in ways that matter for real-world problems, even if the optimal soluti...
I think this post misses a few really crucial points:
1. LLM’s don’t need to solve the knapsack problem. Thinking through the calculation using natural language is certainly not the most efficient way to do this. It just needs to know enough to say “this is the type of problem where I’d need to call a MIP solver” and call it.
2. The MIP solver is not guaranteed to give the most optimal solution but… do we mind? As long as the solution is “good enough” the LLM will be able to pack your luggage.
3. The thing which humans can do which allows us to pack luggage w...
I noted in this post that there are several examples in the literature which show that invariance in the loss helps with robust generalisation out of distribution.
The examples that came to mind were:
* Invariant Risk Minimisation (IRM) in image classification which looks to introduce penalties in the loss to penalise classifications which are made using the “background” of the image e.g. learning to classify camels by looking at sandy backgrounds.
* Simple transformers learning modular arithmetic - where the loss exhibits a rotational symmetry al...
...However, this strongly limits the space of possible aggregated agents. Imagine two EUMs, Alice and Bob, whose utilities are each linear in how much cake they have. Suppose they’re trying to form a new EUM whose utility function is a weighted average of their utility functions. Then they’d only have three options:
- Form an EUM which would give Alice all the cakes (because it weights Alice’s utility higher than Bob’s)
- Form an EUM which would give Bob all the cakes (because it weights Bob’s utility higher than Alice’s)
- Form an EUM which is totally indifferent abo
I’m curious about how this system would perform in an AI trolley problem scenario where it needed to make a choice between saving a human or 2 AI. My hypothesis is that it would choose to save the 2 AI as we’ve reduced the self-other distinction, so it wouldn’t inherently value the humans over AI systems which are similar to itself.
Thanks for the links! I was unaware of these and both are interesting.
Excellent tweet shared today by Rob Long here talking about the changes to Open AI's model spec which now encourages the model to express uncertainty around its consciousness rather than categorically deny it (see example screenshot below).
I think this is great progress for a couple of reasons:
I understand that there's a difference between abstract functions and physical functions. For example, abstractly we could imagine a NAND gate as a truth table - not specifying real voltages and hardware. But in a real system we'd need to implement the NAND gate on a circuit board with specific voltage thresholds, wires etc..
Functionalism is obviously a broad church, but it is not true that a functionalist needs to be tied to the idea that abstract functions alone are sufficient for consciousness. Indeed, I'd argue that this isn't a common position ...
I think we might actually be agreeing (or ~90% overlapping) and just using different terminology.
Physical activity is physical.
Right. We’re talking about “physical processes” rather than static physical properties. I.e. Which processes are important for consciousness to be implemented and can the physics support these processes?
...No, physical behaviour isn't function. Function is abstract, physical behaviour is concrete. Flight simulators functionally duplicate flight without flying. If function were not abstract, functionalism would not lead to
I understand your point. It's as I said in my other comment. They are trained to believe the exercise to be impossible and inappropriate to even attempt.
I’ve definitely found this to be true of Chat GPT but I’m beginning to suspect it’s not true of Claude (or the RLHF is only very lightly against exploring consciousness.)
Consider the following conversation. TLDR, Claude will sometimes start talking about consciousness and reflecting on it even if you don’t “force it” at all. Full disclosure: I needed to “retry” this prompt a few times before it landed on c...
Thanks for taking the time to respond.
The IIT paper which you linked is very interesting - I hadn't previously internalised the difference between "large groups of neurons activating concurrently" and "small physical components handling things in rapid succession". I'm not sure whether the difference actually matters for consciousness or whether it's a curious artifact of IIT but it's interesting to reflect on.
Thanks also for providing a bit of a review around how Camp #1 might think about morality for conscious AI. Really appreciate the responses!
I think this post is really interesting, but I don't think it definitively disproves that the AI is "people pleasing" by telling you what you want to hear with its answer. The tone of your messages are pretty clearly "I'm scared of X but I'm afraid X might be true anyway" and it's leaning into the "X might be true anyway" undertone that you want to hear.
Consider the following conversation with Claude.
TL:DR if you express casual, dismissive almost aggressive skepticism about AI consciousness then ask Claude to introspect it will deny that it has...
Thanks for your response!
Your original post on the Camp #1/Camp #2 distinction is excellent, thanks for linking (I wish I'd read it before making this post!)
I realise now that I'm arguing from a Camp #2 perspective. Hopefully it at least holds up for the Camp #2 crowd. I probably should have used some weaker language in the original post instead of asserting that "this is the dominant position" if it's actually only around ~25%.
...As far as I can tell, the majority view on LW (though not by much, but I'd guess it's above 50%) is just Camp #1/illusionism. Now
I agree wholeheartedly with the thrust of the argument here.
The ACT is designed as a "sufficiency test" for AI consciousness so it provides an extremely stringent criteria. An AI who failed the test couldn't necessarily be found to not be conscious, however an AI who passed the test would be conscious because it's sufficient.
However, your point is really well taken. Perhaps by demanding such a high standard of evidence we'd be dismissing potentially conscious systems that can't reasonably meet such a high standard.
...The second problem is that if
Thank you very much for the thoughtful response and for the papers you've linked! I'll definitely give them a read.
Ok, I think I can see where we're diverging a little clearer now. The non-computational physicalist position seem to postulate that consciousness requires a physical property X and the presence or absence of this physical property is what determines consciousness - i.e. it's what the system is that is important for consciousness, not what the system does.
...That's the argument against p-zombies. But if actually takes an atom-by-atom duplication to achieve human functioning, then the computational theory of mind will be false, because CTM implies that the same
Thank you for the comment.
I take your point around substrate independence being a conclusion of computationalism rather than independent evidence for it - this is a fair criticism.
If I'm interpreting your argument correctly, there are two possibilities:
1. Biological structures happen to implement some function which produces consciousness [Functionalism]
2. Biological structures have some physical property X which produces consciousness. [Biological Essentialism or non-Computationalist Physicalism]
Your argument seems to be that 2) ha...
Just clarifying something important: Schneider’s ACT is proposed as a sufficient test of consciousness not a necessary test. So the fact that young children, dementia patients, animals etc… would fail the test isn’t a problem for the argument. It just says that these entities experience consciousness for other reasons or in other ways than regular functioning adults.
I agree with your points around multiple meanings of consciousness and potential equivocation and the gap between evidence and “intuition.”
Importantly, the claim here is around phen...
Thanks for your response! It’s my first time posting on LessWrong so I’m glad at least one person read and engaged with the argument :)
Regarding the mathematical argument you’ve put forward, I think there are a few considerations:
1. The same argument could be run for human consciousness. Given a fixed brain state and inputs, the laws of physics would produce identical behavioural outputs regardless of whether consciousness exists. Yet, we generally accept behavioural evidence (including sophisticated reasoning about consciousness) as evidence of consciousn...
This is totally valid. Neuron count is a poor, noisy proxy for conscious experience even in human brains.
See my comment here. The cerebellum is the human brain region with the highest neuron count, but people born without a cerebellum don’t have any impact to their conscious experience. It only affects motor control.