A chimp can think how to get a banana when it sees the banana, or is hungry and remembers where bananas are, or some such. Whereas I can lie there and ponder something as remote as Mars colonization, like all day long! Maybe that’s because, when a chimpanzee’s quite advanced mind sees the banana, it tries to predict the world where it gets the banana and uses all it’s world-modeling power to come up with a coherent prediction of it. But it’s stable only for as long as there’s some stimulus related to banana, once there’s no such stimulus the loop becomes unstable - the upper layers of the model trying to predict the world where the chimp gets the banana, but the lower levels report back that, well, no banana around, so after a while it dies out.
What's the evidence for this claim? As best I can tell we don't have any strong reason to believe chimps don't sit around daydreaming about bananas when they are not otherwise occupied.
Yeah that's a great question. In my reasoning absence of evidence is the evidence of absence, but admittedly a weak one. Theoretically I imagine you can run an experiment to test this, something like - present a chimpanzee with a very difficult but solvable task, let it have maybe one go to figure out how things work, then take the task away. Then let it have another try after some time, repeat (eta: with new chimpanzees or tasks), measure success as a function of time. I'm not saying it's a great experiment design, just that "spends resting hours on solving problems" is an externally observable quality. If there's any real evidence they do, I'll be the first to admit that part is not correct, and it was just a guess to begin with.
But this idea - self-consciousness is a model trained to predict other such models and generalizing to itself - seems both extremely obvious (in retrospective) and as mentioned before, with one small exception I can’t remember ever hearing or reading about it.
The idea feels familiar enough that I didn't feel surprised to see you suggest it, but I'm not sure where exactly I might have first encountered it. Learning to be conscious seems like a somewhat similar model, at least:
Consciousness remains a formidable challenge. Different theories of consciousness have proposed vastly different mechanisms to account for phenomenal experience. Here, appealing to aspects of global workspace theory, higher-order theories, social theories, and predictive processing, we introduce a novel framework: the self-organizing metarepresentational account (SOMA), in which consciousness is viewed as something that the brain learns to do. By this account, the brain continuously and unconsciously learns to redescribe its own activity to itself, so developing systems of metarepresentations that qualify target first-order representations. Thus, experiences only occur in experiencers that have learned to know they possess certain first-order states and that have learned to care more about certain states than about others. In this sense, consciousness is the brain’s (unconscious, embodied, enactive, nonconceptual) theory about itself.
As does maybe this paper [edit: apparently it's written by the person who wrote the "Rethinking Consciousness" book]:
One possible explanation of consciousness, proposed here, is that it is a construct of the social perceptual machinery. Humans have specialized neuronal machinery that allows us to be socially intelligent. The primary role for this machinery is to construct models of other people’s minds thereby gaining some ability to predict the behavior of other individuals. In the present hypothesis, awareness is a perceptual reconstruction of attentional state; and the machinery that computes information about other people’s awareness is the same machinery that computes information about our own awareness. The present article brings together a variety of lines of evidence including experiments on the neural basis of social perception, on hemispatial neglect, on the out-of-body experience, on mirror neurons, and on the mechanisms of decision-making, to explore the possibility that awareness is a construct of the social machinery in the brain.
I'm also somewhat reminded Thomas Metzinger's stuff about consciousness being a "self-model" (though it tends to be a bit of a pain to figure out what the heck exactly he's saying; I didn't even try doing more than skimming that page, and wouldn't recommend that to others, either), Dennett's notion of the self as a narrative center of gravity, and this LW comment.
Thanks a lot for the links! I didn't look into them yet, but the second quote sounds pretty much exactly like what I was trying to say, only expressed more intelligibly. Guess the broad concept is "in the air" enough that even a layman can grope their way to it.
I really like Graziano's Attention Schema Theory. Even more because it's essentially an illusionist theory.
Yeah, although seems only in the sense where "everything [we perceive] is illusion"? Which is not functionally different from "nothing is illusion". Unless I'm missing something?
Illusionist theories of consciousness say that people do not actually ever experience phenomenal qualities; we just have the very deep delusion that phenomenal qualities are a thing.
Whether it's a hallucination or not doesn't matter. Either way, our delusion tells us that we're perceiving things as "qualities", as "feels", even though all we are really perceiving is data. If I'm looking at something, then I am acquiring data that tells me, for instance, that there's a certain dark red shape at a certain location in my visual field, and a different reddish-orange shape near it, and whatever else. I'm acquiring the knowledge, the concept, of certain colors and brightnesses being in certain locations in my visual field. But I mistakenly think I am experiencing a dark red quality, and a reddish-orange quality, and the qualities of bright and dark and in-between, etc. I think I'm getting my visual input in the form of a subjective experience.
This paper that I'm linking clears up several important points about the Attention Schema Theory. Graziano admits that "Illusionist theories emphasize how subjective awareness does not really exist – the brain tricks itself into thinking it has it. Obviously, AST aligns with that perspective." But he says he doesn't think the word "illusion" is a helpful word for expressing this, and illusionism should have been called something else, and I think he's probably right. (It seems like it's too late to change it now, though.)
we're perceiving things as "qualities", as "feels", even though all we are really perceiving is data
I consider it my success as a reductionist that this phrase genuinely does not make any sense to me.
But he says he doesn't think the word "illusion" is a helpful word for expressing this, and illusionism should have been called something else, and I think he's probably right.
Yep, can't agree more, basically that's why I was asking - "illusion" doesn't sound like the right concept here.
A friend of mine has not-quite-complete-aphantasia, meaning he cannot "see" visual images in his "mind's eye", except for some rare occasions. When he remembers things he saw, or imagines what hypothetical things might look like, he almost never has mental imagery. But he can still recall information about what he saw, including stuff like "how many items were there", even though he did not consciously count the items when he was looking at them, and is only counting them by examining the memory when asked. I asked him how he was doing it, and he said it was like examining concepts about what he saw, and somehow using those concepts to figure out how many he remembers seeing.
For me to do the same thing, I would have to bring a mental image from my memory into my "mind's eye", and count how many items I "see".
But maybe this mental imagery of mine is just something I'm convinced I experience, rather than something I actually experience? Maybe having aphantasia is nothing more than lacking a certain delusion when it comes to remembered or imagined sights? After all, I haven't found anything that my mental images can figure out, which his concepts cannot. Or vice versa.
In any case, aphantasia only applies to mental images. When my friend is actually looking at something with his eyes (not remembering or imagining), then just like me, he's convinced that what he's experiencing is something more than just concepts.
An illusion is a misleading appearance , whereas a delusion is a false belief. Illusionists don't believe there are any appearances. They think that everything is a belief , but some beliefs are accompanied by a false meta belief that they are not beliefs. So illusionism would be more literally described as delusionism.
Illusionism thinks the illusion-of-phenomenal-consciousness is 'perception-like' — it's more like seeing an optical illusion, and less like just having a stubborn hunch that won't go away even though there's no apparent perceptual basis for it.
The view you're describing is different from illusionism, and is more like the one articulated by Dennett for most of his career. E.g., Dennett's 1979 “On the absence of phenomenology”:
[...] Since I hold that we have privileged access only to judgments, and since I cannot make sense of any claim to the effect that something to which I do not have privileged access is an element of my immediate conscious experience, I am left defending the view that such judgments exhaust our immediate consciousness, that our individual streams of consciousness consist of nothing but such propositional episodes, or better: that such streams of consciousness, composed exclusively of such propositional episodes, are the reality that inspires the variety of misdescriptions that pass for theories of consciousness, both homegrown and academic.
[...] You may be wondering if you even have judgments. Typically these episodes are the momentary, wordless thinkings or convictions (sometimes misleadingly called conscious or episodic beliefs) that are often supposed to be the executive bridges leading to our public, worded introspective reports from our perusal or enjoyment of the phenomenological manifold our reports are about. My view, put bluntly, is that there is no phenomenological manifold in any such relation to our reports. There are the public reports we issue, and then there are episodes of our propositional awareness, our judgments, and then there is—so far as introspection is concerned—darkness. What lies beyond or on the interior of our judgments of the moment, what grounds or causes or controls them, is wholly a matter for science or speculation—in any event it is not a matter to which we have any privileged access at all.
Or his 1991 Consciousness Explained:
[...] You seem to think there’s a difference between thinking (judging, deciding, being of the heartfelt opinion that) something seems pink to you and something really seeming pink to you. But there is no difference.
There is no such phenomenon as really seeming – over and above the phenomenon of judging in one way or another that something is the case.
Indeed, Dennett describes non-physicalism as being based on a "hunch", as though it were just a nagging hard-to-pin-down belief and not something that feels palpably present in all experience. This seems very weird to me.
These days I believe Dennett endorses illusionism instead, though I'm not sure what changed his mind if so? And I have to wonder whether he has some aphantasia-like condition that made a view as weird as delusionism appealing.
Illusionism thinks the illusion-of-phenomenal-consciousness is ‘perception-like’ — it’s more like seeing an optical illusion, and less like just having a stubborn hunch that won’t go away even though there’s no apparent perceptual basis for
If you say so, but it doesn't make it any easier to believe!
Indeed, Dennett describes non-physicalism as being based on a “hunch”, as though it were just a nagging hard-to-pin-down belief and not something that feels palpably present in all experience
It's not clear that it's a hunch , and it's not clear that it's a palpable presence. Physics is a complicated subject that most people do not understand , so why would anyone have reliable hunches or introspections about non physicallity? And the phenomenology varies anyway...some people, but nobody here, have the intuition that thought is non physical.
And I have to wonder whether he has some aphantasia-like condition that made a view as weird as delusionism appealing.
Dennetts phenomenonology is a mystery. Maybe it's like one of those early computer games that prints out "you see a fire breathing dragon".
These and other definitions all seem to mostly shuffle mysterious-ness of consciousness between the words “subjective”, “experience” and “qualia”, and if you’ll try to find a philosophical definition of those you’ll end up going in sepulki loop pretty quickly.
Qualia are sufficiently well defined to enable us to tell that you have not solved the hard problem. Qualia are sensory qualities. "Examples of qualia include the perceived sensation of pain of a headache, the taste of wine, as well as the redness of an evening sky. "
Qualia are not self-reflexive awareness ,or awareness-of-awareness, the thing that you have offered to explain in order to solve the hard problem.
(Edited for typos)
I think we can have it both ways on qualia. Compare these two posts of mine:
Similar problem I'm afraid ... I just don't see how the things you are talking about relate to qualia.
I guess I don't see how it doesn't, since it's a reification of experience, which is what qualia are.
I think this is basically what you're saying, though: you're talking about qualities of experience/sensations, but to speak of qualities is to categorize things rather than just experience them, which means some part of experience is being made into a thing.
I don't think you can explain qualities just by explaining categorisation. If that's what you are saying.
Whether or not they are explained, the import thing I'm pointing at is that they are of the same type, hence whether or not you find an explanation satisfying, it is aiming to explain the same kind of thing.
I can categorsise stones , and I can categorise flowers, but that doesn't mean stones are flowers. In general, the ability to categorise things of type X doesn't exhaustively describe them because they also have intrinsic properties. There is a theory of qualia according to which they don't have any interesting properties other than bring different from each other, the GENSYM, theory, but I don't know whether you're be endorsing it.
For one, the part about qualia talks specifically about sensory inputs. So I'm not sure how's what you're saying is an objection, can you clarify? Also, this part leans really heavily on "Surfing the Uncertainty", not sure if you've read it but if you haven't that may be another issue.
And if you say you have a good definition of quale, even if you can't quite put it into words - can you explain me when a sensation perceived by a human (in the biological sense of perceiving) stops being a quale? Like with your example of "redness of the evening sky". Lets say I sit and stare at the sky marveling how red it is - in my understanding I definitely experience a quale. What if I just walking down the street and maybe vaguely aware that the sky is red? What if I'm running for my life from a dog, or blind drunk or somehow else have no clue what color the sky is even though it's technically in my field of vision?
For one, the part about qualia talks specifically about sensory inputs
But not about sensory qualities.
Also, this part leans really heavily on “Surfing the Uncertainty”, not sure if you’ve read it but if you haven’t that may be another issue.
I read Scott's review. And appears Clark is not saying anything about qualia. Or are you saying Clark has solved the HP?
And if you say you have a good definition of quale, even if you can’t quite put it into words—
I quoted Wikipedia. They managed to put it into words
can you explain me when a sensation perceived by a human (in the biological sense of perceiving) stops being a quale?
Why does that matter? Do you have something to say about qualia that depends on knowing the answer?
The part that you quoted doesn't define anything, it's just 3 examples, which together may be just as well defined simply as "sensations". And the Wikipedia article itself lists a number of different, not equivalent definitions none of which is anything I'd called rigorous, plus a number of references to qualia proponents who claim that this or that part of some definition is wrong (e.g. Ramachandran and Hirstein say that qualia could be communicated), plus a list of qualia opponents who have significant issues with the whole concept. That is exactly what I'm referring to as "ill-defined".
Now you say that you think qualia is well-defined, so I'm asking you to help me to understand the definition you have in mind, so we can talk about it meaningfully. That's why the questions matter - I can't answer you whether I think I or Clark or Graziano or whoever else solved the hard problem if I don't understand what do you mean by the hard problem (for which not all definitions even include the term "qualia").
Do you have something to say about qualia that depends on knowing the answer?
Well of course, everything I have to say depends on knowing the answer because the answer would help me understand what is it that you mean by qualia. So do you feel like your definition allows you to answer this question? And, while we're at it, my follow up question of whether you assume animals have qualia and if yes which of them? If so, that'd be very helpful for my understanding.
The part that you quoted doesn’t define anything, it’s just 3 examples, which together may be just as well defined simply as “sensations”.
No, they are about the quality of sensations. You keep trying to pull the subject towards "explaining sensation" because you actually can explain sensation, absent the qualities of sensation. But if the HP were really about explaining sensation in that way it wouldn't be hard. You should be using the famous hardness of the HP as a guide to understanding it ... If it seems easy , you've got it wrong.
And the Wikipedia article itself lists a number of different, not equivalent definitions none of which is anything I’d called rigorous
But that might be an isolated demand for rigour.
FYI, there is no precise and universally accepted definition of "matter".
(e.g. Ramachandran and Hirstein say that qualia could be communicated
Note that not everything that is true of qualia (or anything else) needs to be in the definition.
Now you say that you think qualia is well-defined
I didn't say that.
I can’t answer you whether I think I or Clark or Graziano or whoever else solved the hard problem if I don’t understand what do you mean by the hard problem
I'm not using an idiosyncratic defintion.
So do you feel like your definition allows you to answer this question?
I would not expect a definition alone to answer every possible question. I once read a paper arguing that unseen qualia are a coherent idea, but I forget the details.
I'm not trying to pull the subject towards anything, I'm just genuinely trying to understand your position, and I'd appreciate a little bit of cooperation on your part in this. Such as, answering any of the questions I asked. And "I don't know" is a perfectly valid answer, I have no intention to "gotcha" you or anything like this, and by your own admission the problem is hard. So I'd ask you to not interpret any of my words above or below as an attack, quite the opposite I'm doing my best to see your point.
You should be using the famous hardness of the HP as a guide to understanding it ... If it seems easy , you've got it wrong.
With all due respect, that sounds to me like you're insisting that the answer to a mysterious question should be itself mysterious, which it shouldn't. Sorry if I misinterpret your words, in that case again I'd appreciate you being a bit more clear about what you're trying to say.
FYI, there is no precise and universally accepted definition of "matter".
Exactly, and that is why using Wikipedia article for definition in such debates is not a good idea. Ideally, I'd ask you (or try myself in an identical situation) to taboo the words "qualia" and "hard problem" and try to explain what exactly question(s) do you think remains unanswered by the theory. But failing that, we can at least agree on the definition on qualia.
And even if we insist on using Wiki as the source of truth, here's the direct quote: "Much of the debate over their importance hinges on the definition of the term, and various philosophers emphasize or deny the existence of certain features of qualia. Consequently, the nature and existence of various definitions of qualia remain controversial because they are not verifiable." To me it sounds at odds with, again direct quote: "Qualia are sufficiently well defined to enable us to tell that you have not solved the hard problem". If nature and even existence something depends on the definition, it's not sufficiently well defined to tell whether theory X explains it (which is all not to say that you're wrong and wikipedia is right, I don't think it's the highest authority on such matters. Just that you seem to have some different, narrower definition in mind so we can't use reference to wiki as the source of truth)
Note that not everything that is true of qualia (or anything else) needs to be in the definition.
Yeah, I kinda hoped that I don't need to spell it out, but okay, there we go. You're correct, not everything that's true of qualia needs to be in the definition. However I would insist that a reasonable definition doesn't directly contradict any important true facts. Whereas one of the definitions in that wiki article (by Dennett) says that qualia is "private; that is, all interpersonal comparisons of qualia are systematically impossible."
I would not expect a definition alone to answer every possible question.
Again, totally agree, that's why I started with specific questions rather than definitions. So, considering that "I don't know" is a perfectly reasonable answer, could you maybe try answering them? Or, if that's seems like a better option to you, give an example of a question which you think proves Graziano/my theory isn't sufficient to solve the hard problem?
With all due respect, that sounds to me like you’re insisting that the answer to a mysterious question should be itself mysterious,
I'm not saying that. But answeers to questions should relevant.
Exactly, and that is why using Wikipedia article for definition in such debates is not a good idea. Ideally, I’d ask you (or try myself in an identical situation) to taboo the words “qualia” and “hard problem
I've already done that. I can replace "qualia" with *sensory qualities", and point out that you are not solving the hard problem because you are no explaining sensory qualities.
And even if we insist on using Wiki as the source of truth, here’s the direct quote: “Much of the debate over their importance hinges on the definition of the term, and various philosophers emphasize or deny the existence of certain features of qualia. Consequently, the nature and existence of various definitions of qualia remain controversial because they are not verifiable.” To me it sounds at odds with, again direct quote: “Qualia are sufficiently well defined to enable us to tell that you have not solved the hard problem”.
Theres no real contradiction. Even though there is disagreement about some features of qualia ,there can still be agreement that they in some sense about sensory qualities. I used a simple, almost naive , definition , consisting of a few examples, for a reason.
Or, if that’s seems like a better option to you, give an example of a question which you think proves Graziano/my theory isn’t sufficient to solve the hard problem?
I've said s already, haven't I? A solution to the HP would allow you to predict sensory qualities from detailed brain scans, in the way that Mary can't.
Replacing it with another word of which you then use identically isn't the same as tabooing, that's kind of defeats the purpose.
there can still be agreement that they in some sense about sensory qualities.
There may be, but then it seems there's no agreement about what sensory qualities are.
I've said s already, haven't I? A solution to the HP would allow you to predict sensory qualities from detailed brain scans, in the way that Mary can't.
No, you have not, in fact in all your comments you haven't mentioned "predict" or "mary" or "brain" ever once. But now we're getting somewhere! How do you tell that a certain solution can or can't predict "sensory qualities"? Or better, when you say "predict qualities from the brain scans" do you mean "feel/imagine them yourself as if you've experienced those sensory inputs firsthand", or do you mean something else?
there’s no agreement about what sensory qualities are.
They're things like the perceived sensation of pain of a headache, the taste of wine, as well as the redness of an evening sky. ”
I don't believe that's difficult to understand.
How do you tell that a certain solution can or can’t predict “sensory qualities”?
How do you tell that a putative explanation can predict something? You make a theoretical prediction, and you perform an experiment to confirm it.
Otherwise, non -predictiveness is the default.
So, a solution to the HP needs to be able make a theoretical prediction: there needs to be some gizmo were you input a brain state and get a predicted quale as output.
Sure, I wasn't claiming at any point to provide a precise mathematical model let alone implementation, if that's what you're talking about. What I was saying is that I have guesses as to what that mathematical model should be computing. In order to tell whether the person experiences a quale of X (in the sense of them perceiving this sensation), you'd want to see whether the sensory input from the eyes corresponding to the red sky is propagated all the way up to the top level of predictive cascade - the level capable of modeling itself to a degree - and whether this top level's state is altered in a way to reflect itself observing the red sky.
And admittedly what I'm saying is super high level, but I've just finished reading a much more detailed and I think fully compatible account of this in this article that Kaj linked. In their sense, I think the answer to your question is that the qualia (perceived sensation) arises when both attention and awareness are focused on the input - see the article for specific definitions.
The situation where the input reaches the top level and affects it, but is not registered subjectively, corresponds to attention without awareness in their terms (or to the information having propagated to the top level, but the corresponding change in the top level state not being reflected in itself). It's observed in people with blindsight, and also was recreated experimentally.
In order to tell whether the person experiences a quale of X (in the sense of them perceiving this sensation
Only you define "quale" in terms of experiencing versus not experiencing.
Looking at your debate both with me and with Gordon below, it seems like your side of the argument mostly consists of telling the opponent "no you're wrong" without providing any evidence to that claim. I honestly did my best to raise the sanity waterline a little, but to no success, so I don't see much sense in continuing.
no you’re wrong” without providing any evidence to that claim.
We're mostly arguing about the definition of qualia. I've quoted Wikipedia , you haven't quoted anybody.
when a sensation perceived by a human (in the biological sense of perceiving) stops being a quale?
When it stops feeling like your "self-awareness" and starts feeling like "there was nobody “in there”". And then it raises questions like "why not having ability to do recursion stops you from feeling pain".
Yeah that sounds reasonable and in line with my intuitions. Where by "somebody" I would mean consciousness - the mind modeling itself. The difference between "qualia" and "no qualia" would be the difference between the signal of e.g. pain propagating all the way to the topmost, conscious level, which would predict not just receiving the signal (as all layers below also do), but also predict its own state altered by receiving the signal. In the latter case, the reason why the mind knows there's "somebody" experiencing it, is because it observes (=predicts) this "somebody" experiencing it. And of course that "somebody" is the mind itself.
And then it raises questions like "why not having ability to do recursion stops you from feeling pain".
Well my - and many other people's - answer to that would be that of course it doesn't, for any reasonable definition of pain. Do you believe it does?
I believe it depends on one's preferences. Wait, you think it doesn't? By "ability to do recursion" I meant "ability to predict its own state altered by receiving the signal" or whatever the difference of the top level is supposed to be. I assumed that in your model whoever doesn't implement it doesn't have qualia therefore doesn't feel pain because there is no one to feel it. And for the interested in the Hard Problem the question would be "why this specific physical arrangement interpreted as recursive modeling feels so different from when the pain didn't propagate to the top level".
I don't think qualia - to the degree it is at all a useful term - has much to do with the ability to feel pain, or anything. In my understanding all definitions of qualia assume it is a different thing from purely neurological perceptions (which is what I'd understand by "feelings"), more specifically that the perceptions can generate qualia sometimes in some creatures but they don't automatically do.
Otherwise you'd have to argue one of the two:
Both sound rather indefensible to me, so it follows that an animal can feel pain without experiencing a quale of it, just like a scallop can see the light without experiencing a quale of it. But two caveats on this. First, I don't have a really good grasp on what a qualia is, and as wikipedia attests neither do the experts. I feel there's some core of truth that people are trying to get at with this concept (something along what you said in your first comment), but also it's very often used as a rug for people to hide their confusion under, so I'm always skeptical about using this term. Second, whether or not one should ascribe any moral worth to the agents without consciousness/qualia is decisively not a part of what I'm saying here. I personally do, but as you say it depends on one's preferences, and so largely orthogonal to the question of how consciousness works.
In my understanding all definitions of qualia assume it is a different thing from purely neurological perceptions (which is what I’d understand by “feelings”), more specifically that the perceptions can generate qualia sometimes in some creatures but they don’t automatically do.
Of course, the minimal definition of "qualia" I have been using doesn't have that implication.
Ok, by these definitions what I was saying is "why not having ability to do recursion stops you from having pain-qualia?". Just feeling like there is a core of truth to qualia ("conceivability" in zombie language) is enough for asking your world-model to provide a reason why not everything, including recursively self-modeling systems, feels like qualialess feelings - why recursively self-modeling is not just another kind of reaction and perception?
Ah, I see. My take on this question would be that we should focus on the word "you" rather than "qualia". If you have a conscious mind subjectively perceiving anything about the outside world (or its own internal workings), it has to feel like something, almost by definition. Like, if you went to go get your covid shot and it hurt you'd say "it felt like something". If and only if somehow you didn't even feel the needle piercing your skin, you'll say "I didn't feel anything". There were experiments proving that people can react to a stimulus they are not subjectively aware of (mostly for visual stimuli), but I'm pretty sure in all those cases they'd say they didn't see anything - basically that's how we know they were not subjectively aware of it. What would it even mean for a conscious mind to be aware of a stimulus but it not "feeling like something"? It must have some representation in the consciousness, that's basically what we mean by "being aware of X" or "consciously experiencing X".
So I'd say given a consciousness experiencing stuff, you necessarily have conscious experiences (aka qualia), that's a tautology basically. So the question becomes why some things have consciousness, or to narrow it down to your question - why (certain) recursively self-modeling systems are conscious? And that's kind of what I was trying to explain by the part 4 of the post, and approximately the same idea just from another perspective is much better covered in this book review and this article.
But if I tried to put it in one paragraph, I'd start with - how do I know that I'm conscious and why do I think I know it? And the answer would be a ramble along the lines of: well when I look into my mind I can see me, i.e. some guy who thinks and makes decisions and is aware of things, and have emotions and memories and so on and so forth. And at the same time as I see I also am this guy! I can have different thoughts whenever I choose to (to a degree), I can do different things whenever I choose to (to a still more limited degree), and at the same time I can reflect on the choice process. So my theory is that I can perceive myself as a human mind mostly because the self-reflecting model - which is me - has trained to perceive other human mind so well that it learned to generalize to itself (see the whole entire post for the details). Although Graziano in the article and book I linked provides a very convincing explanation as to why this self-modeling would also be very helpful for the general reasoning ability - something I was unsuccessfully trying to figure out in the part 5.
So my theory is that I can perceive myself as a human mind mostly because the self-reflecting model—which is me—has trained to perceive other human mind so well that it learned to generalize to itself.
What's your theory for why consciousness is actually your ability to perceive yourself as human mind? From your explanation it seems to be
You are basically saying that consciousness detector in the brain is an "algorithm of awareness" detector (and algorithm of awareness can work as "algorithm of awareness" detector). But what are the actual reasons to believe it? Only that if it is awareness, then it explains why you can detect it? It certainly is not a perfect detector, because some people will explicitly say "no, my definition of consciousness is not about awareness". And because it doesn't automatically fits into "If you have a conscious mind subjectively perceiving anything about the outside world, it has to feel like something" if you just replace "conscious" by "able to percieve itself".
Those are all great points. Regarding your first question, no, that's not the reasoning I have. I think consciousness is the ability to reflect on myself firstly because it feels like the ability to reflect on myself. Kind of like the reason that I believe I can see is that when I open my eyes I start seeing things and if I interact with those things they really are mostly where I see them, nothing more sophisticated than that. There's a bunch of longer more theoretical arguments I can bring for this point, but I never thought I should because I was kind of taking it as a given. It well may be me falling into the typical mind fallacy, if you say some people say otherwise. So if you have different intuitions about the consciousness, can you tell:
And because it doesn't automatically fits into "If you have a conscious mind subjectively perceiving anything about the outside world, it has to feel like something" if you just replace "conscious" by "able to percieve itself".
Well, no, it doesn't fit quite as simple, but overall I think it works out. If you have an agent able to reflect on itself and model itself perceiving something, it's going to reflect on the fact that it perceives something. I.e. it's going to have some mental representation for both the perception and for itself perceiving it. It will be able to reason about itself perceiving things, and if it can communicate it will probably also talk about it. Different perceptions will be in relation to each other (e.g. sky is not the same color as grass, and grass color is associated with summer and warmth and so on). And, perhaps most importantly, it will have models of other such agents perceiving things and it will on the high abstract level that they have the same perceptions in them. But it will only have the access to the lower level data for such perceptions from its own sensory inputs, not others', so it won't be able to tell for sure what it "feels like" to them, because it won't be getting theirs stream of low-level sensory inputs.
In short, I think - and please do correct me if you have a counterexample - that we have reasons to expect such an agent to make any claim humans make (given similar circumstances and training examples), and we can make any testable claim about such an agent that we can make about a human.
To me it looks like the defining feature of consciousness intuition is one's certainty in having it, so I define consciousness as the only thing one can be certain about and then I know I am conscious by executing "cogito ergo sum".
I can imagine disabling specific features associated with awareness starting with memory: seeing something without remembering feels like seeing something and then forgetting about it. Usually when you don't remember seeing something recent it means your perception wasn't conscious, but you certainly forgot some conscious moments in the past.
Then I can imagine not having any thoughts. It is harder for long periods of time, but I can create short durations of just seeing that, as far as I remember, are not associated with any thoughts.
At that point it becomes harder to describe this process as self-awareness. You could argue that if there is representation of the lower level somewhere in the high level, then it is still modeling. But there is no more reason to consider these levels parts of the same system, than to consider any sender-receiver pair as self-modeling system.
I don't know. It's all ethics, so I'll probably just check for some arbitrary similarity-to-human-mind metric.
we have reasons to expect such an agent to make any claim humans make
Depending on detailed definitions of "reflect on itself" and "model itself perceiving" I think you can make an agent that wouldn't claim to be perfectly certain in its own consciousness. For example, I don't see a reason why some simple cartesian agent with direct read-only access to its own code would think in terms of consciousness.
But it will only have the access to the lower level data for such perceptions from its own sensory inputs, not others’, so it won’t be able to tell for sure what it “feels like” to them, because it won’t be getting theirs stream of low-level sensory inputs.
That's nothing new, it's the intuition that the Mary thought experiment is designed to address.
Regarding Mary I think it is reasonable to take Mary being informed of all of the literature. But if one takes that "Mary knows all about color" then that could include meditation and lucid dreaming skills which could make it possible for the abstract parts of the brains to simulate and drive the perceptual parts of the brain. it might literally be possible to hallucinate colors and it might literally be possible to hallucinate colors one hasn't already seen. The distinction between these kinds of knowing is still valuable, but I am a stickler for possiblity claims.
An unquided hallucination strip would probably be a random walk and be helpless as a epistemic tool. Intricate knowledge of neurobiology and connectomes and such might be able to function as a kind of guide of constructing very specific kind of hallucinations/experiences. Like very intricate knowledge of the quirks of a simulation of a MMO like world of warcraft might be able to give a clue how to blow a real-world physical fuse to alter the simulation. From the game you wouldn't guess that it is running on computers but if you know it is and that security flaws are a thing then seeing through the spoon can give power.
Thus what is and what is not possible in letting a mind go wild on itself is difficult to say.
I pretty much agree with you. Human intelligence may be high because it is used to predict/interpret the behaviour of others. Consciousness may be that same intelligence turned inward. But:
3. Given enough computational power and a compatible architecture, the agent will develop consciousness if and only if it needs to interact with other agents of the same kind, or at least of similar level of intelligence.
This does not automatically follow I think. There may be other ways that can lead to the same result.
An existing example would be cephalopods (octopus, squid & co.) From what I understand, they are highly intelligent, yet live very short lives, are not social (don't live in large groups, like humans), and have no "culture" (tricks that are taught from generation to generation)[1].
Instead, their intelligence seems to be related to their complex bodies, which requires lots of processing power.
Which is why I think that interaction with other similar entities is not needed for consciousness to emerge. I think the interaction just has to be complex (which is more general than your requirement of interaction with complex beings) For example, a sufficient number of "simple" input/output channels (lots of suckers) can be just as complex as for example human language. Because it is efficient to model/simplify this complexity, intelligence and then consciousness may emerge.
I am therefore of the opinion that either octopi are already conscious, or that if you were to increase the number of their arms n, for n -> infty they sooner or later should be.
In any case, they may dream
[1] This may not be completely correct. There seems to be some kind of hunting tactics, that involve 1 octopus and 1 grouper (fish), where they each drive prey towards the other in turn. The grouper, being longer lived, may teach this to to others?
Epistemic Status: Speculations on top of well established theories.
Introduction
As the conventional memetic wisdom goes, “after decades and decades of research, the science sill have no idea how the human mind works”. As anyone passingly familiar with the relevant fields of science knows, this is of course a load of bull.
However, even in the works writing specifically about the human mind as a whole, it’s common to treat the consciousness itself with not a small degree of reverence. “Yeah sure, my work explains how people perceive the world or how we deceive ourselves or what driven people to evolve so smart or whatelse, but that’s just the small stuff. The big questions of the nature of consciousness are not for my humble self to answer and generally it’s a mysterious miracle shrouded in the veil of enigma, probably best left to philosophers”. The reasons for this are not hard to see - explaining the nature of consciousness is a status claim nobody can possibly make with impunity. Also, everyone has consciousness so everyone thinks they know how it works and will argue with you, and the philosophers would be unhappy to say the least if the hard sciences would've taken yet another loaf of bread from them.
[Here and throughout the post I’m using consciousness to mean “self-awareness in the sense of being aware of one’s own awareness”, ability to introspect one’s own mind, which separates the humans from all or almost all other animals. If this still sounds ambiguous and ill-defined, hopefully the specific meaning will become clearer upon reading the parts 3 and 4, for now bear with me.]
But with the amount of progress done in neuroscience, psychology, AI and other relevant fields of study, one would expect hard sciences have at least something relevant to say on the nature of consciousness. And in fact I will argue in this post that certain theories fit together very nicely and provide a quite satisfying explanation of what exactly the hell self-awareness is and why it exists.
The theories in question are the manifold hypothesis (see part 1), predictive processing (part 2) and Hanson’s “Elephant in the Brain” theory (part 3), and I've never seen them brought together before in this fashion (or in any other for that matter). Which is of course not to say nobody’s come up with this idea before, just I haven’t heard of it. I have seen somewhat similar explanation for consciousness not involving any of these theories mentioned once briefly in a rationalists slack channel, but that's pretty much it.
So after spending some time myself poking holes in this idea and discussing it with a couple of smart people, I've decided to make it into a post on LW [my first post here by the way] to get broader help in figuring out whether any of this makes sense and/or if somebody has thought of it before.
Now, to get started, let's start with something simple and widely acknowledged: people have thoughts. And those thoughts generally model the real world in some way. But how? If you go from "thought" down the abstraction levels you'll eventually get to "pattern of neurons firing", but what lies in between?
1. Manifolds
The manifold hypothesis, which has been tested a number of times and is generally considered to be true, states that objects in the extremely high dimensional data of the real world are contained within [relatively] lower-dimension manifolds. I assume that most people here are familiar with it to some degree, if not here’s a short explanation. And it seems to me that this idea matches perfectly to our common-sense concept of an “object in abstract”, of “general notion” of an object, as opposed to any specific instance.
Lets take the classic ML example of a cat. You can think of a cat in general, without it being any specific cat. If I show you a closed box and tell there’s a cat in it, it vastly narrows down your choice of all the possible collections of atoms that could fit in that box. If I add just a few more variables (age, weight, fur coloration and type, eye color) you’ll be able to picture it quite accurately - because just knowing that it’s a cat already reduced your search space to a small subspace of much lower dimensionality, compared with the space of all possible things which can fit in the box. And conversely, for the vast majority of real-world objects, given a reasonable amount of data you’ll be able to tell confidently whether it is, or isn’t, a cat.
Of course, these manifolds are incredibly fuzzy at the boundaries - just like the tests have shown for the manifold hypothesis. There’s plenty of things out there which you can argue is or isn’t a cat - like those Savannah cats, or a dead cat, or a cat’s embryo. And that in addition to the fact that the verbal handle “cat” we’re using to refer to the manifold is ambiguous - do you mean to include, e.g. an image or figure of a cat? But generally the world does come in chunks it can be carved into.
So what presumably happens in the brain (and a sufficiently powerful NN) is that the high-dimensional input is reduced, turned, twisted and scaled to map it into the lower dimensional space where the manifolds are easy to delineate. At which point you stop seeing an amorphous blob of paint and think - aha, that’s a cat! How does the brain learn how to do it? Where does it get labels for cats and dogs and everything else? Well, it seems this question has also been answered.
2. Predictive Processing
For anyone curious how the brain works - in any meaning of this sentence - I truly can’t recommend “Surfing Uncertainty” by Andy Clark high enough. At least you should read Scott Alexander’s brilliant review of the book, and then decide for yourself. The book describes the predictive processing theory, which is, to quote Scott “a theory of how the brain works – a real unifying framework theory like Darwin’s or Einstein’s – and it’s beautiful and it makes complete sense.”
Since I totally agree with this judgement, everything below is written under the assumption that the theory is broadly true, at least as far as information processing is concerned. (When it comes to how motivation, the predictive processing account is much less convincing, but I think it’s fine and as per author’s own acknowledgement that part is not necessary for the theory to work). The rest of the post is not guaranteed to make sense if you haven’t read at least the review on SSC. But for those who haven’t read it and don’t want to go through the whole thing now, here’s a few paragraphs to give some idea - I’m quoting Scott, so you can take a break and read the writing of someone who’s actually good at it:
You can see how this allows the models in human (and other animal) brains to be trained on the real world data, with the input from T+1 serving as the labels for T. The successes of GPT 2 and 3 - which happened after the book was written - are a good argument for predictive processing being true. You can see how this way, given a sufficiently powerful hardware, the model can learn pretty high level abstract concepts, like “trip” or “repetition” or “tool”. And by learn I mean not just being able to perform or perceive these things, but actually having notions of them similar to those in a human mind, being able to tell whether a certain thing “is a trip” or “isn’t a trip”, and so on.
But what happens on the topmost levels of the model? Why am I able see myself thinking, and see myself thinking about that, and so on? What the hell am I predicting at that moment and why? It makes sense and reasonably clear how and why all these external concepts are learned. But why learn representations of the things internal to the model, like thoughts and emotions? Why a human mind needs to learn to represent and predict a human mind… ah. Right.
3. Social Mind
So, the required reading for this part is Robin Hanson’s “Elephant in the Brain”. Which is quite fortunate, since literally every single person on LW is at least somewhat familiar with its main ideas, so I’ll just move on without further ado.
Of course a human mind needs to predict behavior of a human mind - that’s what it’s freaking made for! The only theory convincingly explaining how humans got so damn smart says they did so competing with other humans. The reproductive success in the ancestral evolutionary environment depended strongly on the ability to successfully interact with other humans - deceive them, seduce, negotiate, convince, impress, coordinate and so on. For that purpose, wouldn’t it be pretty handy to have a model of a human mind? Just like to better hunt antelopes, it helps to have a model of an antelope in your head, so you expect it to run away from you at such and such speed, but don’t expect it to fly or shoot lightnings out of its horns.
Obviously you can’t model another mind in exactness, given that you have about as much computational power as that other mind, but you can have at least some model, pretty abstract and high level. And what
mindsmodels are made of, on the top level? Well, the top level is where we parsed all the noisy input of reality into neat low-dimensional manifolds. When modeling another mind you probably don’t have computing power to care about exactly how are they parsed or represented, or what goes on the lower levels, nor do you have a strong need. What matters is that these manifolds are there and your fellow tribesman’s mind operates in them, and to predict him more accurately it helps if you could incorporate this fact in your model. So, these low-dimension representations of real world objects that people have in their heads - does it sound like we’d call it “thought”? And the process of operating on them “thinking”?[You don’t have to buy into the manifold thing though, just the general idea that people have some representation of real world object in their heads, and that a sufficiently powerful model trying to predict people will quickly come up with a representation for those representations - i.e. a notion of thought.]
And one quality of good powerful models is that they generalize. And the one human mind that any human mind always has access to is, of course, itself. At this point it’s hard to imagine how, once having come up with a notion of “thought” and “thinking”, it can fail to notice these same “thoughts” within itself. Add a bunch of other concepts such as “intentions”, “emotions” etc, and you basically have it - all the notions necessary to describe a mind. You also need a notion of self, of course, but a model operating in a world would likely develop this notion relatively early on, and the mirror test experiments hint that even many animals seem to have it.
Now as per “Surfing the Uncertainty”, humans and other animals predict they own neural states all the time, that’s how brains work. The crucial difference in this case is that the predicted part is the same one that does the prediction. The topmost level of the brain predicts the level immediately down below as per regular predictive processing, but it can also predict its own states - because it was trained to predict states of other very similar human minds. When you’re looking into your mind, you see a human, but that human is the same one that looks. Which, I don’t know about you, but for me is pretty much exactly how it feels to be self-aware! I obviously can't spell out the underlying neurobiology and math in any detail, but I’m reasonably sure there’s nothing at least prohibiting this outright, self-referential systems and autocorrelated functions are all around, and in fact it’s pretty much inevitable that a human brain should contain something of this sort, since as a matter of trivial fact I can use my mind to think about my mind.
And since a thought about a thought is also a thought, the recursion can go as deep as
you wantyour working memory allows. You can think about yourself seeing yourself seeing yourself et cetera (and get bored after about 3 layers per Yudkowsky’s Law of Ultrafinite Recursion).[If it is not clear why I’m using “see” and “predict” interchangeably, see “Surfing Uncertainty”. In two words, if you see and recognize something it means your bottom-up sensory data is matched by the top-down predictions. Sensory data without matching predictions feels more like looking at this image - i.e. like WTF.]
It also helps - and provides an incentive to generalize to your own mind - that you and other humans in your tribe live in approximately the same world. So as soon as you figured out that both you and them have “concepts” of real world objects in your minds, you can reuse the specific concepts from your mind to model their minds. If you’re smart and introspective enough, you can even reuse your knowledge of internal workings of your mind to model them. And it’s widely known that this is exactly what people do, and it works great - up until the point where it doesn’t, anyway.
Of course modeling your own mind feels somewhat differently from modeling others. For one thing, now you have access to much more detailed and lower-level picture than from just observing others externally. For another, now there’s no boundary between the predictor and the predicted, they are the same thing (more on this in the next section).
Note however that it’s not completely different, you’re still using the same vocabulary and the same mental concepts: we recognize thoughts, desires, goals, emotions, mental states etc in both ourselves and others as fundamentally the same. You can see into yourself, roughly, about one level deeper than into someone you know very well, if that - the better understanding of yourself is that you have answers for all the questions you’d ask about others (why did she say that? what he actually thinks about this?), but the language you’re using is the same. Compare this with e.g. the difference between how an average driver thinks about a car compared to a mechanic, or an engineer who designed that car. The latter would be thinking on entirely different levels, using entirely different models and concepts to understand what goes on with the vehicle. In this sense, human introspection looks much more like a model which was fundamentally about the outside view of a human and just happened to get a peek on the inside, rather than a model which was about the inside view to begin with.
This all also squares neatly with Robin Hanson’s model of consciousness as a “PR office”. Once you do have this ability to see yourself as another human being, one of the most useful things that a social, status-concerned creature can do with it is to make sure their actions and words represent them in the best way among their fellow tribesmen.
But when it comes to dealing with predictions proving false, Hanson's model as I understand it of consciousness as purely a PR office contradicts predictive processing and frankly common sense, and I'm tending to be on the side of the latter two. In the predictive processing framework if a prediction doesn’t match bottom-up sensory data (“I’m the bravest man in the tribe” vs “I’m scared shitless at the sight of a lion”), the error is resolved either through action (“I don’t run despite the fear”) or update to the model (“You know, actually maybe I’m
not as bravethe smartest man in the tribe!”). The model is heavily regularized to err on the side of self-flattering, but not to the point where it becomes completely devoid of reality, except maybe for the most pathological cases. (More on this in the section 6)One testable prediction of this theory is that a human child needs other humans around them to grow up fully conscious and self-aware, because otherwise they won’t have enough data to train on to develop a proper model of a human mind. The limited data we have about feral children seems to tentatively support this suggestion.
This section is essentially the main reason I’m writing this. Everything above is mostly me restating things I’ve read somewhere else. Everything below is largely a speculation which potentially makes no sense. But this idea - self-consciousness is a model trained to predict other such models and generalizing to itself - seems both extremely obvious (in retrospective) and as mentioned before, with one small exception I can’t remember ever hearing or reading about it.
Ok, now to the speculations.
4. The Least Interesting Problem
The Least Interesting Problem Of Consciousness aka “the hard problem of consciousness” goes as follows, per David Chalmers, the author of the term:
or, per Wikipedia:
Now, when there’s a non-trivial amount of specialists who question the very existence of the problem, you should expect it’s not very well defined. These and other definitions all seem to mostly shuffle mysterious-ness of consciousness between the words “subjective”, “experience” and “qualia”, and if you’ll try to find a philosophical definition of those you’ll end up going in sepulki loop pretty quickly.
So, let’s start from the basics and ask “What is it that I know and why do I think I know it?”. Well, I know that I’m conscious, and philosophical zombies aside I also know other people are. Why do I think I know it?
Let’s consider the alternative. Not some theoretical alternative, but a very practical one. Were there ever a period in your life when you, human being, did perform behavior and perceived things and reacted on them, but were not conscious of any of this? Yes for sure, it’s called early childhood! I don’t have a whole lot experience with 1-year-olds, but from what I gather they tend to do things and react to things going on around them. And I’m pretty sure anyone reading this has been one at some point. Also most people can’t recall any memories prior to the age of two (the rare cases when people claim they do are generally considered to be false memories), and have limited recollection prior to the age of ten. So, can you imagine being a 1 year old? Not really I’d guess, at most you’d imagine being an adult (or maybe a kid but much older kid which you can remember) in a toddler’s body. Because, when you imagine yourself being a toddler you imagine yourself being a toddler - in your imagination you can reflect on your mind being in a toddler’s body, but “reflect on your mind” part isn’t something that a real toddler can do! An actual toddler-you could have been cheerful or angry, but there was nobody “in there” sitting and reflecting “aha, now we’re cheering”. And if you go on to the earliest memories you do have, and then on to the later memories, you can almost track that someone “appearing”, and you “knowing” things and “realizing” what you feel more and more.
So the reason why I know I’m conscious is because there is "someone" in my head who can see inside myself and see all those mental processes going on and think about them. And the fact that this someone also happens to be me and the mental process of looking at your own mental processes is also susceptible to reflection, seems integral to the definition of consciousness. To begin to wonder "why am I self-aware" you need to be aware that you're self-aware and so on.
Assuming the model from the previous section however, the question stops feeling mysterious and the answer (on the very high level of abstraction) becomes fairly straightforward. The predictive processing model explains how perception is mediated by prediction. The social mind theory says that the most important and most complex thing that our mind will be trying to predict is other minds. In the absence of any laws prohibiting generalization to itself, it becomes quite expected of the mind to generalize to itself and become able to recognize some top-level parts of itself (like thoughts). And the fact that a thought about a thought is also a thought and therefore can be recognized as such closes the reflection loop. I say to myself “Hey, I’m hungry!” but both the speaker and the listener are the same mind which is myself. And the one thinking about this perplexing paradox and trying to figure out which of the two is “really truly” me is also myself, and so on and so on.
Because in fact there's only one model, in which the top level - the consciousness - tries to predict its own state based on its previous states, but also on the input from lower levels and the outside world. The top level is accessible for direct introspection - you know what you're thinking about - but communications with any levels down below need to be done in their terms, e.g. to predict how you'd feel in a certain situation it's not enough to just describe it verbally, you need to actually imagine yourself being there for emotional processing to kick in.
Before I got too carried away wallowing in confirmation bias and running down the happy death spiral, lets try address the elephant in the room. So far, we’ve been talking purely about reasoning about other minds, and how this process is both helping and is helped by developing self-awareness. If this theory were true, if the consciousness is indeed just a model trained to predict other similar models which generalized to itself - why this would make us so darn good at everything else?
5. Why So Smart?
Why instead of being just exceptionally good at deceiving our conspecifics, humans are by far the smartest species on the planet in general?
Frankly, I don’t have a good answer to this one. Here’s a few guesses, but I won’t be surprised if none of them is true, and also more than one can be true.
It’s just raw brainpower
One possible answer is to suggests that consciousness is merely a side-effect of a sufficiently powerful intelligence that needs to interact with other nearly-identical agents. It is useful for this interaction, but not much else. I kind of like this answer because it’s biting the bullet in the true spirit of rationality: without the prior knowledge, from the model so far described I wouldn’t have inferred that self-awareness makes your intelligence (defined as the ability to change the outside world according to your goals) to skyrocket. Therefore, I shouldn’t invent any explanations for why it actually does post-factum and instead just accept that it doesn’t.
And you can make a reasonably strong argument that this is at least a possibility. For one, I’m sure everyone here is aware how weak single-task AI models with no chance of consciousness were able to beat humans in quite a few areas which people 50 years ago would confidently say require consciousness - this is a strong evidence that consciousness is less crucial for being smart than we may have naively assumed. On the other hand, humans have always been social. If self-awareness is only a byproduct of high intelligence in social settings, we wouldn't be able to tell the difference from it being a prerequisite for high intelligence, since for our single example of self-awareness social setting is a given.
On the con side, if this explanation is true it means Peter Watts was right about everything, vampires are real and we’re all very doomed.
Abstract thinking
The other obvious candidate here is the suggestion that the ability to reflect on your own thought process changes how it works.
How would this help? My guess is that it has something to do with being able to think about something in the absence of any relevant external stimulus. A chimp can think how to get a banana when it sees the banana, or is hungry and remembers where bananas are, or some such. Whereas I can lie there and ponder something as remote as Mars colonization, like all day long! Maybe that’s because, when a chimpanzee’s quite advanced mind sees the banana, it tries to predict the world where it gets the banana and uses all it’s world-modeling power to come up with a coherent prediction of it. But it’s stable only for as long as there’s some stimulus related to banana, once there’s no such stimulus the loop becomes unstable - the upper layers of the model trying to predict the world where the chimp gets the banana, but the lower levels report back that, well, no banana around, so after a while it dies out.
Whereas with self-reflection, the model can screen-off” the lower levels error messages: well, duh, there’s no banana and there won’t be because we’re merely thinking about one! Without the concept of “thinking” the error between the prediction (“banana”) and observation (“no banana”) persists and must be resolved and some way, and in the absence of actual banana the only way to resolve it would be to amend the top-down prediction - to drop the idea. But if you do have the concept of “thinking”, the top level can be coherently predicting just having a thought of banana in your head, and observing exactly that. Which makes this state of mind stable, or at least more much more stable, and allows one to spend their resting time thinking about ways to get the stuff they want, i.e. planning and plotting, or even just contemplating the world and finding patterns which can be potentially helpful in the future.
Self-modification
Another tempting candidate which follows from self-reflection is the ability to modify your own thought process. At the end of the day, one of the main reasons we’re worried about AGI is because it will likely have this ability completely unbounded.
However I’m skeptical that this is the main or even particularly important skill in what gives humans the intellectual edge over the other animals. Of course it depends on the definition of self-modification, but my impression is that this skill is far from ubiquitous even among humans, and to learn and apply it one already needs to be hella smarter than any other animal on the planet.
Language and culture
It’s not difficult to see how you can go from mind-modeling to language. Given that you and your tribesman already have these “concepts” or “manifolds” in your heads, and you’re all generally aware of this fact, it only makes sense to come up with some handles to invoke a notion you need in the head of another. Guessing wildly here, I’d imagine that these processes more likely went kind of in parallel - language developed alongside with the theory of mind, rather than coming strictly after it.
And of course the language and the culture that it enables, are what’s mostly responsible for why the humans dominate the planet. For all our intellect, if each generation were figuring everything anew, we wouldn’t even be particularly good hunter-gatherers.
Also it’s possible that once you do have these handles to notions, it makes it much easier to invoke them in your own head too. Something something context switching something. “Surfing Uncertainty” talks about it in the part "9. Being Human". That may be another explanation to how conscious helped the humans to become so good at reasoning.
6. Implications
If we assume that the general model above is remotely correct, what would that entail? Here’s my speculations on some areas.
AI
What good is any philosophy of consciousness if it doesn’t tell us anything about safe AIs, right?
One potential ramification of this model is that it kinda seems plausible to built a powerful super-human oracle AI, at least in relatively narrow domains, without it becoming self-aware. If self-awareness is the result of trying to predict other models like yourself, and we’ll train our [decidedly non-general] AI to do high-quality cancer research, it won’t have neither stimuli nor training data to develop consciousness. (Note that the question of whether it is possible to create a narrow but super-human AI in a field like this remains untouched).
Second, sufficiently powerful AI operating in a world with other similar AIs and trained to work with them and against them will very likely become conscious.
Third, an AI trained to work with and against humans may or may not become conscious through this process, depending on the architecture and computational power and likely some other factors.
Ideal Self
One of the topics constantly coming up together with self-reflection is some idealized image of self which people fail to live up to, the eternal conflict between what you believe you should be and what you are. This makes perfect sense if we remember that 1) our model is a predictive one and it produces actions, not just explains them and 2) the model is nevertheless heavily regularized to err on the side of modeling oneself as “good”, in whichever way your local tribe happened to define goodness.
So, on one hand you train your model (well, more appropriate wording may be “you-model”) of human mind on the observations of tribesmen doing their best to cast themselves in the most positive light possible. On the other hand, you have better ability to see through your own bullshit then through their bullshit (yes, yes, far from perfect, but somewhat better generally). It means that your predictions of what you, as a proper human, will do and think in each situation will systematically fall somewhat short of your real behavior, but instead of just lowering the bar of what counts for a proper human, your model will keep on predicting that you’ll do better next time. Which is, I’d dare to say, corresponds very closely to the observations.
S1 vs S2
[Disclaimer: I did not read the book and familiar with the concept only through cultural osmosis.]
Essentially to cast the System 1 and System 2 model described in “Thinking, Fast and Slow” in the terms of predictive processing framework, you don’t even need the whole theory outlined in this post. You can just say that System 2 is the topmost layer, and the System 1 is the second from the top. In our framework that would correspond to the System 2 being the layer capable of reflecting on itself, and the System 1 would be the layer immediately below it, to which System 1 has direct access but which in itself is not reflective.
Free Will and the Meaning of Life
Well the first one was already solved and the latter one (why it’s common for people to agonize about meaning of life and what’s existential dread?) also isn’t particularly difficult. I’ll just point out the by-now-obvious framing of these questions given this theory.
Trying to find the purpose of every action is a wonderful heuristic for dealing with other intelligent agents, so humans have it. One well-known area where it fails is dealing with anything not driven by at least some form of intelligence. Another case where it fails though is when the agent can examine itself and its own goals and ask a meta question of what's the purpose of me having these goals? There's no good way to answer this question, because it's based on the flawed assumption that everything must have a goal spawning an infinite recursion - what's the purpose of me having this purpose having this purpose... Hence, agonizing about the meaning of life.
As for the free will, it's nothing new, it's how the algorithm feels on the inside. We can just be slightly more specific and say this is a predictive processing algorithm. You observing yourself making a choice (and observing yourself observing yourself...) is an integral part of the choice-making process. But since self-awareness is originally about the outside view of the human mind, it feels like the choice is somewhat external to and also controlled by, the observer. Whereas in fact it's a two way information flow, the upward stream constitutes whatever data relevant to the decision you see/hear/remember, and your feelings about consequences of each side of the choice and your current emotional and physiological state all such things which - who would've thought! - affect your free-will choice. And the downward stream is your expectations of what you, as a proper human being, would do in this situation - the predictions of your model trained on humans and biased toward you being a very good one. At some point one meets the other, errors are resolved (the details of this process are described in "Surfing Uncertainty", seriously you should read it) and you make a choice and everything adds up to normality.
Both these questions don't strictly require anything about brain architecture to figure out (free will has been solved for many years now), I just want to point out how naturally this follows from this theory.
Mary's Room
As I mentioned before, the concept of qualia appears to be extremely ill-defined, so I avoided it throughout. But this specific question of “Whether Mary will gain new knowledge when she goes outside the room and experiences seeing in color?” has a pretty obvious answer.
The confusion stems from the two possible meanings of the word "knowledge": first, it can mean a verbal statement one can recite from memory, and second, information stored in neural connections in one's brains. You can have the latter kind of knowledge without the former. For example, you most likely can't verbally enumerate which individual muscles are contracted in which order and to which extent as you type the word "knowledge" - but that information is clearly stored in your brain somewhere, so you know it in the second sense.
Perception works in exactly the same way only in reverse, per predictive processing framework. Crudely speaking, neurons in the first layers of visual cortex know how to parse color and brightness alterations into lines and curves, layers after that know how to construct 3D shapes out of those lines, infer location of those shapes relative to each other and to you in the 3D space, attach labels to those shapes and correspond them to sounds and other sensations, and at the very top sits the consciousness which perceives e.g. a red car speeding buy.
Mary, being the super-scientist, knows all that and much more and can recite it to you. But there's no mechanism in the human brain to propagate this knowledge from the uppermost levels down to the visual cortex neurons which will be actually inferring things from colors. Mary's neurons so far have only worked with alterations in brightness. And this also means that the neurons instantiating her consciousness don't know in the second sens what it is like to receive signals about colors from the lower levels, i.e. the conscious part of Mary also has no idea of what it means to perceive e.g. a color red - even though Mary does know it in the first sense, she can recite it to you verbally.
In other words, when Mary tells you how human visual cortex works, upper levels of her brain predict their own state (and, through the regular predictive processing mechanisms, state of all the parts of the brain down below) corresponding to Mary talking about colors, or thinking words about colors. But in order to actually visualize seeing a color Mary's brain needs to take a verbal notion of color - which Mary does have - and propagate it down through all the levels to end up in almost the same state as if she was actually seeing that color - an operation which Mary's brain can't execute. The neural connections required for this operation are simply not there. Surely Mary knows in the first sense - i.e. she can tell verbally - what those connections should be, but it doesn't help to actually create them any more than knowing what BMI 20 is helps one to get thin.
Conclusion
Thank you for reading all of this!
To close off, here's some testable predictions of this theory, sorted roughly from relatively easier to test to more hypothetical.
And here's one prediction that at least seems to have failed:
Maybe that's because you have some limited capacity to model humans brains and if you direct lots of it inwards you have less available to direct outwards. Maybe it's about precision vs speed trade-off - people who can model others and themselves in greatest details struggle doing it at speed and with multiple people, which is necessary in social settings. Both of these explanations sound unsatisfactory and adhoc-ish to me though.
How likely do I think that this is all wrong and doesn't make any sense? From the outside view, quite likely. From the inside view, by now it feels kinda natural and straightforward way to think about consciousness for me. So I don't know. But more generally, I do think that we're at the point, or at least very close to it, where someone much smarter and better educated can come up with a detailed, fully satisfying account of what consciousness is and how it works, which will leave no mysteriousness hidden under the rug.
ETA: right after publishing this I've stumbled upon this post reviewing a book which talks about very similar ideas in a more detailed way and from another perspective. From the post it sounds very much compatible with what I'm saying here and potentially answering to the question in the part 5. I'm definitely going to read the book to get a better understanding.