Hedonium's semantic problem

Stuart_Armstrong

If this argument is a re-tread of something already existing in the philosophical literature, please let me know.

I don't like Searle's Chinese Room Argument. Not really because it's wrong. But mainly because it takes an interesting and valid philosophical insight/intuition and then twists it in the wrong direction.

The valid insight I see is:

One cannot get a semantic process (ie one with meaning and understanding) purely from a syntactic process (one involving purely syntactic/algorithmic processes).

I'll illustrate both the insight and the problem with Searle's formulation via an example. And then look at what this means for hedonium and mind crimes.

Napoleonic exemplar

Consider the following four processes:

Napoleon, at Waterloo, thinking and directing his troops.
A robot, having taken the place of Napoleon at Waterloo, thinking in the same way and directing his troops in the same way.
A virtual Napoleon in a simulation of Waterloo, similarly thinking and directing his virtual troops.
A random Boltzmann brain springing into existence from the thermal radiation of a black hole. This Boltzmann brain is long-lasting (24 hours), and, by sheer coincidence, happens to mimic exactly the thought processes of Napoleon at Waterloo.

All four mental processes have the same syntactic properties. Searle would draw the semantic line between the first and the second process: the organic mind is somehow special. I would draw the semantic line between the third and the fourth process. The difference is that in all three of the first processes, the symbols in the brain correspond to objects in reality (or virtual reality). They can make reasonably accurate predictions about what might happen if they do something, and get feedback validating or infirming those predictions. Semantic understanding emerges from a correspondence with reality.

In contrast the fourth process is literally insane. It's mental process correspond to nothing in reality (or at least, nothing in its reality). It emerges by coincidence, its predictions are wrong or meaningless, and it will almost certainly be immediately destroyed by processes it has completely failed to model. The symbols exist only within its own head.

There are some interesting edge cases to consider here (I chose Napoleon because there are famously many people deluded into thinking they are Napoleon), but that's enough background. Essentially the symbol grounding problem is solved (maybe by evolution, maybe by deliberate design) simply by having the symbols and the mental model be close enough to reality. The symbols in the Boltzmann-Napoleon's brain could be anything, as far as we know - we just identify it with Napoleon because it's coincidentally similar. If Napoleon had never existed, we might have no clue as to what Boltzmann-Napoleon was "thinking".

Hedonium: syntax?

The idea behind hedonium is to take something corresponding to the happiest possible state, and copy that with maximal efficiency across the universe. This can involve defining hedons - the fundamental unit of happiness - and maximise them while minimising dolors (the fundamental units of pain/suffering/anti-happiness). Supposedly this would result in the cosmos being filled to the brim with the greatest possible amount of happiness and joy. This could maybe be pictured as taking the supreme moment of ecstatic, orgasmic happiness of the most joyful person ever to live, and filling the cosmos with that.

Let's start with the naivest of possible hedonium ideas, a simple algorithm with a happiness counter "My_happiness" which is either continually increasing or set at some (possibly infinite or trans-finite) maximum, while the algorithm continually repeats to itself "I have ultimate happiness!".

A very naive idea. And one that has an immediate and obvious flaw: what happens if English were to change so that "happiness" and "suffering" exchanged meanings? Then we would have transformed the maximally happy universes into a maximally painful one. All at the stroke of a linguistic pen.

The problem is that that naive hedonium ended up being a purely syntactic construction. Referring to nothing in the outside universe, its definition of "happiness" was entirely dependent on the linguistic label "happiness".

It seems that the more grounded and semantic the symbols are, the harder it is to get an isomorphism that transforms them into something else.

Hedonium: semantics

So how can we ensure that we have something that is inarguably hedonium, not just the algorithmic equivalent of drawing a happy face? I'd say there are three main ways that we can check that the symbols are grounded/the happiness is genuine:

Predictive ability
Simplest isomorphism to reality
Correct features

If the symbols are well grounded in reality, then the agent should have a decent predictive ability. Note that the bar is very low here. Someone who realises that battles are things that are fought by humans, that involve death, and that are won or lost or drawn, is already very much ahead than someone who thinks that battles are things that invite you home for tea and biscuits. So a decent prediction is "someone will die in this battle", a bad one is "this battle will wear a white frilly dress".

Of course, that prediction relies on the meaning of "die" and "white frilly dress". We can get round this problem by looking at predictive ability in general (does the agent win some bets/achieve a goal it seems to have?). Or we can look at the entire structure of the agent's symbolic setup, and the relationships between them. This is what the project CYC tried to do, by memorising databases of sentences like "Bill Clinton belongs to the collection of U.S. presidents" and "All trees are plants". The aim was to achieve and AI, which failed. However, if the database is large and complicated enough, it might be that there is only one sensible way of grounding the symbols in reality. "Sensible" can here be defined using a complexity prior.

But be warned! The sentences are very much intuition pumps. "Bill Clinton belongs to the collection of U.S. presidents" irresistibly makes us think of the real Bill Clinton. We need to able to take sentences like "Solar radiation waxes to the bowl of ambidexterous anger", and deduce after much analysis of the sentences' structures that "Solar radiation -> Bill Clinton", "waxes -> belongs", etc...

Notice there is a connection with the symbolic approach of GOFAI ("Good Old-Fashioned AI). Basically GOFAI failed because the symbols did not encode true understanding. The more hedonium resembles GOFAI, the more likely it is to devoid of actual happiness (equivalently, the more likely it is to be isomorphic to some other, non-happiness situation).

Finally, we can assess some of the symbols (the more abstract ones) by looking at their features (it helps if we have grounded many of the other symbols). For instance, we think one concept might be "nostalgia for the memory of childhood". This is something that we expect to be triggered when childhood is brought up, or when the agent sees a house that resembles its childhood home, and it is likely to result in certain topics of conversation, and maybe some predictable priming on certain tests.

Of course, it is trivially easy to setup an algorithm with a "nostalgia for the memory of childhood" node, a "childhood conversation" node, etc..., with the right relations between them. So, as in this generalised Turing test, it's more indicative if the "correct features" are things the programmers did not explicitly design in.

Hedonium: examples and counterexamples

So, what should we expect from a properly grounded hedonium algorithm? There are many reasons to expect that they will be larger than we might have intuitively thought. Reading the word "anger" or seeing a picture of an angry person both communicate "anger" to us, but a full description of "anger" is much larger and more complex than can be communicated by the word or the picture. Those suggest anger by simply reminding us of our own complex intuitive understanding of the term, rather than by grounding it.

Let's start by assuming that, for example, the hedonium experience involves someone "building on the memory of their previously happiest experience", for instance. Let's ground that particular memory. First of all, we have to ground the concept of (human) "memory". This will require a lot algorithmic infrastructure. Remember we have to structure the algorithm so that even if we label "memory" as "spatula", an outside analyst if forced to conclude that "spatula" can only mean memory. This will, at the minimum, involve the process of laying down many examples of memories, of retrieving them and making use of them.

This is something that the algorithm itself must do. If the algorithm doesn't do that each time the hedonium algorithm is run, then the whole concept of memory is simply a token in the algorithm saying "memory is defined in location X", which is trivially easy to change to something completely different. Remember, the reason the algorithm needs to ground these concepts itself is to prevent it being isomorphic to something else, something very bad. Nor can we get away with a simplistic overview of a few key memories being laid down - we'd be falling back into the GOFAI trap of expecting a few key relationships to establish the whole concept. It seems that for an algorithm to talk about memory in a way that makes sense, we require the algorithm to demonstrate a whole lot of things about the concept.

It won't be enough, either, to have a "memory submodule" that the main algorithm doesn't run. That's exactly the same as having an algorithm with token saying "memory is defined over there"; if you change the content of "over there", you change the algorithm's semantics without changing its syntax.

Then, once we have the concept of memory down, we have to establish the contents and emotions of that particular memory, both things that will require the algorithm to actively perform a lot of tasks.

Let's look at a second example. Assume now that the algorithm thinks "I expect happiness to increase" or something similar. I'll spare you the "I" for the moment, and just focus on "expect". "Expectation" is something specific, probably best defined by the "correct features" approach. It says something about future observations. It allows for the possibility of being surprised. It allows for the possibility of being updated. All these must be demonstrable features of the "expect" module, to ground it properly. So the algorithm must demonstrate a whole range of changing expectations, to be sure that "expects" is more that just a label.

Also, "expectation" is certainly not something that will be wrong every single time. It's likely not something that will be right every single time. This poses great problems for running the hedonium algorithm identically multiple times: the expectations are either always wrong or always right. The meaning of "expectation" has been lost, because it no longer has the features that it should.

There are similar problems with running the same algorithm in multiple locations (or all across the universe, in the extreme case). The first problem is that this might be seen as isomorphic to simply running the algorithm once, recording it, and running the recording everywhere else. Even if this is different, we might have the problem that an isomorphism making the hedonium into dolorum might be very large compared with the size of the hedonium algorithm - but tiny compared with the size of the multiple copies of the algorithm running everywhere.

But those are minor quibbles: the main problem is whether the sense of identity of the agent can be grounded sufficiently well, while remaining accurate if the agent is run trillions upon trillions of times. Are these genuine life experience? What if the agent learns something new during that period - this seems to stretch the meaning of "learning something new", possibly breaking it.

Other issues crop up - suppose a lot of my identity is tied up with the idea I could explore space around me? In a hedonium world, this would be impossible, as the space (physical and virtual) is taken up by other copies being run in limited virtual environments. Remember it's not enough to say "the agent could explore space"; if there is no possibility for the agent to do so "could explore" can be syntactically replaced with "couldn't explore" without affecting the algorithm, just its meaning.

These are just the first issues that come to mind; if you replace actual living and changing agents with hedoniumic copies of themselves, you have to make those copies have sufficiently rich interactions that all the important features of living and changing agents are preserved and grounded uniquely.

Beyond Hedonium

Well, where does that leave us? Instead of my initial caricature of hedonium, what if we had instead a vast amount of more complex algorithms, possibly stochastic and varying, with more choices, more interactions, more exploration, etc... all that is needed to ground them as agents with emotions? What it we took those, and then made them as happy as possible? Would I argue against that hedonium, still?

Probably not. But I'm not sure "hedonium" is the best description of that setup. It seems to be agents, that have various features, among which happens to be extremely high happiness, rather than pure happiness algorithms. And that might be a better way of conceiving of them.

Addendum: mind crimes

Nick Bostrom and others have brought up the possibility of AI "mind crimes", where the AI, simply by virtue of simulating humans in potentially bad situations, causes these humans to exist and, possibly, suffer (and then most likely die as the simulation ends).

This situation seems exactly converse to the above. For hedonium, we want a rich enough interaction to ground all the symbols and leave no ambiguity as to what is going on. To avoid mind crimes, we want the opposite. We'd be fine if the AI's prediction modules returned something like this, as text:

Stuart was suffering intensely, as he recalled agonising memories and tried to repair his mangled arms.

Then as long as we get to safely redefine the syntactic tokens "suffering", "agonising", etc..., we should be fine. Note that the AI itself must have a good grounding of "suffering" and so on, so that it knows what to avoid. But as long as the prediction module (the part that runs repeatedly) has a simple syntactic definition, there should be no mind crimes.

(comment written after only reading the introduction and the "Napoleonic exemplar" section)

In contrast the fourth process is literally insane. It's mental process correspond to nothing in reality (or at least, nothing in its reality). It emerges by coincidence, its predictions are wrong or meaningless, and it will almost certainly be immediately destroyed by processes it has completely failed to model. The symbols exist only within its own head.

I'm not so sure about this. You say that it's "literally insane" and that its thought processes happen to exactly mimic Napoleon's "by sheer coincidence". But I don't see a way for its thought processes to exactly mimic Napoleon's unless it started in the same state as Napoleon's brain and then proceeded to carry out the same calculations, while (by sheer coincidence) happening to receive the exact same sensory data throughout the 24-hour period as Napoleon did.

Yes, the Boltmann brain's reasoning doesn't actually model anything in its immediate (objective) surroundings, but it does process the data it's given correctly, and make the correct predictions from it - at least to the same extent that we presume Napoleon to have been processing his sense data correctly and making correct predictions from it.

It's true that it will soon be destroyed by processes it has completely failed to model. But anybody might be destroyed by processes they have completely failed to model - that just means they haven't been subjected to the right information sources.

I think an analogy to math might be useful here. In one sense, you could say that mathematicians are reasoning about things that are totally divorced from the world - there's no such thing as a perfect circle or an infinitely thin line in real life. Yet once you have assumed those things as axioms, you can still do completely sane and lawful reasoning on what would follow from those axioms. Similarly, the Boltzmann brain accepts the sensory data it gets as axiomatic (as do most of us), and then proceeds to carry out lawful reasoning based on that.

I'm not sure if you can argue that the Boltzmann brain is insane without also arguing that mathematicians are insane. Though, to be fair, I have noticed that the professors in the math department tend to be more colorful than the ones in other departments... :-)

Even if the Boltzmann brain is completely chaotic, internally it contains the same structures/processes as whatever we find meaningful about Napoleon's brain. It is only by external context that we can claim that those things are now meaningful.

For us, that may be a valid distinction -- how can we talk to or interact with the brain? It's essentially in it's own world.

For the Boltzmann!Napoleon, the distinction isn't remotely meaningful. It's in it's own world, and it can't talk to us, interact with us, or know we are here.

Even if the internal processes of the brain are nothing more than randomised chance, it maps to 'real', causal processes in brains in 'valid' ontological contexts.

The question is -- do those contexts/brains exists, and is there any real distinction between the minds produced by Boltmann!Napoleon, Virtual!Napoleon, etc.? I would say yes, and no. Those contexts exist, and we are really discussing one mind that corresponds to all those processes .

As to why I would say that, it's essentially Greg Egan's Dust hypothesis/Max Tegmark's Mathematical Universe thing.

I think an analogy to math might be useful here. In one sense, you could say that mathematicians are reasoning about things that are totally divorced from the world - there's no such thing as a perfect circle or an infinitely thin line in real life. Yet once you have assumed those things as axioms, you can still do completely sane and lawful reasoning on what would follow from those axioms. Similarly, the Boltzmann brain accepts the sensory data it gets as axiomatic (as do most of us), and then proceeds to carry out lawful reasoning based on that.

I can't say I like the analogy. The point of modeling an infinitely thin line is to generalize over lines of any actual thickness. The point of modeling a perfect circle is to generalize over all the slightly ellipsoid "circles" that we want to be perfectly round. We pick out mathematical constructions and axioms based on their usefulness in some piece of reasoning we want to carry out, check them for consistency, and then proceed to use them to talk about (mostly) the real world or (for fun) fake "worlds" (which occasionally turn out to be real anyway, as with non-Euclidean geometry and general relativity).

I think this post jumps the gun. We don't have a really meaningful concept of happened outside of experience or consciousness. At present I think we have very little clue about how those work, how they arise, and what they even precisely are. The later question of happiness doesn't really make sense until we have the first one.

I think this post jumps the gun.

Yes, we are not certain that what I'm saying is accurate, or that we need that much mental infrastructure. But we're not certain that we don't either. The post is an argument, not a conclusion.

Knowing how something works is unnecessary to specify it if you can just point. And we can point. The things we are pointing at can turn out to be important (on reflection). By reproducing them without missing plausibly relevant details we can establish a reliable analogy between what we care about (even if we don't understand what it is and why we care about it) and the reproductions. By getting rid of the details we risk missing something relevant, even if we can't formulate what it is more clearly than by pointing.

Now replying to actual meat of the post:

This post seems to be mostly talking about the questions of "what is intelligence" and "what is meaning", while implying that answering that question would also help figure out the answer to "what's the minimum requirement for the subjective experience of happiness".

But it doesn't seem at all obvious to me that these are the same question!

Research on the requirements for subjective experience doesn't, as far as I know, say anything about whether something is intelligent or having meaning. E.g. Thomas Metzinger has argued that a neural representation becomes a phenomenally conscious representation if it's globally available for the system (for deliberately guided attention, cognitive reference, and control of action), activated within a window of presence (subjectively perceived as being experienced now), bound into a global situational context (experienced as being part of a world), etc. Some researchers have focused on specific parts of the criteria, like the global availability.

Now granted, if your thesis is that a hedonium or mind crime algorithm seems to require some minimum amount of complexity which might be greater than some naive expectations, then the work I've mentioned would also support that. But that doesn't seem to me like it would prevent hedonium scenarios - it would just put some upper bound on how dense with pleasure we can make the universe. And I don't know of any obvious reasons for why the required level of complexity for experiencing subjective pleasure would necessarily be even at the human level: probably an animal-level intelligence could be just as happy.

Later on in the post you say:

But those are minor quibbles: the main problem is whether the sense of identity of the agent can be grounded sufficiently well, while remaining accurate if the agent is run trillions upon trillions of times. Are these genuine life experience? What if the agent learns something new during that period - this seems to stretch the meaning of "learning something new", possibly breaking it.

Other issues crop up - suppose a lot of my identity is tied up with the idea I could explore space around me? In a hedonium world, this would be impossible, as the space (physical and virtual) is taken up by other copies being run in limited virtual environments. Remember it's not enough to say "the agent could explore space"; if there is no possibility for the agent to do so "could explore" can be syntactically replaced with "couldn't explore" without affecting the algorithm, just its meaning.

But now you seem to be talking about something else than in the beginning of the post. At first you only mentioned the hedonium scenario as one where we took a single maximally happy state and copied it across the universe to obtain the maximum density of happiness; now you seem to be talking about something like "would it be possible to take all currently living humans and make them maximally happy while preserving their identity". This is a very different scenario from just the plain hedonium scenario.

probably an animal-level intelligence could be just as happy.

In that case it's not a human-comparable intelligent agent experiencing happiness. So I'd argue that either a) hedonium needs to be more complex that expected, or b) the definition of happiness does not require high level agents experiencing it.

And I'm arguing that the minimum complexity should be higher than the human level, as you need not only a mind, but an interaction with the environment of sufficient complexity to ground it as a mind.

At first you only mentioned the hedonium scenario as one where we took a single maximally happy state and copied it across the universe to obtain the maximum density of happiness; now you seem to be talking about something like "would it be possible to take all currently living humans and make them maximally happy while preserving their identity". This is a very different scenario from just the plain hedonium scenario.

That's the point. I don't think that the first setup would count as a happy state, if copied in the way described.

In contrast the fourth process is literally insane. [Its mental processes] correspond to nothing in reality (or at least, nothing in its reality).

Rather, it's not thinking about your reality. It does think about its own reality, for which you've even proposed a model (the actual battle). If you argue with it, using a Boltzmann brain that spontaneously appeared at Waterloo as your avatar, it'll say that it's you, Boltzmann-Stuart, who is insane, who's saying that it, the actual Napoleon, sprang into existence from the thermal radiation of a black hole.

Its reality is not morally relevant to you, so its thought about its reality is misguided for the goal of thinking about things that are morally relevant to you. But your reality is morally irrelevant to it, so thinking about your reality is similarly misguided from its point of view. The symmetry arises from different preferences (that are not trivially relocatable for the same reasons as hedonium isn't), with moral relevance of various worlds an aspect of preference, different near-copies of the same agent valuing different worlds to different extents, because these copies are related to these worlds in different ways.

I get the idea in principle, but not sure I agree with the symmetry in practice.

If we take me and Boltzmann Stuarts, we would have the same values. And, assuming we could somehow communicate, Boltzmann Stuart would defer to current me's decisions (because our preferences would be over the current me's world).

I'm guessing you are talking about a different Boltzmann-Stuart than the one from my comment (the one I was talking about has the same info as you, or maybe it is just you, it only acts as an avatar at Waterloo, so there is nothing for you two to argue about).

I think you mean a Stuart Armstrong from another world who you observe as a Boltzmann brain, closer to Boltzmann-Napoleon from the post, except that it's Stuart and not Napoleon (correct me if I'm wrong). In that case I don't see why you'd expect it to defer to your decisions more than you'd expect yourself to defer to its decisions, since at least the observations will be analogous. You see a Boltzmann brain that thinks it's a Stuart from another world (its world), and similarly Boltzmann-Stuart sees a Boltzmann brain that thinks (as you do) that it's a Stuart from another world (your world). There seems to be no reason to expect one of them to be more persuasive than the other. I'd guess they could cooperate on an equal footing instead.

because our preferences would be over the current me's world

This is probably the root of the intended disagreement. As with hedonium (simpler versions with major flaws) in your post, preferences may refer to environment, point to it without specifying what it is. As a result, placing the same person in a different world changes their preferences, because the person doesn't know what they prefer, they need to observe the world to learn which world they care about. So if Boltzmann-Stuart remembers always living in a different world, that's the world they care about.

There are multiple reasons for not caring about a world that are similar in practice. One is the world having no morally relevant features in it. Another is not being able to affect it, which is usually the case when it's counterfactual (when you've observed events absent in that world, evidence that you are now acting in a different world and so probably don't affect the counterfactual world). Not discarding hard-to-affect worlds as morally irrelevant is one thing that gives UDT an advantage, as it can look for unusual means of control such as prediction of agent's decisions made in a world that doesn't contain the agent in the usual form. Yet another reason to disregard a world is not being able to predict (or compare) consequences of decisions, so that there is no point in caring about which decisions are enacted in that world.

Above these reasons seems to be a global measure over the worlds, which should be part of preference, but it's not clear if the world where an agent originated shouldn't leave large bias on which worlds are preferred in this sense. For this thought experiment, this seems to be the only relevant distinction, assuming Boltzmann avatars can coordinate decisions across worlds, so that the factors of unpredictability and inability to control don't apply (and we know that the worlds have similar features of moral relevance, the same human civilization).

So one way to break the symmetry is to formulate your preference while in the original world, then wipe your memories and replace them with those of another world, while somehow keeping the preference (which threatens to become epiphenomenal in this setup, so perhaps that doesn't make sense). Otherwise, if you are only human, your preference is not yet formulated sufficiently to become relocatable between worlds without changes in measure of how much various worlds are valued, and so having memories of always living in another world also changes the preference that you'd formulate on reflection.

One is the world having no morally relevant features in it. Another is not being able to affect it

Both are relevant for why Boltzmann stuart would defer to real world stuart.

Why? As I understand it, Boltzmann-Stuart affects its own world, in the same sense as Harry Potter affects Hogwarts but doesn't affect Mordor, it's control in a hypothetical situation. We don't require that any part of the setup is real, instead we see whether the agent controls the environment inside the hypothetical. And its own world has the same kind of valuable people as your world, in the same way as Albus Dumbledore is a morally relevant feature in his hypothetical world, while a non-magical chair in Hogwarts isn't. The problem with Boltzmann-Stuart's world is not that it's not affected by Boltzmann-Stuart, or that it doesn't have anything of relative value in it (compared to the value of the world itself).

The problem, from your perspective, is that the whole world is in some sense "not real". For the purposes of decision making, being "not real" seems to be the same thing as not being morally relevant, hence the enumeration of possible reasons for not being considered relevant in my comment. The reason that seems to apply in this case seems to be the measure of caring over possible worlds, which I guess depends on which world it's originally formulated from.

Another hypothesis is that you don't identify Botzmann-Stuart and the hypothetical Stuart that lives in the hypothetical world that Boltzmann-Stuart merely believes he's inhabiting. In that case my argument can be seen as being about the hypothetical Stuart, rather than about Boltzmann-Stuart, with the additional assumption that the hypothetical Stuart should be identified with Boltzmann-Stuart (which I didn't argue for). I identify them is a way similar to how it's done in discussions of UDT and counterfactual trade. For example, we may consider hypothetical Stuart a predictive device that manages to enact the decisions of Boltzmann-Stuart, so that any decisions made by Boltzmann-Stuart affect the hypothetical Stuart's world.

In the example I was thinking, real-world Stuart exists in a world that contains other agents and endures, while Botlzmann Stuart exists in a world that only contains his imagination of other agents, and ends swiftly.

I would draw the semantic line between the third and the fourth process.

Scott Aaronson draws the same line, but for a different reason: Boltzmann brains are reversible processes, therefore the usual ideas do not apply. He argues that it is plausible that irreversibility is a requirement for consciousness.

I can imagine how decoherence explains why we only experience descent along a single path through the multiverse-tree instead of experiencing superposition, but I don't think that's sufficient to claim that all consciousness requires decoherence.

An interesting implication of Scott's idea is that consciousness is timeless, despite our experience of time passing. For example, put a clock and a conscious being inside Schrödinger’s box and then either leave it in a superposition forever or open it at some point in the future. If we don't open the box, in theory nothing is conscious of watching the clock as time passes. If we open the box, there's a conscious being who can describe all the time inside the box watching the clock. When, to the outside observer, does that conscious experience happen? Either all the conscious experience happens the instant the box is measured, contrary to our notions of the experience of time passing and our understanding of how the physical state of the clock changes (e.g. the conscious experience of seeing the clock read 3:52 PM on Thursday should have happened at 3:52 PM on Thursday when the clock inside the box was in a superposition of physically displaying that time with very high probability), or else there would have been conscious experience the entire time even if the box was never opened, in order that the experience could happen at the corresponding time.

Which means we're all p-zombies until a specific point in the future when we decohere sufficiently to have consciousness up to that point.

Note that decoherence is not an absolute. It's the degree of interaction/entanglement of one system with another, usually much larger system.

When, to the outside observer, does that conscious experience happen?

Until you interact with a system, you don't know anything about it.

Which means we're all p-zombies until a specific point in the future when we decohere sufficiently to have consciousness up to that point.

Not necessarily, "simply" emitting photons in an irrecoverable way would be sufficient for internal conscious experience. Of course the term "emitting photons" is only defined with respect to a system that can interact with these photons.

Maybe it's a gradual process where the degree of consciousness rises with the odds of emission, even if there is no one to measure it. Or something.

Very interesting post, but your conclusion seems too strong. Presumably, if instead of messing around with artificial experiencers, we just fill the universe with humans being wireheaded, we should be able to get large quantities of real pleasure with fairly little actually worthwhile experiences; we might even be able to get away with just disembodied human brains. Given this, it seems highly implausible that if we try to transfer this process to a computer, we are forced to create agent so rich and sophisticated that their lives are actually worth living.

we just fill the universe with humans being wireheaded, we should be able to get large quantities of real pleasure with fairly little actually worthwhile experiences

By this argument, we might not. If the wireheaded human beings never have experiences and never access their memories, in what way do they remain human beings? ie if we could lobotomise them without changing anything, are they not already lobotomised?

if we could lobotomise them without changing anything, are they not already lobotomised?

Very unseriously: Of course not, because if they were already lobotomized we wouldn't be able to lobotomize them. :P

Sorry, I have to go eat dinner, but speaking of dinner, the term we're looking for here is recipe (or really: naturalized generative model). A symbol is grounded by some sort of causal model; a causal model consists in both a classifier for perceptual features and a recipe for generating an/the object referred to by the causal model. For FAI purposes, we could say that when the agent possesses a non-naturalized and/or uncertain understanding of some symbol (ie: "happiness"), it should exercise a strong degree of normative uncertainty in how it acts towards real-world objects relating to that symbol.

But this is really just a quick note before dinner, sorry.

Interesting. ....what is the generative recipe needed for?

Primarily for predicting how the "object" (ie: component of the universe) in question is going to act. Classifying (in the machine learning sense) what you see as a cat doesn't tell you whether it will swim or slink (that requires causal modeling). Also, causal knowledge confirmed by time-sequence observation seems to actually make classification a much easier problem: the causal structure of the world, once identifable, is much sparser than the feature-structure of the world. Every cause "radiates" information about many, many effects, so modeling the cause (once you can: causal inference is near the frontier of current statistics) is a much more efficient way to compress the data on effects and thus generalize successfully.

Interesting, thanks.

Not really because it's wrong.

The linked article doesn't reality demonstrate that. In particular, if you are going to appeal to robot bodies as giving a level of causal connection sufficient to ground symbols, then Searle still has a point about the limitations of abstract, unembodied, software.

difference is that in all three of the first processes, the symbols in the brain correspond to objects in reality (or virtual reality).

How can a symbol correspond to what does not actually exist?

Your criterion for correspondence seems to just be the ability to make accurate productions. But virtual reality is just the sort of scenario that shows the lack of neceesary connection between predictive accuracy and ontological accuracy.

Essentially the symbol grounding problem is solved (maybe by evolution, maybe by deliberate design) simply by having the symbols and the mental model be close enough to reality.

What does closeness mean here? Does a conscious entity in a VR have it or not?

If the symbols are well grounded in reality, then the agent should have a decent predictive ability

But predictive ability doesn't imply correct grounding. Consider spectrum inversion,

In particular, if you are going to appeal to robot bodies as giving a level of causal connection sufficient to ground symbols, then Searle still has a point about the limitations of abstract, unembodied, software.

Except that isn't Searle's stated point. He really flubs the "problem of other minds" objection as badly as I parodied it.

How can a symbol correspond to what does not actually exist?

If a human plays starcraft 2 and has a symbol for Protoss Carrier, does that mean the human's symbol is suddenly ungrounded?

Except that isn't Searle's stated point.

I agree, but it is possible to rescue valid intuitions from M,B&P.

If a human plays starcraft 2 and has a symbol for Protoss Carrier, does that mean the human's symbol is suddenly ungrounded?

If fictions can ground symbols, then what is wrong with having santa , the tooth fairy, and unicorns in your ontology?

but it is possible to rescue valid intuitions from M,B&P.

Indeed, that was my argument (and why I'm annoyed that Searle misdirected a correct intuition).

If fictions can ground symbols, then what is wrong with having santa , the tooth fairy, and unicorns in your ontology?

You should have them - as stories people talk about, at the very least. Enough to be able to say "no, Santa's colour is red, not orange", for instance.

Genuine human beings are also fiction from the point of view of quantum mechanics; they exist more strongly as models (that's what allows you to say that people stay the same even when they eat and excrete food). Or even as algorithms, which are also fictions from the point of view of physical reality.

PS: I don't know why you keep on getting downvoted.

Genuine human beings are also fiction from the point of view of quantum mechanics

Are you saying that you and Harry Potter are equally fictional? Or rainbows and kobolds? If not, what are you saying, when you say that they are all fictional? What observations were made, whereby quantum mechanics discovered this fictional nature, and what counterfactual observations would have implied the opposite?

Are you saying that you and Harry Potter are equally fictional?

Certainly not. I'm saying it's useful for people to have symbols labelled "Stuart Armstrong" and "Harry Potter" (with very different properties - fictional being one), without needing either symbol defined in terms of quantum mechanics.

Something defined in terms of quantum mechanics can still fail to correspond....you're still on the mail and therefore talking about issues orthogonal to grounding.

You should have them - as stories people talk about, at the very least. Enough to be able to say "no, Santa's colour is red, not orange", for instance..

Ontology isn't a vague synonym for vocabulary. An ontological catalogue is the stuff whose existence you are seriously committed to ... .so if you have tags against certain symbols in your vocabulary saying "fictional", those definitely aren't the items you want to copy across to your ontological catalogue .

Enough to be able to say "no, Santa's colour is red, not orange", for instance.

Fictional narratives allow one to answer that kind of question by relating one symbol to another ... but the whole point of symbol grounding is to get out of such closed, mutually referential systems.

This gets back to the themes of the chinese room. The worry is that if you naively dump a dictionary or encyclopedia into an AI, it won't have real semantics, because of lack of grounding, even though it can correctly answer questions, in the way you and I can about Santa.

But if you want grounding to solve that problem, you need a robust enough version of grounding ... it wont do to water down the notion of grounding to include fictions.

Genuine human beings are also fiction from the point of view of quantum mechanics; they exist more strongly as models (that's what allows you to say that people stay the same even when they eat and excrete food). Or even as algorithms, which are also fictions from the point of view of physical reality.

Fiction isnt a synonym for lossy high-level abstraction, either. Going down that route means that "horse" and "unicorn" are both fictions. Almost all of our terms are high level abstractions.

What you've written here tells people what you think fiction is not. Could you define fiction positively instead of negatively?

For the purposes of the current discussion , iit s a symbol which is not intended to correspond to reality.

Really? In that case, Santa is not fiction, because the term "Santa" refers to a cultural and social concept in the public consciousness--which, as I'm sure you'll agree, is part of reality.

I don't have to concede that the intentional content of culture is part of reality, even if I have to concede that its implementations and media are. Ink and paper are real, but as as soon as you stop treating books as marks on paper, and start reifying the content, the narrative, you cross from the territory to the map.

Sure, but my point still stands: as long as "Santa" refers to something in reality, it isn't fiction; it doesn't have to mean a jolly old man who goes around giving people presents.

My point would be that a terms referent has to be picked out by its sense. No existing entity is fat AND jolly AND lives at the north pole AND delivers presents., so no existing referent fulfils the sense.

No existing entity is fat AND jolly AND lives at the north pole AND delivers presents., so no existing referent fulfils the sense.

This simply means that "an entity that is fat AND jolly AND lives at the North Pole AND delivers presents" shouldn't be chosen as a referent for "Santa". However, there is a particular neural pattern (most likely a set of similar neural patterns, actually) that corresponds to a mental image of "an entity that is fat AND jolly AND lives at the North Pole AND delivers presents"; moreover, this neural pattern (or set of neural patterns) exists across a large fraction of the human population. I'm perfectly fine with letting the word "Santa" refer to this pattern (or set of patterns). Is there a problem with that?

My $0.02...

OK, so let's consider the set of neural patterns (and corresponding artificial signals/symbols) you refer to here... the patterns that the label "Santa" can be used to refer to. For convenience, I'm going to label that set of neural patterns N.

I mean here to distinguish N from the set of flesh-and-blood-living-at-the-North-Pole patterns that the label "Santa" can refer to. For convenience, I'm going to label that set of patterns S.

So, I agree that N exists, and I assume you agree that S does not exist.

You further say:

"I'm perfectly fine with letting the word "Santa" refer to this pattern (or set of patterns)."

...in other words, you're fine with letting "Santa" refer to N, and not to S. Yes?

Is there a problem with that?

Well, yes, in that I don't think it's possible.

I mean, I think it's possible to force "Santa" to refer to N, and not to S, and you're making a reasonable effort at doing so here. And once you've done that, you can say "Santa exists" and communicate exists(N) but not communicate exists(S).

But I also think that without that effort being made what "Santa exists" will communicate is exists(S).

And I also think that one of the most reliable natural ways of expressing exists(N) but not communicate exists(S) is by saying "Santa doesn't exist."

Put another way: it's as though you said to me that you're perfectly fine with letting the word "fish" refer to cows. There's no problem with that, particularly; if "fish" ends up referring to cows when allowed to, I'm OK with that. But my sense of English is that, in fact, "fish" does not end up referring to cows when allowed to, and when you say "letting" you really mean forcing.

That seems fair. What I was mostly trying to get at was a way to describe Santa without admitting his existence; for instance, I could say, "Santa wears a green coat!" and you'd be able to say, "That's wrong!" without either of us ever claiming that Santa actually exists. In other words, we would be communicating information about N, but not S.

More generally speaking, this problem usually arises whenever a word has more than one meaning, and information about which meaning is being used when is conveyed through context. As usual, discussion of the meaning of words leaves out a lot of details about how humans actually communicate (for instance, an absolutely enormous amount of communication occurs through nonverbal channels). Overloaded words occur all the time in human communication, and Santa just happens to be one of these overloaded words; it occasionally refers to S, occasionally to N. Most of the time, you can tell which meaning is being used, but in a discussion of language, I agree I was being imprecise. The concept of overloading a word just didn't occur to me at the time I was typing my original comment, for whatever reason.

(nods) Yes, agreed with all of this.

And it is admittedly kind of funny that I can say "Superman is from Krypton, not from Vulcan!" and be understood as talking about a fictional character in a body of myth, but if I say "Superman really exists" nobody understands me the same way (though in the Superman mythos, Superman both really exists and is from Krypton). A parsing model that got that quirk right without special-case handling would really be on to something.

The Sense/Reference distinction handles this all out of the box, without the assumption that only certain words have double meanings.

Eg the, the correct sense of Superman is being from Krypton. But Superman has no referent...is fictional , does not exist.

The Sense/Reference distinction handles this all out of the box, without the assumption that only certain words have double meanings.

It also forces you to reject objects in virtual reality as "real".

A way to communicate Exists(N) and not Exists(S) in a way that doesn't depend on the context of the current conversation might be ""Santa" exists but Santa does not." Of course, the existence of "Santa" is granted when "Santa does not exist" is understood by the other person, so this is really just a slightly less ambiguous way of saying "Santa does not exist"

But I also think that without that effort being made what "Santa exists" will communicate is exists(S).

Yes, The not-exists(S) is explicit, in "there is no Santa ", the exists(N) is implicit in the fact that listener and speaker understood each other.

This simply means that "an entity that is fat AND jolly AND lives at the North Pole AND delivers presents" shouldn't be chosen as a referent for "Santa".

That is the exact opposite if what I was saying. An entity that is fat and jolly, etc, should, normatively be chosen as the referent of "Santa", and in the absence of any such, Santa has no referent. AFAICT you are tacitly assuming that every term must have a referent, however unrelated to its sense. I am not. Under the Fregean scheme, I can cash out fictional terms as terms with no referents.

However, there is a particular neural pattern (most likely a set of similar neural patterns, actually) that corresponds to a mental image of "an entity that is fat AND jolly AND lives at the North Pole AND delivers presents";

I'm not disputing that. What I am saying is that such neural patterns are the referent of "neural other representing fat jolly man....", not referents of "Santa".

moreover, this neural pattern (or set of neural patterns) exists across a large fraction of the human population. I'm perfectly fine with letting the word "Santa" refer to this pattern (or set of patterns). Is there a problem with that?

Several.

Breaks the rule that referents are picked out by senses.
Entails map/territory confusions.
Blurs fiction/fact boundary.
Inconsistent...sometimes "X" has referent X, sometimes it has referent "representation of X"

Look, I think you've maybe forgotten that this conversation started when you took issue with this part of the article:

difference is that in all three of the first processes, the symbols in the brain correspond to objects in reality (or virtual reality).

To which Stuart replied:

If a human plays starcraft 2 and has a symbol for Protoss Carrier, does that mean the human's symbol is suddenly ungrounded?

And then you said:

If fictions can ground symbols, then what is wrong with having santa , the tooth fairy, and unicorns in your ontology?

And from here the conversation branched off. Several comments in, and you have now managed to divert this conversation into a discussion on philosophy of language, all the while entirely ignoring the fact that your stated concerns are irrelevant to your original contention. Let's take a look at each of your complaints:

An entity that is fat and jolly, etc, should, normatively be chosen as the referent of "Santa", and in the absence of any such, Santa has no referent.

You have now utterly divorced this conversation from the issue which first prompted it. The confusion here stems from the fact that the traditional tale of "Santa" tells of a physical man who physically exists at the physical North Pole. None of that applies to virtual reality, which was the part of the article you originally took umbrage at. Nor is it the case for Stuart's example of the Protoss Carrier in Starcraft 2. In these examples, objects in virtual reality/the computer model of the Protoss Carrier should "normatively be chosen as the referents" (as you phrased it) of the phrases "objects in virtual reality"/"the Protoss Carrier".

What I am saying is that such neural patterns are the referent of "neural other representing fat jolly man....", not referents of "Santa".

What is the referent of "Protoss Carrier", if not "computer-generated video game model of the Protoss Carrier"?

Breaks the rule that referents are picked out by senses.

Again, irrelevant to the original example.

Entails map/territory confusions.

Still irrelevant.

Blurs fiction/fact boundary.

Still irrelevant.

Inconsistent...sometimes "X" has referent X, sometimes it has referent "representation of X"

Still irrelevant, and you can easily tell from context besides.

Look, you've performed what is known as a conversational "bait-and-switch", wherein you present one idea for discussion, and when another engages you on that idea, you back out and start talking about something that seems maybe-a-little-bit-possibly-slightly-tangentially-related-if-you-don't-squint-at-it-too-hard. Stick to the topic at hand, please.

EDIT: And in fact, this entire confusion stems from your original use of the word "fiction". You've been implicitly been using the word with two meanings in mind, in an analogous fashion to how we've been using "Santa" that refer to different things:

Something which does not exist in physical reality except by being instantiated in some virtual representation or map, and makes no claim to exist in physical reality. This is the definition you used when first addressing Stuart.
Something which does not exist in physical reality except by being instantiated in some virtual representation or map, and is also claimed to exist in physical reality. This is the definition you began using when you first brought up Santa and unicorns, and it's the definition you've been using ever since.

In retrospect, I should have seen that and called you out on it immediately, but I didn't look too closely despite there being a nagging feeling that something strange was going on when I first read your comment. Let's keep words from being overloaded, neh? That's what happened with Santa, after all.

you have now managed to divert this conversation into a discussion on philosophy of language,

The discussion is about the Chinese Rooms, the .CR is about semantics, and philosophy of language is relevant to semantics, so I dont see the diversion.

None of that applies to virtual reality, which was the part of the article you originally took umbrage at.

I don't see why not. VR isnt a place where things exist, it rather series on representation and intentionally ad much as a picture or novel.

What I didn't say before was that VR is particularly lproblematicalin the case of the CR, because a CR that is grounding its symbols in VR could reconfigure itself with the VR run internally...you then have the paradox of a CR with grounded symbols supposedly, but no contact with the outside world.

What is the referent of "Protoss Carrier", if not "computer-generated video game model of the Protoss Carrier"?

As I stated before, I don't think there has to be one.

That gives me a way of cashing out the fact/fiction distinction. I don't know what your alternative is, because you haven't given one,

As before, you seem to be arguing from a tacit assumption that all terms must have referents.

Irrelevant....irrelevant.

I was arguing against the all-must-have-referents theory, and inasmuch as you are still using it, it is still relevant.

you can easily tell from context besides.

I was tacitly using the assumption, common in analytical philosophy, that theories .should be built on cleaned up versions of natural language...hence, "normatively". How do you build theories, dxu?

Stick to the topic at hand, please.

So what do you think of the CR, dxu?

Something which does not exist in physical reality except by being instantiated in some virtual representation or map, and is also claimed to exist in physical reality.

I didn't say that. Things dont, properly speaking, exist in virtual reality, they are represented in it. Which adds nothing to neural representations.

That being the case, I .am saying a fictional term...

....has no referent...

2,....and is not intended to.

But these are not different theories. 2 is just a ramifications of 1, a second iteration.

The discussion is about the Chinese Room

...What?

I was arguing against the all-must-have-referents theory, and inasmuch as you are still using it, it is still relevant.

And how is the "all-must-have-referents theory" relevant to Stuart Armstrong's original example which you first took issue with?

How do you build theories, dxu?

I build theories by observing how real humans communicate.

I didn't say that. Things dont, properly speaking, exist in virtual reality, they are represented in it.

Right, and terms for objects in virtual reality can refer to those representations. That's the same thing I was getting at with my distinction involving Santa and neural patterns, except in this case, there's no claim that the VR object exists in the physical world, and thus nothing to get confused about. Hence, your objection to Santa does not apply here.

The discussion is about the Chinese Room...

What?

From my first reply:

The linked article doesn't reality demonstrate that. In particular, if you are going to appeal to robot bodies as giving a level of causal connection sufficient to ground symbols, then Searle still has a point about the limitations of abstract, unembodied, software.

And how is the "all-must-have-referents theory" relevant to Stuart Armstrong's original example which you first took issue with?

I'm bringing it up because you are. Its like you're saying it's OK for you to appeal to unjustified premises, but if I bring it up, I'm at fault for changing the subject.

build theories by observing how real humans communicate.

If that means taking their statements at face value, without allowing for metaphor .or misleading phraseology....then I have to tell you that there is a thing called a cold that someone can catch, and a thing called a temper someone can lose.

Right, and terms for objects in virtual reality can refer to those representations

Is that a fact? In particular is it is it a fact that people are referring in my technical sense of "refering" , and not in some loose and popular sense , eg talking about,

Isn't that just the contention of "Yes, Virginia..."?

I'm not quite sure on what you mean by that. I looked up the phrase, and it returned an 1897 article in The New York Sun, but besides the obvious fact that both my comment and the article deal with the existence (or non-existence) of Santa Claus, I'm not seeing a huge connection here. Could you possibly expand?

I'm not entirely sure that we're still disagreeing. I'm not claiming that fiction is the same as non-fictional entities. I'm saying that something functioning in the human world has to have a category called "fiction", and to correctly see the contours of that category.

This gets back to the themes of the chinese room. The worry is that if you naively dump a dictionary or encyclopedia into an AI, it won't have real semantics, because of lack of grounding, even though it can correctly answer questions, in the way you and I can about Santa.

Yes, just like the point I made on the weakness of the Turing test. The problem is that it uses verbal skills as a test, which means it's only testing verbal skills.

However, if the chinese room walked around in the world, interacted with objects, and basically demonstrated human-level (or higher) lever of prediction, manipulation, and such, AND it operated by manipulating symbols and models, then I'd conclude that those actions demonstrate the symbols and models were grounded. Would you disagree?

I'd say they could .bd taken to be as grounded as ours. There is still a problem with referential semantics, that neither we nor the AI can tell it isnt in VR.

Which itself feeds through into problem with empiricism and physicalism.

Since semantics is inherently tricky, there aren't easy answers to the CR.

If you're in VR and can never leave it or see evidence of if (eg a perfect Descartes's demon), I see no reason to see this as different from being in reality. The symbols are still grounded in the baseline reality as far as you could ever tell. Any being you could encounter could check that your symbols are as grounded as you can make them.

Note that this is not the case for a "encyclopaedia Chinese Room". We could give it legs and make it walk around; and then when it fails and falls over every time while talking about how easy it is to walk, we'd realise its symbols are not grounded in our reality (which may be VR, but that's not relevant).

The Sense/Reference distinction handles this all out of the box, without the assumption that only certain words have double meanings.

Eg the, the correct sense of Superman is being from Krypton. But Superman has no referent...is fictional , does not exist.

1TheOtherDave11y

(nods) Yes, agreed with all of this. And it is admittedly kind of funny that I can say "Superman is from Krypton, not from Vulcan!" and be understood as talking about a fictional character in a body of myth, but if I say "Superman really exists" nobody understands me the same way (though in the Superman mythos, Superman both really exists and is from Krypton). A parsing model that got that quirk right without special-case handling would really be on to something.

LESSWRONG
LW

LESSWRONG
LW

22

Hedonium's semantic problem

22

Napoleonic exemplar

Hedonium: syntax?

Hedonium: semantics

Hedonium: examples and counterexamples

Beyond Hedonium

Addendum: mind crimes

22