[4] AI safety relevant side note: The idea that translations of meaning need only be sufficiently reliable in order to be reliably useful might provide an interesting avenue for AI safety research. [...]
I also see this as an interesting (and pragmatic) research direction. However, I think its usefulness hinges on ability to robustly quantify the required alignment reliability / precision for various levels of optimization power involved. Only then it may be possible to engineer demonstrably safe scenarios with alignment quality tracking the optimization power growth (made easier by a slower opt. power growth).
I think that this is a very hard task, both for unknown unknowns around strong optimizers, practically (we still need increasingly good alignment), and also fundamentally (if we can define this rigorously, perfect alignment would be a natural limit).
On the other hand, a cascade of practically sufficient alignment mechanisms is one of my favorite ways to interpret Paul's IDA (Iterated Distillation-Amplification), this happening sufficiently reliably at every distillation step.
Re language as an example: parties involved in communication using language have comparable intelligence (and even there I would say someone just a bit smarter can cheat their way around you using language).
Re language as an example: parties involved in communication using language have comparable intelligence (and even there I would say someone just a bit smarter can cheat their way around you using language).
Mhh yeah so I agree these examples of ways in which language "fails". But I think they don't bother me too much?
I put them in the same category as "two agents with good faith sometimes miscommunicate - and still, language overall is pragmatically", or "works good enough". In other words, even though there is potential for exploitation, that potential is in fact meaningfully constraint. More importantly, I would argue that the constraint comes (in large parts) from the way the language has been (co-)constructed.
a cascade of practically sufficient alignment mechanisms is one of my favorite ways to interpret Paul's IDA (Iterated Distillation-Amplification)
Yeah, great point!
However, I think its usefulness hinges on ability to robustly quantify the required alignment reliability / precision for various levels of optimization power involved.
I agree and think this is a good point! I think on top of quantifying the required alignment reliability "at various levels of optimization" it would also be relevant to take the underlying territory/domain into account. We can say that a territory/domain has a specific epistemic and normative structure (which e.g. defines the error margin that is acceptable, or tracks the co-evolutionary dynamics).
Congratulations on your first LessWrong post! :) (Well, almost first)
As a piece of feedback, I will note that I found the "Rosenberg's crux" section pretty hard to read, because it was quite dense.
I feel like if I would've have read the original letter exchange, I could then have turned to this post, and gone "a-ha!" In other words, it felt like a useful summary, but didn't give me the original generators/models, such that I could pass the intellectual Turing test of what Dennett and Rosenberg actually believe.
By comparison, I think the section on the "cryptographer's constraint" was clearer; since it was more focused on elaborating on a particular principle and why it was important, along with considering some concrete examples more in depth.
Thanks :)
> I will note that I found the "Rosenberg's crux" section pretty hard to read, because it was quite dense.
Yeah, you're right - thanks for the concrete feedback !
I wasn't originally planning to make this a public post and later failed to take a step back and properly model what it would be like as a reader without the context of having read the letter exchange.
I consider adding a short intro paragraph to partially remedy this.
One simple answer to the question "what is purpose?" is "the reference input of a control system".
On this view, a thermostat acts purposively to maintain a room, a fridge, a shower, etc. at a constant temperature. It senses the actual temperature and adjusts it as necessary to keep it close to the reference temperature. Of course, it is the designer's purpose that the thermostat should do that, but once the thermostat has been made and installed, the purpose is physically present in the thermostat.
This concept does not occur at all in the letter exchange.
Consider what a person does when they act on a purpose. Some part of the world is not as they want it to be, and they act to bring it to such a state. That is what purpose is: to intend a state of affairs, and act to achieve it.
There is a wider sense of the word "purpose", where it is considered as a property of things. Maybe Dennett and Rosenberg are talking about that sense? The sense in which we can ask, "what is the purpose of this rock?", and if it's just some random rock lying somewhere, the answer would be "none". And for the question, "what is the purpose of an animal's heart?", the answer would be "to pump blood around its body." The difference between the two once again involves control systems. The heart is part of a control system. It is the actuator that will be made to beat faster and stronger, or slower and weaker, to meet the demand of the rest of the body for oxygen. A random rock is not part of any control system.
I'm inclined to map your idea of "reference input of a control system" onto the concept of homeostasis, homeostatic set points and homeostatic loops. Does that capture what you're trying to point at?
(Assuming it does) I agree that that homeostasis is an interesting puzzle piece here. My guess for why this didn't come up in the letter exchange is that D/R are trying to resolve a related but slightly different question: the nature and role of an organism's conscious, internal experience of "purpose".
Purpose and its pursuit have a special role in how human make sense of the world and themselves, in a way non-human animals don't (though it's not a binary).
The suggested answer to this puzzle is that, basically, the conscious experience of purpose and intent (and the allocation of this conscious experience to other creatures) is useful and thus selected for.
Why? They are meaningful patterns in the world. An observer with limited resource who wants to make senes of the world (i.e. an agent that wants to do sample complexity reduction) can abstract along the dimension of "purpose"/"intentionality" to reliably get good predictions about the world. (Except, "abstracting along the dimension of intentionality" isn't an active choice of the observer, rather than a results of the fact that intentions are a meaningful pattern.) The "intentionality-based" prediction does well at ignoring variables that aren't very predictive and capturing the ones that are, in the context of a bounded agent.
I would map "homeostasis" onto "control system", but maybe that's just a terminological preference.
The internal experience of purpose is a special case of internal experience, explaining which is the Hard Problem of Consciousness, which no-one has a solution for. I don't see a reason to deny this sort of purpose to animals, except to the extent that one would deny all conscious experience to them. I am quite willing to believe that (for example) cats, dogs, and primates have a level of consciousness that includes purpose.
The evolutionary explanation does not make any predictions. It looks at what is, says "it was selected for", and confabulates a story about its usefulness. Why do we have five fingers? Because every other number was selected against. Why were they selected against? Because they were less useful. How were they less useful? They must have been, because they were selected against. Even if some content were put into that, it still would not explain the thing that was to be explained: what is purpose? It is like answering the question "how does a car work?" by expatiating upon how useful cars are.
(Cleaned up some formatting, like adding a blockquote and adding some proper dividers as opposed to three "*" symbols)
I'm not sure I understand the cryptographer's constraint very well, especially with regard to language: individual words seem to have different meanings ("awesome", "literally", "love"). It's generally possible to infer which decryption was intended from the wider context, but sometimes the context itself will have different and mutually exclusive decryptions, such as in cases of real or perceived dogwhistling.
One way I could see this specific issue being resolved is by looking at what the intent of the original communication was - this would make it so that there is now a fact that settles which is the “correct” solution -, but that seems to fail in a different way: agents don't seem to have full introspective access to what they are doing or what the likely outcome of their actions is, such as in some cases of infidelity or making of promises.
This, too, could be resolved by saying that an agent's intention is "the outcomes they're attempting to instantiate regardless of self-awareness", but by that point it seems to me that we've agreed with Rosenberg's claim that it's Darwinian all the way down.
What am I missing?
As far as I can tell, I agree with what you say - this seems like a good account of how the cryptophraher's constraint cashes out in language.
To your confusion: I think Dennett would agree that it is Darwianian all the way down, and that their disagreement lies elsewhere. Dennet's account for how "reasons turn into causes" is made on Darwinian grounds, and it compels Dennett (but not Rosenberg) to conclude that purposes deserve to be treated as real, because (compressing the argument a lot) they have the capacity to affect the causal world.
Not sure this is useful?
In regards to "the meaning of life is what we give it", that's like saying "the price of an apple is what we give it". While true, it doesn't tell the whole story. There's actual market forces that dictate apple prices, just like there are actual darwinian forces that dictate meaning and purpose.
Agree; the causes that we create ourselves aren't all that governs us - in fact, it's a small fraction of that, considering physical, chemical, biological, game-theoretic, etc. constraints. And yet, there appears to be an interesting difference between the causes that govern simply animals and those that govern human-animals. Which is what I wanted to point at in the paragraph you're quoting and the few above it.
I'm confused about the "purposes don't affect the world" part. If I think my purpose is to eat an apple, then there will not be an apple in the world that would have otherwise still been there if my purpose wasn't to eat the apple. My purpose has actual effects on the world, so my purpose actually exists.
So, yes, basically this is what Dennett reasons in favour of, and what Rosenberg is skeptical of.
I think the thing here that needs reconciliation - and what Dennett is trying to do - is to explain why, in your apple story, it's justified to use the term "purpose", as opposed to only invoking arguments of natural selection, i.e. saying (roughly) that you (want to) eat apples because this is an evolutionarily adaptive behaviour and has therefore been selected for.
According to this view, purposes are at most a higher-level description that might be convenient for communication but that can entirely be explained away in evolutionary terms. In terms of the epistemic virtues of explanations, you wouldn't want to add conceptual entities without them improving the predictive power to your theory. I.e. adding the concept of purposes to your explanation if you could just as well explain the observation without that concept makes your explanation more complication without it buying you predictive power. All else equal, we prefer simple/parsimonious explanations over more complicated ones (c.f. Occam's razor).
So, while Rosenberg advocates for the "Darwinism is the only game in town"-view, Dennett is trying to make the case that, actually, purposes cannot be fully explained away by a simple evolutionary account, because the act of representing purposes (e.g. a parent telling their children to eat an apple every day because it's good for them, a add campaign promoting the consumption scepticalof local fruit, ...) does itself affect people's action, and thereby purposes become causes.
[cross-posted from my blog]
Introduction
Is the concept of purposes, and more generally teleological accounts of behaviour, to be banished from the field of biology?
For many years - essentially since the idea of Darwinian natural selection has started to be properly understood and integrated into the intellectual fabric of the field -, the consensus answer to this questions among biology scholars was “yes”. Much more recently, however, interest in this question has re-sparked - notably driven by voices that contradict that former consensus.
This is the context in which this letter exchange between the philosophers Alex Rosenberg and Daniel Dennett is taking place. What is the nature of "purposes"? Are they real? But mostly, what would it even mean for them to be?
In the following, I will provide a summary and discussion of what I consider the key points and lines of disagreements between the two. Quotes, if not specified otherwise, are taken from the letter exchange.
Rosenberg’s crux
Rosenberg and Dennett agree on large parts of their respective worldviews. They both share a "disenchanted" naturalist's view - they believe that reality is (nothing but) causal and (in principle) explainable. They subscribe to the narrative of reductionism which acclaims how scientific progress emancipated, first, the world of physics, and later the chemical and biological one, from metaphysical beliefs. Through Darwin, we have come to understand the fundamental drivers of life as we know it - variation and natural selection.
But despite their shared epistemic foundation, Rosenberg suspects a fundamental difference in their views concerning the nature of purpose. Rosenberg - contrary to Dennett - sees a necessity for science (and scientists) to disabuse themselves, entirely, from any anthropocentric speech of purpose and meaning. Anyone who considers the use of the “intentional stance” as justified, so Rosenberg, would have to reconcile the following:
What is the mechanism by which Darwinian natural selection turns reasons (tracked by the individual as purpose, meaning, beliefs and intentions) into causes (affecting the material world)?
Rosenberg, of course, doesn't deny that humans - what he refers to as Gregorian creatures shaped by biological as well as cultural evolution - experience higher-level properties like emotions, intentions and meaning. Wilfrid Sellars calls this the "manifest image": the framework in terms of which we ordinarily perceive and make sense of ourselves and the world. [1] But Rosenberg sees a tension between the scientific and the manifest image - one that is, to his eyes, irreconcilable.
"Darwinism is the only game in town", so Rosenberg. Everything can, and ought to be, explained in terms of it. These higher-level properties - sweetness, cuteness, sexiness, funniness, colour, solidity, weight (not mass!) - are radically illusionary. Darwin's account of natural selection doesn't explain purpose, it explains it away. Just like physics and biology, so do cognitive sciences and psychology now have to become disabused from the “intentional stance”.
In other words, it's the recalcitrance of meaning that bothers Rosenberg - the fact that we appear to need it in how we make sense of the world, while also being unable to properly integrate it in our scientific understanding.
As Quine put it: "One may accept the Brentano thesis [about the nature of intentionality] as either showing the indispensability of intentional idioms and the importance of an autonomous science of intention, or as showing the baselessness of intentional idioms and the emptiness of a science of intention." [2]
Rosenberg is compelled by the latter path. In his view, the recalcitrance of meaning is "the last bastion of resistance to the scientific world view. Science can do without them, in fact, it must do without them in its description of reality." He doesn't claim that notions of meaning have never been useful, but that they have "outlived their usefulness", replaced, today, with better tools of scientific inquiry.
As I understand it, Rosenberg argues that purposes aren't real because they aren’t tied up with reality, unable to affect the physical world. Acting as if they were real (by relying on the concept to explain observations) is contributing to confusion and convoluted thinking. We ought, instead, to resort to the classical Darwinian explanations, where all behaviour boils down to evolutionary advantages and procreation (in a way that explains purpose away).
Rosenberg’s crux (or rather, my interpretation thereof) is that, if you want to claim that purposes are real - if you want to maintain purpose as a scientifically justified concept, one that is reconcilable with science -, you need to be able to account for how reasons turn into causes.
Perfectly real illusions
While Dennett recognized the challenges presented by Rosenberg, he refuses to be troubled by them. Dennett paints a possible "third path" to Quine’s puzzle by suggesting to understand the manifest image (i.e. mental properties, qualia) neither as "as real as physics" (thereby making it incomprehensible to science) nor as "radically illusionary" (thereby troubling our self-understanding as Gregorian creatures). Instead, Dennett suggests, we can understand it as a user-illusion: "ways of being informed about things that matter to us in the world (our affordances) because of the way we and the environment we live in (microphysically [3]) are."
I suggest that this is, in essence, a deeply pragmatic account. (What account other than pragmatism, really, could utter, with the same ethos, a sentence like: "These are perfectly real illusions!")
While not explicitly saying such, we can interpret Dennett as invoking the bounded nature of human minds and their perceptual capacity. Mental representations, while not representing reality fully truthfully (e.g. there is no microphysical account of colours, just photons), they also aren't arbitrary. They are issued (in part) from reality, and through compression inherent to the mind’s cognitive processes, these representations get distorted such as to form false, yet in all likelihood useful, illusions.
These representations are useful because they have evolved to be such: after all, it is through the interaction with the causal world that the Darwinian fitness of an agent is determined; whether we live or die, procreate or fail to do so. Our ability to perceive has been shaped by evolution to track reality (i.e. to be truthful), but only exactly to the extent that this gives us a fitness advantage (i.e. is useful). Our perceptions are neither completely unrestrained nor completely constrained by reality, and therefore they are neither entirely arbitrary nor entirely accurate.
Let’s talk about the nature of patterns for a moment. Patterns are critical to how intelligent creatures make sense of and navigate the world. They allow (what would otherwise be far too much) data to be compressed, while still granting predictive power. But are patterns real? Patterns directly stem from reality - they are to be found in reality - and, in this very sense, they are real. But, if there wasn’t anyone or anything to perceive and make use of this structural property of the real world, it wouldn’t be meaningful to talk of patterns. Reality doesn’t care about patterns. Observers/agents do.
This same reasoning can be applied to intentions. Intentions are meaningful patterns in the world. An observer with limited resources who wants to make sense of the world (i.e. an agent that wants to reduce sample complexity) can abstract along the dimension of "intentionality" to reliably get good predictions about the world. (Except, "abstracting along the dimension of intentionality" isn't an active choice of the observer, rather than something that emerges because intentions are a meaningful pattern.) The "intentionality-based" prediction does well at ignoring variables that aren't sufficiently predictive and capturing the ones that are, which is critical in the context of a bounded agent.
Another point in case: affordances. In the preface to his book Surfing Uncertainty, Andy Clark writes : “ [...] different (but densely interanimanted) neural populations learn to predict various organism-salient regularities pertaining at many spatial and temporal scales. [...] The world is thus revealed as a world tailored to human needs, tasks and actions. It is a world built of affordances - opportunities for action and intervention.“ Just like patterns, the world isn’t made up of affordances. And yet, they are real in the sense of what Dennett calls user-illusions.
The cryptographer’s constraint
Dennett goes on to endow these illusionary reasons with further “realness” by invoking the cryptographer's constraint:
Dennett uses a simple crossword puzzle to illustrate the idea: “Consider a crossword puzzle that has two different solutions, and in which there is no fact that settles which is the “correct” solution. The composer of the crossword went to great pains to devise a puzzle with two solutions. [...] If making a simple crossword puzzle with two solutions is difficult, imagine how difficult it would be to take the whole corpus of human utterances in all languages and come up with a pair of equally good versions of Google Translate that disagreed!” [slight edits to improve readability]
The practical consequence of the constraint is that, “if you can find one reasonable decryption of a cipher-text, you’ve found the decryption.” Furthermore, this constraint is a general property of all forms of encryption/decryption.
Let’s look at the sentence: “Give me a peppermint candy!”
Given the cryptographer’s constraint, there are, practically speaking, very (read: astronomically) few plausible interpretations of the words “peppermint”, “candy”, etc. This is at the heart of what makes meaning non-arbitrary and language reliable.
To add a bit of nuance: the fact that the concept "peppermint" reliably translates to the same meaning across minds requires iterated interactions. In other words, Dennett doesn’t claim that, if I just now came up with an entirely new concept (say "klup"), its meaning would immediately be unambiguously clear. But its meaning (across minds) would become increasingly more precise and robust after using it for some time, and - on evolutionary time horizons - we can be preeetty sure we mean (to all practical relevance) the same things by the words we use.
But what does all this have to do with the question of whether purpose is real? Here we go:
The cryptographer's constraint - which I will henceforth refer to as the principle of pragmatic reliability [4]- is an essential puzzle piece to understanding what allows representations of reasons (e.g. a sentence making a claim) to turn into causes (e.g. a human taking a certain action because of that claim).
We are thus starting to get closer to Rosenberg’s crux as stated above: a scientific account for how reasons become causes. There is one more leap to take.
***
Reasons-turning-causes
Having invoked the role of pragmatic reliability, let’s examine another pillar of Dennett's view - one that will eventually get us all the way to addressing Rosenberg’s crux.
Rosenberg says: "I see how we represent in public language, turning inscriptions and noises into symbols. I don’t see how, prior to us and our language, mother nature (a.k.a Darwinian natural selection) did it."
What Rosenberg conceives to be an insurmountable challenge to Dennett’s view, the latter prefers to walk around rather than over, figuratively speaking. As developed at length in his book From Bacteria to Bach and Back, Dennett suggests that "mother nature didn’t represent reasons at all", nor did it need to.
First, the mechanism of natural selection uncovers what Dennett calls "free-floating rationales” - reasons that existed billions of years before and independent of reasoners. Only when the tree of life grew a particular (and so far unique) branch - humans, together with their use of language -, these reasons start to get represented.
"We humans are the first reason representers on the planet and maybe in the universe. Free-floating rationales are not represented anywhere, not in the mind of God, or Mother Nature, and not in the organisms who benefit from all the arrangements of their bodies for which there are good reasons. Reasons don’t have to be representations; representations of reasons do."
This is to say: it isn't exactly the reasons, so much as their representations, that become causes.
Reasons-turning-causes, so Dennett, is unique to humans because only humans represent reasons. I would nuance that the capacity to represent lives on a spectrum rather than a binary. Some other animals seem to be able to do something like representation, too. [5] That said, humans remain unchallenged in the degree two which they have developed the capacity to represent (among the forms of life we are currently aware of).
"Bears have a good reason to hibernate, but they don’t represent that reason, any more than trees represent their reason for growing tall: to get more sunlight than the competition." While there are rational explanations for the bear’s or the tree’s behaviour, they don’t understand, think about or represent these reasons. The rationale has been discovered by natural selection, but the bear/tree doesn’t know - nor does it need to - why it wants to stay in their dens during winter.
Language plays a critical role in this entire representation-jazz. Language is instrumental to our ability to represent; whether as necessary precursor, mediator or (ex-post) manifestation of that ability remains a controversial question among philosophers of language. Less controversial, however, is the role of language in allowing us to externalize representations of reasons, thereby “creating causes” not only for ourselves but also for people around us. Wilfrid Sellars suggested that language bore what he calls “the space of reasons” - the space of argumentation, explanation, query and persuasion. [6] In other words, language bore the space in which reasons can become causes.
We can even go a step further: while acknowledging the role of natural selection in shaping what we are - the fact that the purposes of our genes are determined by natural selection -, we are still free to make our own choices. To put it differently: "Humans create the purposes they are subject to; we are not subject to purposes created by something external to us.” [7]
In From Darwin to Derrida: Selfish Genes, Social Selves, and the Meanings of Life, David Haigh argues for this point of view by suggesting that there does not need be full concordance, nor congruity, between our psychological motivations (e.g. wanting to engage in sexual activity because it is pleasurable, wanting to eat a certain food because it is tasty) and the reasons why we have those motivations (e.g. in order to pass on our genetic material).
There is a piece of folk wisdom that goes: “the meaning of life is the meaning we give it”. Based on what has been discussed in this essay, we can see this saying in a different, more scientific light: as a testimony of the fact that we humans are creatures that represent meaning, and by doing so we turn “free-floating rationales” into causes that govern our own.
Thanks to particlemania, Kyle Scott and Romeo Stevens for useful discussions and comments on earlier drafts of this post.
***
[1] Sellars, Wilfrid. "Philosophy and the scientific image of man." Science, perception and reality 2 (1963).
Also see: deVries, Willem. "Wilfrid Sellars", The Stanford Encyclopedia of Philosophy (Fall 2020 Edition). Retrieved from: https://plato.stanford.edu/archives/fall2020/entries/sellars/
[2] Quine, Willard Van Orman. "Word and Object. New Edition." MIT Press (1960).
[3] I.e. the physical world at the level of atoms
[4] AI safety relevant side note: The idea that translations of meaning need only be sufficiently reliable in order to be reliably useful might provide an interesting avenue for AI safety research.
Language works, evidenced by the striking success of human civilisations made possible through advanced coordination which in return requires advanced communication. (Sure, humans miscommunicate what feels like a whole lot, but in the bigger scheme of things, we still appear to be pretty damn good at this communication thing.)
Notably, language works without there being theoretically air-tight proofs that map meanings on words.
Right there, we have an empirical case study of a symbolic system that functions on a (merely) pragmatically reliable regime. We can use it to inform our priors on how well this regime might work in other systems, such as AI, and how and why it tends to fail.
One might argue that a pragmatically reliable alignment isn’t enough - not given the sheer optimization power of the systems we are talking about. Maybe that is true; maybe we do need more certainty than pragmatism can provide. Nevertheless, I believe that there are sufficient reasons for why this is an avenue worth exploring further.
Personally, I am most interested in this line of thinking from an AI ecosystems/CAIS point of view, and as a way of addressing (what I consider a major challenge) the problem of the transient and contextual nature of preferences.
[5] People wanting to think about this more might be interesting in looking into vocal (production) learning - the ability to “modify acoustic and syntactic sounds, acquire new sounds via imitation, and produce vocalizations”. This conversation might be a good starting point.
[6] Sellars, Wilfrid. In the space of reasons: Selected essays of Wilfrid Sellars. Harvard University Press (2007).
[7] Quoted from: https://twitter.com/ironick/status/1324778875763773448