From Capuchins to AI's, Setting an Agenda for the Study of Cultural Cooperation (Part2)

diegocaleiro

Today's writings are shaded dark green, the rest was also in Part1.

This is a multi-purpose essay-on-the-making, it is being written aiming at the following goals 1) Mandatory essay writing at the end of a semester studying "Cognitive Ethology: Culture in Human and Non-Human Animals" 2) Drafting something that can later on be published in a journal that deals with cultural evolution, hopefully inclining people in the area to glance at future oriented research, i.e. FAI and global coordination 3) Publishing it in Lesswrong and 4) Ultimately Saving the World, as everything should. If it's worth doing, it's worth doing in the way most likely to save the World.

Since many of my writings are frequently too long for Lesswrong, I'll publish this in a sequence-like form made of self-contained chunks. My deadline is Sunday, so I'll probably post daily, editing/creating the new sessions based on previous commentary.

Abstract: The study of cultural evolution has drawn much of its momentum from academic areas far removed from human and animal psychology, specially regarding the evolution of cooperation. Game theoretic results and parental investment theory come from economics, kin selection models from biology, and an ever growing amount of models describing the process of cultural evolution in general, and the evolution of altruism in particular come from mathematics. Even from Artificial Intelligence interest has been cast on how to create agents that can communicate, imitate and cooperate. In this article I begin to tackle the 'why?' question. By trying to retrospectively make sense of the convergence of all these fields, I contend that further refinements in these fields should be directed towards understanding how to create environmental incentives fostering cooperation.

We need systems that are wiser than we are. We need institutions and cultural norms that make us better than we tend to be. It seems to me that the greatest challenge we now face is to build them. - Sam Harris, 2013, The Power Of Bad Incentives

1) Introduction

2) Cultures evolve

Culture is perhaps the most remarkable outcome of the evolutionary algorithm (Dennett, 1996) so far. It is the cradle of most things we consider humane - that is, typically human and valuable - and it surrounds our lives to the point that we may be thought of as creatures made of culture even more than creatures of bone and flesh (Hofstadter, 2007; Dennett, 1992). The appearance of our cultural complexity has relied on many associated capacities, among them:

1) The ability to observe, be interested by, and go nearby an individual doing something interesting, an ability we share with norway rats, crows, and even lemurs (Galef & Laland, 2005).

2) Ability to learn from and scrounge the food of whoever knows how to get food, shared by capuchin monkeys (Ottoni et al, 2005).

3) Ability to tolerate learners, to accept learners, and to socially learn, probably shared by animals as diverse as fish, finches and Fins (Galef & Laland, 2005).

4) Understanding and emulating other minds - Theory of Mind - empathizing, relating, perhaps re-framing an experience as one's own, shared by chimpanzees, dogs, and at least some cetaceans (Rendella & Whitehead, 2001).

5) Learning the program level description of the action of others, for which the evidence among other animals is controversial (but see Cantor & Whitehead, 2013). And finally...

6) Sharing intentions. Intricate understanding of how two minds can collaborate with complementary tasks to achieve a mutually agreed goal (Tomasello et al, 2005).

Irrespective of definitional disputes around the true meaning of the word "culture" (which doesn't exist, see e.g. Pinker, 2007 pg115; Yudkowsky 2008A), each of these is more cognitively complex than its predecessor, and even (1) is sufficient for intra-specific non-environmental, non-genetic behavioral variation, which I will call "culture" here, whoever it may harm.

By transitivity, (2-6) allow the development of culture. It is interesting to notice that tool use, frequently but falsely cited as the hallmark of culture, is ubiquitously equiprobable in the animal kingdom. A graph showing, per biological family, which species shows tool use gives us a power law distribution, whose similarity with the universal prior will help in understanding that being from a family where a species uses tools tells us very little about a specie's own tool use (Michael Haslam, personal conversation).

Once some of those abilities are available, and given an amount of environmental facilities, need, and randomness, cultures begin to form. Occasionally, so do more developed traditions. Be it by imitation, program level imitation, goal emulation or intention sharing, information is transmitted between agents giving rise to elements sufficient to constitute a primeval Darwinian soup. That is, entities form such that they exhibit 1)Variation 2)Heredity or replication 3)Differential fitness (Dennett, 1996). In light of the article Five Misunderstandings About Cultural Evolution (Henrich, Boyd & Richerson, 2008) we can improve Dennett's conditions for the evolutionary algorithm as 1)Discrete or continuous variation 2)Heredity, replication, or less faithful replication plus content attractors 3)Differential fitness. Once this set of conditions is met, an evolutionary algorithm, or many, begin to carve their optimizing paws into whatever surpassed the threshold for long enough. Cultures, therefore, evolve.

The intricacies of cultural evolution and mathematical and computational models of how cultures evolve have been the subject of much interdisciplinary research, for an extensive account of human culture see Not By Genes Alone (Richerson & Boyd, 2005). For computational models of social evolution, there is work by Mesoudi, Novak, and others e.g. (Hauert et al, 2007). For mathematical models, the aptly named Mathematical models of social evolution: A guide for the perplexed by McElrath and Rob Boyd (2007) makes the textbook-style walk-through. For animal culture, see (Laland & Galef, 2009).

Cultural evolution satisfies David Deutsch's criterion for existence, it kicks back, it satisfies the evolutionary equivalent of the condition posed by the Quine-Putnam Indispensability argument in mathematics, i.e. it is a sine qua non condition for understanding how the World works nomologically. It is falsifiable to Popperian content, and it inflates the Worlds ontology a little, by inserting a new kind of "replicator", the meme. Contrary to what happened on the internet, the name 'meme' has lost much of it's appeal within cultural evolution theorists, and "memetics" is considered by some to refer only to the study of memes as monolithic atomic high fidelity replicators, which would make the theory obsolete. This has created the following conundrum: the name 'meme' remains by far the most well known one to speak of "that which evolves culturally" within, and specially outside, the specialist arena. Further, the niche occupied by the word 'meme' is so conceptually necessary within the area to communicate and explain that it is frequently put under scare quotes, or some other informal excuse. In fact, as argued by Tim Tyler - who frequently posts here - in the very sharp Memetics (2010), there are nearly no reasons to try to abandon the 'meme' meme, and nearly all reasons (practicality, Qwerty reasons, mnemonics) to keep it. To avoid contradicting the evidence ever since Dawkins first coined the term, I suggest we must redefine Meme as an attractor in cultural evolution (dual-inheritance) whose development over time structurally mimics to a significant extent the discrete behavior of genes, frequently coinciding with the smallest unit of cultural replication. The definition is long, but the idea is simple: Memes are not the best analogues of genes because they are discrete units that replicate just like genes, but because they are continuous conceptual clusters being attracted to a point in conceptual space whose replication is just like that of genes. Even more simply, memes are the mathematically closest things to genes in cultural evolution. So the suggestion here is for researchers of dual-inheritance and cultural evolution to take off the scare quotes of our memes and keep business as usual.

The evolutionary algorithm has created a new attractor-replicator, the meme, it didn't privilege with it any specific families in the biological trees and it ended up creating a process of cultural-genetic coevolution known as dual-inheritance. This process has been studied in ever more quantified ways by primatologists, behavioral ecologists, population biologists, anthropologists, ethologists, sociologists, neuroscientists and even philosophers. I've shown at least six distinct abilities which helped scaffold our astounding level of cultural intricacy, and some animals who share them with us. We will now take a look at the evolution of cooperation, collaboration, altruism, moral behavior, a sub-area of cultural evolution that saw an explosion of interest and research during the last decade, with publications (most from the last 4 years) such as The Origins of Morality, Supercooperators, Good and Real, The Better Angels of Our Nature, Non-Zero, The Moral Animal, Primates and Philosophers, The Age of Empathy, Origins of Altruism and Cooperation, The Altruism Equation, Altruism in Humans, Cooperation and Its Evolution, Moral Tribes, The Expanding Circle, The Moral Landscape.

3) Cooperation evolves

Despite the selfish nature of genes (Dawkins, 1999) and other units of Darwinian transmission (Jablonka & Lamb, 2007), altruism at the individual level (cost to self for benefit to other) can and does arise because of several intertwined factors.

1) Alleles (the molecular biologist word for what less-specialized areas call genes) under normal conditions optimize for there being more copies of themselves in the future. This happens regardless of whether it is that physical instantiation - also known as token - that is present in the future.

2) Copies of alleles are spread over space, individuals, groups, species and time, but they only care about the time dimension and the quantity dimension. In the long run alleles don't thrive if they are doing better than their neighbors, they thrive if they are doing better than the average allele. A token (instantiation) of an allele that codes for cancer, multiplying itself uncontrollably could, had he a mind, think he's doing great, but if the mutation that gave rise to it only happened in somatic cells (that do not go through the germ line), he'd be in for a surprise. One reason why biologists say natural selection is short-sighted.

3) The above reasoning applies exactly equally and for the same reasons to an allele that codes for individual-selfish behavior in a species in which more altruist groups tend to outlive more egotistic ones. The allele for individual-selfishness, and the selfish individual, may think they are doing great, comparing to their neighbors, when all of a sudden, with high probability, their group dies. Altruism wins in this case not because there is a new spooky unit of selection that reverses reductionism, and applies downward causation which originates in groups. Altruism thrives because the average long term fitness of each allele that coded for it was higher than that of genes that code for individual-selfish behavior. Group selection_c - as well as superoganism selection, somatic cells selection, species selection and individual selection - only happens when the selective forces operating on that level coincide with the allele's fitness increasing in relation to all the competing alleles. (Group selection_c is selection for altruist genes at the group level, the only definition under which the entire discussion was dealing with a controversy of substance instead of talking past each other, as brilliantly explained in this post by PhilGoetz, 2010, please read the case study section in that post to get a more precise understanding than the above short definition). See also the excursus on what a fitness function is below.

4) Completely independent from the reasons in (3), alleles, epigenetics, and learning can program individuals to be cooperative if they "expect" (consciously or not) the interaction with another individual, say, Malou, to: (a) Begin a cycle of reciprocation with Malou in the future whose benefit exceeds the current cost being paid; (b) Counterfactually increase their reputation with sufficiently many individuals that those will award more benefit than current cost; (c) Avoid being punished by third parties; (d) Conform to, or help enforce, by setting an example, social norms and rules upon which selection pressures act (Tomasello, 2005). A key notion in all these mechanisms based on this encoded "expectation" is that uncertainty must be present. In the absence of uncertainty, a state that doesn't exist in nature, an agent in a prisoner dilemma like interaction would be required to defect instead of cooperating from round one, predicting the backwards-in-time cascade of defection from whichever was the last round of interaction, in which by definition cooperating is worse. The problems that in Lesswrong people are trying to solve using Timeless Decision Theory, Updateless Decision Theory, PrudentBot, and other IQ140+ gimmicks, evolution solved by inserting stupidity! More precisely by embracing higher level uncertainty about how many future interactions will there be. Kissing, saying "I love you", becoming engaged, and getting married are all increasingly honest ways in which the computer program programmed by your alleles informs Malou that there will be more cooperation and less defection in the future.

5) Finally, altruism only poses paradoxes of the "Group Selection_c" kind when we are trying to explain why a replicator that codes for Altruism emerged? And we are trying to explain it at that replicators level. It is no mystery why a composition of the phenotypic effects of a gene (replicator) and two memes (attractor-replicators) in all individuals who posses the three of them makes them altruistic, if it does. Each gene and meme in that composition may be fending for itself, but as things turn out, they do make some really nice people (or bonobos) once their extended phenotypes are clustered within those people. If we trust Jablonka & Lamb (2007), there are four streams of heredity flowing concomitantly: Genetic, Epigenetic, Behavioral and Symbolic. Some of the flowing hereditary entities are not even attractor-replicators (niche construction for instance), they don't exhibit replicator dynamics and any altruism that spreads through them requires no special explanation at all!

To the best of my knowledge, none of the 5 factors above, which all do play a role in the existence and maintenance of altruism, requires a revision of Neodarwinism of the Dawkins, Dennett, Trivers, Pinker sort. None of them challenges the validity of our models of replicator dynamics as replicator dynamics. None of them challenges the metaphysically fundamental notion of Darwinism as Universal Acid (Dennett,1996). None of them compromises the claim that everything in the universe that has complex design of which we are aware can be traced back to Darwinian mind-less processes operating, by and large, in replicator-like entities (Dennett, opus cit). None of them poses an obstacle to physicalist reductionism - in this biology-ladden context being the claim that all macrophysical facts, including biological facts, are materially determined by the microphysical facts.

Cooperation evolves, and altruism evolves. They evolve for natural, non-mysterious reasons, and before any more shaking of the edifice of Darwinism is made, and it's constitutive reductionism or universal corrosive powers are contested, any counteracting evidence must be able to traverse undetectably by the far less demanding possibility of being explained by any of the factors above or a combination of them, or being simply the result of one of the many confusions clarified in the excursus below. Despite many people's attempts to look for Skyhooks that would cast away the all-too-natural demons of Neodarwinism and reductionism, things remain as they were before, Cranes all the way up. I will be listening attentively for a case of altruism found in the biological world or mathematical simulations based on it that can pierce through these many layers of epistemic explanatory ability, but I won't be holding my breadth.

Excursus: What is a fitness function?

It is worth pointing out here not only that the altruism and group selection confusion happens, but showing why it does. And PhilGoetz did half of the explanatory job already. The other half is noticing that the fitness function is a many-place function (there is a newer and better post on Lesswrong explaining many-place functions/words, but I didn't find it in 12min, please point to it if you can). The complicated description of "what the fitness function is", in David Lewis's manner of speaking, would be that it is a function from things to functions from functions to functions. More understandably, with e.g. the specific "thing" being a token of an altruistic allele of kind "Aallele", call it "Aallele₃₃₄":

Aallele₃₄₄--¹-->((number of Aalleles--³-->total number of alleles)--²-->(amplitude configuration slice--⁴-->simplest ordering))

Here arrow 4 is the function we call time from a timeless physics, quantum physics perspective. Just substitute the whole parenthesis for "time" instead if you haven't read the Quantum Physics sequence. Arrow 3 is how good Aalleles are doing, i.e. how many of them there are in relation to the total number of competing alleles. Arrow 2 is how this relation between Aalleles and total varies over time. The fitness function is arrow 1, once you are given a specific token ofan allele, it is the function that describes how well copies of that token do over time in relation to all the competing alleles. Needless to say, not many biologists are aware of that complex computation.

The reason why the unexplained half of controversies happen is that the punctual fitness of an allele will appear very different when you factor it against the competing alleles of other cells, of other individuals, of other groups, or of other species. Fitness is what philosophers call an externalist concept, if you increase the amount of contextually relevant surroundings, the output number changes significantly. It will also appear very different when you factor it for final time T₁ or T₂. The fitness of an allele coding for a species specific characteristic of T-Rex's large bodies will be very high if the final time is 65 million years ago, but negative if 64.

I remember Feynman saying, I believe in this interview, that it is amazing what the eye does. Surrounded in a 3d equivalent of an insect floating up and down in the 2d surface of a swimming pool, we manage to abstract away all the waves going through the space between us and a seen object, and still capture information enough to locate it, interact with it, and admire it. It is as if the insect could tell only from his vertical oscillations how many children were in the pool, where they were located etc. The state of knowledge in many fields, adaptive fitness included, strikes me as similarly amazing. If this many-place function underlies what biologists should be talking about to avoid talking past each other, how can many of them be aware of only one or two of the many variables that should be input, and still be making good science? Or are they?
If you fail to see hidden variables, you can fall prey to anomalies like the Simpson's paradox, which is exactly the mistake described in PhilGoetz's post on group/species selection.

The function above also works for things other than alleles, like individuals with a characteristic, in which case it will be calculating the fitness of having that characteristic at the individual level.

4) The complexity of cultural items doesn't undermine the validity of mathematical models.

4.1) Cognitive attractors and biases substitute for memes discreteness

The math becomes equivalent.

4.2) Despite the Unilateralist Curse and the Tragedy of the Commons, dyadic interaction models help us understand large scale cooperation

Once we know these two failure modes, dyadic iterated (or reputation-sensitive) interaction is close enough.

5) From Monkeys to Apes to Humans to Transhumans to AIs, the ranges of achievable altruistic skill.

Possible modes of being altruistic. Graph like Bostrom's. Second and third order punishment and cooperation. Newcomb-like signaling problems within AI.

6) Unfit for the Future: the need for greater altruism.

We fail and will remain failing in Tragedy of the Commons problems unless we change our nature.

7) From Science, through Philosophy, towards Engineering: the future of studies of altruism.

Philosophy: Existential Risk prevention through global coordination and cooperation prior to technical maturity. Engineering Humans: creating enhancements and changing incentives. Engineering AI's: making them better and realer.

8) A different kind of Moral Landscape

Like Sam Harris's one, except comparing not how much a society approaches The Good Life (Moral Landscape pg15), but how much it fosters altruistic behavior.

9) Conclusions

Not yet.

Bibliography (_{Only of the parts already written, obviously}):

Boyd, R., Gintis, H., Bowles, S., & Richerson, P. J. (2003). The evolution of altruistic punishment. Proceedings of the National Academy of Sciences, 100(6), 3531-3535.

Cantor, M., & Whitehead, H. (2013). The interplay between social networks and culture: theoretically and among whales and dolphins. Philosophical Transactions of the Royal Society B: Biological Sciences, 368(1618).

Dawkins, R. (1999). The extended phenotype: The long reach of the gene. Oxford University Press, USA.

Dennett, D. C. (1996). Darwin's dangerous idea: Evolution and the meanings of life (No. 39). Simon & Schuster.

Dennett, D. C. (1992). The self as a center of narrative gravity. Self and consciousness: Multiple perspectives.

Galef Jr, B. G., & Laland, K. N. (2005). Social learning in animals: empirical studies and theoretical models. Bioscience, 55(6), 489-499.

Hauert, C., Traulsen, A., Brandt, H., Nowak, M. A., & Sigmund, K. (2007). Via freedom to coercion: the emergence of costly punishment. science, 316(5833), 1905-1907.

Henrich, J., Boyd, R., & Richerson, P. J. (2008). Five misunderstandings about cultural evolution. Human Nature, 19(2), 119-137.

Hofstadter, D. R. (2007). I am a Strange Loop. Basic Books

Jablonka, E., & Lamb, M. J. (2007). Precis of evolution in four dimensions. Behavioral and Brain Sciences, 30(4), 353-364.

McElreath, R., & Boyd, R. (2007). Mathematical models of social evolution: A guide for the perplexed. University of Chicago Press.

Ottoni, E. B., de Resende, B. D., & Izar, P. (2005). Watching the best nutcrackers: what capuchin monkeys (Cebus apella) know about others’ tool-using skills. Animal cognition, 8(4), 215-219.

Persson, I., & Savulescu, J. Unfit for the Future: The Need for Moral Enhancement Oxford: Oxford University Press, 2012 ISBN 978-0199653645 (HB)£ 21.00. 160pp. On the brink of civil war, Abraham Lincoln stood on the steps of the US Capitol and appealed.

PhilGoetz. (2010), Group selection update. Available at http://lesswrong.com/lw/300/group_selection_update/

Pinker, S. (2007). The stuff of thought: Language as a window into human nature. Viking Adult.

Rendella, L., & Whitehead, H. (2001). Culture in whales and dolphins.Behavioral and Brain Sciences, 24, 309-382.

Richardson, P. J., & Boyd, R. (2005). Not by genes alone. University of Chicago Press.

Tyler, T. (2011). Memetics: Memes and the Science of Cultural Evolution. Tim Tyler.

Tomasello, M., Carpenter, M., Call, J., Behne, T., & Moll, H. (2005). Understanding and sharing intentions: The origins of cultural cognition.Behavioral and brain sciences, 28(5), 675-690.

Yudkowsky, E. (2008A). 37 ways words can be wrong. Available at http://lesswrong.com/lw/od/37_ways_that_words_can_be_wrong/

[-]Viliam_Bur11y00

it is being written aiming at the following goals 1) Mandatory essay writing at the end of a semester

In my experience (as a reader), most articles written with this goal are horrible. Please don't do this.

[-]diegocaleiro11y00

I agree that the end result is not good so far. I mean, it is more than enough for giving me approval. But it fails in its multiplicity, as made obvious by the downvotes, which I think were fairly deserved, given how much a writing of this kind is not the expected Lesswrong discussion post.

[-]Qiaochu_Yuan11y00

In this article I begin to tackle the 'why?' question.

What is the 'why?' question?

This is a somewhat long post and the introduction doesn't give me a good sense of why I should read it. Also, the dark green is really weird. It's impossible to distinguish from black on my screen unless I scroll, and then it's just distracting. I can't even begin to tell what's dark green and what isn't without scrolling up and down, and even then I still can't really tell.

[-]Shmi11y-10

May I suggest downvoting without reading anything diegocaleiro writes that is longer than one screenful and only return to read it if it's highly upvoted later? Works well for me.

LESSWRONG
is fundraising!
LW
$

-5

From Capuchins to AI's, Setting an Agenda for the Study of Cultural Cooperation (Part2)

-5

-5