The continued misuse of the Prisoner's Dilemma
Related to: The True Prisoner's Dilemma, Newcomb's Problem and Regret of Rationality
In The True Prisoner's Dilemma, Eliezer Yudkowsky pointed out a critical problem with the way the Prisoner's Dilemma is taught: the distinction between utility and avoided-jail-time is not made clear. The payoff matrix is supposed to represent the former, even as its numerical values happen to coincidentally match the latter. And worse, people don't naturally assign utility as per the standard payoff matrix: their compassion for the friend in the "accomplice" role means they wouldn't feel quite so good about a "successful" backstabbing, nor quite so bad about being backstabbed. ("Hey, at least I didn't rat out a friend.")
For that reason, you rarely encounter a true Prisoner's Dilemma, even an iterated one. The above complications prevent real-world payoff matrices from working out that way.
Which brings us to another unfortunate example of this misunderstanding being taught.
An Alternative Approach to AI Cooperation
[This post summarizes my side of a conversation between me and cousin_it, and continues it.]
Several people here have shown interest in an approach to modeling AI interactions that was suggested by Eliezer Yudkowsky: assume that AIs can gain common knowledge of each other's source code, and explore the decision/game theory that results from this assumption.
In this post, I'd like to describe an alternative approach*, based on the idea that two or more AIs may be able to securely merge themselves into a joint machine, and allow this joint machine to make and carry out subsequent decisions. I argue that this assumption is as plausible as that of common knowledge of source code, since it can be built upon the same technological foundation that has been proposed to implement common knowledge of source code. That proposal, by Tim Freeman, was this:
Entity A could prove to entity B that it has source code S by
consenting to be replaced by a new entity A' that was constructed by a
manufacturing process jointly monitored by A and B. During this
process, both A and B observe that A' is constructed to run source
code S. After A' is constructed, A shuts down and gives all of its
resources to A'.
Notice that the same technology can be used for two AIs to merge into a single machine running source code S (which they both agreed upon). All that needs to be changed from the above process is for B to also shut down and give all of its resources to A' after A' is constructed. Not knowing if there is a standard name for this kind of technology, I've given it the moniker "secure joint construction."
Fair Division of Black-Hole Negentropy: an Introduction to Cooperative Game Theory
Non-cooperative game theory, as exemplified by the Prisoner’s Dilemma and commonly referred to by just "game theory", is well known in this community. But cooperative game theory seems to be much less well known. Personally, I had barely heard of it until a few weeks ago. Here’s my attempt to give a taste of what cooperative game theory is about, so you can decide whether it might be worth your while to learn more about it.
The example I’ll use is the fair division of black-hole negentropy. It seems likely that for an advanced civilization, the main constraining resource in the universe is negentropy. Every useful activity increases entropy, and since entropy of the universe as a whole never decreases, the excess entropy produced by civilization has to be dumped somewhere. A black hole is the only physical system we know whose entropy grows quadratically with its mass, which makes it ideal as an entropy dump. (See http://weidai.com/black-holes.txt where I go into a bit more detail about this idea.)
Let’s say there is a civilization consisting of a number of individuals, each the owner of some matter with mass mi. They know that their civilization can’t produce more than (∑ mi)2 bits of total entropy over its entire history, and the only way to reach that maximum is for every individual to cooperate and eventually contribute his or her matter into a common black hole. A natural question arises: what is a fair division of the (∑ mi)2 bits of negentropy among the individual matter owners?
Fortunately, Cooperative Game Theory provides a solution, known as the Shapley Value. There are other proposed solutions, but the Shapley Value is well accepted due to its desirable properties such as “symmetry” and “additivity”. Instead of going into the theory, I’ll just show you how it works. The idea is, we take a sequence of players, and consider the marginal contribution of each player to the total value as he or she joins the coalition in that sequence. Each player is given an allocation equal to his or her average marginal contribution over all possible sequences.
Two-Tier Rationalism
Related to: Bayesians vs. Barbarians
Consequentialism1 is a catchall term for a vast number of specific ethical theories, the common thread of which is that they take goodness (usually of a state of affairs) to be the determining factor of rightness (usually of an action). One family of consequentialisms that came to mind when it was suggested that I post about my Weird Forms of Utilitarianism class is called "Two-Tier Consequentialism", which I think can be made to connect interestingly to our rationalism goals on Less Wrong. Here's a summary of two-tier consequentialism2.
(Some form of) consequentialism is correct and yields the right answer about what people ought to do. But (this form of) consequentialism has many bad features:
- It is unimplementable (because to use it correctly requires more calculation than anyone has time to do based on more information than anyone has time to gather and use).
- It is "alienating" (because people trying to obey consequentialistic dictates find them very unlike the sorts of moral motivations they usually have, like "I want to be a nice person" or "so-and-so is my friend")3.
- It is "integrity-busting" (because it can force you to consider alternatives that are unthinkably horrifying, if there is the possibility that they might lead to the "best" consequences).
- It is "virtue-busting" (because it too often requires a deviation from a pattern of behavior that we consider to be an expression of good personal qualities that we would naturally hope and expect from good people).
- It is prone to self-serving abuse (because it's easy, when calculating utilities, to "cook the books" and wind up with the outcome you already wanted being the "best" outcome).
- It is "cooperation-busting" (because individuals don't tend to have an incentive to avoid free-riding when their own participation in a cooperative activity will neither make nor break the collective good).
To solve these problems, some consequentialist ethicists (my class focused on Railton and Hare) invented "two-tier consequentialism". The basic idea is that because all of these bad features of (pick your favorite kind of) consequentialism, being a consequentialist has bad consequences, and therefore you shouldn't do it. Instead, you should layer on top of your consequentialist thinking a second tier of moral principles called your "Practically Ideal Moral Code", which ought to have the following more convenient properties:
Bayesians vs. Barbarians
Previously in series: Collective Apathy and the Internet
Followup to: Helpless Individuals
Let's say we have two groups of soldiers. In group 1, the privates are ignorant of tactics and strategy; only the sergeants know anything about tactics and only the officers know anything about strategy. In group 2, everyone at all levels knows all about tactics and strategy.
Should we expect group 1 to defeat group 2, because group 1 will follow orders, while everyone in group 2 comes up with better ideas than whatever orders they were given?
In this case I have to question how much group 2 really understands about military theory, because it is an elementary proposition that an uncoordinated mob gets slaughtered.
Suppose that a country of rationalists is attacked by a country of Evil Barbarians who know nothing of probability theory or decision theory.
Now there's a certain viewpoint on "rationality" or "rationalism" which would say something like this:
"Obviously, the rationalists will lose. The Barbarians believe in an afterlife where they'll be rewarded for courage; so they'll throw themselves into battle without hesitation or remorse. Thanks to their affective death spirals around their Cause and Great Leader Bob, their warriors will obey orders, and their citizens at home will produce enthusiastically and at full capacity for the war; anyone caught skimming or holding back will be burned at the stake in accordance with Barbarian tradition. They'll believe in each other's goodness and hate the enemy more strongly than any sane person would, binding themselves into a tight group. Meanwhile, the rationalists will realize that there's no conceivable reward to be had from dying in battle; they'll wish that others would fight, but not want to fight themselves. Even if they can find soldiers, their civilians won't be as cooperative: So long as any one sausage almost certainly doesn't lead to the collapse of the war effort, they'll want to keep that sausage for themselves, and so not contribute as much as they could. No matter how refined, elegant, civilized, productive, and nonviolent their culture was to start with, they won't be able to resist the Barbarian invasion; sane discussion is no match for a frothing lunatic armed with a gun. In the end, the Barbarians will win because they want to fight, they want to hurt the rationalists, they want to conquer and their whole society is united around conquest; they care about that more than any sane person would."
Maybe theism is wrong
(This is meant as an entirely rewritten version of the original post. It is still long, but hopefully clearer.)
Theism is often bashed. Part of that bashing is gratuitous and undeserved. Some people therefore feel compelled to defend theism. Their defence of theism goes further than just putting the record straight though. It attempts to show how theism can be a good thing, or right. That is probably going too far.
I would argue several points. And for that I will be using the most idealistic vision of religion I can conjure, keeping in mind that real world examples may not be as utopian. My intended conclusion is that fairness and tolerance are a necessary and humane means to the end of helping people, which cannot, however, be used to justify as right something that is ultimately wrong.
Theism is indeed a good thing, on short and mid term, both for individuals and society, as it holds certain benefits.Such as helping people stick together in close knit communities, helping people life a more virtuous life by giving themselves incentives to do so, helping them feel better when life feels unbearable or meaningless.
Another point is that theism also possesses deep similarities with science, and uses optimally rational arguments and induction. Optimally, that is, insofar as the premises of theism allow; those premises, what we could call their priors are, for instance, in Christianity, to be found in the Bible.
Finally, I also wanted to draw on further similarities between religion and secular groups of people. Atheism, humanism, transhumanism, even rationalism as we know it on LW. These similarities lie in the objectives which any of those groups honestly strives to attain. Those goals are, for instance, truth, the welfare of human beings, and their betterment.
Within the world view of each of those groups, each is indeed doing its best to achieve those ends. One of catholicism's final beacon, used to guide people's life path, can be roughly said to be "what action should I take that will make me more able to love others, and myself" for instance. This, involves understanding, and following the word of God, as love and morality is understood to emanate from that source.
And so the Bible, is supposed to hold those absolute truths, not so much in a straightforward, explained way, but rather in the same way that the observable universe is supposed to hold absolute truth for secular science. And just as it is possible to misconstrue observations and build flawed theories in the scientific model, given that observational, experimental data, so is it for a christian person, to misunderstand the data presented in the Bible. Rational edifices of thought have therefore been built to derive humanly understandable, cross checked (inside that edifice), usable-on-a-daily-basis truth, from the Bible.
That is about as far as we can go for similarities, purity of purpose, intellectual honesty and adequacy with the real world.
The premise of theism itself, is flawed. Theism presupposes the supernatural. Therefore, the priors of theism, do not correspond to the real state of the universe as we observe it, and this implies two main consequences.
The first is that an intellectual edifice based upon flawed premises, no matter how carefully crafted, will still be flawed itself.
The second runs deeper and is that the premises of theism themselves are in part incompatible with rationality itself, and hence limit the potential use of rational methods. In other words, some methods of rationality, as well as some particular arguments are forbidden, or unknown to what we could tentatively call religious science.
From that, my first conclusion is that theism is wrong. Epistemically wrong, but also, doing itself a disservice, as the goals it has set itself up to, cannot be completed through its program. This program will not be able to hit its targets in optmization space, because of that epistemical flaw. Even though theism possesses short and mid term advantages, its whole edifice makes it a dead end, which will at the very least slow down humanity's progress towards nobler objectives like truth or betterment, if not even rendering that progress outright impossible past a certain point.
Yet, it seems to me that this mistaken edifice isn't totally insane, far from it, at least at its roots. Hence it should be possible to heal it. Or at least, helping the people that are part of it, healing them.
But, religion cannot be honestly called right, no matter how deep that idea is rooted in our culture and collective consciousness. On the long term, theism deprives us of our potential, it builds a virtual, unnecessary cage around us.
To conclude on that, I wanted to point out that religious belief appears to be a human universal, and probably a hard coded part of human nature. It seems fair to recognize it in us, if we have that tendency. I know I do, for instance, and fairly strongly so. Idem for belief in the supernatural.
This should be part of a more general mental discipline, of admitting to our faults and biases, rather than trying to hide and make up for them. The only way to dissect and correct them, is to first thoroughly observe those faults in our reasoning. Publicly so even. In a community of rationalists, there should be no question that even the most flawed, irrational of us, should only be treated as a friend in need of help, if he so desires, and if we have enough ressources to provide to his needs. The important thing there, is to have someone possessing a willingness to learn, and grow past his mistakes. This, can indeed be made easier, if we are supportive of each other, and tolerant, unconditionally.
Yet, at the same time, even for that purpose, we can't yield to falseness. We can and must admit for instance that religion has good points, that we may not have a licence to change people against their will, and that if people want to be helped, that they should feel relaxed in explaining all the relevant information about what they perceive to be their problem. We can't go as far as saying that such a flaw, or problem, is, in itself, alright, though.
Newcomb's Problem vs. One-Shot Prisoner's Dilemma
Continuation of: http://lesswrong.com/lw/7/kinnairds_truels/i7#comments
Eliezer has convinced me to one-box Newcomb's problem, but I'm not ready to Cooperate in one-shot PD yet. In http://www.overcomingbias.com/2008/09/iterated-tpd.html?cid=129270958#comment-129270958, Eliezer wrote:
PDF, on the 100th [i.e. final] move of the iterated dilemma, I cooperate if and only if I expect the paperclipper to cooperate if and only if I cooperate, that is:
Eliezer.C <=> (Paperclipper.C <=> Eliezer.C)
The problem is, the paperclipper would like to deceive Eliezer into believing that Paperclipper.C <=> Eliezer.C, while actually playing D. This means Eliezer has to expend resources to verify that Paperclipper.C <=> Eliezer.C really is true with high probability. If the potential gain from cooperation in a one-shot PD is less than this cost, then cooperation isn't possible. In Newbomb’s Problem, the analogous issue can be assumed away, by stipulating that Omega will see through any deception. But in the standard game theory analysis of one-shot PD, the opposite assumption is made, namely that it's impossible or prohibitively costly for players to convince each other that Player1.C <=> Player2.C.
It seems likely that this assumption is false, at least for some types of agents and sufficiently high gains from cooperation. In http://www.nabble.com/-sl4--prove-your-source-code-td18454831.html, I asked how superintelligences can prove their source code to each other, and Tim Freeman responded with this suggestion:
Entity A could prove to entity B that it has source code S by consenting to be replaced by a new entity A' that was constructed by a manufacturing process jointly monitored by A and B. During this process, both A and B observe that A' is constructed to run source code S. After A' is constructed, A shuts down and gives all of its resources to A'.
But this process seems quite expensive, so even SIs may not be able to play Cooperate in one-shot PD, unless the stakes are pretty high. Are there cheaper solutions, perhaps ones that can be applied to humans as well, for players in one-shot PD to convince each other what decision systems they are using?
On a related note, Eliezer has claimed that truly one-shot PD is very rare in real life. I would agree with this, except that the same issue also arises from indefinitely repeated games where the probability of the game ending after the current round is too high, or the time discount factor is too low, for a tit-for-tat strategy to work.
Selecting Rationalist Groups
Previously in series: Purchase Fuzzies and Utilons Separately
Followup to: Conjuring an Evolution To Serve You
GreyThumb.blog offered an interesting comparison of poor animal breeding practices and the fall of Enron, which I previously posted on in some detail. The essential theme was that individual selection on chickens for the chicken in each generation who laid the most eggs, produced highly competitive chickens—the most dominant chickens that pecked their way to the top of the pecking order at the expense of other chickens. The chickens subjected to this individual selection for egg-laying prowess needed their beaks clipped, or housing in individual cages, or they would peck each other to death.
Which is to say: individual selection is selecting on the wrong criterion, because what the farmer actually wants is high egg production from groups of chickens.
While group selection is nearly impossible in ordinary biology, it is easy to impose in the laboratory: and breeding the best groups, rather than the best individuals, increased average days of hen survival from 160 to 348, and egg mass per bird from 5.3 to 13.3 kg.
The analogy being to the way that Enron evaluated its employees every year, fired the bottom 10%, and gave the top individual performers huge raises and bonuses. Jeff Skilling fancied himself as exploiting the wondrous power of evolution, it seems.
If you look over my accumulated essays, you will observe that the art contained therein is almost entirely individual in nature... for around the same reason that it all focuses on confronting impossibly tricky questions: That's what I was doing when I thought up all this stuff, and for the most part I worked in solitude. But this is not inherent in the Art, not reflective of what a true martial art of rationality would be like if many people had contributed to its development along many facets.
Case in point: At the recent LW / OB meetup, we played Paranoid Debating, a game that tests group rationality. As is only appropriate, this game was not the invention of any single person, but was collectively thought up in a series of suggestions by Nick Bostrom, Black Belt Bayesian, Tom McCabe, and steven0461.
Rationality: Common Interest of Many Causes
Previously in series: Church vs. Taskforce
It is a non-so-hidden agenda of this site, Less Wrong, that there are many causes which benefit from the spread of rationality—because it takes a little more rationality than usual to see their case, as a supporter, or even just a supportive bystander. Not just the obvious causes like atheism, but things like marijuana legalization—where you could wish that people were a bit more self-aware about their motives and the nature of signaling, and a bit more moved by inconvenient cold facts. The Institute Which May Not Be Named was merely an unusually extreme case of this, wherein it got to the point that after years of bogging down I threw up my hands and explicitly recursed on the job of creating rationalists.
But of course, not all the rationalists I create will be interested in my own project—and that's fine. You can't capture all the value you create, and trying can have poor side effects.
If the supporters of other causes are enlightened enough to think similarly...
Then all the causes which benefit from spreading rationality, can, perhaps, have something in the way of standardized material to which to point their supporters—a common task, centralized to save effort—and think of themselves as spreading a little rationality on the side. They won't capture all the value they create. And that's fine. They'll capture some of the value others create. Atheism has very little to do directly with marijuana legalization, but if both atheists and anti-Prohibitionists are willing to step back a bit and say a bit about the general, abstract principle of confronting a discomforting truth that interferes with a fine righteous tirade, then both atheism and marijuana legalization pick up some of the benefit from both efforts.
But this requires—I know I'm repeating myself here, but it's important—that you be willing not to capture all the value you create. It requires that, in the course of talking about rationality, you maintain an ability to temporarily shut up about your own cause even though it is the best cause ever. It requires that you don't regard those other causes, and they do not regard you, as competing for a limited supply of rationalists with a limited capacity for support; but, rather, creating more rationalists and increasing their capacity for support. You only reap some of your own efforts, but you reap some of others' efforts as well.
If you and they don't agree on everything—especially priorities—you have to be willing to agree to shut up about the disagreement. (Except possibly in specialized venues, out of the way of the mainstream discourse, where such disagreements are explicitly prosecuted.)
Your Price for Joining
Previously in series: Why Our Kind Can't Cooperate
In the Ultimatum Game, the first player chooses how to split $10 between themselves and the second player, and the second player decides whether to accept the split or reject it—in the latter case, both parties get nothing. So far as conventional causal decision theory goes (two-box on Newcomb's Problem, defect in Prisoner's Dilemma), the second player should prefer any non-zero amount to nothing. But if the first player expects this behavior—accept any non-zero offer—then they have no motive to offer more than a penny. As I assume you all know by now, I am no fan of conventional causal decision theory. Those of us who remain interested in cooperating on the Prisoner's Dilemma, either because it's iterated, or because we have a term in our utility function for fairness, or because we use an unconventional decision theory, may also not accept an offer of one penny.
And in fact, most Ultimatum "deciders" offer an even split; and most Ultimatum "accepters" reject any offer less than 20%. A 100 USD game played in Indonesia (average per capita income at the time: 670 USD) showed offers of 30 USD being turned down, although this equates to two week's wages. We can probably also assume that the players in Indonesia were not thinking about the academic debate over Newcomblike problems—this is just the way people feel about Ultimatum Games, even ones played for real money.
There's an analogue of the Ultimatum Game in group coordination. (Has it been studied? I'd hope so...) Let's say there's a common project—in fact, let's say that it's an altruistic common project, aimed at helping mugging victims in Canada, or something. If you join this group project, you'll get more done than you could on your own, relative to your utility function. So, obviously, you should join.
But wait! The anti-mugging project keeps their funds invested in a money market fund! That's ridiculous; it won't earn even as much interest as US Treasuries, let alone a dividend-paying index fund.
Clearly, this project is run by morons, and you shouldn't join until they change their malinvesting ways.
Now you might realize—if you stopped to think about it—that all things considered, you would still do better by working with the common anti-mugging project, than striking out on your own to fight crime. But then—you might perhaps also realize—if you too easily assent to joining the group, why, what motive would they have to change their malinvesting ways?
Well... Okay, look. Possibly because we're out of the ancestral environment where everyone knows everyone else... and possibly because the nonconformist crowd tries to repudiate normal group-cohering forces like conformity and leader-worship...
...It seems to me that people in the atheist/libertarian/technophile/sf-fan/etcetera cluster often set their joining prices way way way too high. Like a 50-way split Ultimatum game, where every one of 50 players demands at least 20% of the money.
= 783df68a0f980790206b9ea87794c5b6)
Subscribe to RSS Feed
= f037147d6e6c911a85753b9abdedda8d)