Less Wrong is a community blog devoted to refining the art of human rationality. Please visit our About page for more information.
I'm proud to announce the preprint of Robust Cooperation in the Prisoner's Dilemma: Program Equilibrium via Provability Logic, a joint paper with Mihaly Barasz, Paul Christiano, Benja Fallenstein, Marcello Herreshoff, Patrick LaVictoire (me), and Eliezer Yudkowsky.
This paper was one of three projects to come out of the 2nd MIRI Workshop on Probability and Reflection in April 2013, and had its genesis in ideas about formalizations of decision theory that have appeared on LessWrong. (At the end of this post, I'll include links for further reading.)
Below, I'll briefly outline the problem we considered, the results we proved, and the (many) open questions that remain. Thanks in advance for your thoughts and suggestions!
Background: Writing programs to play the PD with source code swap
(If you're not familiar with the Prisoner's Dilemma, see here.)
The paper concerns the following setup, which has come up in academic research on game theory: say that you have the chance to write a computer program X, which takes in one input and returns either Cooperate or Defect. This program will face off against some other computer program Y, but with a twist: X will receive the source code of Y as input, and Y will receive the source code of X as input. And you will be given your program's winnings, so you should think carefully about what sort of program you'd write!
Of course, you could simply write a program that defects regardless of its input; we call this program DefectBot, and call the program that cooperates on all inputs CooperateBot. But with the wealth of information afforded by the setup, you might wonder if there's some program that might be able to achieve mutual cooperation in situations where DefectBot achieves mutual defection, without thereby risking a sucker's payoff. (Douglas Hofstadter would call this a perfect opportunity for superrationality...)
Previously known: CliqueBot and FairBot
And indeed, there's a way to do this that's been known since at least the 1980s. You can write a computer program that knows its own source code, compares it to the input, and returns C if and only if the two are identical (and D otherwise). Thus it achieves mutual cooperation in one important case where it intuitively ought to: when playing against itself! We call this program CliqueBot, since it cooperates only with the "clique" of agents identical to itself.
There's one particularly irksome issue with CliqueBot, and that's the fragility of its cooperation. If two people write functionally analogous but syntactically different versions of it, those programs will defect against one another! This problem can be patched somewhat, but not fully fixed. Moreover, mutual cooperation might be the best strategy against some agents that are not even functionally identical, and extending this approach requires you to explicitly delineate the list of programs that you're willing to cooperate with. Is there a more flexible and robust kind of program you could write instead?
As it turns out, there is: in a 2010 post on LessWrong, cousin_it introduced an algorithm that we now call FairBot. Given the source code of Y, FairBot searches for a proof (of less than some large fixed length) that Y returns C when given the source code of FairBot, and then returns C if and only if it discovers such a proof (otherwise it returns D). Clearly, if our proof system is consistent, FairBot only cooperates when that cooperation will be mutual. But the really fascinating thing is what happens when you play two versions of FairBot against each other. Intuitively, it seems that either mutual cooperation or mutual defection would be stable outcomes, but it turns out that if their limits on proof lengths are sufficiently high, they will achieve mutual cooperation!
The proof that they mutually cooperate follows from a bounded version of Löb's Theorem from mathematical logic. (If you're not familiar with this result, you might enjoy Eliezer's Cartoon Guide to Löb's Theorem, which is a correct formal proof written in much more intuitive notation.) Essentially, the asymmetry comes from the fact that both programs are searching for the same outcome, so that a short proof that one of them cooperates leads to a short proof that the other cooperates, and vice versa. (The opposite is not true, because the formal system can't know it won't find a contradiction. This is a subtle but essential feature of mathematical logic!)
Generalization: Modal Agents
Unfortunately, FairBot isn't what I'd consider an ideal program to write: it happily cooperates with CooperateBot, when it could do better by defecting. This is problematic because in real life, the world isn't separated into agents and non-agents, and any natural phenomenon that doesn't predict your actions can be thought of as a CooperateBot (or a DefectBot). You don't want your agent to be making concessions to rocks that happened not to fall on them. (There's an important caveat: some things have utility functions that you care about, but don't have sufficient ability to predicate their actions on yours. In that case, though, it wouldn't be a true Prisoner's Dilemma if your values actually prefer the outcome (C,C) to (D,C).)
However, FairBot belongs to a promising class of algorithms: those that decide on their action by looking for short proofs of logical statements that concern their opponent's actions. In fact, there's a really convenient mathematical structure that's analogous to the class of such algorithms: the modal logic of provability (known as GL, for Gödel-Löb).
So that's the subject of this preprint: what can we achieve in decision theory by considering agents defined by formulas of provability logic?
For the first time in history, it has become possible for a limited group of a few thousand people to threaten the absolute destruction of millions.
-- Norbert Wiener (1956), Moral Reflections of a Mathematician.
Today, the general attitude towards scientific discovery is that scientists are not themselves responsible for how their work is used. For someone who is interested in science for its own sake, or even for someone who mostly considers research to be a way to pay the bills, this is a tempting attitude. It would be easy to only focus on one’s work, and leave it up to others to decide what to do with it.
But this is not necessarily the attitude that we should encourage. As technology becomes more powerful, it also becomes more dangerous. Throughout history, many scientists and inventors have recognized this, and taken different kinds of action to help ensure that their work will have beneficial consequences. Here are some of them.
This post is not arguing that any specific approach for taking responsibility for one's actions is the correct one. Some researchers hid their work, others refocused on other fields, still others began active campaigns to change the way their work was being used. It is up to the reader to decide which of these approaches were successful and worth emulating, and which ones were not.
… I do not publish nor divulge [methods of building submarines] by reason of the evil nature of men who would use them as means of destruction at the bottom of the sea, by sending ships to the bottom, and sinking them together with the men in them.
People did not always think that the benefits of freely disseminating knowledge outweighed the harms. O.T. Benfey, writing in a 1956 issue of the Bulletin of the Atomic Scientists, cites F.S. Taylor’s book on early alchemists:
Alchemy was certainly intended to be useful .... But [the alchemist] never proposes the public use of such things, the disclosing of his knowledge for the benefit of man. …. Any disclosure of the alchemical secret was felt to be profoundly wrong, and likely to bring immediate punishment from on high. The reason generally given for such secrecy was the probable abuse by wicked men of the power that the alchemical would give …. The alchemists, indeed, felt a strong moral responsibility that is not always acknowledged by the scientists of today.
With the Renaissance, science began to be viewed as public property, but many scientists remained cautious about the way in which their work might be used. Although he held the office of military engineer, Leonardo da Vinci (1452-1519) drew a distinction between offensive and defensive warfare, and emphasized the role of good defenses in protecting people’s liberty from tyrants. He described war as ‘bestialissima pazzia’ (most bestial madness), and wrote that ‘it is an infinitely atrocious thing to take away the life of a man’. One of the clearest examples of his reluctance to unleash dangerous inventions was his refusal to publish the details of his plans for submarines.
Later Renaissance thinkers continued to be concerned with the potential uses of their discoveries. John Napier (1550-1617), the inventor of logarithms, also experimented with a new form of artillery. Upon seeing its destructive power, he decided to keep its details a secret, and even spoke from his deathbed against the creation of new kinds of weapons.
But only concealing one discovery pales in comparison to the likes of Robert Boyle (1627-1691). A pioneer of physics and chemistry and possibly the most famous for describing and publishing Boyle’s law, he sought to make humanity better off, taking an interest in things such as improved agricultural methods as well as better medicine. In his studies, he also discovered knowledge and made inventions related to a variety of potentially harmful subjects, including poisons, invisible ink, counterfeit money, explosives, and kinetic weaponry. These ‘my love of Mankind has oblig’d me to conceal, even from my nearest Friends’.
Recent renewed discussions of the parapsychology literature and Daryl Bem's recent precognition article brought to mind the "market test" of claims of precognition. Bem tells us that random undergraduate students were able to predict with 53% accuracy where an erotic image would appear in the future. If this effect was actually real, I would rerun the experiment before corporate earnings announcements, central bank interest rate changes, etc, and change the images based on the reaction of stocks and bonds to the announcements. In other words, I could easily convert "porn precognition" into "hedge fund trillionaire precognition."
If I was initially lacking in the capital to do trades, I could publish my predictions online using public key cryptography and amass an impressive track record before recruiting investors. If anti-psi prejudice was a problem, no one need know how I was making my predictions. Similar setups could exploit other effects claimed in the parapsychology literature (e.g. the remote viewing of the Scientologist-founded Stargate Project of the U.S. federal government). Those who assign a lot of credence to psi may want to actually try this, but for me this is an invitation to use parapsychology as control group for science, and to ponder a general heuristic for crudely estimating the soundness of academic fields for outsiders.
One reason we trust that physicists and chemists have some understanding of their subjects is that they produce valuable technological spinoffs with concrete and measurable economic benefit. In practice, I often make use of the spinoff heuristic: If an unfamiliar field has the sort of knowledge it claims, what commercial spinoffs and concrete results ought it to be producing? Do such spinoffs exist? What are the explanations for their absence?
For psychology, I might cite systematic desensitization of specific phobias such as fear of spiders, cognitive-behavioral therapy, and military use of IQ tests (with large measurable changes in accident rates, training costs, etc). In financial economics, I would raise the hundreds of billions of dollars invested in index funds, founded in response to academic research, and their outperformance relative to managed funds. Auction theory powers tens of billions of dollars of wireless spectrum auctions, not to mention evil dollar-auction sites.
This seems like a great task for crowdsourcing: the cloud of LessWrongers has broad knowledge, and sorting real science from cargo cult science is core to being Less Wrong. So I ask you, Less Wrongers, for your examples of practical spinoffs (or suspicious absences thereof) of sometimes-denigrated fields in the comments. Macroeconomics, personality psychology, physical anthropology, education research, gene-association studies, nutrition research, wherever you have knowledge to share.
ETA: This academic claims to be trying to use the Bem methods to predict roulette wheels, and to have passed statistical significance tests on his first runs. Such claims have been made for casinos in the past, but always trailed away in failures to replicate, repeat, or make actual money. I expect the same to happen here.
Luke tasked me with researching the following question
I‘d like to know if anybody has come up with a good response to any of the objections to ’full information’ or ‘ideal preference’ theories of value given in Sobel (1994). (My impression is “no.”)
The paper in question is David Sobel’s 1994 paper “Full Information Accounts of Well-Being” (Ethics 104, no. 4: 784–810) (his 1999 paper, “Do the desires of rational agents converge?”, is directed against a different kind of convergence and won’t be discussed here).
The starting point is Brandt’s 1979 book where he describes his version of a utilitarianism in which utility is the degree of satisfaction of the desires of one’s ideal ‘fully informed’ self, and Sobel also refers to the 1986 Railton apologetic. (LWers will note that this kind of utilitarianism sounds very similar to CEV and hence, any criticism of the former may be a valid criticism of the latter.) I’ll steal entirely the opening to Mark C Murphy’s 1999 paper, “The Simple Desire-Fulfillment Theory” (rejecting any hypotheticals or counterfactuals in desire utilitarianism), since he covers all the bases (for even broader background, see the Tanner Lecture “The Status of Well-Being”):
(This post is an expanded version of a LW comment I left a while ago. I have found myself referring to it so much in the meantime that I think it’s worth reworking into a proper post. Some related posts are "The Correct Contrarian Cluster" and "What is Bunk?")
When looking for information about some area outside of one’s expertise, it is usually a good idea to first ask what academic scholarship has to say on the subject. In many areas, there is no need to look elsewhere for answers: respectable academic authors are the richest and most reliable source of information, and people claiming things completely outside the academic mainstream are almost certain to be crackpots.
The trouble is, this is not always the case. Even those whose view of the modern academia is much rosier than mine should agree that it would be astonishing if there didn’t exist at least some areas where the academic mainstream is detached from reality on important issues, while much more accurate views are scorned as kooky (or would be if they were heard at all). Therefore, depending on the area, the fact that a view is way out of the academic mainstream may imply that it's bunk with near-certainty, but it may also tell us nothing if the mainstream standards in the area are especially bad.
I will discuss some heuristics that, in my experience, provide a realistic first estimate of how sound the academic mainstream in a given field is likely to be, and how justified one would be to dismiss contrarians out of hand. These conclusions have come from my own observations of research literature in various fields and some personal experience with the way modern academia operates, and I would be interested in reading others’ opinions.
— Eliezer Yudkowsky, Frequentist Statistics are Frequently Subjective
Imagine if, way back at the start of the scientific enterprise, someone had said, "What we really need is a control group for science - people who will behave exactly like scientists, doing experiments, publishing journals, and so on, but whose field of study is completely empty: one in which the null hypothesis is always true.
"That way, we'll be able to gauge the effect of publication bias, experimental error, misuse of statistics, data fraud, and so on, which will help us understand how serious such problems are in the real scientific literature."
Isn't that a great idea?
By an accident of historical chance, we actually have exactly such a control group, namely parapsychologists: people who study extra-sensory perception, telepathy, precognition, and so on.
Andrew Gelman recently responded to a commenter on the Yudkowsky/Gelman diavlog; the commenter complained that Bayesian statistics were too subjective and lacked rigor. I shall explain why this is unbelievably ironic, but first, the comment itself:
However, the fundamental belief of the Bayesian interpretation, that all probabilities are subjective, is problematic -- for its lack of rigor... One of the features of frequentist statistics is the ease of testability. Consider a binomial variable, like the flip of a fair coin. I can calculate that the probability of getting seven heads in ten flips is 11.71875%... At some point a departure from the predicted value may appear, and frequentist statistics give objective confidence intervals that can precisely quantify the degree to which the coin departs from fairness...
Gelman's first response is "Bayesian probabilities don't have to be subjective." Not sure I can back him on that; probability is ignorance and ignorance is a state of mind (although indeed, some Bayesian probabilities can correspond very directly to observable frequencies in repeatable experiments).
My own response is that frequentist statistics are far more subjective than Bayesian likelihood ratios. Exhibit One is the notion of "statistical significance" (which is what the above comment is actually talking about, although "confidence intervals" have almost the same problem). Steven Goodman offers a nicely illustrated example: Suppose we have at hand a coin, which may be fair (the "null hypothesis") or perhaps biased in some direction. So lo and behold, I flip the coin six times, and I get the result TTTTTH. Is this result statistically significant, and if so, what is the p-value - that is, the probability of obtaining a result at least this extreme?
Well, that depends. Was I planning to flip the coin six times, and count the number of tails? Or was I planning to flip the coin until it came up heads, and count the number of trials? In the first case, the probability of getting "five tails or more" from a fair coin is 11%, while in the second case, the probability of a fair coin requiring "at least five tails before seeing one heads" is 3%.
Whereas a Bayesian looks at the experimental result and says, "I can now calculate the likelihood ratio (evidential flow) between all hypotheses under consideration. Since your state of mind doesn't affect the coin in any way - doesn't change the probability of a fair coin or biased coin producing this exact data - there's no way your private, unobservable state of mind can affect my interpretation of your experimental results."
There's a contrarian theory presented by Robin that people go to highly reputable schools, visit highly reputable hospitals, buy highly reputable brands etc. to affiliate with high status individuals and institutions.
But what would a person who completely didn't care about such affiliations do? Pretty much the same thing. Unless you know a lot about schools, hospitals, and everything else, you're better off simply following prestige as proxy for quality (in addition to price and all the other usual criteria). There's no denying that prestige is better indicator of quality than random chance - the question is - is it the best we can do?
It's possible to come up with alternative measures, which might correlate with quality too, like operation success rates for hospitals, graduation rates for schools etc. But if they really indicated quality that well, wouldn't they be simply included in institution's prestige, and lose their predictive status? The argument is highly analogous to one for efficient market hypothesis (or to some extent with Bayesian beauty contest with schools, as prestige might indicate quality of other students). Very often there are severe faults with alternative measures, like with operation success rates without correcting for patient demographics.
If you postulate that you have better indicator of quality than prestige, you need to do some explaining. Why is it not included in prestige already? I don't propose any magical thinking about prestige, but we shouldn't be as eager to throw it away completely as some seem to be.
You may be interested in a paper of medium age I just read. Testing ecological models: the meaning of validation (PDF) tackles a problem many of you are familiar with in a slightly different context.
To entice you to read it, here are some quotes from its descriptions of other papers:
Holling (1978) pronounced it a fable that the purpose of validation is to establish the truth of the model…
Overton (1977) viewed validation as an integral part of the modelling process…
Botkin (1993) expressed concern that the usage of the terms verification and validation was not consistent with their logical meanings…
Mankin et al. (1977) suggested that the objectives of model-building may be achieved without validating the model…
I have another reason for posting this; I’m looking for more papers on model validation, especially how-to papers. Which ones do you consider most helpful?
View more: Next