"Inductive Bias"

12Eliezer_Yudkowsky08 April 2007 07:52PM

(Part two in a series on "statistical bias", "inductive bias", and "cognitive bias".)

Suppose that you see a swan for the first time, and it is white.  It does not follow logically that the next swan you see must be white, but white seems like a better guess than any other color.  A machine learning algorithm of the more rigid sort, if it sees a single white swan, may thereafter predict that any swan seen will be white.  But this, of course, does not follow logically - though AIs of this sort are often misnamed "logical".  For a purely logical reasoner to label the next swan white as a deductive conclusion, it would need an additional assumption:  "All swans are the same color."  This is a wonderful assumption to make if all swans are, in reality, the same color; otherwise, not so good.  Tom Mitchell's Machine Learning defines the inductive bias of a machine learning algorithm as the assumptions that must be added to the observed data to transform the algorithm's outputs into logical deductions.

A more general view of inductive bias would identify it with a Bayesian's prior over sequences of observations...

continue reading »

Priors as Mathematical Objects

16Eliezer_Yudkowsky12 April 2007 03:24AM

Followup to:  "Inductive Bias"

What exactly is a "prior", as a mathematical object?  Suppose you're looking at an urn filled with red and white balls.  When you draw the very first ball, you haven't yet had a chance to gather much evidence, so you start out with a rather vague and fuzzy expectation of what might happen - you might say "fifty/fifty, even odds" for the chance of getting a red or white ball.  But you're ready to revise that estimate for future balls as soon as you've drawn a few samples.  So then this initial probability estimate, 0.5, is not repeat not a "prior".

An introduction to Bayes's Rule for confused students might refer to the population frequency of breast cancer as the "prior probability of breast cancer", and the revised probability after a mammography as the "posterior probability". But in the scriptures of Deep Bayesianism, such as Probability Theory: The Logic of Science, one finds a quite different concept - that of prior information, which includes e.g. our beliefs about the sensitivity and specificity of mammography exams. Our belief about the population frequency of breast cancer is only one small element of our prior information.

continue reading »

Bayesian Judo

55Eliezer_Yudkowsky31 July 2007 05:53AM

You can have some fun with people whose anticipations get out of sync with what they believe they believe.

I was once at a dinner party, trying to explain to a man what I did for a living, when he said: "I don't believe Artificial Intelligence is possible because only God can make a soul."

At this point I must have been divinely inspired, because I instantly responded: "You mean if I can make an Artificial Intelligence, it proves your religion is false?"

continue reading »

Focus Your Uncertainty

25Eliezer_Yudkowsky05 August 2007 08:49PM

Will bond yields go up, or down, or remain the same? If you're a TV pundit and your job is to explain the outcome after the fact, then there's no reason to worry. No matter which of the three possibilities comes true, you'll be able to explain why the outcome perfectly fits your pet market theory . There's no reason to think of these three possibilities as somehow opposed to one another, as exclusive, because you'll get full marks for punditry no matter which outcome occurs.

But wait! Suppose you're a novice TV pundit, and you aren't experienced enough to make up plausible explanations on the spot. You need to prepare remarks in advance for tomorrow's broadcast, and you have limited time to prepare. In this case, it would be helpful to know which outcome will actually occur—whether bond yields will go up, down, or remain the same—because then you would only need to prepare one set of excuses.

Alas, no one can possibly foresee the future. What are you to do? You certainly can't use "probabilities". We all know from school that "probabilities" are little numbers that appear next to a word problem, and there aren't any little numbers here. Worse, you feel uncertain. You don't remember feeling uncertain while you were manipulating the little numbers in word problems. College classes teaching math are nice clean places, therefore math itself can't apply to life situations that aren't nice and clean.  You wouldn't want to inappropriately transfer thinking skills from one context to another.  Clearly, this is not a matter for "probabilities".

continue reading »

Your Strength as a Rationalist

45Eliezer_Yudkowsky11 August 2007 12:21AM

(The following happened to me in an IRC chatroom, long enough ago that I was still hanging around in IRC chatrooms.  Time has fuzzed the memory and my report may be imprecise.)

So there I was, in an IRC chatroom, when someone reports that a friend of his needs medical advice.  His friend says that he's been having sudden chest pains, so he called an ambulance, and the ambulance showed up, but the paramedics told him it was nothing, and left, and now the chest pains are getting worse.  What should his friend do?

I was confused by this story.  I remembered reading about homeless people in New York who would call ambulances just to be taken someplace warm, and how the paramedics always had to take them to the emergency room, even on the 27th iteration.  Because if they didn't, the ambulance company could be sued for lots and lots of money.  Likewise, emergency rooms are legally obligated to treat anyone, regardless of ability to pay.  (And the hospital absorbs the costs, which are enormous, so hospitals are closing their emergency rooms...  It makes you wonder what's the point of having economists if we're just going to ignore them.)  So I didn't quite understand how the described events could have happened.  Anyone reporting sudden chest pains should have been hauled off by an ambulance instantly.

And this is where I fell down as a rationalist.  I remembered several occasions where my doctor would completely fail to panic at the report of symptoms that seemed, to me, very alarming.  And the Medical Establishment was always right.  Every single time.  I had chest pains myself, at one point, and the doctor patiently explained to me that I was describing chest muscle pain, not a heart attack.  So I said into the IRC channel, "Well, if the paramedics told your friend it was nothing, it must really be nothing—they'd have hauled him off if there was the tiniest chance of serious trouble."

Thus I managed to explain the story within my existing model, though the fit still felt a little forced...

continue reading »

Absence of Evidence Is Evidence of Absence

39Eliezer_Yudkowsky12 August 2007 08:34PM

From Robyn Dawes's Rational Choice in an Uncertain World:

Post-hoc fitting of evidence to hypothesis was involved in a most grievous chapter in United States history: the internment of Japanese-Americans at the beginning of the Second World War.  When California governor Earl Warren testified before a congressional hearing in San Francisco on February 21, 1942, a questioner pointed out that there had been no sabotage or any other type of espionage by the Japanese-Americans up to that time.  Warren responded, "I take the view that this lack [of subversive activity] is the most ominous sign in our whole situation. It convinces me more than perhaps any other factor that the sabotage we are to get, the Fifth Column activities are to get, are timed just like Pearl Harbor was timed... I believe we are just being lulled into a false sense of security."

Consider Warren's argument from a Bayesian perspective.  When we see evidence, hypotheses that assigned a higher likelihood to that evidence, gain probability at the expense of hypotheses that assigned a lower likelihood to the evidence.  This is a phenomenon of relative likelihoods and relative probabilities.  You can assign a high likelihood to the evidence and still lose probability mass to some other hypothesis, if that other hypothesis assigns a likelihood that is even higher.

Warren seems to be arguing that, given that we see no sabotage, this confirms that a Fifth Column exists.  You could argue that a Fifth Column might delay its sabotage.  But the likelihood is still higher that the absence of a Fifth Column would perform an absence of sabotage.

continue reading »

Conservation of Expected Evidence

42Eliezer_Yudkowsky13 August 2007 03:55PM

Followup to:  Absence of Evidence Is Evidence of Absence.

Friedrich Spee von Langenfeld, a priest who heard the confessions of condemned witches, wrote in 1631 the Cautio Criminalis ('prudence in criminal cases') in which he bitingly described the decision tree for condemning accused witches:  If the witch had led an evil and improper life, she was guilty; if she had led a good and proper life, this too was a proof, for witches dissemble and try to appear especially virtuous. After the woman was put in prison: if she was afraid, this proved her guilt; if she was not afraid, this proved her guilt, for witches characteristically pretend innocence and wear a bold front. Or on hearing of a denunciation of witchcraft against her, she might seek flight or remain; if she ran, that proved her guilt; if she remained, the devil had detained her so she could not get away.

Spee acted as confessor to many witches; he was thus in a position to observe every branch of the accusation tree, that no matter what the accused witch said or did, it was held a proof against her.  In any individual case, you would only hear one branch of the dilemma.  It is for this reason that scientists write down their experimental predictions in advance.

But you can't have it both ways—as a matter of probability theory, not mere fairness.  The rule that "absence of evidence is evidence of absence" is a special case of a more general law, which I would name Conservation of Expected Evidence:  The expectation of the posterior probability, after viewing the evidence, must equal the prior probability.

P(H) = P(H)
P(H) = P(H,E) + P(H,~E)

P(H) = P(H|E)*P(E) + P(H|~E)*P(~E)

Therefore, for every expectation of evidence, there is an equal and opposite expectation of counterevidence.

continue reading »

Update Yourself Incrementally

20Eliezer_Yudkowsky14 August 2007 02:56PM

Politics is the mind-killer.  Debate is war, arguments are soldiers.  There is the temptation to search for ways to interpret every possible experimental result to confirm your theory, like securing a citadel against every possible line of attack.  This you cannot do.  It is mathematically impossible. For every expectation of evidence, there is an equal and opposite expectation of counterevidence.

But it's okay if your cherished belief isn't perfectly defended.  If the hypothesis is that the coin comes up heads 95% of the time, then one time in twenty you will see what looks like contrary evidence.  This is okay.  It's normal.  It's even expected, so long as you've got nineteen supporting observations for every contrary one.  A probabilistic model can take a hit or two, and still survive, so long as the hits don't keep on coming in.

Yet it is widely believed, especially in the court of public opinion, that a true theory can have no failures and a false theory no successes.

continue reading »

One Argument Against An Army

26Eliezer_Yudkowsky15 August 2007 06:39PM

Followup to:  Update Yourself Incrementally

Yesterday I talked about a style of reasoning in which not a single contrary argument is allowed, with the result that every non-supporting observation has to be argued away.  Today I suggest that when people encounter a contrary argument, they prevent themselves from downshifting their confidence by rehearsing already-known support.

Suppose the country of Freedonia is debating whether its neighbor, Sylvania, is responsible for a recent rash of meteor strikes on its cities.  There are several pieces of evidence suggesting this: the meteors struck cities close to the Sylvanian border; there was unusual activity in the Sylvanian stock markets before the strikes; and the Sylvanian ambassador Trentino was heard muttering about "heavenly vengeance".

Someone comes to you and says:  "I don't think Sylvania is responsible for the meteor strikes.  They have trade with us of billions of dinars annually."  "Well," you reply, "the meteors struck cities close to Sylvania, there was suspicious activity in their stock market, and their ambassador spoke of heavenly vengeance afterward."  Since these three arguments outweigh the first, you keep your belief that Sylvania is responsible—you believe rather than disbelieve, qualitatively. Clearly, the balance of evidence weighs against Sylvania.

Then another comes to you and says:  "I don't think Sylvania is responsible for the meteor strikes.  Directing an asteroid strike is really hard. Sylvania doesn't even have a space program."  You reply, "But the meteors struck cities close to Sylvania, and their investors knew it, and the ambassador came right out and admitted it!"  Again, these three arguments outweigh the first (by three arguments against one argument), so you keep your belief that Sylvania is responsible.

Indeed, your convictions are strengthened.  On two separate occasions now, you have evaluated the balance of evidence, and both times the balance was tilted against Sylvania by a ratio of 3-to-1.

continue reading »

Scientific Evidence, Legal Evidence, Rational Evidence

26Eliezer_Yudkowsky19 August 2007 05:36AM

Suppose that your good friend, the police commissioner, tells you in strictest confidence that the crime kingpin of your city is Wulky Wilkinsen.  As a rationalist, are you licensed to believe this statement?  Put it this way: if you go ahead and mess around with Wulky's teenage daughter, I'd call you foolhardy.  Since it is prudent to act as if Wulky has a substantially higher-than-default probability of being a crime boss, the police commissioner's statement must have been strong Bayesian evidence.

Our legal system will not imprison Wulky on the basis of the police commissioner's statement.  It is not admissible as legal evidence.  Maybe if you locked up every person accused of being a crime boss by a police commissioner, you'd initially catch a lot of crime bosses, plus some people that a police commissioner didn't like.  Power tends to corrupt: over time, you'd catch fewer and fewer real crime bosses (who would go to greater lengths to ensure anonymity) and more and more innocent victims (unrestrained power attracts corruption like honey attracts flies).

This does not mean that the police commissioner's statement is not rational evidence.  It still has a lopsided likelihood ratio, and you'd still be a fool to mess with Wulky's teenager daughter.  But on a social level, in pursuit of a social goal, we deliberately define "legal evidence" to include only particular kinds of evidence, such as the police commissioner's own observations on the night of April 4th.  All legal evidence should ideally be rational evidence, but not the other way around.  We impose special, strong, additional standards before we anoint rational evidence as "legal evidence".

As I write this sentence at 8:33pm, Pacific time, on August 18th 2007, I am wearing white socks.  As a rationalist, are you licensed to believe the previous statement?  Yes.  Could I testify to it in court?  Yes.  Is it a scientific statement?  No, because there is no experiment you can perform yourself to verify it.  Science is made up of generalizations which apply to many particular instances, so that you can run new real-world experiments which test the generalization, and thereby verify for yourself that the generalization is true, without having to trust anyone's authority.  Science is the publicly reproducible knowledge of humankind.

Like a court system, science as a social process is made up of fallible humans.  We want a protected pool of beliefs that are especially reliable.  And we want social rules that encourage the generation of such knowledge.  So we impose special, strong, additional standards before we canonize rational knowledge as "scientific knowledge", adding it to the protected belief pool.

continue reading »

View more: Next