# My Bayesian Enlightenment

**Followup to**: The Magnitude of His Own Folly

I remember (dimly, as human memories go) the first time I self-identified as a "Bayesian". Someone had just asked a malformed version of an old probability puzzle, saying:

If I meet a mathematician on the street, and she says, "I have two children, and at least one of them is a boy," what is the probability that they are both boys?

In the *correct* version of this story, the mathematician says "I have two children", and *you* ask, "Is at least one a boy?", and she answers "Yes". Then the probability is 1/3 that they are both boys.

But in the malformed version of the story—as I pointed out—one would common-sensically reason:

If the mathematician has one boy and one girl, then my prior probability for her saying 'at least one of them is a boy' is 1/2 and my prior probability for her saying 'at least one of them is a girl' is 1/2. There's no reason to believe, a priori, that the mathematician will only mention a girl if there is no possible alternative.

So I pointed this out, and worked the answer using Bayes's Rule, arriving at a probability of 1/2 that the children were both boys. I'm not sure whether or not I knew, at this point, that Bayes's rule was called that, but it's what I used.

And lo, someone said to me, "Well, what you just gave is the Bayesian answer, but in orthodox statistics the answer is 1/3. We just exclude the possibilities that are ruled out, and count the ones that are left, without trying to guess the probability that the mathematician will say this or that, since we have no way of really knowing that probability—it's too subjective."

I responded—note that this was completely spontaneous—"What on Earth do you mean? You can't avoid assigning a probability to the mathematician making one statement or another. You're just assuming the probability is 1, and *that's* unjustified."

To which the one replied, "Yes, that's what the Bayesians say. But frequentists don't believe that."

And I said, astounded: "How can there possibly be such a thing as non-Bayesian statistics?"

That was when I discovered that I was of the type called 'Bayesian'. As far as I can tell, I was *born* that way. My mathematical intuitions were such that everything Bayesians said seemed perfectly straightforward and simple, the obvious way I would do it myself; whereas the things frequentists said sounded like the elaborate, warped, mad blasphemy of dreaming Cthulhu. I didn't *choose* to become a Bayesian any more than fishes choose to breathe water.

But this is not what I refer to as my "Bayesian enlightenment". The first time I heard of "Bayesianism", I marked it off as obvious; I didn't go much further in than Bayes's rule itself. At that time I still thought of probability theory as a tool rather than a law. I didn't think there were mathematical laws of intelligence (my best and worst mistake). Like nearly all AGI wannabes, Eliezer_{2001} thought in terms of techniques, methods, algorithms, building up a toolbox full of cool things he could *do*; he searched for tools, not understanding. Bayes's Rule was a really neat tool, applicable in a surprising number of cases.

Then there was my initiation into heuristics and biases. It started when I ran across a webpage that had been transduced from a Powerpoint intro to behavioral economics. It mentioned some of the results of heuristics and biases, in passing, without any references. I was so startled that I emailed the author to ask if this was actually a real experiment, or just anecdotal. He sent me back a scan of Tversky and Kahneman's 1973 paper.

Embarrassing to say, my story doesn't really start there. I put it on my list of things to look into. I knew that there was an edited volume called "Judgment Under Uncertainty: Heuristics and Biases" but I'd never seen it. At this time, I figured that if it wasn't online, I would just try to get along without it. I had so many other things on my reading stack, and no easy access to a university library. I think I must have mentioned this on a mailing list, because Emil Gilliam ~~emailed me to tell me that he'd read Judgment Under Uncertainty~~ was annoyed by my online-only theory, so he bought me the book.

His action here should probably be regarded as scoring a fair number of points.

But this, too, is not what I refer to as my "Bayesian enlightenment". It was an important step toward realizing the inadequacy of my Traditional Rationality skillz—that there was so much more out there, all this new science, beyond just doing what Richard Feynman told you to do. And seeing the heuristics-and-biases program holding up Bayes as the gold standard helped move my thinking forward—but not all the way there.

Memory is a fragile thing, and mine seems to have become more fragile than most, since I learned how memories are recreated with each recollection—the science of how fragile they are. Do other people really have better memories, or do they just trust the details their mind makes up, while really not remembering any more than I do? My guess is that other people do have better memories for certain things. I find structured, scientific knowledge easy enough to remember; but the disconnected chaos of everyday life fades very quickly for me.

I know *why* certain things happened in my life—that's causal structure I can remember. But sometimes it's hard to recall even in *what order* certain events happened to me, let alone in what year.

I'm not sure if I read E. T. Jaynes's *Probability Theory: The Logic of Science* before or after the day when I realized the magnitude of my own folly, and understood that I was facing an adult problem.

But it was PT:TLOS that did the trick. Here was probability theory, laid out not as a clever tool, but as *The Rules*, inviolable on pain of paradox. If you tried to approximate The Rules because they were too computationally expensive to use directly, then, no matter how necessary that compromise might be, you would still end doing less than optimal. Jaynes would do his calculations different ways to show that the same answer always arose when you used legitimate methods; and he would display different answers that others had arrived at, and trace down the illegitimate step. Paradoxes could not coexist with his precision. Not *an* answer, but *the* answer.

And so—having looked back on my mistakes, and all the *an-answers* that had led me into paradox and dismay—it occurred to me that here was the level above mine.

I could no longer visualize trying to build an AI based on vague answers—like the an-answers I had come up with before—and surviving the challenge.

I looked at the AGI wannabes with whom I had tried to argue Friendly AI, and their various dreams of Friendliness which they had. (Often formulated spontaneously in response to my asking the question!) Like frequentist statistical methods, no two of them agreed with each other. Having actually studied the issue full-time for some years, I knew something about the problems their hopeful plans would run into. And I saw that if you said, "I don't see why this would fail," the "don't know" was just a reflection of your own ignorance. I could see that if I held myself to a similar standard of "that seems like a good idea", I would also be doomed. (Much like a frequentist inventing amazing new statistical calculations that seemed like good ideas.)

But if you can't do that which seems like a good idea—if you can't do what you don't imagine failing—then what can you do?

It seemed to me that it would take something like the Jaynes-level—not, *here's my bright idea,* but rather, *here's the only correct way you can do this (and why)*—to tackle an adult problem and survive. If I achieved the same level of mastery of my own subject, as Jaynes had achieved of probability theory, then it was at least *imaginable* that I could try to build a Friendly AI and survive the experience.

Through my mind flashed the passage:

Do nothing because it is righteous, or praiseworthy, or noble, to do so; do nothing because it seems good to do so; do only that which you must do, and which you cannot do in any other way.

Doing what it seemed good to do, had only led me astray.

So I called a full stop.

And I decided that, from then on, I would follow the strategy that could have saved me if I had followed it years ago: Hold my FAI designs to the higher standard of not doing that which seemed like a good idea, but only that which I understood on a sufficiently deep level to see that I could not do it in any other way.

All my old theories into which I had invested so much, did not meet this standard; and were not close to this standard; and weren't even on a track leading to this standard; so I threw them out the window.

I took up the study of probability theory and decision theory, looking to extend them to embrace such things as reflectivity and self-modification.

If I recall correctly, I had already, by this point, started to see cognition as manifesting Bayes-structure, which is also a major part of what I refer to as my Bayesian enlightenment—but of this I have already spoken. And there was also my naturalistic awakening, of which I have already spoken. And my realization that Traditional Rationality was not strict enough, so that in matters of human rationality I began taking more inspiration from probability theory and cognitive psychology.

But if you add up all these things together, then that, more or less, is the story of my Bayesian enlightenment.

Life rarely has neat boundaries. The story continues onward.

It was while studying Judea Pearl, for example, that I realized that precision can save you time. I'd put some thought into nonmonotonic logics myself, before then—back when I was still in my "searching for neat tools and algorithms" mode. Reading *Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference, *I could imagine how much time I would have wasted on ad-hoc systems and special cases, if I hadn't known that key. "Do only that which you must do, and which you cannot do in any other way", translates into a time-savings measured, not in the rescue of wasted months, but in the rescue of wasted careers.

And so I realized that it was only by holding myself to this higher standard of precision that I had started to *really* think *at all* about quite a number of important issues. To say a thing with precision is difficult—it is not at all the same thing as saying a thing formally, or inventing a new logic to throw at the problem. Many shy away from the inconvenience, because human beings are lazy, and so they say, "It is impossible" or "It will take too long", even though they never really tried for five minutes. But if you don't hold yourself to that *inconveniently* high standard, you'll let yourself get away with anything. It's a hard problem just to find a standard high enough to make you actually start thinking! It may seem taxing to hold yourself to the standard of mathematical proof where every single step has to be correct and one wrong step can carry you anywhere. But otherwise you won't chase down those tiny notes of discord that turn out to, in fact, lead to whole new concerns you never thought of.

So these days I don't complain as much about the heroic burden of inconvenience that it takes to hold yourself to a precise standard. It can save time, too; and in fact, it's more or less the ante to get yourself thinking about the problem at all.

And this too should be considered part of my "Bayesian enlightenment"—realizing that there were advantages in it, not just penalties.

But of course the story continues on. Life is like that, at least the parts that I remember.

If there's one thing I've learned from this history, it's that saying "Oops" is something to look forward to. Sure, the prospect of saying "Oops" in the future, means that the you of *right now* is a drooling imbecile, whose words your future self won't be able to read because of all the wincing. But saying "Oops" in the future also means that, in the future, you'll acquire new Jedi powers that your present self doesn't dream exist. It makes you feel embarrassed, but also *alive.* Realizing that your younger self was a complete moron means that even though you're already in your twenties, you haven't yet gone over your peak. So here's to hoping that my future self realizes I'm a drooling imbecile: I may *plan* to solve my problems with my present abilities, but extra Jedi powers sure would come in handy.

That scream of horror and embarrassment is the sound that rationalists make when they level up. Sometimes I worry that I'm not leveling up as fast as I used to, and I don't know if it's because I'm finally getting the hang of things, or because the neurons in my brain are slowly dying.

Yours, Eliezer_{2008}.

Part of the sequence *Yudkowsky's Coming of Age*

(end of sequence)

Previous post: "Beyond the Reach of God"

## Comments (56)

OldIf optimal is not synonymous with winning (i.e. doing what is necessary), what is the point of being optimal? If you die of starvation before you manage to pick the most nutritious thing to eat using bayesian methods, I'm gonna ditch bayesian methods.

@Will: The point is not that you should necessarily run the algorithm that would be optimal if you had unlimited computational resources. The point is that by understanding what that algorithm does, you have a better chance of coming up with a good approximation which you can run in a reasonable amount of time. If you are trying to build a locomotive it helps to understand Carnot Engines.

What's your justification for having P(she says "at least one is a boy" | 1B,1G) = P(she says "at least one is a girl" | 1B,1G)? Maybe the hypothetical mathematician is from a culture that considers it important to have at least one boy. (China was like that, IIRC)

As a twin, I always found it surprising how easily people assume that children's genders are independent. I saw it more like 'Kid1'<-'Fertilization specifics'->'Kid2', and if, as Wiki says, monozygotic twins occur in about 3 cases per 1000, and same-sex dizygotic twins occur in half cases of all dizygotic twins1, then

it's not at all obviousthat two children of the same mother have the same distribution of possible genders as two children of the same father or two random children at all. 1 - Wiki doesn't state the frequency of dizygotic twins.In recent years I've become more appreciative of classical statistics. I still consider the Bayesian solution to be the correct one, however, often a full Bayesian treatment turns into a total mess. Sometimes, by using a few of the tricks from classical statistics, you can achieve nearly as good performance with a fraction of the complexity.

Thank you for a correct statement of the problem which indeed gives the 1/3 answer. Here's the problem I have with the malformed version: I agree that it's reasonable to assume that if the children were a boy and a girl it is equally likely that the parent would say "at least one is a boy" as "at least one is a girl". But I guess you're assuming the parent would say "at least one boy" if both were boys, "at least one girl" if both were girls, and either "at least one boy" or "at least one girl" with equal probability in the one of each case.

That's the simplest set of assumptions consistent with the problem. But the quote itself is inconsistent with the normal rules of social interaction. Saying "at least one is a boy" takes more words to convey less information than saying "both boys" or "one of each". I think it's perfectly reasonable to draw some inference from this violation of normal social rules, although it is not clear to me what inference should be drawn.

Keep in mind this is a hypothetical character behaving in an unrealistic and contrived manner. If she doesn't heed social norms or effective communication strategies then there's nothing we can infer from those considerations.

If the mathematician has one boy and one girl, then my prior probability for her saying 'at least one of them is a boy' is 1/2 and my prior probability for her saying 'at least one of them is a girl' is 1/2Why isn't it 3/4 for both? Why are these scenarios mutually exclusive?

Never mind -- missed the "If" clause. (Sorry!)

No, wait -- my question stands!

Do we really want to assign a prior of 0 to the mathematician saying "I have two children, one boy and one girl"?

Sadly, I had

notread Judgment under Uncertainty, and still haven't. I don't recall ever saying I did, and can't find any email in which I claimed I'd read it.However, I

dorecall being annoyed in 2002-2003 at Eliezer for joking that there was nothing worth reading that wasn't online and searchable through Google (or worse, that if it wasn't on the Net then it didn't exist). He did mention Judgment under Uncertainty on a mailing list (or on IRC) as something he would like to read, so I decided my donation to SIAI would be this book.Eliezer doesn't make

that particularannoying joke anymore. :)I think a more reasonable conclusion is: yes indeed it is malformed, and the person I am speaking to is evidently not competent enough to notice how this necessarily affects the answer and invalidates the familiar answer, and so they may not be a reliable guide to probability and in particular to what is or is not "orthodox" or "bayesian." What I think you ought to have discovered was not that you were Bayesian, but that you had not blundered, whereas the person you were speaking to had blundered.

"There's no reason to believe, a priori, that the mathematician will only mention a girl if there is no possible alternative."

Erp, I don't understand what this sentence is referring to. Can someone do me a favor and explain what is the "no possible alternative" here?

There are other scenarios when running the "optimal" algorithm is considered harmful. Consider a nascent sysop vaporising the oceans purely by trying to learn how to deal with humanity (if that amount of compute power is needed of course).

Probability theory was not designed about how to win, it was designed as way to get accurate statements about the world, assuming an observer whose computations have no impact on the world. This is a reasonable formalism for science, but only a fraction of how to win in the real world, and sometimes antithetical to winning. So if you want your system to win, don't necessarily approximate it to the best of your ability.

Ideally we want a theory of how to change energy into winning, not information and a prior into accurate hypotheses about the world, which is what probability theory gives us, and is very good at.

You need accurate information about the world in order to figure out how to "change energy into winning."

@ komponisto:

Also do we really want to assign a prior probability of 0 that the mathematician is a liar! :)

Or both. But getting the hang of things might just mean something like having core structures that are more and more durable which are harder and harder to break, making you feel like you're not leveling up as fast as you used to. Whether not leveling up as fast as before means something more like not arriving at "new theorems" as fast, might be more because of the other matter. If it doesn't cost anything and if it would slow down the neural degeneration process, be as physiologically healthy as you can on current terms.

Cat Dancer,

The frequentist answer of 1/3 is effectively making the implicit assumption that the parent would have said "at least one boy" either if both were boys or if there were one of each, and "at least one girl" if both were girls. Eliezer

_{2008}'s 1/2 answer effectively assumes that the parent would have said "at least one boy" if both were boys, "at least one girl" if both were girls, and either with equal probability if there were one of each. "No alternative" assumes the parent is constrained to (truthfully) say either "at least one boy" or "at least one girl", an assumption that strikes me as being bizzare.Will Pearson, you could not be more wrong. Winning money at games of chance is precisely what probability theory was designed for.

So the clear Bayesian version is: Mathematician says "I have two children", and you say, "Please tell me the sex of one of them", and she says "male". What's the chance both are boys?

One step back, though. The prior probability of being asked: "One's a girl. What's the chance both are boys?" is probably close to 0.

So the correct question to avoid that prior is: "What's the distribution of probabilities over 2 girls, one of each, and 2 boys?", not "What's the chance both are boys?"

Also do we really want to assign a prior probability of 0 that the mathematician is a liar! :)That's not the point I was making.

I'm not attacking unrealistic idealization. I'm willing to stipulate that the mathematician tells the truth. What I'm questioning is the "naturalness" of Eliezer's interpretation. The interpretation that I find "common-sensical" would be the following:

Let A = both boys, B = at least one boy. The prior P(B) is 3/4, while P(A) = 1/4. The mathematician's statement instructs us to find P(A|B), which by Bayes is equal to 1/3.

Under Eliezer's interpretation, however, the question is to find P(A|C), where C = *the mathematician says* at least one boy (*as opposed to saying* at least one girl).

So if anyone is attacking the premises of the question, it is Eliezer, by introducing the quantity P(C) (which strikes me as contrived) and assigning it a value less than 1.

"But it was PT:TLOS that did the trick. Here was probability theory, laid out not as a clever tool, but as The Rules, inviolable on pain of paradox"

I am unaware of a statement of Cox's theorem where the full *technical* statement of the theorem comes even close to this informal characterization. I'm not saying it doesn't exist, but PT:TLOS certainly doesn't do it.

I found the first two chapters of PT:TLOS to be absolutely, wretchedly awful. It's full of technical mistakes, crazy mischaracterizations of other people's opinions, hidden assumptions and skipped steps (that he tries to justify with handwaving nonsense), and even a discussion of Godel's theorems that mixes meta levels and completly misses the point.

Cat Dancer, I think by "no alternative," he means the case of two girls.

Of course the mathematician could say something like "none are boys," but the point is whether or not the two-girls case gets special treatment. If you ask "is at least one a boy?" then "no" means two girls and "yes" means anything else.

If the mathematician is just volunteering information, it's not divided up that way. When she says "at least one is a boy," she might be turning down a chance to say "at least one is a girl," and that changes things.

At least, I think that's what he's saying. Most of probability seems as awkward to me as frequentism seems to Eliezer.

Larry D'Anna,

Could you be more specific (citations, etc), so that we can have an exchange between you and Eliezer on this?

George, Brian: thank you for the elaborations. Perhaps the point is that if I have a mental model of when the mathematician will say what, and that model is reasonably accurate, I can use that information to make more accurate deductions?

Which seems fairly obvious... but perhaps that's also the point, that Bayesian statistics allows you to use what information you have.

Eliezer:

How do you decide which books to read? In particular, why did you decide to read PT:LOS? Did Amazon recommend it?

For those who are interested, a fellow named Kevin Van Horne has compiled a nice unofficial errata page for PT:LOS here. (Check the acknowledgments for a familiar name.)

I agree that the nature of that question requires having a mental model of the mathematician, or at least a mental model of mathematicians in general, which for this question we probably don't have.

However, a similar question can more unambiguously be answered with Eliezer's answer of 1/2.

You're at a dinner at a mathematician's house, and he says that he has two kids. A boy walks through the room, and you ask if the boy is his son. He says yes. What is the probability that the other child is a girl?

Larry D'Anna on Jaynes:

I found the first two chapters of PT:TLOS to be absolutely, wretchedly awful. It's full of technical mistakes, crazy mischaracterizations of other people's opinions, hidden assumptions and skipped steps (that he tries to justify with handwaving nonsense), and even a discussion of Godel's theorems that mixes meta levels and completly misses the point.Not to mention the totally unnecessary and irrelevant screeds against mainstream pure mathematics in general, which can only serve to alienate potential converts in that discipline (they sure alienated the hell out of me).

Eliezer,

Have you considered in detail the idea of AGI throttling, that is, given a metric of intelligence, and assuming a correlation between existential risk and said intelligence, AGI throttling is the explicit control of the AGI's intelligence level (or optimization power if you like), which indirectly also bounds existential risk.

In other words, what, if any, are the methods of bounding AGI intelligence level? Is it possible to build an AGI and explicitly set it at human level?

Agreed re: the bashing of mainstream math in PT:TLOS. AFAIK, his claims that mainstream math leads to paradoxes are all false; of course trying to act as though various items of mainstream math meant what an uneducated first glance says they mean can make them look bad. (e.g. the Banach-Tarski paradox means either "omg, mathematicians think they can violate conservation of mass!" or "OK, so I guess non-measurable things are crazy and should be avoided") It's not only unnecessary and annoying, but also I think that using usual measure theory would clarify things sometimes. For instance the fact that MaxEnt depends on what kind of distribution you start with, because a probability distribution doesn't actually have an entropy, but only a relative entropy relative to a reference measure, which is of course not necessarily uniform, even for a discrete variable. Jaynes seems to strongly deemphasize this, which is unfortunate: from PT:TLOS it seems as though MaxEnt gives you a prior given only some constraints, when really you also need a "prior prior".

Precision in seventeen syllables or less is very diffic.

Eliezer,

You say that like it's a bad thing. Your statement implies that something that is "necessary" is not necessary.

Just this morning I gave a presentation on the use of Bayesian methods for automatically predicting the functions of newly sequenced genes. The authors of the method I presented used the approximation

P(A, B, C) ~ P(A) x P(B|A) x P(C|A)

because it would have been difficult to compute P(C | B, A), and they didn't think B and C were correlated. Your statement condemns them as "less than optimal". But a sub-optimal answer you can compute is better than an optimal answer that you can't.

I am willing to entertain the notion that this is not utter foolishness, if you can provide us with some examples - say, ten or twenty - of scientists who had success using this approach. I would be surprised if the ratio of important non-mathematical discoveries made by following this maxim, to those made by violating it, was greater than .05. Even mathematicians often have many possible ways of approaching their problems.

David,

Building an AGI and setting it at "human level" would be of limited value. Setting it at "human level" plus epsilon could be dangerous. Humans on their own are intelligent enough to develop dangerous technologies with existential risk. (Which prompts the question: Are we safer with AI, or without AI?)

Phil,

There's really two things im considering. One, whether the general idea of AI throttling is meaningful and what the technical specifics could be (crude example: lets give it only X compute power yielding an intelligence level Y) Two, if we could reliably build a human level AI, it could be of great use, not in itself, but as a tool for investigation, since we could finally "look inside" at concrete realizations of mental concepts, which is not possible with our own minds. As an example, if we could teach a human level AI morality (presumably possible since we ourselves learn it) we would have a concrete realization of that morality as computation that could be looked at outright and even debugged. Could this not be of great value for insights into FAI?

@Phil G:

if you can provide us with some examples - say, ten or twenty - of scientists who had success using this approach.Phil, the low prevalence of breakthroughs made using this approach is evidence of science's historical link with serendipity. What it is not is evidence that 'Bayesian precision' as Eliezer describes it is not a necessary approach when the nature of the problem calls for it.

Recall the sequence around 'Faster than Einstein'. From a top-down capital-S Science point of view, there's nothing wrong with pootling around waiting for that 'hmmm, that's odd' moment. As you say, science has been ratcheting forward like that for a long while.

However, when you're just one guy with limited resources who wishes to take a mind-boggling step forward in a difficult domain in its infancy, the answer space is small enough that pootling won't get you far at all. (Doubly so when a single misstep kills you dead, as Eliezer's fond of saying.) No-one will start coding a browser and stumble across a sentient piece of code (Ă la Fleming / Penicillin), let alone a seed FAI. That kind of advance requires a large number of steps, each one technically precise and reliant on its predecessors. Or so I'm told. ;)

People are very fond of saying that General Intelligence may be outside the human sphere of ability - by definition too difficult for us. Well unless someone tries as hard as it's possible to try, how will we ever know?

David, the concept behind the term Singularity refers to our inability to predict what happens on the other side.

However, you don't even have to hold with the theory of a technological Singularity to appreciate the idea that an intelligence even slightly higher than our own (not to mention orders of magnitudes faster, and

certainlynot to mention self-optimizing) would probably be able to do things we can't imagine. Is it worth taking the risk?David - Yes, a human-level AI could be very useful. Politics and economics alone would benefit greatly from the simulations you could run.

(Of course, all of us but manual laborers would soon be out of a job.)

Ben,

The reason why I was considering the idea of "throttling" is precisely in order to reliably set the AI at human level (ie equivalent to an average human) and no higher. This scenario would therefore not entail the greater than human intelligence risk that you are referring to, nor would it (presumably) entail the singularity as usually defined. However, the benefits of a human level AI could be huge in terms of ability to introspect concepts that are shrouded in the mystery associated with the "mental" (vs non-mental in Eliezer's terminology). If the AI is at human level, then the AI can learn morality, then we can introspect and debug moral thinking that currently comes to us as a given. So, could it not be that the fastest path to FAI passes through human level AI? (that is not powerful enough to require FAI in the first place)

Phil,

Yes im sure it would be of great use in many things, but my main suggestion is whether the best route to FAI is through human level (but not higher) AI.

David,

Throttling an AI to human intelligence is like aiming your brand new superweapon at the world with the safety catch on. Potentially interesting, but really not worth the risk.

Besides, Eliezer would probably say that the F in FAI is the point of the code, not a module bolted into the code. There's no 'building the AI and tweaking the morality'. Either it's spot on when it's switched on, or it's unsafe.

Ben,

Using your analogy I was thinking more along lines of reliably building a non-super weapon in the first place. Also, I wasnt suggesting that F would be a module, but rather that FAI (the theory) could be easier to figure out via a non "superlative" AI, after which point you'd _then_ attempt to build the superweapon according to FAI, having had key insights into what morality is.

Imagine OpenCogPrime has reached human level AI. Presumably you could teach it morality/moral judgements like humans. At this point, you could actually look inside at the AtomTable and have a concrete mathematical representation of morality. You could even trace whats going on during judgements. Try doing the same by introspecting into your own thoughts.

Human level AI is still dangerous. Look how dangerous we are.

Consider that a human level AI which is not friendly, is likely to be far more unfriendly or difficult to bargain with than any human. (The total space of possible value systems is far far greater than the space of value systems inhabited by functioning humans). If there are enough of them, then they can cause the same kind of problem that a hostile society could.

But it's worse than that. A sufficiently unfriendly AI would be like a sociopath or psychopath by human standards. But unlike individual sociopaths among humans (who can become very powerful and do extraordinary damage, consider Stalin), they would not need to fake [human] sanity to work with others if there were a large community of like-minded unfriendly AIs. Indeed, if they were unfriendly enough and more comfortable with violence than say, your typical european/american, the result could look a lot like the colonialism of the 15th-19th centuries or earlier migrations of more warlike populations with all humans on the short end of the stick. And that's just looking at the human potential for collective violence. Surely the space of all human level intelligences contains some that are more brutally violent than the worst of us.

Could we conceivably hold this off? Possible, but it would be a big gamble, and unfriendliness would ensure that such a conflict would be inevitable. If the AI were significantly more efficient than we are (cost of upkeep and reproduction), that would be a *huge* advantage in any potential conflict. And it's hard to imagine an AI of strictly human level being commercially useful to build unless unless its efficiency is superior to ours.

Those are good points, although you did add the assumption of a community of uncontrolled widespread AI's whereas my idea was related to building one for research as part of a specific venture (eg singinst)

In any case, I have the feeling that the problem of engineering a safe controlled environment for a specific human level AI is much smaller than the problem of attaining Friendliness for AIs _in general_ (including those that are 10x, 100x, 1000x etc more intelligent). Consider also that deciding not to build an AI does not stop everybody else from doing so, so if a human level AI were valuable in achieving FAI as I suggest, then it would be wise for the very reasons you suggest to take that route before the bad scenario plays out.

For me the key to leveling up is to question every assumption (often) and find sources of novelty regularly. I liken cognition to a hill-climbing search on the landscape of theories/models/maps that explain/predict reality. It’s easy to get stuck on peaks of local maximality. Injecting randomness creates a sort of Boltzmann machine of the mind and increases my chances of finding higher peaks.

But I have to be prepared to be more confused — and question more assumptions than I intended to — because chances are my new random placement on the landscape is initially lower than the local maximum I was on prior. This part is scary. People around me don’t understand what I’m saying initially because I necessarily need new words, new language, to describe the new landscape.

And rather than start totally afresh with a new lexicon, I notice it’s more productive (personally and in communication) to overload old terms and let them slowly blend into their new meanings. We all resist the strain, especially those who did not sign up for the jump through hyperspace. They use the hill-climbing techniques that incrementally achieve higher ground (logical deduction, reductionism) in order to deny that we are in new territory at all and “prove” every new claim as false. But unless we eliminate most or all of our old assumptions and embrace the new ones, these techniques will always yield inconsistency.

Thus, it seems like a good idea to resist the urge to bring to in the heavy logical artillery until it’s clear we are on the upslope. In practice what this means is adding more novelty — but not as much as last time. This is the Boltzmann technique of simulated annealing: start with a high degree of heat/randomness and turn it down slowly, all the while pounding away with the tools of logic and reduction.

More here: http://emergentfool.com/2010/03/07/science-2-0/

It would help if there were examples of how precision saves time.

Having no training in probability, and having come upon the present website less than a day ago, I'm hoping someone here will be able to explain to me something basic. Let's assume, as is apparently assumed in this post, a 50-50 boy-girl chance. In other words, the chance is one out of two that a child will be a boy -- or that it will be a girl. A woman says, "I have two children." You respond, "Boys or girls?" She says, "Well, at least one of them is a boy. I haven't yet been informed of the sex of the other, to whom I've just given birth." You're saying that the chance that the newborn is a boy is one out of three, not one out of two? That's what I gather from the present post, near the beginning of which is the following:

In the correct version of this story, the mathematician says "I have two children", and you ask, "Is at least one a boy?", and she answers "Yes". Then the probability is 1/3 that they are both boys.

No. To get the 1/3 probability you have to assume that she would be just as likely to say what she says if she had 1 boy as if she had 2 (and that she wouldn't say it if she had none). In your scenario she's only half as likely to say what she says if she has one boy as if she has two boys, because if she only has one there's a 50% chance it's the one she's just given birth to.

Although I don't see what you're getting at, shinoteki, I appreciate your replying. Maybe you didn't notice; but about half an hour after I posted my comment to which you replied, I posted a comment with a different scenario, which involves no reference to birth order. (That is not to say I see that birth order bears on this.) I will certainly appreciate a reply, from you or from anyone else, to the said latter comment, whose time-stamp is 02 December 2012 06:51:25PM.

Let me try another scenario. A woman says, "I have two children." You respond, "What are their sexes?" She says, "At least one of them is a boy. The other was kidnapped before I was informed of its sex." You're saying that the chance that the kidnapped child is a boy is one out of three, not out of two? To repeat: That's what I gather from the present post, near the beginning of which is the following:

In the correct version of this story, the mathematician says "I have two children", and you ask, "Is at least one a boy?", and she answers "Yes". Then the probability is 1/3 that they are both boys.

*1 point [-]No, the chance that the kidnapped child is a boy is 1/2.

In the correct version of the story, you do not gain access to any information that allows you to differentiate between the mathematician's two children and identify a specific child as a boy.

In your story, you are able to partition the woman's children into "the kidnapped one" and "the other one", and the woman provides you with the information that "the other one" is a boy. The sex of "the kidnapped one" is independent of the sex of "the other one". That is,

P("the kidnapped one" is a boy | "the other one" is a boy") = P("the kidnapped one" is a boy)

*0 points [-]Thank you for the reply, Mr. Kasper.

Let me try this. You come upon a man who, as you watch, flips a 50-50 coin. He catches and covers it; that is, the result of the flip is not known. I, who have been standing there, present you the following question:

"What is the chance the coin is heads?"

That's Question A. What is your answer?

The next day, you come upon a different man, who, as you watch, flips a 50-50 coin. Again, he catches it; again, the result is not revealed. I, who have been standing there, address you as follows:

"Just before you arrived, that man flipped that same coin; it came up heads. What is the chance it is now heads?"

That's Question B. What is your answer?

If you and I were having this discussion in person, I would pause here, to allow you to answer Questions A and B. Because this is the internet, where I don't know how many opportunities you'll have to reply to me, I'll continue.

You come upon a man who is holding a 50-50 coin. I am with him. There is the following exchange:

I (to you, re the man with the coin): This man has just flipped this coin two times.

You: What were the results?

I: One of the results was heads. I don’t remember what the other was.

Question C: What is the chance the other was heads?

Let’s step over Question C (though I'll appreciate your answering it). After I tell you that one of the results was heads but that I don't remember what the other was, you say:

"Which do you remember, the first or the second?"

I reply, "I don’t remember that either."

Question D: What is the chance the other was heads?

Let's establish some notation first:

P(H): My prior probability that the coin came up heads. Because we're assuming that the coin is fair before you present any evidence, I assume a 50% chance that the coin came up heads.

P(H|E): My posterior probability that the coin came up heads, or the probability that the coin came up heads, given the evidence that you have provided.

P(E|H): The probability of observing what we have, given the coin in question coming up heads.

P(E&H): The probability of you observing the evidence and the coin in question coming up heads.

P(E&-H): The probability of you observing the evidence and the coin in question coming up tails.

P(E): The unconditional probability of you observing the evidence that you presented. Because the events (E&H) and (E&-H) are mutually exclusive (one cannot happen at the same time as the other) and the events (H) and (-H) are collectively exhaustive (the probability that at least one of these events occurs is 100%), we can calculate P(E):

P(E) = P(E&H) + P(E&-H)

P(E) = P(E|H) P(H) + P(E|-H) P(-H)

Using Bayes' Theorem, we can calculate P(H|E) after we determine P(E|H) and P(E|-H):

P(H|E) = [P(E|H) P(H)] / [P(E|H) P(H) + P(E|-H) P(-H)]

In this case we can assume that our lack of knowledge is independent of the result of the coin toss; P(E|H) = P(E) = P(E|-H). So

P(H|E) = P(E) (50%) / [P(E) (50%) + P(E) (1 - 50%)] = [P(E) / P(E)] (50% /100%) = 50%.

Again here, your probability of observing the first result is independent of the second result. So P(H|E) = 50%.

Here we can note that there are four mutually exclusive, collectively exhaustive, and equiprobable outcomes. Let's call them (HH), (HT), (TH), and (TT), where the first of the two symbols represents the result that you remember observing. Given that you remember observing a result of heads, our evidence is (HH or HT). The second coin is heads in the case of (HH), which is as probable as (HT). Given that P(HH) = P(HT) = 25%, P(HH or HT) = 50%

P(HH|HH or HT) = P(HH or HT|HH) P(HH) / P(HH or HT)

P(HH|HH or HT) = 1 (25% / 50%) = 50%

We can use the same method as in Question C. Since the ordinality of the missed observation is independent from the result of the missed observation, the probability is the same as in Question C, which is 50%.

*3 points [-]Thank you, Mr. Kasper, for your thorough reply. Because all of this is new to me, I feel rather as I did the time I spent an hour on a tennis court with a friend who had won a tennis scholarship to college. Having no real tennis ability myself, I felt I was wasting his time; I appreciated that he’d agreed to play with me for that hour.

As I began to grasp the reasoning, I decided that each time you state the chance that the coin is heads, you are stating a fact. I asked myself what that means. I imagined the following:

I encounter you after you’ve spent two months traveling the world. You address me as follows:

“During my first month, I happened upon one hundred men who told me—each of them—that he had just flipped a coin twice. In each case, I asked, ‘Was at least one of the results heads?’ Each man said yes, and I knew that, in each case, the probability was 1/3 that both flips had been heads.

“In my second month, I again happened upon one hundred men who told me—each of them—that he had just flipped a coin twice. Each added, ‘One of the results was heads; I don’t remember what the other was.’ I knew that, in each case, the probability was 1/2 that both flips had been heads.

“Just as I was about to return home, I was approached by a man who had video recordings of the coin flips that those two hundred men had mentioned. In watching the recordings, I learned that both flips had been heads in fifty of the first one hundred cases and that, likewise, both flips had been heads in fifty of the second one hundred cases.”

In considering that, Mr. Kasper, I imagined the following exchange, which you may imagine as taking place between you and me. I speak first.

“My dog is in that box.”

“Is that a fact?”

“Yes.”

“In saying it’s a fact, you mean what?”

“I mean I regard it as true.”

“Which means what?”

“Which means I can imagine events that culminate in my saying, ‘I seem to have been mistaken; my dog wasn’t in that box.’”

“For example.”

“You walk over to the box and remove its lid, and I see my dog is not in it.”

“Maybe the dog disappeared—vanished into thin air—while I was walking over to the box.”

“That’s a possibility I wouldn’t be able to rule out; but because it would seem to me unlikely, I would say, ‘I seem to have been mistaken; my dog wasn’t in the box.’”

“How much is 189 plus 76?”

“To tell you that, I would have to get a pencil and paper and add them.”

“Please get a pencil and paper and add them; then tell me the result.”

“I’ve just done as you requested. Using a pencil and paper, I’ve added those two numbers. The result is 265.”

“189 + 76 = 265.”

“Correct.”

“Is that a fact?”

“Yes.”

“Please add them again.”

“I’ve just done as you requested. Using my pencil and paper, I’ve added those numbers a second time. I seem to have been mistaken. The result is 255.”

“Are you sure?”

“Well—”

“Please add them again.”

“I’ve just done as you requested. With my pencil and paper, I’ve added the numbers a third time.”

“And?”

“I was right the first time. The sum is 265.”

“Is that a fact?”

“That the sum is 265?”

“Yes.”

“I would say yes. It’s a fact.”

“How much is two plus two?”

“Four.”

“Did you use your pencil and paper to determine that?”

“No.”

“You used your pencil and paper to add 189 and 76 but not to add two and two.”

“That’s right.”

“Is there any sequence of events that could culminate in your saying, ‘I seem to have been mistaken; two plus two is not four.’”

“No.”

“Is it a fact?”

“That two plus two is four?”

“Yes.”

"Yes. It's a fact."

“In saying that, you mean what?”

“—I don’t know.”

Thank you again.

*0 points [-]Mr. Bonaccorsi:

Here are two links to classic posts by Eliezer Yudkowsky that you may find pertinent to the second dialog from your last comment. I hope you enjoy them.

How to Convince Me That 2 + 2 = 3

The Simple Truth

Thank you for those links, Mr. Kasper. In taking a quick first look at the two pieces, I've noticed passages with which I'm familiar, so I must have encountered those posts as I made my initial reconnaissance, so to speak, of this very-interesting website. Now that you've directed my attention to those posts in particular, I'll be able to read them with real attention.

Initially, there are four possibilities, each with probability 1/4:

If you learn that

one of themis a boy, then that eliminates option D, leaving the other three options (A, B, C) with 1/3 probability each. So the probability that both are boys given that at least one is a boy (ie., Pr[A] given A-or-B-or-C) is 1/3.On the other hand, if you learn that the

first childis a boy, that eliminates options DandC. You've ruled outmorepossibilities -- whereas before 'Girl, Boy' (C) was an option, now the only options are 'Boy, Boy' (A) and 'Boy, Girl' (B). So there's now a 1/2 chance that both are boys (i.e., Pr[A] given A-or-B). And the same calculation holds if you learned instead that thesecond childis a boy, only with B eliminated in place of C.Thank you for the reply, RobbBB. As I mentioned in my reply to shinoteki (at 03 December 2012 01:48:47AM ), I followed my original post (to which you have just responded) with a post in which there is no reference to birth order. As I also said to shinoteki, that does not mean I see that birth order bears on this. It means simply that I was anticipating the response you, RobbBB, have just posted.

At 06 December 2012 10:18:40AM, as you may see, William Kasper posted a reply to my said second post (the one without reference to birth order). After I post the present comment, I will reply to Mr. Kasper. Thank you again.

This is why I sometimes hate probability. The probabilities here strongly depend on how the family and boys are chosen.

If you took a list of families with exactly two children and threw out the ones that had no boys, then you'd find that of the remaining families, 1/3 have two boys.

If you took a list of boys who have exactly one sibling and asked how many of them had a brother, you'd get the answer 1/2.

The difference is whether the child is chosen at random. Even a minor change in the phrasing of the question can change the correct answer. Always be cautious with probability.