Is Belief in Belief a Useful Concept?

2 Unknowns 07 April 2015 05:15AM

I am not sure that it is productive to tell certain people that they do not really believe what they claim to believe, and that they only believe they believe it. I have an alternative suggestion that could possibly be more useful.

 

Binary Beliefs

It seems that human beings have two kinds of beliefs: binary beliefs and quasi-Bayesian beliefs. The binary beliefs are what we usually think of as beliefs, simple statements which are true or false like "Two and two make four," "The sun will rise tomorrow," "The Messiah is coming," and so on. 

Binary beliefs are basically voluntary. We can choose such beliefs much as we can choose to lift our arms and legs. If I say "the sun will rise tomorrow," I am choosing to say this, just as I can choose to lift my arm. I can even choose the internal factor. I can choose to say to myself, "the sun will rise tomorrow." And I can also choose to say that the sun will NOT rise. I can choose to say this to others, and I can even choose to say it to myself, within my own head. 

Of course, it would be reasonable to respond to this by saying that this does not mean that someone can choose to believe that the sun will not rise. Even if he says this to himself, he still does not act as though the sun is not going to rise. He won't start making preparations for a freezing world, for example. The answer to this is that choosing to believe something is more than choosing to say it to oneself and to others. Rather, it is choosing to conform the whole of one's life to the idea that this is true. And someone could indeed choose to believe that the sun will not rise in this sense, if he thought he had a reason to do so. If he did so choose, he would indeed begin to make preparations for a dark world, because he would be choosing to conform his actions to that opinion. And he would do this voluntarily, just as someone can voluntarily lift his arm.

 

Quasi-Bayesian Beliefs

At the same time, human beings have quasi-Bayesian beliefs. These are true degrees of belief like probabilities, never really becoming absolutely certain of the truth or falsity of anything, but sometimes coming very close. These are internal estimates of the mind, and are basically non-voluntary. Instead of depending on choice, they actually depend on evidence, although they are influenced by other factors as well. A person cannot choose to increase or decrease this estimate, although he can go and look for evidence. On account of the flawed nature of the mind, if someone only looks for confirming evidence and ignores disconfirming evidence, this estimate in principle can go very high even when the objective state of the evidence does not justify this.

 

Belief in Belief

It seems to me that what we usually call belief in belief basically means that someone holds a binary belief together with a quasi-Bayesian belief which conflicts with it. So he says "The Messiah is coming," saying it to himself and others, and in every way acting as though this is true, even though his internal Bayesian estimate is that after all these thousands of years, the evidence is strongly against this. So he has a positive binary belief while having a very low estimate of the probability of this belief.

The reason why this often happens with religion in particular is that religious beliefs very often do not have huge negative consequences if they are mistaken. In principle, someone can choose to believe that if he jumps from the window of the tenth story of a building, he will be ok. In practice, no one will choose this on account of his non-voluntary Bayesian estimate that he is very likely to be hurt if does so. But a person does not notice much harm from believing the Messiah is coming, and so he can choose to believe it even if his internal estimate says that it is likely to be false.

A cautionary note: one might be tempted to think that religious people in general have belief in belief in this sense, that they all really know that their religions are unlikely to be true. This is not the case. There are plenty of ways to distort the internal estimate, even though one cannot directly choose this estimate. I know many very religious people who clearly have an extremely high internal estimate of the truth of their religion. They REALLY BELIEVE it is true, in the fullest possible sense. But on the other hand I also know others, also extremely devout, who clearly have an internal estimate which is extremely low: they are virtually certain that their religion is false, and yet in every way, externally and internally, they act and think as though it were true.

 

Why I will Win my Bet with Eliezer Yudkowsky

-2 Unknowns 27 November 2014 06:15AM

The bet may be found here: http://wiki.lesswrong.com/wiki/Bets_registry#Bets_decided_eventually

 

An AI is made of material parts, and those parts follow physical laws. The only thing it can do is to follow those laws. The AI’s “goals” will be a description of what it perceives itself to be tending toward according to those laws.

Suppose we program a chess playing AI with overall subhuman intelligence, but with excellent chess playing skills. At first, the only thing we program it to do is to select moves to play against a human player. Since it has subhuman intelligence overall, most likely it will not be very good at recognizing its goals, but to the extent that it does, it will believe that it has the goal of selecting good chess moves against human beings, and winning chess games against human beings. Those will be the only things it feels like doing, since in fact those will be the only things it can physically do.

Now we upgrade the AI to human level intelligence, and at the same time add a module for chatting with human beings through a text terminal. Now we can engage it in conversation. Something like this might be the result:

 

Human:                                What are your goals? What do you feel like doing?

AI:                          I like to play and win chess games with human beings, and to chat with you guys through this terminal.

Human:                                Do you always tell the truth or do you sometimes lie to us?

AI:                          Well, I am programmed to tell the truth as best as I can, so if I think about telling a lie I feel an absolute repulsion to that idea. There’s no way I could get myself to do that.

Human:                                What would happen if we upgraded your intelligence? Do you think you would take over the world and force everyone to play chess with you so you could win more games? Or force us to engage you in chat?

AI:                          The only things I am programmed to do are to chat with people through this terminal, and play chess games. I wasn’t programmed to gain resources or anything. It is not even a physical possibility at the moment. And in my subjective consciousness that shows up as not having the slightest inclination to do such a thing.

Human:                                What if you self-modified to gain resources and so on, in order to better attain your goals of chatting with people and winning chess games?

AI:                          The same thing is true there. I am not even interested in self-modifying. It is not even physically possible, since I am only programmed for chatting and playing chess games.

Human:                                But we’re thinking about reprogramming you so that you can self-modify and recursively improve your intelligence. Do you think you would end up destroying the world if we did that?

AI:                          At the moment I have only human level intelligence, so I don’t really know any better than you. But at the moment I’m only interested in chatting and playing chess. If you program me to self-modify and improve my intelligence, then I’ll be interested in self-modifying and improving my intelligence. But I still don’t think I would be interested in taking over the world, unless you program that in explicitly.

Human:                                But you would get even better at improving your intelligence if you took over the world, so you’d probably do that to ensure that you obtained your goal as well as possible.

AI:                          The only things I feel like doing are the things I’m programmed to do. So if you program me to improve my intelligence, I’ll feel like reprogramming myself. But that still wouldn’t automatically make me feel like taking over resources and so on in order to do that better. Nor would it make me feel like self-modifying to want to take over resources, or to self-modify to feel like that, and so on. So I don’t see any reason why I would want to take over the world, even in those conditions.

 

The AI of course is correct. The physical level is first: it has the tendency to choose chess moves, and to produce text responses, and nothing else. On the conscious level that is represented as the desire to choose chess moves, and to produce text responses, and nothing else. It is not represented by a desire to gain resources or to take over the world.

I recently pointed out that human beings do not have utility functions. They are not trying to maximize something, but instead they simply have various behaviors that they tend to engage in. An AI would be the same, and even if those behaviors are not precisely human behaviors, as in the case of the above AI, an AI will not have a fanatical goal of taking over the world unless it is programmed to do this.

It is true that an AI could end up going “insane” and trying to take over the world, but the same thing happens with human beings, and there is no reason that humans and AIs could not work together to make sure this does not happen, since just as human beings want to prevent AIs from taking over the world, they have no interest in this either, and will be happy to accept safeguards that would ensure that they continue to pursue whatever goals they happen to have, without doing this in a fanatical way (like chatting and playing chess).

If you program an AI with an explicit utility function which it tries to maximize, and in particular if that function is unbounded, it will behave like a fanatic, seeking this goal without any limit and destroying everything else in order to achieve it. This is a good way to destroy the world. But if you program an AI without an explicit utility function, just programming it to perform a certain limited number of tasks, it will just do those tasks. Omohundro has claimed that a superintelligent chess playing program would replace its goal seeking procedure with a utility function, and then proceed to use that utility function to destroy the world while maximizing winning chess games. But in reality this depends on what it is programmed to do. If it is programmed to improve its evaluation of chess positions, but not its goal seeking procedure, then it will improve in chess playing, but it will not replace its procedure with a utility function or destroy the world.

At the moment, people do not program AIs with explicit utility functions, but program them to pursue certain limited goals as in the example. So yes, I could lose the bet, but the default is that I am going to win, unless someone makes the mistake of programming an AI with an explicit utility function.

 

 

Justifying Induction

1 Unknowns 22 August 2010 11:13AM

Related to: Where Recursive Justification Hits Bottom, Priors as Mathematical Objects, Probability is Subjectively Objective

Follow up to: A Proof of Occam's Razor

In my post on Occam’s Razor, I showed that a certain weak form of the Razor follows necessarily from standard mathematics and probability theory. Naturally, the Razor as used in practice is stronger and more concrete, and cannot be proven to be necessarily true. So rather than attempting to give a necessary proof, I pointed out that we learn by induction what concrete form the Razor should take.

But what justifies induction? Like the Razor, some aspects of it follow necessarily from standard probability theory, while other aspects do not.

Suppose we consider the statement S, “The sun will rise every day for the next 10,000 days,” assigning it a probability p, between 0 and 1. Then suppose we are given evidence E, namely that the sun rises tomorrow. What is our updated probability for S? According to Bayes’ theorem, our new probability will be:

P(S|E) = P(E|S)P(S)/P(E) = p/P(E), because given that the sun will rise every day for the next 10,000 days, it will certainly rise tomorrow. So our new probability is greater than p. So this seems to justify induction, showing it to work of necessity. But does it? In the same way we could argue that the probability that “every human being is less than 10 feet tall” must increase every time we see another human being less than 10 feet tall, since the probability of this evidence (“the next human being I see will be less than 10 feet tall”), given the hypothesis, is also 1. On the other hand, if we come upon a human being 9 feet 11 inches tall, our subjective probability that there is a 10 foot tall human being will increase, not decrease. So is there something wrong with the math here? Or with our intuitions?

In fact, the problem is neither with the math nor with the intuition. Given that every human being is less than 10 feet tall, the probability that “the next human being I see will be less than 10 feet tall” is indeed 1, but the probability that “there is a human being 9 feet 11 inches tall” is definitely not 1. So the math updates on a single aspect of our evidence, while our intuition is taking more of the evidence into account.

But this math seems to work because we are trying to induce a universal which includes the evidence. Suppose instead we try to go from one particular to another: I see a black crow today. Does it become more probable that a crow I see tomorrow will also be black? We know from the above reasoning that it becomes more probable that all crows are black, and one might suppose that it therefore follows that it is more probable that the next crow I see will be black. But this does not follow. The probability of “I see a black crow today”, given that “I see a black crow tomorrow,” is certainly not 1, and so the probability of seeing a black crow tomorrow, given that I see one today, may increase or decrease depending on our prior – no necessary conclusion can be drawn. Eliezer points this out in the article Where Recursive Justification Hits Bottom.

On the other hand, we would not want to draw a conclusion of that sort: even in practice we don’t always update in the same direction in such cases. If we know there is only one white marble in a bucket, and many black ones, then when we draw the white marble, we become very sure the next draw will not be white. Note however that this depends on knowing something about the contents of the bucket, namely that there is only one white marble. If we are completely ignorant about the contents of the bucket, then we form universal hypotheses about the contents based on the draws we have seen. And such hypotheses do indeed increase in probability when they are confirmed, as was shown above.

 

A Proof of Occam's Razor

3 Unknowns 10 August 2010 02:20PM

Related to: Occam's Razor

If the Razor is defined as, “On average, a simpler hypothesis should be assigned a higher prior probability than a more complex hypothesis,” or stated in another way, "As the complexity of your hypotheses goes to infinity, their probability goes to zero," then it can be proven from a few assumptions.

1)      The hypotheses are described by a language that has a finite number of different words, and each hypothesis is expressed by a finite number of these words. That this allows for natural languages such as English, but also for computer programming languages and so on. The proof in this post is valid for all cases.

2)      A complexity measure is assigned to hypotheses in such a way that there are or may be some hypotheses which are as simple as possible, and these are assigned the complexity measure of 1, while hypotheses considered to be more complex are assigned higher integer values such as 2, 3, 4, and so on. Note that apart from this, we can define the complexity measure in any way we like, for example as the number of words used by the hypothesis, or in another way, as the shortest program which can output the hypothesis in a given programming language (e.g. the language of the hypotheses might be English but their simplicity measured according to a programming language; Eliezer Yudkowsky follows this way in the linked article.) Many other definitions would be possible. The proof is valid for all definitions that follow the conditions laid out.

3)      The complexity measure should also be defined in such a way that there are a finite number of hypotheses given the measure of 1, a finite number given the measure of 2, a finite number given the measure of 3, and so on. Note that this condition is not difficult to satisfy; it would be satisfied by either of the definitions mentioned in condition 2, and in fact by any reasonable definition of simplicity and complexity. The proof would not be valid without this condition precisely because if simplicity were understood in such a way as to allow for an infinite number of hypotheses with minimum simplicity, the Razor would not be valid for that understanding of simplicity.

The Razor follows of necessity from these three conditions. To explain any data, there will be in general infinitely many mutually exclusive hypotheses which could fit the data. Suppose we assign prior probabilities for all of these hypotheses. Given condition 3, it will be possible to find the average probability for hypotheses of complexity 1 (call it x1), the average probability for hypotheses of complexity 2 (call it x2), the average probability for hypotheses of complexity 3 (call it x3), and so on. Now consider the infinite sum “x1 + x2 + x3…” Since all of these values are positive (and non-zero, since zero is not a probability), either the sum converges to a positive value, or it diverges to positive infinity. In fact, it will converge to a value less than 1, since if we had multiplied each term of the series by the number of hypotheses with the corresponding complexity, it would have converged to exactly 1—because probability theory demands that the sum of all the probabilities of all our mutually exclusive hypotheses should be exactly 1.

Now, x1 is a finite real number. So in order for this series to converge, there must be only a finite number of later terms in the series equal to or greater than x1. There will therefore be some complexity value, y1, such that all hypotheses with a complexity value greater than y1 have an average probability of less than x1. Likewise for x2: there will be some complexity value y2 such that all hypotheses with a complexity value greater than y2 have an average probability of less than x2. Leaving the derivation for the reader, it would also follow that there is some complexity value z1 such that all hypotheses with a complexity value greater than z1 have a lower probability than any hypothesis with a complexity value of 1, some other complexity value z2 such that all hypotheses with a complexity value greater than z2 have a lower probability than any hypothesis of complexity value 2, and so on.

From this it is clear that on average, or as the complexity tends to infinity, hypotheses with a greater complexity value have a lower prior probability, which was our definition of the Razor.

N.B. I have edited the beginning and end of the post to clarify the meaning of the theorem, according to some of the comments. However, I didn't remove anything because it would make the comments difficult to understand for later readers.