Expected Creative Surprises
Imagine that I'm playing chess against a smarter opponent. If I could predict exactly where my opponent would move on each turn, I would automatically be at least as good a chess player as my opponent. I could just ask myself where my opponent would move, if they were in my shoes; and then make the same move myself. (In fact, to predict my opponent's exact moves, I would need to be superhuman - I would need to predict my opponent's exact mental processes, including their limitations and their errors. It would become a problem of psychology, rather than chess.)
So predicting an exact move is not possible, but neither is it true that I have no information about my opponent's moves.
Personally, I am a very weak chess player - I play an average of maybe two games per year. But even if I'm playing against former world champion Garry Kasparov, there are certain things I can predict about his next move. When the game starts, I can guess that the move P-K4 is more likely than P-KN4. I can guess that if Kasparov has a move which would allow me to checkmate him on my next move, that Kasparov will not make that move.
Much less reliably, I can guess that Kasparov will not make a move that exposes his queen to my capture - but here, I could be greatly surprised; there could be a rationale for a queen sacrifice which I have not seen.
And finally, of course, I can guess that Kasparov will win the game...
Supposing that Kasparov is playing black, I can guess that the final position of the chess board will occupy the class of positions that are wins for black. I cannot predict specific features of the board in detail; but I can narrow things down relative to the class of all possible ending positions.
If I play chess against a superior opponent, and I don't know for certain where my opponent will move, I can still endeavor to produce a probability distribution that is well-calibrated - in the sense that, over the course of many games, legal moves that I label with a probability of "ten percent" are made by the opponent around 1 time in 10.
You might ask: Is producing a well-calibrated distribution over Kasparov, beyond my abilities as an inferior chess player?
But there is a trivial way to produce a well-calibrated probability distribution - just use the maximum-entropy distribution representing a state of total ignorance. If my opponent has 37 legal moves, I can assign a probability of 1/37 to each move. This makes me perfectly calibrated: I assigned 37 different moves a probability of 1 in 37, and exactly one of those moves will happen; so I applied the label "1 in 37" to 37 different events, and exactly 1 of those events occurred.
Total ignorance is not very useful, even if you confess it honestly. So the question then becomes whether I can do better than maximum entropy. Let's say that you and I both answer a quiz with ten yes-or-no questions. You assign probabilities of 90% to your answers, and get one answer wrong. I assign probabilities of 80% to my answers, and get two answers wrong. We are both perfectly calibrated but you exhibited better discrimination - your answers more strongly distinguished truth from falsehood.
Suppose that someone shows me an arbitrary chess position, and asks me: "What move would Kasparov make if he played black, starting from this position?" Since I'm not nearly as good a chess player as Kasparov, I can only weakly guess Kasparov's move, and I'll assign a non-extreme probability distribution to Kasparov's possible moves. In principle I can do this for any legal chess position, though my guesses might approach maximum entropy - still, I would at least assign a lower probability to what I guessed were obviously wasteful or suicidal moves.
If you put me in a box and feed me chess positions and get probability distributions back out, then we would have - theoretically speaking - a system that produces Yudkowsky's guess for Kasparov's move in any chess position. We shall suppose (though it may be unlikely) that my prediction is well-calibrated, if not overwhelmingly discriminating.
Now suppose we turn "Yudkowsky's prediction of Kasparov's move" into an actual chess opponent, by having a computer randomly make moves at the exact probabilities I assigned. We'll call this system RYK, which stands for "Randomized Yudkowsky-Kasparov", though it should really be "Random Selection from Yudkowsky's Probability Distribution over Kasparov's Move."
Will RYK be as good a player as Kasparov? Of course not. Sometimes the RYK system will randomly make dreadful moves which the real-life Kasparov would never make - start the game with P-KN4. I assign such moves a low probability, but sometimes the computer makes them anyway, by sheer random chance. The real Kasparov also sometimes makes moves that I assigned a low probability, but only when the move has a better rationale than I realized - the astonishing, unanticipated queen sacrifice.
Randomized Yudkowsky-Kasparov is definitely no smarter than Yudkowsky, because RYK draws on no more chess skill than I myself possess - I build all the probability distributions myself, using only my own abilities. Actually, RYK is a far worse player than Yudkowsky. I myself would make the best move I saw with my knowledge. RYK only occasionally makes the best move I saw - I won't be very confident that Kasparov would make exactly the same move I would.
Now suppose that I myself play a game of chess against the RYK system.
RYK has the odd property that, on each and every turn, my probabilistic prediction for RYK's move is exactly the same prediction I would make if I were playing against world champion Garry Kasparov.
Nonetheless, I can easily beat RYK, where the real Kasparov would crush me like a bug.
The creative unpredictability of intelligence is not like the noisy unpredictability of a random number generator. When I play against a smarter player, I can't predict exactly where my opponent will move against me. But I can predict the end result of my smarter opponent's moves, which is a win for the other player. When I see the randomized opponent make a move that I assigned a tiny probability, I chuckle and rub my hands, because I think the opponent has randomly made a dreadful move and now I can win. When a superior opponent surprises me by making a move to which I assigned a tiny probability, I groan because I think the other player saw something I didn't, and now I'm about to be swept off the board. Even though it's exactly the same probability distribution! I can be exactly as uncertain about the actions, and yet draw very different conclusions about the eventual outcome.
(This situation is possible because I am not logically omniscient; I do not explicitly represent a joint probability distribution over all entire games.)
When I play against a superior player, I can't predict exactly where my opponent will move against me. If I could predict that, I would necessarily be at least that good at chess myself. But I can predict the consequence of the unknown move, which is a win for the other player; and the more the player's actual action surprises me, the more confident I become of this final outcome.
The unpredictability of intelligence is a very special and unusual kind of surprise, which is not at all like noise or randomness. There is a weird balance between the unpredictability of actions and the predictability of outcomes.




Comments (34)
Didn't we already have this exact post?
You had "Optimization", and you may have read a draft elsewhere with some of this material in it. I'm posting it now by way of finally finishing it after years, and by way of taking a vacation (feeling a bit burned-out lately).
On a whimsical note, it is reminiscent of the unpredictability of the Infinite Improbability Drive :-)
"If we walk without rhythm, we won't attract the worm."
Set up a pattern-recognition system directed at your own actions, and when you fall into a predictable rut, do something differently.
Could you give a more precise definition of "calibrated"? Your example of 1/37 for each of 37 different possibilities, justified by saying that indeed one of the 37 will happen, seems facile. Do you mean that the "correct" distribution, relative to your guess, has low relative entropy?
I think what he means by "calibrated" is something like it not being possible for someone else to systematically improve the probabilities you give for the possible answers to a question just from knowing what values you've assigned (and your biases), without looking at what the question is.
I suppose the improvement would indeed be measured in terms of relative entropy of the "correct" guess with respect to the guess given.
anon: The short definition of "well-calibrated" is that if things you assign X% probability to actually do happen X% of the time, you are well-calibrated. If you go around saying that you are 90% certain of your answers on an exam, and you get a 90, you are well-calibrated. If you say that you are 90% certain and you get a 50, you are not well-calibrated.
For the long version, see A Technical Explanation of Technical Explanation". (Highly recommended!)
Anon: well-calibrated means roughly that in the class of all events you think have probability p to being true, the proportion of them that turn out to be true is p.
More formally, suppose you have a probability distribution over something you are going to observe. If the log probability of the event which actually occurs is equal to the entropy of your distribution, you are well calibrated. If it is above you are over confident, if it is below you are under confident. By this measure, assigning every possibility equal probability will always be calibrated.
This is related to relative entropy.
As with jsalvati's comment, upon reading this post I was convinced that either I was hallucinating, that almost the entirety of this post had appeared elsewhere before, or that I've read so much of Eliezer's writing that I'm anticipating his arguments in detail before he makes them.
Fortunately, it turns out that the second (and least disturbing) option was correct, in that a substantial amount of this post already existed at: [link deleted]
Not that I'm complaining, but it was seriously bothering me that I knew I'd heard this stuff before, but Google said it hadn't previously been posted on Overcoming Bias, and I was starting to doubt my sanity...
This notion of calibratedness seems to have bad properties to me. Consider a situation where I'm trying to guess a distribution for the outcomes of a coin flip with a coin which, my information tells me, lands "heads" 99% of the time. Then a guess of 50% and 50% is "calibrated" because of the 50% predictions I make, exactly half come out right. But a guess 49.9% heads and 50.1% tails is horribly calibrated; the "49.9%" predictions come out 99% correct, and the "50.1%" predictions come out 1% correct. So the concept, as defined like this, seems hypersensitive, and therefore not very useful. I think a proper definition must necessarily be in terms of relative entropy, or perhaps considering Bayesian posteriors from subsets of your information, but I still don't see how it should work. Sorry if someone already gave a robust definition that I missed.
Nick: If you don't mean *expected* log probability, then I don't know what you're talking about. And if you do, it seems to me that you're saying that well-calibratedness means that relative entropy of the "correct" distribution relative to yours is equal to your entropy. But then the uniform prior doesn't seem well-calibrated; again, consider a coin that lands "heads" 99% of the time. Then your entropy is 1, while the relative entropy of the "correct" distribution is (-log(99%)-log(1%))/2, which is >2.
Nick: Sorry, I got it backwards. What you seem to be saying is that well-calibratedness means that relative entropy of your distribution relative to the "correct" one is equal to your entropy. This does hold for the uniform guess. But once again, considering a situation where your information tells you the coin will land "heads" with 99% probability, it would seem that the only well-calibrated guesses are 99%-1% and 50%-50%. I don't yet have an intuition for why both of these guesses are strictly "better" in any way than an 80%-20% guess, but I'll think about it. It definitely avoids the sensitivity that seemed to come out of the "rough" definition, where 50% is great but 49.9% is horrible.
Please don't post a link to the draft version of this. This just means that lots of Overcoming Bias readers will be bored for the next couple of weeks, because they'll already have read the old version of something.
Anon: no, I mean the log probability. In your example, the calibratedness will generally be high: - \log 0.499 - H(p) ~= 0.00289 each time you see tails, and - log 0.501 - H(p) ~= - 0.00289 each time you come up tails. It's continuous.
Let's be specific. We have H(p) = - \sum_x p(x) \log p(x), where p is some probability distribution over a finite set. If we observe x0, the say the predictor's calibration is
C(x0) = \sum_x p(x) \log p(x) - \log p(x0) = - \log p(x0) - H(p)
so the expected calibration is 0 by the definition of H(p). The calibration is continuous in p. If \log p(x0) is higher then the expected value of \log p(x) then we are underconfident and C(x0) < 0; if \log p(x0) is lower than expected we are overconfident, and C>0.
With q = p(x) d(x,x0) the non-normalised probability distribution that assigns value only x0, we have
C = D(p||q)
so this is a relative entropy of sorts.
If the possibility of "creative surprises" depends on ignorance about the logical consequences of a game move, it seems that this would be best modeled as an asymmetric information problem.
It's interesting to note that the usual "Dutch-book" arguments for subjective probability break down under asymmetric information - the only way to avoid losing money to a possibly better-informed opponent is refusing to enter some bets, i.e. adopting an upper-lower probability interval.
Of course, such upper-lower spreads show up routinely in illiquid financial markets; I wonder whether any connections have been made between rational pricing conditions in such markets and imprecise probability theories like Dempster-Shafer, etc.
I think there's a sign error in my post -- C(x0) = \log p(x0) + H(p) it should be.
Nick Hay - IIRC the minus-log probability of an outcome is usually called "surprisal" or "self-information". The Shannon entropy of a random variable is just the expected value of its surprisal.
guest: right, so with those definitions you are overconfident if you are suprised more than you expected, underconfident if you are suprised less, calibration being how close your suprisal is to your expectation of it.
If you use upper-lower then you will forego combinations of bets that are sure gains. Dutch Book is inescapable, and only your presumption that the environment will contain agents who try to offer you losing bet combinations but no agents who are nice enough to take them, would lead you think that upper-lower "solves" the problem. No real agent can afford the computing power to be an ideal Bayesian, but the laws are ineluctable: whoever is not Bayesian - regardless of the reasons, regardless of whether it is unavoidable - is losing something.
And whoever is bayesian is also losing something, a quantity of easily acquirable energy. It is a loss one way or the other.
Nick: It seems like a bad idea to me to call a prediction underconfident or overconfident depending on the particular outcome. Shouldn't it depend rather on the "correct" distribution of outcomes, i.e. the Bayesian posterior taking all your information into account? I mean, with your definition, if we do the coin flip again, with 99% heads and 1% tails, and our prediction is 99% heads and 1% tails, then if it comes up heads we're slightly underconfident, and if it comes up tails we're strongly overconfident. Hence there's no such thing as an actually well-calibrated prediction for this (?). If we take into account the existence of a correct Bayesian posterior then it's clear that "expected calibration" is not at all 0. For instance if p is the "correct" probability of heads and q is your prediction then the "expected calibration" would seem to be -p*log(q)-(1-p)*log(1-q)+q*log(q)+(1-q)*log(1-q). And, for instance, if you know for a fact that a certain experiment can go one of 3 ways, and over a long period of time the proportion has been 60%-30%-10%, then not only 33.3%-33.3%-33.3%, but also 45%-45%-10% and 57%-19%-24% have "expected calibration" ~0 by this definition.
You consider the Randomized Yudkowsky-Kasparov system, but this is only one point on a continuum. There are stupider and smarter systems, generated by weaker and stronger chess players, respectively.
One interesting system is Randomized Kasparov-Kasparov, where we show arbitrary chess positions to Kasparov and ask him to assign probability distributions to what he'd do. Chess is a game of perfect information, so arbitrary chess positions can be shown without their history (of course, we have to count physically invisible state as known; a position consists of both the arrangement of the pieces and factoids like "the king has not moved, but the queen's rook has moved, so queenside castling is forbidden but kingside castling is allowed"). I assert without proof that RKK is an extremely smart system, and will turn both RYK and you yourself into chunky salsa.
Here's the question: is Kasparov smarter than RKK? We can imagine that first we develop RKK by properly incentivizing Kasparov (e.g. by telling him that we're going to use it to play against every human on the planet, and for every time that RKK comes out victorious, we'll donate a dollar to his favorite charity, or if RKK loses even once, we'll drown a kitten, or whatever), and then (after we've shown Kasparov the million zillion chess positions and recorded his assigned probability distributions) we turn the tables and properly incentivize him to beat RKK (charity donation, kitten drowning, whatever). Can Kasparov beat RKK? Does it matter if he knows he's playing against RKK?
Then the metaquestion: what if Kasparov was a weaker player? (For values of Kasparov that equal Yudkowsky...) What if he was a stronger player? One limiting case is easy. If Kasparov was very very smart, he could solve chess, and so RKK could solve chess, and they would be equally strong. (That is, if the solution says that White wins, then whoever plays White wins, and so forth.) There's a probability distribution that solves chess, after all (with 100% for the right move in any given situation and 0% for the others).
Will RYK be as good a player as Kasparov? Of course not. Sometimes the RYK system will randomly make dreadful moves which the real-life Kasparov would never make - start the game with P-KN4. I assign such moves a low probability, but sometimes the computer makes them anyway, by sheer random chance.
Don't think this is right. RYK would not make dreadful moves that you would not think Kasparov would make under any circumstances, unless you were assigning odds too high to those moves. So unless you think there's a chance that Kasparov would make that move more often than you would, you would assign a log-odd of -200 dB or something, effective 0 chance.
To clarify: what I'm specifically objecting to, I think, is "by sheer random chance". If it's the chance that's causing those bad moves, then you're not well-calibrated. If it's your uncertainty about Kasparov that's causing the moves, then my objection doesn't apply.
I swear to god I've read these Kasparov posts before...
I think the weakness of R[xx] systems is that they're only working one move at a time. Any position suggests a number of strategies to a good player. Being stuck with a randomized system that tends towards plausible moves means that a move that builds in one direction could be weakened in the next round by a move that's tending towards another good strategy.
Could R[xx] be improved (in theory--I think the calculations would be unmanageable) by considering more than one move at a time? Is there some number of moves where you don't exactly have an R[xx] anymore because the program is more like simulating the way the real chess player thinks?
Nancy, the underlying algorithms of most modern chess engines don't have any sort of persistent strategy from move to move. In fact, most don't even keep any state between moves. (That's an advanced feature.) And yet chess engines are able to compete with top chess players. While they look more than one move ahead, so does RYK. I don't think your objection actually pans out in practice.
re: calibration, it seems like what we want to do is ask ourselves what happens if an agent is asked lots of different probability questions, consider the the true probability as a function of probability stated by the agent, use some prior distribution (describing *our* uncertainty) on all such functions that the agent could have, update this prior using a finite set of answers we have seen the agent give and their correctness, and end up with a posterior distribution on functions (agent's probability -> true probability) from which we can get estimates of how over/underconfident the agent is at each probability level, and use those to determine what the agent "really means" when it says 90%; if the agent is overconfident at all probabilities then it's "overconfident" period, if it's underconfident at all probabilities then it's "underconfident" period, if it's over at some and under at some then I guess it's just "misconfident"? (an agent could be usually overconfident in an environment that usually asked it difficult questions and usually underconfident in an environment that usually asked it easy questions, or vice versa) If we keep asking an agent that doesn't learn the same question like in anon's comment then that seems like a degenerate case. On first think it doesn't seem like an agent's calibration function necessarily depends on what questions you ask it; Solomonoff induction is well-calibrated in the long run in all environments (right?), and you could imagine an agent that was like SI but with all probability outputs twice as close to 1, say. Hope this makes any sense.
pdf23ds, thanks. I wasn't sure whether the whole strategic situation could be implied in each position. Have people been able to learn anything about chess strategy by studying how programs play? Do program vs. program games make sense to people?
pdf23ds, most chess engines store analyzed subtrees into a transposition table. This is not the same as having a persistent strategy, but it does mean that much evaluation work can be saved by keeping a persistent state between moves.
This is similar to how iterative deepening works: in theory you could obtain the same result by performing a depth-first search directly, but iterative deepening provides a hint at the "best" moves which makes search more efficient.
I wasn't under the impression that transposition tables were kept between moves, but now that I think about it, they probably are. On the other hand, I'd wager that the reuse isn't very high, depending on the caching criteria. Maybe around 30%.
Have people been able to learn anything about chess strategy by studying how programs play?
I'm don't know if they've learned anything. Computer style is heavy on tactics. I'm not sure how their strategy is characterized. But it's notable that most new young grandmasters nowadays rely heavily on chess engines to help them train.
Do program vs. program games make sense to people?
Yes. Only chess engine fans are really too interested in them, though.
It'd be interesting to know if perfect checkers games (checkers has been solved) were amenable to human analysis.
A bit more on chess engines: many engines have a number of parameters you can set that determines their playing style, and some might say their "strategy". Things like aggressiveness, willingness to sacrifice, valuation of pieces, willingness to trade, etc. What many of these parameters actually do is change the static board evaluation function. A chess engine looks ahead many moves, and due to time constraints, picks a board position after X moves and runs the static evaluation function, which is a very naive estimate of the quality of the engine's position. A human's static evaluation function is much higher quality, but humans only look ahead a few moves at most. (Well, grandmasters look ahead a dozen or two moves at most but they're still evaluating many, many fewer positions (their pruning factor is higher).)
When I play against a superior player, I can't predict exactly where my opponent will move against me. [...] But I can predict the consequence of the unknown move, which is a win for the other player; and the more the player's actual action surprises me, the more confident I become of this final outcome."
Interestingly, playing an opponent which selects completely randomly from the range of possible moves gives a similar result: you never know what your opponent will do, but you can predict with a fair amount of confidence that you will win. And the more their actions surprise you (because you didn't spend any time thinking about such a remarkably dumb move as that), the more confident you become in your prediction of the result.
bah. moving too fast, forget to close an em tag. in the preceding comment, only the first paragraph should be em'd as a quote.
Will RYK be as good a player as Kasparov? Of course not. Sometimes the RYK system will randomly make dreadful moves which the real-life Kasparov would never make - start the game with P-KN4. I assign such moves a low probability, but sometimes the computer makes them anyway, by sheer random chance.
If you believe that Kasparov would never make that move, you should assign it a probability of 0.
Regardless of whether Grob's Attack is dreadful, this article is. RYK doesn't make dreadful moves that the real-life Kasparov would never make because of "sheer random chance", it does so because it has nothing whatsoever to do with Kasparov -- you could substitute 'I' ("how I imagine a good player might move") for 'K' everywhere in your silly article. The information content of RYK is drawn solely from the chess knowledge of an astoundingly weak player -- you -- deflated by your own uncertainty about your chess judgment. That is, it's the same as RYY, but with the probabilities smeared, reducing the probabilities of the moves you think are good and increasing the probabilities of those you think are bad. If you're a bad enough player, RYK could actually be better than RYY, by sufficiently often avoiding the horrible moves you would make and sometimes making good moves that you never would.
As for calibration, RYK is not at all calibrated to Kasparov's actual behavior, making invocation of his name, and any surprise as to how badly RYK plays, absurd. Your blather about "the creative unpredictability of intelligence" is absurd; Kasparov could be a completely deterministic engine, always making the same moves in the same positions, and nothing would change -- he would still beat your ass every time and RYK would still suck.