## Be secretly wrong

"I feel like I'm not the sort of person who's allowed to have opinions about the important issues like AI risk."

"What's the bad thing that might happen if you expressed your opinion?"

"It would be wrong in some way I hadn't foreseen, and people would think less of me."

"Do you think less of other people who have wrong opinions?"

"Not if they change their minds when confronted with the evidence."

"Would you do that?"

"Yeah."

"Do you think other people think less of those who do that?"

"No."

"Well, if it's alright for other people to make mistakes, what makes YOU so special?"

A lot of my otherwise very smart and thoughtful friends seem to have a mental block around thinking on certain topics, because they're the sort of topics Important People have Important Opinions around. There seem to be two very different reasons for this sort of block:

- Being wrong feels bad.
- They might lose the respect of others.

# Be wrong

If you don't have an opinion, you can hold onto the fantasy that someday, once you figure the thing out, you'll end up having a right opinion. But if you put yourself out there with an opinion that's unmistakably your own, you don't have that excuse anymore.

This is related to the desire to pass tests. The smart kids go through school and are taught - explicitly or tacitly - that as long as they get good grades they're doing OK, and if they try at all they can get good grades. So when they bump up against a problem that might actually be hard, there's a strong impulse to look away, to redirect to something else. So they do.

You have to understand that this system is not real, it's just a game. In real life you have to be straight-up wrong sometimes. So you may as well get it over with.

If you expect to be wrong when you guess, then you're already wrong, and paying the price for it. As Eugene Gendlin said:

What is true is already so. Owning up to it doesn't make it worse. Not being open about it doesn't make it go away. And because it's true, it is what is there to be interacted with. Anything untrue isn't there to be lived. People can stand what is true, for they are already enduring it.

What you would be mistaken about, you're already mistaken about. Owning up to it doesn't make you any more mistaken. Not being open about it doesn't make it go away.

"You're already "wrong" in the sense that your anticipations aren't perfectly aligned with reality. You just haven't put yourself in a situation where you've openly tried to guess the teacher's password. But if you want more power over the world, you need to focus your uncertainty - and this only reliably makes you righter if you repeatedly test your beliefs. Which means sometimes being wrong, and noticing. (And then, of course, changing your mind.)

Being wrong is how you learn - by testing hypotheses.

# In secret

Getting used to being wrong - forming the boldest hypotheses your current beliefs can truly justify so that you can correct your model based on the data - is painful and I don't have a good solution to getting over it except to tough it out. But there's a part of the problem we can separate out, which is - the pain of being wrong publicly.

When I attended a Toastmasters club, one of the things I liked a lot about giving speeches there was that the stakes were low in terms of the content. If I were giving a presentation at work, I had to worry about my generic presentation skills, but also whether the way I was presenting it was a good match for my audience, and also whether the idea I was pitching was a good strategic move for the company or my career, and also whether the information I was presenting was accurate. At Toastmasters, all the content-related stakes were gone. No one with the power to promote or fire me was present. Everyone was on my side, and the group was all about helping each other get better. So all I had to think about was the form of my speech.

Once I'd learned some general presentations at Toastmasters, it became easier to give talks where I did care about the content and there were real-world consequences to the quality of the talk. I'd gotten practice on the form of public speaking separately - so now I could relax about that, and just focus on getting the content right.

Similarly, expressing opinions publicly can be stressful because of the work of generating likely hypotheses, and revealing to yourself that you are farther behind in understanding things than you thought - but also because of the perceived social consequences of sounding stupid. You can at least isolate the last factor, by starting out thinking things through in secret. This works by separating epistemic uncertainty from social confidence. (This is closely related to the dichotomy between social and objective respect.)

Of course, as soon as you can stand to do this in public, that's better - you'll learn faster, you'll get help. But if you're not there yet, this is a step along the way. If the choice is between having private opinions and having none, have private opinions. (Also related: If we can't lie to others, we will lie to ourselves.)

Read and discuss a book on a topic you want to have opinions about, with one trusted friend. Start a secret blog - or just take notes. Practice having opinions at all, that you can be wrong about, before you worry about being accountable for your opinions. One step at a time.

Before you're publicly right, consider being secretly wrong. Better to be secretly wrong, than secretly not even wrong.

(Cross-posted at my personal blog.)

## Logics for Mind-Building Should Have Computational Meaning

**The Workshop**

Late in July I organized and held MIRIx Tel-Aviv with the goal of investigating the currently-open (to my knowledge) Friendly AI problem called "logical probability": the issue of assigning probabilities to formulas in a first-order proof system, in order to use the reflective consistency of the probability predicate to get past the Loebian Obstacle to building a self-modifying reasoning agent that will trust itself and its successors. Vadim Kosoy, Benjamin and Joshua Fox, and myself met at the Tel-Aviv Makers' Insurgence for six hours, and each presented our ideas. I spent most of it sneezing due to my allergies to TAMI's resident cats.

My idea was to go with the proof-theoretic semantics of logic and attack computational construction of logical probability via the Curry-Howard Isomorphism between programs and proofs: this yields a rather direct translation between computational constructions of logical probability and the learning/construction of an optimal function from sensory inputs to actions required by Updateless Decision Theory.

The best I can give as a mathematical result is as follows:

** **

The capital is a set of hypotheses/axioms/assumptions, and the English letters are metasyntactic variables (like "foo" and "bar" in programming lessons). The lower-case letters denote proofs/programs, and the upper-case letters denote propositions/types. The turnstile just means "deduces": the *judgement* can be read here as "an agent whose set of beliefs is denoted will believe that the evidence *a* proves the proposition *A*." The performs a "reversed" substitution, with the result reading: "for all *y* proving/of-type *B*, substitute *x* for *y* in *a*". This means that we algorithmically build a new proof/construction/program from *a* in which any and all constructions proving the proposition *B* are replaced with the logically-equivalent hypothesis *x*, which we have added to our hypothesis-set .

Thus the first equation reads, "the probability of *a* proving *A* conditioned on *b* proving *B* equals the probability of *a* proving *A* when we assume the truth of *B* as a hypothesis." The second equation then uses this definition of conditional probability to give the normal Product Rule of probabilities for the logical product (the operator), defined proof-theoretically. I strongly believe I could give a similar equation for the normal Sum Rule of probabilities for the logical sum (the operator) if I could only access the relevant paywalled paper, in which the λμ-calculus acting as an algorithmic interpretation of the natural-deduction system for classical propositional logic (rather than intuitionistic) is given.

The third item given there is an inference rule, which reads, "if *x* is a free variable/hypothesis imputed to have type/prove proposition *A*, not bound in the hypothesis-set , then the probability with which we believe *x* proves *A* is given by the Solomonoff Measure of type *A* in the λμ-calculus". We can define that measure simply as the summed Solomonoff Measure of every program/proof possessing the relevant type, and I don't think going into the details of its construction here would be particularly productive. Free variables in λ-calculus are isomorphic to unproven hypotheses in natural deduction, and so a probabilistic proof system could learn how much to believe in some free-standing hypothesis via Bayesian evidence rather than algorithmic proof.

The final item given here is trivial: anything assumed has probability 1.0, that of a logical tautology.

The upside to invoking the strange, alien λμ-calculus instead of the more normal, friendly λ-calculus is that we thus reason inside classical logic rather than intuitionistic, which means we can use the classical axioms of probability rather than intuitionistic Bayesianism. We *need* classical logic here: if we switch to intuitionistic logics (Heyting algebras rather than Boolean algebras) we do get to make computational decidability a first-class citizen of our logic, but the cost is that we can then believe *only* computationally provable propositions. As Benjamin Fox pointed out to me at the workshop, Loeb's Theorem then becomes a triviality, with real self-trust rendered no easier.

**The Apologia**

My motivation and core idea for all this was very simple: I am a devout computational trinitarian, believing that logic must be set on foundations which describe reasoning, truth, and evidence in a non-mystical, non-Platonic way. The study of first-order logic and *especially* of incompleteness results in metamathematics, from Goedel on up to Chaitin, *aggravates* me in its relentless Platonism, and especially in the way Platonic mysticism about logical incompleteness so often leads to the belief that minds are mystical. (It aggravates other people, too!)

The slight problem which I ran into is that there's a shit-ton I don't know about logic. I am now working to remedy this grievous hole in my previous education. Also, this problem is really deep, actually.

I thus apologize for ending the rigorous portion of this write-up here. Everyone expecting proper rigor, you may now pack up and go home, if you were ever paying attention at all. Ritual seppuku will duly be committed, followed by hors d'oeuvre. My corpse will be duly recycled to make paper-clips, in the proper fashion of a failed LessWrongian.

**The Parts I'm Not Very Sure About**

With any luck, that previous paragraph got rid of all the serious people.

I do, however, still think that the (beautiful) equivalence between computation and logic can yield some insights here. After all, the whole reason for the strange incompleteness results in first-order logic (shown by Boolos in his textbook, I'm told) is that first-order logic, as a reasoning system, contains sufficient computational machinery to encode a Universal Turing Machine. The bidirectionality of this reduction (Hilbert and Gentzen both have given computational descriptions of first-order proof systems) is just another demonstration of the equivalence.

In fact, it seems to me (right now) to yield a rather intuitively satisfying explanation of why the Gaifman-Carnot Condition (that every instance we see of provides Bayesian evidence in favor of ) for logical probabilities is not computably approximable. What would we need to interpret the Gaifman Condition from an algorithmic, type-theoretic viewpoint? From this interpretation, we would need a proof of our universal generalization. This would have to be a dependent product of form , a function taking any construction to a construction of type , which itself has type **Prop**. To learn such a dependent function from the examples would be to search for an optimal (simple, probable) construction (program) constituting the relevant proof object: effectively, an individual act of Solomonoff Induction. Solomonoff Induction, however, is already only semicomputable, which would then make a Gaifman-Hutter distribution (is there another term for these?) doubly semicomputable, since even generating it involves a semiprocedure.

The *benefit* of using the constructive approach to probabilistic logic here is that we know perfectly well that however incomputable Solomonoff Induction and Gaifman-Hutter distributions might be, both existing humans and existing proof systems succeed in building proof-constructions for quantified sentences *all the time*, even in higher-order logics such as Coquand's Calculus of Constructions (the core of a popular constructive proof assistant) or Luo's Logic-Enriched Type Theory (the core of a popular dependently-typed programming language and proof engine based on classical logic). Such logics and their proof-checking algorithms constitute, going all the way back to Automath, the first examples of computational "agents" which acquire specific "beliefs" in a mathematically rigorous way, subject to human-proved theorems of soundness, consistency, and programming-language-theoretic completeness (rather than meaning that every true proposition has a proof, this means that every program which does not become operationally stuck has a type and is thus the proof of some proposition). If we want our AIs to believe in accordance with soundness and consistency properties we can prove *before* running them, while being composed of computational artifacts, I personally consider this the foundation from which to build.

Where we *can* acquire probabilistic evidence in a sound and computable way, as noted above in the section on free variables/hypotheses, we can do so for propositions which we cannot algorithmically prove. This would bring us closer to our actual goals of using logical probability in Updateless Decision Theory or of getting around the Loebian Obstacle.

**Some of the Background Material I'm Reading**

Another reason why we should use a Curry-Howard approach to logical probability is one of the simplest possible reasons: the burgeoning field of probabilistic programming is already being built on it. The Computational Cognitive Science lab at MIT is publishing papers showing that their languages are universal for computable and semicomputable probability distributions, and getting strong results in the study of human general intelligence. Specifically: they are hypothesizing that we can dissolve "learning" into "inducing probabilistic programs via hierarchical Bayesian inference", "thinking" into "simulation" into "conditional sampling from probabilistic programs", and "uncertain inference" into "approximate inference over the distributions represented by probabilistic programs, conditioned on some fixed quantity of sampling that has been done."

In fact, one might even look at these ideas and think that, perhaps, an agent which could find some way to sample quickly and more accurately, or to learn probabilistic programs more efficiently (in terms of training data), than was built into its original "belief engine" could then rewrite its belief engine to use these new algorithms to perform strictly better inference and learning. Unless I'm as completely wrong as I usually am about these things (that is, very extremely completely wrong based on an utterly unfounded misunderstanding of the whole topic), it's a potential engine for recursive self-improvement.

They also have been studying how to implement statistical inference techniques for their generate modeling languages which do not obey Bayesian soundness. While most of machine learning/perception works according to error-rate minimization rather than Bayesian soundness (exactly because Bayesian methods are *often* too computationally expensive for real-world use), I would prefer someone at least study the implications of employing unsound inference techniques for more general AI and cognitive-science applications in terms of how often such a system would "misbehave".

Many of MIT's models are currently dynamically typed and appear to leave type soundness (the logical rigor with which agents come to believe things by deduction) to future research. And yet: they got to this problem first, so to speak. We really ought to be collaborating with them, with the full-time grant-funded academic researchers, rather than trying to armchair-reason our way to a full theory of logical probability as a large group of amateurs or part-timers and only a small core cohort of full-time MIRI and FHI staff investigating AI safety issues.

(I admit to having a nerd crush, and I am actually planning to go visit the Cocosci Lab this coming week, and want/intend to apply to their PhD program.)

They have also uncovered something else I find highly interesting: human learning of both concepts and causal frameworks seems to take place via hierarchical Bayesian inference, gaining a "blessing of abstraction" to countermand the "curse of dimensionality". The natural interpretation of these abstractions in terms of constructions and types would be that, as in dependently-typed programming languages, constructions have types, and types are constructions, but for hierarchical-learning purposes, it would be useful to suppose that *types* have specific, structured types more informative than **Prop** or **Type**_{n} (for some universe level *n*). Inference can then proceed from giving constructions or type-judgements as evidence at the bottom level, up the hierarchy of types and meta-types to give probabilistic belief-assignments to very general knowledge. Even very different objects could have similar meta-types at some level of the hierarchy, allowing hierarchical inference to help transfer Bayesian evidence between seemingly different domains, giving insight into how efficient general intelligence can work.

**Just-for-fun Postscript**

If we really buy into the model of thinking as conditional simulation, we can use that to dissolve the modalities "possible" and "impossible". We arrive at (by my count) three different ways of considering the issue computationally:

- Conceivable/imaginable: the generative models which constitute my current beliefs do or do not yield a path to make some logical proposition true or to make some causal event happen (planning can be done as inference, after all), with or without some specified level of probability.
- Sensibility/absurdity: the generative models which constitute my current beliefs place a desirably high or undesirably low probability on the known path(s) by which a proposition might be true or by which an event might happen. The level which constitutes "desirable" could be set as the value for a hypothesis test, or some other value determined decision-theoretically. This could relate to Pascal's Mugging: how probable must something be before I consider it
*real*rather than an artifact of my own hypothesis space? - Consistency or Contradiction: the generative models which constitute my current beliefs, plus the hypothesis that some proposition is true or some event can come about, do or do not yield a logical contradiction with some probability (that is, we should believe the contradiction exists only to the degree we believe in our existing models in the first place!).

I mostly find this fun because it lets us talk rigorously about when we should "shut up and do the 1,2!impossible" and when something is very definitely 3!impossible.

## Solutions and Open Problems

**Followup To:** Approaching Logical Probability

Last time, we required our robot to only assign logical probability of 0 or 1 to statements where it's checked the proof. This flowed from our desire to have a robot that comes to conclusions in limited time. It's also important that this abstract definition has to take into account the pool of statements that our actual robot actually checks. However, this restriction doesn't give us a consistent way to assign numbers to unproven statements - to be consistent we have to put limits on our application of the usual rules of probability.

## Approaching Logical Probability

**Followup To:** Logic as Probability

If we design a robot that acts as if it's uncertain about mathematical statements, that violates some desiderata for probability. But realistic robots cannot prove all theorems; they have to be uncertain about hard math problems.

In the name of practicality, we want a foundation for decision-making that captures what it means to make a good decision, even with limited resources. "Good" means that even though our real-world robot can't make decisions well enough to satisfy Savage's theorem, we want to approximate that ideal, not throw it out. Although I don't have the one best answer to give you, in this post we'll take some steps forward.

## Logic as Probability

**Followup To:** Putting in the Numbers

Before talking about logical uncertainty, our final topic is the relationship between probabilistic logic and classical logic. A robot running on probabilistic logic stores probabilities of events, e.g. that the grass is wet outside, P(wet), and then if they collect new evidence they update that probability to P(wet|evidence). Classical logic robots, on the other hand, deduce the truth of statements from axioms and observations. Maybe our robot starts out not being able to deduce whether the grass is wet, but then they observe that it is raining, and so they use an axiom about rain causing wetness to deduce that "the grass is wet" is true.

Classical logic relies on complete certainty in its axioms and observations, and makes completely certain deductions. This is unrealistic when applied to rain, but we're going to apply this to (first order, for starters) math later, which a better fit for classical logic.

The general pattern of the deduction "It's raining, and when it rains the grass is wet, therefore the grass is wet" was modus ponens: if 'U implies R' is true, and U is true, then R must be true. There is also modus tollens: if 'U implies R' is true, and R is false, then U has to be false too. Third, there is the law of non-contradiction: "It's simultaneously raining and not-raining outside" is always false.

We can imagine a robot that does classical logic as if it were writing in a notebook. Axioms are entered in the notebook at the start. Then our robot starts writing down statements that can be deduced by modus ponens or modus tollens. Eventually, the notebook is filled with statements deducible from the axioms. Modus tollens and modus ponens can be thought of as consistency conditions that apply to the contents of the notebook.

## Putting in the Numbers

**Followup To:** Foundations of Probability

In the previous post, we reviewed reasons why having probabilities is a good idea. These foundations defined probabilities as numbers following certain rules, like the product rule and the rule that mutually exclusive probabilities sum to 1 at most. These probabilities have to hang together as a coherent whole. But just because probabilities hang together a certain way, doesn't actually tell us what numbers to assign.

I can say a coin flip has P(heads)=0.5, or I can say it has P(heads)=0.999; both are perfectly valid probabilities, as long as P(tails) is consistent. This post will be about how to actually get to the numbers.

## Foundations of Probability

**Beginning of:** Logical Uncertainty sequence

Suppose that we are designing a robot. In order for this robot to reason about the outside world, it will need to use probabilities.

Our robot can then use its knowledge to acquire cookies, which we have programmed it to value. For example, we might wager a cookie with the robot on the motion of a certain stock price.

In the coming sequence, I'd like to add a new capability to our robot. It has to do with how the robot handles very hard math problems. If we ask "what's the last digit of the 3^^^3'th prime number?", our robot should at some point *give up*, before the sun explodes and the point becomes moot.

If there are math problems our robot can't solve, what should it do if we offer it a bet about the last digit of the 3^^^3'th prime? It's going to have to approximate - robots need to make lots of approximations, even for simple tasks like finding the strategy that maximizes cookies.

Intuitively, it seems like if we can't find the real answer, the last digit is equally likely to be 1, 3, 7 or 9; our robot should take bets as if it assigned those digits equal probability. But to assign some probability to the wrong answer is logically equivalent to assigning probability to 0=1. When we learn more, it will become clear that this is a problem - we aren't ready to upgrade our robot yet.

Let's begin with a review of the foundations of probability.

## Probability and radical uncertainty

In the previous article in this sequence, I conducted a thought experiment in which simple probability was not sufficient to choose how to act. Rationality required reasoning about *meta-probabilities*, the probabilities of probabilities.

Relatedly, lukeprog has a brief post that explains how this matters; a long article by HoldenKarnofsky makes meta-probability central to utilitarian estimates of the effectiveness of charitable giving; and Jonathan_Lee, in a reply to that, has used the same framework I presented.

In my previous article, I ran thought experiments that presented you with various colored boxes you could put coins in, gambling with uncertain odds.

The last box I showed you was blue. I explained that it had a fixed but unknown probability of a twofold payout, uniformly distributed between 0 and 0.9. The overall probability of a payout was 0.45, so the expectation value for gambling was 0.9—a bad bet. Yet your optimal strategy was to gamble a bit to figure out whether the odds were good or bad.

Let’s continue the experiment. I hand you a black box, shaped rather differently from the others. Its sealed faceplate is carved with runic inscriptions and eldritch figures. “I find this one *particularly* interesting,” I say.

## The dangers of zero and one

Eliezer wrote a post warning against unrealistically confident estimates, in which he argued that you can't be 99.99% sure that 53 is prime. Chris Hallquist replied with a post arguing that you can.

That particular case is tricky. There have been many independent calculations of the first hundred prime numbers. 53 is a small enough number that I think someone would notice if Wikipedia included it erroneously. But can you be 99.99% confident that 1159 is a prime? You found it in one particular source. Can you trust that source? It's large enough that no one would notice if it were wrong. You could try to verify it, but if I write a Perl or C++ program, I can't even be 99.9% sure that the compiler or interpreter will interpret it correctly, let alone that the program is correct.

Rather than argue over the number of nines to use for a specific case, I want to emphasize the the importance of not assigning things probability zero or one. Here's a real case where approximating 99.9999% confidence as 100% had disastrous consequences.

## Probability, knowledge, and meta-probability

This article is the first in a sequence that will consider situations where probability estimates are not, by themselves, adequate to make rational decisions. This one introduces a "meta-probability" approach, borrowed from E. T. Jaynes, and uses it to analyze a gambling problem. This situation is one in which reasonably straightforward decision-theoretic methods suffice. Later articles introduce increasingly problematic cases.

View more: Next