Followup To: Approaching Logical Probability

Last time, we required our robot to only assign logical probability of 0 or 1 to statements where it's checked the proof. This flowed from our desire to have a robot that comes to conclusions in limited time. It's also important that this abstract definition has to take into account the pool of statements that our actual robot actually checks. However, this restriction doesn't give us a consistent way to assign numbers to unproven statements - to be consistent we have to put limits on our application of the usual rules of probability.

Total Ignorance

The simplest solution is to assign logical probabilities to proven statements normally, but totally refuse to apply our information to unproven statements. The principle of maximum entropy means that every unproven statement then gets logical probability 0.5.

There is a correspondence between being inconsistent and ignoring information. We could just as well have said that when we move from proven statements to unproven statements, we refuse to apply the product rule, and that would have assigned every unproven statement logical probability 0.5 too. Either way, there's something "non-classical" going on at the boundary between proven statement and unproven statements. If you are certain that 100+98=198, but not that 99+99=198, some unfortunate accident has befallen the rule that if (A+1)+B=C, A+(B+1)=C.

Saying 0.5 for everything is rarely suggested in practice, because it has some glaringly bad properties: if we ask about the last digit of the zillionth prime number, we don't want to say that the answer being 1 has logical probability 0.5, we want our robot to use facts about digits and about primes to say that 1, 3, 7, and 9 have logical probability 0.25 each.

Using Our Pool of Proof-Steps

The most obvious process that solves this problem is to assign logical probabilities by ignoring most of the starting information, but using as information all proven statements containing no variables (that is, can't regenerate a set of axioms that will require us to take infinite time). So if we prove that our prime number can't simultaneously end with 1 and 3, we'll never assign a combined probability to them greater than 1.

This is the most typical family of suggestions (of the ones that won't require infinite resources). See Benja, for example. Another phrasing of this solution is that it's like we start with the inconsistent prior of "1/2 to everything," and then update this according to checked proof steps, until we run out of time. The updating can also be described as if there's a bunch of different tables (models) that assign a true or false value to every statement, and when we learn that the prime can't end with 1 and 3 simultaneously, we rule out the models that say both of those are true and redistribute the probability evenly. To get answers in finite time, we can't actually compute any of these fancy descriptions, but we can compute the updates of just the statement we want.

Even though the probabilities assigned by this approach are more sensible, violations of normal probability still occur at the boundary between proven and unproven statements. We're giving our robot more information to work with, but still not as much as a robot with infinite computing power could use.

A sneaky issue here is that since using checked steps as information takes time, that's less time available to find the solution. This is a general rule - as tricks for assigning logical probabilities to unproven statements get better, they take up more time, so you only want to use them if you don't expect that time to be important. But calculating that tradeoff also takes time! Someone has probably solved this problem in other contexts, but I do not know the solution.

Logical Pattern-Matching

There is another interesting property we might want, which could be called logical pattern-matching. Suppose that our robot is trying to predict a very complicated machine. Further suppose that our robot knows the complete description of the machine, but it is too complicated for our robot to predict the output, or even to find any useful proofs about the machine's behavior.

At time step 1, our robot observes that the machine outputs "1." At time step 2, our robot observes that the machine outputs "2." We might now want our robot to "see the pattern," and guess that the machine is about to output 3.

Our solutions so far don't do this - our robot would need to prove a statement like "if its last output was 2, its next output is logically forbidden from being 5." If our machine is too complicated to prove statements like that about, our previous solutions won't even think of the previous outputs as information.

One way to make our robot care about complexity is to restrict the length of hypotheses to less than the length of the description of the machine. This is like giving the robot the information that the answer comes from a machine, that the description length of this machine is less than some number.

A big problem with this is time. If our robot has to average over the outcome of all these different hypotheses, this takes longer than just using the description itself to find the answer. In a sense, directly using knowledge about the description of the machine is too much for our robot to handle. When we just used the checked proof-steps, that was okay, but as you give the robot more information you also burden it by making it spend more time interpreting that information.

And yet, we want our robot to be able to do logical pattern matching quickly if it actually goes out and observes a complicated machine that prints "1, 2...". But this is another problem I don't know how to solve - we could just say "monte carlo" and wave our hands, but handwaving is frowned upon here, and for good reason.

Further Open Problems

In this post I've already mentioned two open problems: the tradeoff of searching for an exact solution versus having a good approximation, and the correct way to do logical pattern-matching. There are more unsolved problems that also deserve mention.

• Handling infinities. Current proposals have some pretty bad properties if there are an infinite number of possible answers. For example, if you prove that answers 1-1000 all have small logical probability, but don't prove anything about answer 1001, the robot might decide that since you didn't prove anything about it, it has probability 0.5, and is thus a really good idea. An example direction to go might be to restrict our robot to taking actions it's actually proved things about - but we can also come up with perverse situations where that's bad. Is there a better way?

• Integrating this approach into a larger framework of decision-making. This builds off of making tradeoffs with your computing time and handling infinities. Basically, we want our robot to make decisions in limited time, not just output logical probabilities in limited time, and making decisions requires considering your possible actions and the utility of outcomes, which are allowed to be really complicated and require approximation. And then, we need to somehow direct computational resources into different tasks to make the best decision.

• Integrating this approach with second-order arithmetic. If you look at MIRI's paper that uses a probability distribution over logical statements, their approach is quite different - for one, they don't allow for any limits on the robot's resources. And for another, there are all sorts of properties that are important when considering second-order arithmetic that we haven't needed yet. For example, what happens when we ask for P(P(this statement)<0.3)?


Thank you for reading the Logical Uncertainty sequence. I hope that things which were not obvious now seem obvious. If you want your logical probability distribution to have certain nice properties, it is a good idea to only slightly depart from the original desiderata of probability, and build up from there. Jumping straight to an answer is not necessary, and is probably a bad idea anyhow.


End of the sequence Logical Uncertainty

Previous Post: Approaching Logical Probability

New to LessWrong?

New Comment
8 comments, sorted by Click to highlight new comments since: Today at 2:50 PM

I read a comment in this thread by Armok_GoB, and it reminded me of some machine-learning angles you could take on this problem. Forgive me if I make a fool of myself on this, I'm fairly rusty. Here's my first guess as to how I'd solve the following:

open problem: the tradeoff of searching for an exact solution versus having a good approximation

Take a bunch of proven statements, and look at half of them. Generate a bunch of possible heuristics, and score them based on how well they predict the other half of the proven statements given the first half as proven. Keep track of how long it takes to apply a heuristic. Use the weighted combination of heuristics that worked best on known data, given various run-time constraints.

With a table of heuristic combinations and their historical effectiveness and computational time, and the expected value of having accurate information, you can quickly compute the expected value of running the heuristics. Then compare it against the expected computation time to see if it's worth running.

Finally, you can update the heuristics themselves whenever you decide to add more proofs. You can also check short run-time heuristics with longer run-time ones. Things that work better than you expected, you should expect to work better.

Oh, and the value-of-information calculation I mentioned earlier can be used to pick up some cheap computational cycles as well - if it turns out that whether or not the billionth digit of pi is "3" is worth $3.50, you can simply decide to not care about that question.

And to be rigorous, here are the hand-waving parts of this plan:

  1. Generate heuristics. How? I mean, you could simply write every program that takes a list of proofs, starting at the simplest, and start checking them. That seems very inefficient, though. There may be machine learning techniques for this that I simply have not been exposed to.

  2. Given a list of heuristics, how do you determine how well they work? I'm pretty sure this is a known-solved problem, but I can't remember the exact solution. If I remember right it's something along the lines of log-difference, where getting something wrong is worth more penalty points the more certain you are about it.

  3. Given a list heuristics, how do you find the best weighted combinations under a run-time constraint? This is a gigantic mess of linear algebra.

And another problem with it that I just found is that there's no room for meta-heuristics. If the proofs come in two distinguishable groups that are separately amenable to two different heuristics, then it's a really good idea to separate out these two groups and applying the better approach for that group. My approach seems like it'd be likely to miss this sort of insight.

If you are certain that 100+98=198, but not that 99+99=198, some unfortunate accident has befallen the rule that if (A+1)+B=C, A+(B+1)=C.

The "unfortunate accident" may be that you simply didn't have enough time to apply the rule. If you have limited time, you have to stop somewhere. There doesn't have to be anything magical between steps N and N+1, it could merely mean that the timer went off at step N.

[-]jbay10y20

Just to expand, this example looks deceptively suspicious to humans because the proof is so obvious, even to us, and can be done in a single line using that one rule. For simple arithmetic like this, a robot would be able to prove it immediately after being asked whether 99+99=198 with hardly any time or effort. The procedure is so simple that, in practice, it would be efficient to delay generating the proof until it's asked for. Many human readers will mentally prove it while reading the sentence without even noticing that it took any work to do so. It's so fast that there's no need to guess the answer probabilistically beforehand.

So I'm not sure this example is good for an intuitive illustration that "some unfortunate accident" is going on. It's very strange to run out of time in a procedure that takes so little time to complete, so it's hard to reason about the appropriateness of guessing at circumstance.

Maybe a better example would be, having already proved Pythagoras' theorem, to prove that the body diagonal of a unit cube has length sqrt(3). This might take most humans a little while to think about, and if someone ran a stopwatch and said you have five seconds to decide whether it's true or not, the idea of guessing P=0.5 for something you don't have time to prove might be easier for people to understand. =)

Yup, that's the idea :)

While obviously not rigorous enough for something serious, one obvious hack is to do the "0.5 unless proven" thing, and then have a long list of special case dumb heuristics with different weights that update that without any proofs involved at all. The list of heuristics could be gotten from some unsafe source like the programer or another AI or mechanical turk, and then the weights learned by first guessing and then proving to see if it was right, with heuristics that are to bad kicked out entirely.

[-]CCC10y10

At time step 1, our robot observes that the machine outputs "1." At time step 2, our robot observes that the machine outputs "2." We might now want our robot to "see the pattern," and guess that the machine is about to output 3.

At this stage, if I had to guess, I'd guess that the next output would be '4', doubling every time.

That's a good one too. Or " 121212" or "12222222". But not so much "1,2,sofa".

I have some rambling thoughts on the subject. I just hope they aren't too stupid or obvious ;-)

Let's take as a framework the aforementioned example of the last digit of the zillionth prime. We'll say that the agent will be rewarded if it gets it right, on, shall we say, a log scoring rule. This means that the agent is incentivised to give the best (most accurate) probabilities it can, given the information it has. The more unreasonably confident it is, the more it loses, and the same with underconfidence.

By the way, for now I will assume the agent fully knows the scoring rule it will be judges by. It is quite possible that this assumption raises problems of its own, but I will ignore them for now.

So, the agent starts with a prior over the possible answers (a uniform prior?), and starts updating itself. But it wants to figure out how long it will spend doing so, before it should give up and hand in for grading its "good enough" answer. This is the main problem we are trying to solve here.

In the degenerate case in which it has nothing else in the universe other than this to give it utility, I actually think it is the correct answer to work forever (or as long as it can before physically falling apart) on the answer. But we shall make the opposite assumption. Let's call the amount of utility lost to the agent as an opportunity cost in a given unit of time by the name C. (We shall also make the assumption that the agent knows what C is, at least approximately. This is perhaps a slightly more dangerous assumption, but we shall accept it for now.)

So, the agent want to work for as many units of time as it can before the marginal amount of extra utility it would earn from the scoring rule from the work of a unit time is less than C.

The only problem left is figuring out that margin. But, by the assumption that the agent knows the scoring rule, it knows the derivative of the scoring function as well. At any given point in time, it can figure out the amount of change to the potential utility it would get from the change to the probabilities it assigns. Thus, if the agent knows approximately the range in which it may update in the next step, it can figure out whether or not the next stage is worthwhile.

In other words, once it is close enough to the answer that it predicts that a marginal update would move it closer to the answer by an amount that gives less than C utility, it can quit, and not perform the next step.

This makes sense, right? I do suspect that this is the direction to drive at in the solution to this problem.