Applications of logical uncertainty
Lately I've been reading a lot about the problem of logical uncertainty. The problem is that there are logical consequences of your beliefs that you're uncertain about.
So, for statements like "the billionth digit of pi is even", it's provable from your beliefs. But you're still uncertain of it. So, the problem is, what's its probability? Well, 1/2, probably, but from what principled theory can you derive that?
That's the question. What probabilities we should assign to logical statements, and what laws of probability apply to them.
With all the time I've been spending on it, I've been wondering, will I ever have the opportunity to use it? My conclusion is, probably not me, personally. But I have come up with a decent list of ways other people might use it.
Combining information from simulation and experiment
This is why I first became interested in the problem--I was working in protein structure prediction.
The problem in protein structure prediction is figuring out how the protein folds. A protein is a big molecule. They're always these long, linear chains of atoms with relatively short side-branches. Interestingly, there are easy experiments to learn what they'd look like all stretched out in a line like this. But in reality, this long chain is folded up in a complex way. So, the problem is: given the linear, stretched out structure, what's the folded structure?
We have two general approaches to solve this problem.
One is a physics calculation. A protein folds because of electromagnetic forces, and we know how those work. So, you can set up a quantum physics simulation and end up with the correct fold. But you won't really end up with anything because your simulation will never finish if you do it precisely. So a lot of work goes into coming up with realistic approximate calculations, and using bigger computers. (Or distributed computing - Rosetta@Home distributes the calculations over the idle time of people's home computers, and you can download it now if you want your computer to help solve protein structures.)
Another is by experiment. We get some kind of reading from the protein structure--X-rays or radio waves, usually--and use that to solve the structure. This can be thought of as Bayesian inference, and, in fact, that's how some people do it--see a book dear to my heart, Bayesian Methods in Structural Bioinformatics. Some of the chapters are about P(observations|structure), and some are about the prior distribution, P(structure). This prior usually comes from experience with previous protein structures.
(A third approach that doesn't really fit into this picture is the videogame FoldIt.)
But, it occurred to me, doesn't the simulation provide a prior?
This is actually a classic case of logical uncertainty. Our beliefs actually imply a fold--we know the linear chain, we know the laws of physics, from that you can theoretically calculate the folded structure. But we can't actually do it, which is why we need experiments, the experiments are actually resolving logical uncertainty.
My dream is for us to be able to do that explicitly, in the math. To calculate a prior using the laws of physics plus logical uncertainty, and then condition on experimental evidence to resolve that uncertainty.
There are two people that I know of, doing research that resembles this. One is Francesco Stingo. He published a method for detecting binding between two different kinds of molecules--miRNA and mRNA. His method has a prior that is based in part on chemistry-based predictions of binding, and updated on the results of microarray experiments. The other is Cari Kaufman, who builds probability distributions over the results of a climate simulation. (the idea seems to be to extrapolate from simulations actually run with similar but not identical parameters)
What I have in mind would accomplish the same kind of thing in a different way--the prior would come from a general theory for assigning probabilities to statements of mathematics, and "the laws of physics predict these two molecules bind" would be one such statement.
Automated theorem proving
I don't know anything about this field, but this kind of makes sense--what if you could decide which lemmas to try and prove based on their probability? That'd be cool.
Or, decide which lemmas are important with value of information calculations.
(AlanCrowe talked about something like this in a comment)
Friendly AI
I don't really understand this, something about a "Löbian obstacle." But Squark has a bit of an overview at "Overcoming the Loebian obstacle using evidence logic". And qmaurmann wrote something on a similar topic, "Meditations on Löb’s theorem and probabilistic logic." eli_sennesh also recently wrote a related post.
There's an interesting result, in this line of research. It says that, although it's well known that a formal system cannot contain a predicate representing truth in that system, this is actually just because formal proof systems only assign certainties of 0 and 1. If they instead gave false statements probability infinitesimally close to 0, and true statements infinitesimally close to 1, they could define truth. (Disclaimer: I can't follow all the math and the result has not yet been published in a peer reviewed journal). This is such an odd use of probabilistic logical uncertainty, nothing like the applications I've been imagining, but there it is.
(walkthrough of the paper here)
Philosophy
Philosophically, I want to know how you calculate the rational degree of belief in every proposition. Good philosophical theories don't necessarily need to be practical to use. Even if it's impossible to compute, it's got to give the right answer in simple thought experiments, so I feel like I can reason without paradox. But even getting all these simple thought experiments right is hard. Good philosophical theories are hard to come by.
Bayesian confirmation theory has been a successful philosophical theory for thought experiments involving the confirmation of theories by evidence. It tells you how belief in a hypothesis should go up or down after seeing observational evidence.
But what about mathematical evidence, what about derivations? Wasn't Einstein's derivation that general relativity predicts the orbit of Mercury evidence in favor of the theory? I want a philosophical theory, I want basic principles, that will tell me "yes." That will tell me the rational degree of belief is higher if I know the derivation.
Philosophers frame this as the "problem of old evidence." They think of it like: you come up with a new theory like GR, consider old evidence like Mercury, and somehow increase your confidence in the new theory. The "paradox" is that you didn't learn of any new evidence, but your confidence in the theory still changed. Philosopher Daniel Garber argues that the new evidence is a mathematical derivation, and the problem would be solved by extending Bayesian probability theory to logical uncertainty. (The paper is called "Old Evidence and Logical Omniscience in Bayesian Confirmation Theory," and can be found here)
A more general philosophical theory than Bayesian confirmation theory is Solomonoff induction, which provides a way to assign probabilities to any observation given any past observations. It requires infinite computing power. However, there still is a rational degree of belief for an agent with finite computing power to have, given the computations that they've done. I want the complete philosophical theory that tells me that.
Learning about probabilistic logical uncertainty
I made a list of references here, if you'd like to learn more about the subject.
Logical uncertainty reading list
This was originally part of a post I wrote on logical uncertainty, but it turned out to be post-sized itself, so I'm splitting it off.
Daniel Garber's article Old Evidence and Logical Omniscience in Bayesian Confirmation Theory. Wonderful framing of the problem--explains the relevance of logical uncertainty to the Bayesian theory of confirmation of hypotheses by evidence.
Articles on using logical uncertainty for Friendly AI theory: qmaurmann's Meditations on Löb’s theorem and probabilistic logic. Squark's Overcoming the Loebian obstacle using evidence logic. And Paul Christiano, Eliezer Yudkowsky, Paul Herreshoff, and Mihaly Barasz's Definibility of Truth in Probabilistic Logic. So8res's walkthrough of that paper, and qmaurmann's notes. eli_sennesh like just made a post on this: Logics for Mind-Building Should Have Computational Meaning.
Benja's post on using logical uncertainty for updateless decision theory.
cousin_it's Notes on logical priors from the MIRI workshop. Addresses a logical-uncertainty version of Counterfactual Mugging, but in the course of that has, well, notes on logical priors that are more general.
Reasoning with Limited Resources and Assigning Probabilities to Arithmetical Statements, by Haim Gaifman. Shows that you can give up on giving logically equivalent statements equal probabilities without much sacrifice of the elegance of your theory. Also, gives a beautifully written framing of the problem.
manfred's early post, and later sequence. Amazingly readable. The proposal gives up Gaifman's elegance, but actually goes as far as assigning probabilities to mathematical statements and using them, whereas Gaifman never follows through to solve an example afaik. The post or the sequence may be the quickest path to getting your hands dirty and trying this stuff out, though I don't think the proposal will end up being the right answer.
There's some literature on modeling a function as a stochastic process, which gives you probability distributions over its values. The information in these distributions comes from calculations of a few values of the function. One application is in optimizing a difficult-to-evaluate objective function: see Efficient Global Optimization of Expensive Black-Box Functions, by Donald R. Jones, Matthias Schonlau, and William J. Welch. Another is when you're doing simulations that have free parameters, and you want to make sure you try all the relevant combinations of parameter values: see Design and Analysis of Computer Experiments by Jerome Sacks, William J. Welch, Toby J. Mitchell, and Henry P. Wynn.
Maximize Worst Case Bayes Score, by Coscott, addresses the question: "Given a consistent but incomplete theory, how should one choose a random model of that theory?"
Bayesian Networks for Logical Reasoning by Jon Williamson. Looks interesting, but I can't summarize it because I don't understand it.
And, a big one that I'm still working through: Non-Omniscience, Probabilistic Inference, and Metamathematics, by Paul Christiano. Very thorough, goes all the way from trying to define coherent belief to trying to build usable algorithms for assigning probabilities.
Dealing With Logical Omniscience: Expressiveness and Pragmatics, by Joseph Y. Halpern and Riccardo Pucella.
Reasoning About Rational, But Not Logically Omniscient Agents, by Ho Ngoc Duc. Sorry about the paywall.
And then the references from Christiano's report:
Abram Demski. Logical prior probability. In Joscha Bach, Ben Goertzel, and Matthew Ikle, editors, AGI, volume 7716 of Lecture Notes in Computer Science, pages 50-59. Springer, 2012.
Marcus Hutter, John W. Lloyd, Kee Siong Ng, and William T. B. Uther. Probabilities on sentences in an expressive logic. CoRR, abs/1209.2620, 2012.
Bas R. Steunebrink and Jurgen Schmidhuber. A family of Godel machine implementations. In Jurgen Schmidhuber, Kristinn R. Thorisson, and Moshe Looks, editors, AGI, volume 6830 of Lecture Notes in Computer Science, pages 275{280. Springer, 2011.
If you have any more links, post them!
Or if you can contribute summaries.
A Limited But Better Than Nothing Way To Assign Probabilities to Statements of Logic, Arithmetic, etc.
If we want to reason with probability theory, we seem to be stuck if we want to reason about mathematics.
You can skip this pararaph and the next if you're familiar with the problem. But if you're not, here's an illustration. Suppose your friend has some pennies that she would like to arrange into a rectangle, which of course is impossible if the number of pennies is prime. Let's call the number of pennies N. Your friend would like to use probability theory to guess whether it's worth trying; if there's a 50% chance that Prime(N), she won't bother trying to make the rectangle. You might imagine that if she counts them and finds that there's an odd number, this is evidence of Prime(N); if she furthermore notices that the digits don't sum to a multiple of three, this is further evidence of Prime(N). In general, each test of compositeness that she knows should, if it fails, raise the probability of Prime(N).
But what happens instead is this. Suppose you both count them, and find that N=53. Being a LessWrong reader, you of course recognize from recently posted articles that N=53 implies Prime(N), though she does not. But this means that P(N=53) <= P(Prime(N)). If you're quite sure of N=53—that is, P(N=53) is near 1—then P(Prime(N)) is also near 1. There's no way for her to get a gradient of uncertainty from simple tests of compositeness. The probability is just some number near 1.
In general, conditional on the axioms, mathematical theorems have probability 1 if they're true, and 0 if they're false. Deriving these probabilities is exactly as difficult as deriving the theorems themselves.
A way of assigning actual probabilities to theorems occurred to me today. I usually see this problem discussed by folks that want to develop formal models of AI, and I don't know if this'll be helpful at all for that. But it's something I can imagine myself using when I want to reason about a mathematical conjecture in a basically reasonable way.
The basic idea is just, don't condition on the axioms. Condition on a few relevant things implied by the axioms. Then, maximize entropy subject to those constraints.
Like, if you're trying to figure out whether a number N is prime. Don't condition on any properties of numbers, any facts about addition or multiplication etc., just treat Prime(N) as a predicate. Then condition on a few facts about Prime(N). I'd start with the prime number theorem and a theorem about the speed of its convergence, which I imagine would lead to a prior distribution P(Prime(N)|N=n, F1) = 1/log(n) (where F1 is Fact1, the fact that the prime number theorem is true). Now, let's say we want to be able to update on noticing a number is odd, so lets add Fact2, which is that all prime numbers are odd and half of all numbers are odd, which allows us to use Bayes' theorem:
which gives us the intuitive conclusion that observing that a number is odd doubles the probability that it's prime.
It's a weird thing to do, leaving out part of your knowledge. Here's how I think of it, though, so that it's less weird.
Suppose that the way you do statistics is by building Jaynesian robots. I'm referring here to an image used over and over again in Jaynes's book Probability Theory: The Logic of Science. The image is that we're building robots that assign plausibilities to statements, and we try to construct these robots in such a way that these plausibilities are useful to us. That is, we try to make robots whose conclusions we'll actually believe, because otherwise, why bother?
One of his "desiderata", his principles of construction, is that the robot gives equal plausibility assignments to logically equivalent statements, which gets us into all this trouble when we try to build a robot we can ask about theorems. But I'm keeping this desideratum; I'm building fully Jaynesian robots.
All I'm doing is hiding some of my knowledge from the robot. And this actually makes sense to me, because sometimes I'd rather have an ignorant Jaynesian robot than a fully informed one.
This is because speed is always important, and sometimes an ignorant Jaynesian robot is faster. Like, let's say my friend and I are building robots to compete in a Number Munchers-like game, where you're given lots of very large numbers and have to eat only the primes. And let's make the levels timed, too. If my friend builds a Jaynesian robot that knows the fundamental axioms of number theory, it's going to have to thoroughly test the primeness of each number before it eats it, and it's going to run out of time. But let's say I carefully program mine with enough facts to make it effective, but not so many facts that it's slow. I'll win.
This doesn't really apply to modeling AI, does it? I mean, it's a way to assign probabilities, but not the best one. Humans often do way better using, I don't know, analogy or whatever they do. So why would a self-modifying AI continue using this, after it finds a better way?
But it's very simple, and it does seem to work. Except for how I totally handwaved turning the prime number theorem into a prior. There could be all sorts of subtleties trying to get from "prime number theorem and some theorem on speed of convergence, and maximizing entropy" to "P(Prime(N)|N=n) = 1/log(n)". I at least know that it's not rigorously provable exactly as stated, since 1/log(n) is above 1 for small n.
And, I honestly don't know how to do any better than this, and still use probability theory. I have that Haim Gaifman paper, "Reasoning with Limited Resources and Assigning Probabilities to Arithmetical Statements", in which he describes an entirely different way of conceptualizing the problem, along with a much better probabilistic test of primeness. It's long though, and I haven't finished it yet.
Does anchoring on deadlines cause procrastination?
The phenomenon of anchoring seems to predict that deadlines will cause you to start a project near the deadline.
In more detail:
Any number you consider as an answer to a question will become an anchor and draw your answer towards it. Since you consider a deadline as a time to finish a project, your decision about when you should actually finish the project will be drawn towards it.
That'll make you start the project later, even though you know consciously that planning to finish a project near the deadline is a bad idea.
It's analogous to an example from Kahneman's Thinking, Fast and Slow—people buy more cans when there's a sign telling them that they can only buy 10.
So, what I'm predicting is that anything that prevents anchoring will reduce procrastination when there's a deadline. Consciously deciding when you plan to finish by adjusting from a much earlier time, maybe?
EDIT: Brendon_Wong points out that "procrastination" really refers to putting things off, which has an emotional cause. I think he's right. What I'm talking about isn't really a procrastination, then, but bad planning.
Influence of scientific research
I'm an undergraduate studying molecular biology, and I am thinking of going into science. In Timothy Gower's "The Importance of Mathematics", he says that many mathematicians just do whatever interests them, regardless of social benefit. I'd rather do something with some interest or technological benefit to people outside of a small group with a very specific education.
Does anybody have any thoughts or links on judging the impact of the work on a research topic?
Clearly, the pursuit of a research topic must be producing truth to be helpful, and I've read Vladimir_M's heuristics regarding this.
Here's something I've tried. My current lab work is on the structure of membrane proteins in bacteria, so this is something I did to see where all this work on protein structure goes. I took a paper that I had found to be a very useful reference for my own work, about a protein that forms a pore in the bacterial membrane with a flexible loop, experimenting with the influence of this loop on the protein's structure. I used the Web of Science database to find a list of about two thousand papers that cited papers that cited this loop paper. I looked through this two-steps-away list for the ones that were not about molecules. Without too much effort, I found a few. The farthest from molecules that I found was a paper on a bacterium that sometimes causes meningitis, discussing about a particular stage in its colonization of the human body. A few of the two-steps-away articles were about antibiotics discovery; though molecular, this is a topic that has a great deal of impact outside of the world of research on biomolecules.
Though it occurs to me that it might be more fruitful to look the other way around: to identify some social benefits or interests people have, and see what scientific research is contributing the most to them.
Subscribe to RSS Feed
= f037147d6e6c911a85753b9abdedda8d)