Comment author: Vaniver 22 October 2014 10:14:55PM 0 points [-]

The other is Cari Kaufman, who builds probability distributions over the results of a climate simulation. (the idea seems to be to extrapolate from simulations actually run with similar but not identical parameters)

I was introduced to the idea of 'emulation' of complex models by Tony O'Hagan a few years back, where you use a Gaussian Process to model what a black box simulation will give across all possible inputs, seeded with actual simulation runs that you performed. (This also helps with active learning, in that you can find the regions of the input space where you're most uncertain what the simulation will give, and then run a simulation with those input parameters.) I believe the first application it saw was also in climate modeling.

Comment author: alex_zag_al 23 October 2014 01:41:15AM *  0 points [-]

Do you know of any cases where this simulation-seeded Gaussian Process was then used as a prior, and updated on empirical data?

Like...

  • uncertain parameters --simulation--> distribution over state

  • noisy observations --standard bayesian update--> refined distribution over state

Cari Kaufman's research profile made me think that's something she was interested in. But I haven't found any publications by her or anyone else that actually do this.

I actually think that I misread her research description, latching on to the one familiar idea.

Comment author: Wei_Dai 13 May 2013 08:39:59AM 22 points [-]

This is for people interested in optimizing for academic fame (for a given level of talent and effort and other costs). Instead of trying to get a PhD and a job in academia (which is very costly and due to "publish or perish" forces you to work on topics that are currently popular in academia), get a job that leaves you with a lot of free time, or find a way to retire early. Use your free time to search for important problems that are being neglected by academia. When you find one, pick off some of the low-hanging fruit in that area and publish your results somewhere. Then, (A) if you're impatient for recognition, use your results to make an undeniable impact on the world (see Bitcoin for example), or (B) if you're patient, move on to another neglected topic and repeat, knowing that in a few years or decades, the neglected topic you found will likely become a hot topic and you'll be credited for being the first to investigate it.

Comment author: alex_zag_al 19 October 2014 05:44:37PM 1 point [-]

This reminds me of the story of Robert Edgar, who created the DNA and protein sequence alignment program MUSCLE.

He got a PhD in physics, but considers that a mistake. He did his bioinformatics work after selling a company and having free time. The bioinformatics work was notable enough that it's how I know of him.

His blog post, from which I learned this story: https://thewinnower.com/discussions/an-unemployed-gentleman-scholar

Comment author: cousin_it 19 October 2014 02:14:18PM *  3 points [-]

Coscott's Maximize Worst case Bayes Score and my notes on logical priors might also be relevant.

Comment author: alex_zag_al 19 October 2014 04:44:49PM *  2 points [-]

added, with whatever little bits of summary I could get by skimming.

Comment author: lackofcheese 19 October 2014 05:13:03AM *  4 points [-]

As far as AI is concerned, you don't need to go anywhere near as specialised as FAI to find something where logical uncertainty is directly applicable.

Every search problem in AI is an instance of logical uncertainty, and every search algorithm is a different way of attempting to deal with that uncertainty.

Comment author: alex_zag_al 19 October 2014 04:26:38PM *  3 points [-]

It's true that this is a case of logical uncertainty.

However, I must add that in most of my examples, I bring up the benefits of a probabilistic representation. Just because you have logical uncertainty doesn't mean you need to represent it with probability theory.

In protein structure, we already have these Bayesian methods for inferring the fold, so the point of the probabilistic representation is to plug it i these methods as a prior. In philosophy, we want ideal rationality, which suggests probability. In automated theorem proving... okay, yeah, in automated theorem proving I can't explain why you'd want to use probability theory in particular.

But yes. If you had a principled way to turn your background information and already done computations into a probability distribution for future computations, you could use that for AI search problems. And optimization problems. Wow, that's a lot of problems. I'm not sure how it would stack up against other methods, but it'd be interesting if that became a paradigm for at least some problems.

In fact, now that you've inspired me to look for it, I find that it's being done! Not with the approach of coming up with a distribution over all mathematical statements that you see in Christiano's report, and which is the approach I had in mind when writing the post. But rather, with an approach like what Cari Kaufman I think uses, where you guess based on nearby points. Which is accomplished by modeling a difficult-to-evaluate function as a stochastic process with some kind of local correlations, like a Gaussian process, so that you get probability distributions for the values of the function at each point. What I'm finding is that this is, in fact, an approach people use to optimizing difficult-to-evaluate objective functions. See here for the details: Efficient Global Optimization of Expensive Black-Box Functions, by Jones, Schonlau and Welch.

Comment author: jkaufman 31 August 2013 11:58:54AM 3 points [-]

Did people in 1711 classify their work into "Math, Phys, Chem, Bio, and Phil"? What if ideas that we call Philosophy now are a subset of what someone in 1711 would be working on?

Comment author: alex_zag_al 19 October 2014 02:13:31PM 0 points [-]

They wouldn't classify their work that way, and in fact I thought that was the whole point of surveying these other fields. Like, for example, a question for philosophers in the 1600s is now a question for biologists, and that's why we have to survey biologists to find out if it was resolved.

Comment author: somnicule 19 October 2014 06:13:34AM 2 points [-]

I think most formulations of logical uncertainty give axioms and proven propositions probability 1, or 1-minus-epsilon.

Comment author: alex_zag_al 19 October 2014 01:30:43PM 4 points [-]

Yes. Because, we're trying to express uncertainty about the consequences of axioms. Not about axioms themselves.

common_law's thinking does seem to be something people actually do. Like, we're uncertain about the consequences of the laws of physics, while simultaneously being uncertain of the laws of physics, while simultaneously being uncertain if we're thinking about it in a logical way. But, it's not the kind of uncertainty that we're trying to model, in the applications I'm talking about. The missing piece in these applications are probabilities conditional on axioms.

Comment author: lukeprog 19 October 2014 03:19:02AM *  4 points [-]
Comment author: alex_zag_al 19 October 2014 07:27:53AM 2 points [-]

Nice. Links added to post and I'll check them out later. The Duc and Williamson papers were from a post of yours, by the way. Some, MIRI status report or something. I don't remember.

Applications of logical uncertainty

16 alex_zag_al 18 October 2014 07:26PM

Lately I've been reading a lot about the problem of logical uncertainty. The problem is that there are logical consequences of your beliefs that you're uncertain about.

So, for statements like "the billionth digit of pi is even", it's provable from your beliefs. But you're still uncertain of it. So, the problem is, what's its probability? Well, 1/2, probably, but from what principled theory can you derive that?

That's the question. What probabilities we should assign to logical statements, and what laws of probability apply to them.

With all the time I've been spending on it, I've been wondering, will I ever have the opportunity to use it? My conclusion is, probably not me, personally. But I have come up with a decent list of ways other people might use it.

Combining information from simulation and experiment

This is why I first became interested in the problem--I was working in protein structure prediction.

The problem in protein structure prediction is figuring out how the protein folds. A protein is a big molecule. They're always these long, linear chains of atoms with relatively short side-branches. Interestingly, there are easy experiments to learn what they'd look like all stretched out in a line like this. But in reality, this long chain is folded up in a complex way. So, the problem is: given the linear, stretched out structure, what's the folded structure?

We have two general approaches to solve this problem.

One is a physics calculation. A protein folds because of electromagnetic forces, and we know how those work. So, you can set up a quantum physics simulation and end up with the correct fold. But you won't really end up with anything because your simulation will never finish if you do it precisely. So a lot of work goes into coming up with realistic approximate calculations, and using bigger computers. (Or distributed computing - Rosetta@Home distributes the calculations over the idle time of people's home computers, and you can download it now if you want your computer to help solve protein structures.)

Another is by experiment. We get some kind of reading from the protein structure--X-rays or radio waves, usually--and use that to solve the structure. This can be thought of as Bayesian inference, and, in fact, that's how some people do it--see a book dear to my heart, Bayesian Methods in Structural Bioinformatics. Some of the chapters are about P(observations|structure), and some are about the prior distribution, P(structure). This prior usually comes from experience with previous protein structures.

(A third approach that doesn't really fit into this picture is the videogame FoldIt.)

But, it occurred to me, doesn't the simulation provide a prior?

This is actually a classic case of logical uncertainty. Our beliefs actually imply a fold--we know the linear chain, we know the laws of physics, from that you can theoretically calculate the folded structure. But we can't actually do it, which is why we need experiments, the experiments are actually resolving logical uncertainty.

My dream is for us to be able to do that explicitly, in the math. To calculate a prior using the laws of physics plus logical uncertainty, and then condition on experimental evidence to resolve that uncertainty.

There are two people that I know of, doing research that resembles this. One is Francesco Stingo. He published a method for detecting binding between two different kinds of molecules--miRNA and mRNA. His method has a prior that is based in part on chemistry-based predictions of binding, and updated on the results of microarray experiments. The other is Cari Kaufman, who builds probability distributions over the results of a climate simulation. (the idea seems to be to extrapolate from simulations actually run with similar but not identical parameters)

What I have in mind would accomplish the same kind of thing in a different way--the prior would come from a general theory for assigning probabilities to statements of mathematics, and "the laws of physics predict these two molecules bind" would be one such statement.

Automated theorem proving

I don't know anything about this field, but this kind of makes sense--what if you could decide which lemmas to try and prove based on their probability? That'd be cool.

Or, decide which lemmas are important with value of information calculations.

(AlanCrowe talked about something like this in a comment)

Friendly AI

I don't really understand this, something about a "Löbian obstacle." But Squark has a bit of an overview at "Overcoming the Loebian obstacle using evidence logic". And qmaurmann wrote something on a similar topic, "Meditations on Löb’s theorem and probabilistic logic." eli_sennesh also recently wrote a related post.

There's an interesting result, in this line of research. It says that, although it's well known that a formal system cannot contain a predicate representing truth in that system, this is actually just because formal proof systems only assign certainties of 0 and 1. If they instead gave false statements probability infinitesimally close to 0, and true statements infinitesimally close to 1, they could define truth. (Disclaimer: I can't follow all the math and the result has not yet been published in a peer reviewed journal). This is such an odd use of probabilistic logical uncertainty, nothing like the applications I've been imagining, but there it is.

(walkthrough of the paper here)

Philosophy

Philosophically, I want to know how you calculate the rational degree of belief in every proposition. Good philosophical theories don't necessarily need to be practical to use. Even if it's impossible to compute, it's got to give the right answer in simple thought experiments, so I feel like I can reason without paradox. But even getting all these simple thought experiments right is hard. Good philosophical theories are hard to come by.

Bayesian confirmation theory has been a successful philosophical theory for thought experiments involving the confirmation of theories by evidence. It tells you how belief in a hypothesis should go up or down after seeing observational evidence.

But what about mathematical evidence, what about derivations? Wasn't Einstein's derivation that general relativity predicts the orbit of Mercury evidence in favor of the theory? I want a philosophical theory, I want basic principles, that will tell me "yes." That will tell me the rational degree of belief is higher if I know the derivation.

Philosophers frame this as the "problem of old evidence." They think of it like: you come up with a new theory like GR, consider old evidence like Mercury, and somehow increase your confidence in the new theory. The "paradox" is that you didn't learn of any new evidence, but your confidence in the theory still changed. Philosopher Daniel Garber argues that the new evidence is a mathematical derivation, and the problem would be solved by extending Bayesian probability theory to logical uncertainty. (The paper is called "Old Evidence and Logical Omniscience in Bayesian Confirmation Theory," and can be found here)

A more general philosophical theory than Bayesian confirmation theory is Solomonoff induction, which provides a way to assign probabilities to any observation given any past observations. It requires infinite computing power. However, there still is a rational degree of belief for an agent with finite computing power to have, given the computations that they've done. I want the complete philosophical theory that tells me that.

Learning about probabilistic logical uncertainty

I made a list of references here, if you'd like to learn more about the subject.

Logical uncertainty reading list

17 alex_zag_al 18 October 2014 07:16PM

This was originally part of a post I wrote on logical uncertainty, but it turned out to be post-sized itself, so I'm splitting it off.

Daniel Garber's article Old Evidence and Logical Omniscience in Bayesian Confirmation Theory. Wonderful framing of the problem--explains the relevance of logical uncertainty to the Bayesian theory of confirmation of hypotheses by evidence.

Articles on using logical uncertainty for Friendly AI theory: qmaurmann's Meditations on Löb’s theorem and probabilistic logic. Squark's Overcoming the Loebian obstacle using evidence logic. And Paul Christiano, Eliezer Yudkowsky, Paul Herreshoff, and Mihaly Barasz's Definibility of Truth in Probabilistic Logic. So8res's walkthrough of that paper, and qmaurmann's notes. eli_sennesh like just made a post on this: Logics for Mind-Building Should Have Computational Meaning.

Benja's post on using logical uncertainty for updateless decision theory.

cousin_it's Notes on logical priors from the MIRI workshop. Addresses a logical-uncertainty version of Counterfactual Mugging, but in the course of that has, well, notes on logical priors that are more general.

Reasoning with Limited Resources and Assigning Probabilities to Arithmetical Statements, by Haim Gaifman. Shows that you can give up on giving logically equivalent statements equal probabilities without much sacrifice of the elegance of your theory. Also, gives a beautifully written framing of the problem.

manfred's early post, and later sequence. Amazingly readable. The proposal gives up Gaifman's elegance, but actually goes as far as assigning probabilities to mathematical statements and using them, whereas Gaifman never follows through to solve an example afaik. The post or the sequence may be the quickest path to getting your hands dirty and trying this stuff out, though I don't think the proposal will end up being the right answer.

There's some literature on modeling a function as a stochastic process, which gives you probability distributions over its values. The information in these distributions comes from calculations of a few values of the function. One application is in optimizing a difficult-to-evaluate objective function: see Efficient Global Optimization of Expensive Black-Box Functions, by Donald R. Jones, Matthias Schonlau, and William J. Welch. Another is when you're doing simulations that have free parameters, and you want to make sure you try all the relevant combinations of parameter values: see Design and Analysis of Computer Experiments by Jerome Sacks, William J. Welch, Toby J. Mitchell, and Henry P. Wynn.

Maximize Worst Case Bayes Score, by Coscott, addresses the question: "Given a consistent but incomplete theory, how should one choose a random model of that theory?"

Bayesian Networks for Logical Reasoning by Jon Williamson. Looks interesting, but I can't summarize it because I don't understand it.

And, a big one that I'm still working through: Non-Omniscience, Probabilistic Inference, and Metamathematics, by Paul Christiano. Very thorough, goes all the way from trying to define coherent belief to trying to build usable algorithms for assigning probabilities.

Dealing With Logical Omniscience: Expressiveness and Pragmatics, by Joseph Y. Halpern and Riccardo Pucella.

Reasoning About Rational, But Not Logically Omniscient Agents, by Ho Ngoc Duc. Sorry about the paywall.

And then the references from Christiano's report:

Abram Demski. Logical prior probability. In Joscha Bach, Ben Goertzel, and Matthew Ikle, editors, AGI, volume 7716 of Lecture Notes in Computer Science, pages 50-59. Springer, 2012.

Marcus Hutter, John W. Lloyd, Kee Siong Ng, and William T. B. Uther. Probabilities on sentences in an expressive logic. CoRR, abs/1209.2620, 2012.

Bas R. Steunebrink and Jurgen Schmidhuber. A family of Godel machine implementations. In Jurgen Schmidhuber, Kristinn R. Thorisson, and Moshe Looks, editors, AGI, volume 6830 of Lecture Notes in Computer Science, pages 275{280. Springer, 2011.

If you have any more links, post them!

Or if you can contribute summaries.

Comment author: Watercressed 23 November 2013 06:41:29AM *  4 points [-]

One of his "desiderata", his principles of construction, is that the robot gives equal plausibility assignments to logically equivalent statements

I don't see this desiderata. The consistency requirement is that if there are multiple ways of calculating something, then all of them yield the same result. A few minutes of thought didn't lead to any way of leveraging a non 1 or zero probability for Prime(53) into two different results.

If I try to do anything with P(Prime(53)|PA), I get stuff like P(PA|Prime(53)), and I don't have any idea how to interpret that. Since PA is a set of axioms, it doesn't really have a truth value that we can do probability with. Technically speaking, Prime(N) means that the PA axioms imply that 53 has two factors. Since the axioms are in the predicate, any mechanism that forces P(Prime(53)) to be one must do so for all priors.

One final thing: Isn't it wrong to assign a probability of zero to Prime(4), i.e. PA implies that 4 has two factors, since PA could be inconsistent and imply everything?

Comment author: alex_zag_al 18 October 2014 04:43:44PM 0 points [-]

I now think you're right that logical uncertainty doesn't violate any of Jaynes's desiderata. Which means I should probably try to follow them more closely, if they don't create problems like I thought they would.

An Aspiring Rationalist's Ramble has a post asserting the same thing, that nothing in the desiderata implies logical omniscience.

View more: Prev | Next