dspeyer comments on So You Want to Save the World - Less Wrong

41 Post author: lukeprog 01 January 2012 07:39AM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (146)

You are viewing a single comment's thread. Show more comments above.

Comment author: lukeprog 27 December 2011 09:44:04PM *  4 points [-]

Stuart Armstrong's explanation of the 5-and-10 problem is:

The five-and-ten problem (sometimes known as the heavy ghost problem) is a problem in certain types of [updateless decision theory]-like decision theories, when the fact that a counterfactual is known to be false makes the algorithm implement it.

Specifically, let there be a decision problem which involves the choice between $5 and $10, a utility function that values the $10 more than the $5, and an algorithm A that reasons something like:

"Look at all proposition of the type '(A decides to do X) implies (Utility=y)', and find the X that maximises y, then do X."

When faced with the above problem, certain types of algorithm can reason:

"The utility of $10 is greater than the utility of $5. Therefore I will never decide to choose $5. Therefor (A decides to do 'choose $5') is a false statement. Since a false statement implies anything, (A decides to do 'choose $5') implies (Utility=y) for any, arbitrarily high, value of y. Therefore this is the utility maximising decision, and I should choose $5."

That is the informal, natural language statement of the problem. Whether the algorithm is actually vulnerable to the 5-and-10 problem depends on the details of what the algorithm is allowed to deduce about itself.

However, some think Drescher's explanation is more accurate. Somebody should write a short paper on the problem so I can cite that instead. :)

Comment author: Vladimir_Nesov 28 December 2011 07:40:43AM *  20 points [-]

This is an incorrect description of 5-and-10. The description given is of a different problem (one of whose aspects is addressed in the recent cousin_it's writeup, the problem is resolved in that setting by Lemma 2).

5-and-10 problem is concerned with the following (incorrect) line of reasoning by a hypothetical agent:

"I have to decide between $5 and $10. Suppose I decide to choose $5. I know that I'm a money-optimizer, so if I do this, $5 must be more money than $10, so this alternative is better. Therefore, I should choose $5."

Comment author: lukeprog 28 December 2011 02:52:04PM 2 points [-]

Thanks!

Comment author: [deleted] 31 December 2011 07:47:11PM 4 points [-]

Has anyone emailed Judea Pearl, John Harrison, Jon Williamson, et cetera, asking them to look at this?

Comment author: lukeprog 01 January 2012 12:25:07AM 3 points [-]

I doubt it.

Comment author: [deleted] 01 January 2012 02:30:44AM *  7 points [-]

Because academics don't care about blogs? Or doing so would project the wrong image of the Singularity Institute? Or no one thought of doing it? Or someone thought of it, but there were more important things to do first? Perhaps because it's inefficient marketing? Or people who aren't already on lesswrong have failed some ratioanlity competence test? Or you're not sure it's safe to discuss?

Comment author: lukeprog 01 January 2012 02:55:16PM 10 points [-]

My plan has been to write up better, more precise specifications of the open problems before systematically sending them to top academics for comments.

Comment author: XiXiDu 01 January 2012 11:21:12AM 2 points [-]

Has anyone emailed Judea Pearl, John Harrison, Jon Williamson, et cetera, asking them to look at this?

Why don't you do it? I would if I could formulate those problems adequately.

Comment author: mavant 12 November 2013 02:11:23PM 0 points [-]

I don't really understand how this could occur in a TDT-agent. The agent's algorithm is causally dependent on '(max $5 $10), but considering the counterfactual severs that dependence. Observing a money-optimizer (let's call it B) choosing $5 over $10 would presumably cause the agent (call it A) to update its model of B to no longer depend on '(max $5 $10). Am I missing something here?

Comment author: Vladimir_Nesov 12 November 2013 07:13:43PM *  0 points [-]

Correctly getting to the comparison of $5 and $10 is the whole point of the exercise. An agent is trying to evaluate the consequences of its action, A, which is defined by agent's algorithm and is not known explicitly in advance. To do that, it could in some sense consider hypotheticals where its action assumes its possible values. One such hypothetical could involve a claim that A=$5. The error in question is about looking at the claim that A=$5 and making incorrect conclusions (which would result in an action that doesn't depend on comparing $5 and $10).

Comment author: linkhyrule5 26 July 2013 07:24:43AM 0 points [-]

This is probably a stupid question, but is this reducible to the Lobian obstacle? On the surface, it seems similar.

Comment author: Malcolm_Edwards 04 January 2012 06:17:50AM *  0 points [-]

It seems to me that any agent unable to solve this problem would be considerably less intelligent than a human.

Comment author: Benja 25 August 2012 06:25:44AM 1 point [-]

It does seem unlikely that an "expected utility maximizer" reasoning like this would manage to build interstellar spaceships, but that insight doesn't automatically help with building an agent that is immune to this and similar problems.

Comment author: SilasBarta 03 January 2012 08:06:23PM *  0 points [-]

That's strange, Luke normally has good understanding of his sources, and uses and explains them correctly, and so usually recognizes an incorrect explanation.

Comment author: [deleted] 28 December 2011 03:36:55AM 7 points [-]

Clearly, if the algorithm concludes that it will certainly not choose the $5, and then does choose the $5, it concluded wrong. But the reasoning seems impeccable, and there don't seem to be any false premises here. It smacks of the unexpected hanging paradox.

Ooh, but wait. Expanding that reasoning a bit, we have...

The utility of $10 is greater than the utility of $5. Therefore, an algorithm whose axioms are consistent will never decide to choose $5. I am an algorithm whose axioms are consistent. Therefore, I will never decide to choose $5.

The assumption "I am an algorithm whose axioms are consistent" is one that we already know leads to a contradiction, by Löb's theorem. If we can avoid the wrath of Löb's theorem, can we also avoid the five-and-ten problem?

(Granted, this probably isn't the best place to say this.)

Comment author: Eliezer_Yudkowsky 28 December 2011 04:55:45PM 7 points [-]

If we can avoid the wrath of Löb's theorem, can we also avoid the five-and-ten problem?

Very likely yes. Now ask if I know how to avoid the wrath of Löb's theorem.

Comment author: wedrifid 28 December 2011 05:02:38PM 7 points [-]

Very likely yes. Now ask if I know how to avoid the wrath of Löb's theorem.

Do you know how to avoid the wrath of Lob's theorem?

Comment author: Eliezer_Yudkowsky 28 December 2011 05:04:36PM 4 points [-]

Not yet.

Comment author: Manfred 28 December 2011 07:21:45PM 3 points [-]

What kind of powers are you hoping for beyond this sort of thing?

Comment author: Anubhav 17 January 2012 08:29:19AM -1 points [-]

Now ask if I know how to avoid the wrath of Löb's theorem

For someone making a desperate effort to not be a cult leader, you really do enjoy arbitrarily ordering people around, don't you?

</humour possibly subject to Poe's law>

Comment author: ec429 28 December 2011 06:07:41PM *  3 points [-]

Now I see why TDT has been causing me unease - you're spot on that the 5-and-10 problem is Löbbish, but what's more important to me is that TDT in general tries to be reflective. Indeed, Eliezer on decision theory seems to be all about reflective consistency, and to me reflective consistency looks a lot like PA+Self.

A possible route to a solution (to the Löb problem Eliezer discusses in "Yudkowsky (2011a)") that I'd like to propose is as follows: we know how to construct P+1, P+2, ... P+w, etc. (forall Q, Q+1 = Q u {forall S, [](Q|-S)|-S}). We also know how to do transfinite ordinal induction... and we know that the supremum of all transfinite countable ordinals is the first uncountable ordinal, which corresponds to the cardinal aleph_1 (though ISTR Eliezer isn't happy with this sort of thing). So, P+Ω won't lose reflective trust for any countably-inducted proof, and our AI/DT will trust maths up to countable induction.

However, it won't trust scary transfinite induction in general - for that, we need a suitably large cardinal, which I'll call kappa, and then P+kappa reflectively trusts any proof whose length is smaller than kappa; we may in fact be able to define a large cardinal property, of a kappa such that PA+(the initial ordinal of kappa) can prove the existence of cardinals as large as kappa. Such a large cardinal may be too strong for existing set theories (in fact, the reason I chose the letter kappa is because my hunch is that Reinhardt cardinals would do the trick, and they're inconsistent with ZFC). Nonetheless, if we can obtain such a large cardinal, we have a reflectively consistent system without self-reference: PA+w_kappa doesn't actually prove itself consistent, but it does prove kappa to exist and thus proves that it itself exists, which to a Syntacticist is good enough (since formal systems are fully determined).

Comment author: cousin_it 28 December 2011 06:29:41PM 0 points [-]

Are you referring to something like ordinal analysis? Can I drop the mention of cardinals and define kappa as the smallest ordinal such that the proof-theoretic ordinal of PA+kappa is kappa? Sorry if these are stupid questions, I don't know very much about the topic.

Also I don't completely understand why such a system could be called reflectively consistent. Can you explain more formally what you mean by "proves that it itself exists"?

Comment author: ec429 28 December 2011 06:52:50PM 1 point [-]

Well, I'm not exactly an expert either (though next term at uni I'm taking a course on Logic and Set Theory, which will help), but I'm pretty sure this isn't the same thing as proof-theoretic ordinals.

You see, proofs in formal systems are generally considered to be constrained to have finite length. What I'm trying to talk about here is the construction of metasyntaxes in which, if A1, A2, ... are valid derivations (indexed in a natural and canonical way by the finite ordinals), then Aw is a valid derivation for ordinals w smaller than some given ordinal. A nice way to think about this is, in (traditionally-modelled) PA the set of numbers contains the naturals, because for any natural number n, you can construct the n-th iterate of ⁺ (successor) and apply it to 0. However, the set of numbers doesn't contain w, because to obtain that by successor application, you'd have to construct the w-th iterate of ⁺, and in the usual metasyntax infinite iterates are not allowed.

Higher-order logics are often considered to talk about infinite proofs in lower-order logics (eg. every time you quantify over something infinite, you do something that would take infinite proving in a logic without quantifiers), but they do this in a semantic way, which I as a Syntacticist reject; I am doing it in a syntactic way, considering only the results of transfinitely iterated valid derivations in the low-order logic.

As far as I'm aware, there has not been a great deal of study of metasyntax. It seems to me that (under the Curry-Howard isomorphism) transfinite-iteration metasyntax corresponds to hypercomputation, which is possibly why my kappa is (almost certainly) much, much larger than the Church-Kleene ordinal which (according to Wikipedia, anyway) is a strict upper bound on proof-theoretic ordinals of theories. w₁CK is smaller even than w₁, so I don't see how it can be larger than kappa.

Comment author: cousin_it 28 December 2011 07:31:21PM *  1 point [-]

Ok, I think I get it. You're talking about proofs of transfinite lengths, and using hypercomputation to check those, right? This seems way powerful, e.g. it pins down all statements in the arithmetical hierarchy (a statement at level N requires an N-dimensional hyper...quadrant... of proofs) and much more besides. A finite computer program probably can't use such a system directly. Do you have any ideas how to express the notion that the program should "trust" such a system? All ideas that come to my mind seem equivalent to extending PA with more and more powerful induction postulates (or equivalently, adding more and more layers of Con()), which just amounts to increasing the proof-theoretic ordinal as in my first comment.

Comment author: ec429 28 December 2011 09:21:17PM 0 points [-]

The computer program 'holds the belief that' this way-powerful system exists; while it can't implement arbitrary transfinite proofs (because it doesn't have access to hypercomputation), it can still modify its own source code without losing a meta each time: it can prove its new source code will increase utility over its old, without its new source code losing proof-power (as would happen if it only 'believed' PA+n; after n provably-correct rewrites it would only believe PA, and not PA+1. Once you get down to just PA, you have a What The Tortoise Said To Achilles-type problem; just because you've proved it, why should you believe it's true?

The trick to making way-powerful systems is not to add more and more Con() or induction postulates - those are axioms. I'm adding transfinite inference rules. As well as all the inference rules like modus ponens, we have one saying something like "if I can construct a transfinite sequence of symbols, and map those symbols to syntactic string-rewrite operations, then the-result-of the corresponding sequence of rewrites is a valid production". Thus, for instance, w-inference is stronger than adding w layers of Con(), because it would take a proof of length at least w to use all w layers of Con().

This is why I call it metasyntax; you're considering what would happen if you applied syntactic productions transfinitely many times.

I don't know, in detail, how to express the notion that the program should "trust" such a system, because I don't really know how a program can "trust" any system: I haven't ever worked with/on automated theorem provers, nor any kind of 'neat AI'; my AGI experiments to date have all been 'scruffy' (and I stopped doing them when I read EY on FAI, because if they were to succeed (which they obviously won't, my neural net did nothing but talk a load of unintelligible gibberish about brazil nuts and tetrahedrite) they wouldn't even know what human values were, let alone incorporate them into whatever kind of decision algorithm they ended up having).

I'm really as much discussing how human mathematicians can trust mathematics as I am how AIs can trust mathematics - when we have all that Gödel and Löb and Tarski stuff flying around, some people are tempted to say "Oh, mathematics can't really prove things, therefore not-Platonism and not-infinities", which I think is a mistake.

Comment author: Zetetic 31 December 2011 04:10:45AM 3 points [-]

I might be misunderstanding you, but it looks like you're just describing fragments of infinitary logic which has a pretty solid amount of research behind it. Barwise actually developed a model theory for it, you can find a (slightly old) overview here (in part C).

Infinitary logic admits precisely what you're talking about. For instance; it models sentences with universal and existential quantifiers in N using conjunctions and disjunctions (respectively) over the index ω.

As far as

when we have all that Gödel and Löb and Tarski stuff flying around, some people are tempted to say "Oh, mathematics can't really prove things, therefore not-Platonism and not-infinities", which I think is a mistake.

I don't think the results of Gödel, Löb and Tarski are necessary to conclude that Platonism is at least pretty questionable. I don't know where the "mathematics can't really prove things" bit is coming from - we can prove things in mathematics, and I've never really seen people claim otherwise. Are you implicitly attributing something like Absolute Truth to proofs?

Anyway, I've been thinking about a naturalistic account of anti-realism for a little while. I'm not convinced it's fruitful, but Platonism seems totally incomprehensible to me anymore. I can't see a nice way for it to mesh with what I know about brains and how we form concepts - and the accounts I read can't give any comprehensible account of what exactly the Platonic realm is supposed to be, nor how we access it. It looks like a nice sounding fiction with a big black box called the Platonic realm that everything we can't really explain gets stuck into. Nor can I give any kind of real final account of my view of mathematics, because it would at least require some new neuroscience (which I mention in that link).

I will say that I don't think that mathematics has any special connection with The Real Truth, and I also think that it's a terrible mistake to think that this is a problem. Truth is evidence based, we figure out the truth by examining our environment. We have extremely strong evidence that math performs well as a framework for organizing information and for reasoning predictively and counterfactually - that it works extremely well in modelling the world around us. I definitely don't think that it's reasonable to think that math grants us license to epistemic certainty.

I wouldn't mind being convinced otherwise, but I'd need an epistemically sound account of Platonism first. I don't find Tegmark to be super convincing, though I do like that he restricts the MUH to computable structures - but I'm also not sure if that strictly qualifies as Platonism like you seem to be using it. At the very least it gives a moderately constructive account of exactly what is being claimed, which is something I would require from any Platonic theory.

Comment author: Nisan 30 January 2012 07:48:56PM 3 points [-]

... up until now I thought "barwise" was a technical term.

Comment author: Zetetic 30 January 2012 11:08:34PM 1 point [-]

Ha! Yeah, it seems that his name is pretty ubiquitous in mathematical logic, and he wrote or contributed to quite a number of publications. I had a professor for a sequence in mathematical logic who had Barwise as his thesis adviser. The professor obtained his doctoral degree from UW Madison when it still had Barwise, Kleene and Keisler so he would tell stories about some class he had with one or the other of them.

Barwise seems to have had quite a few interesting/powerful ideas. I've been wanting to read Vicious Circles for a while now, though I haven't gotten around to it.

Comment author: ec429 31 December 2011 06:05:05PM 0 points [-]

Hmm, infinitary logic looks interesting (I'll read right through it later, but I'm not entirely sure it covers what I'm trying to do). As for Platonism, mathematical realism, and Tegmark, before discussing these things I'd like to check whether you've read on my version of Tegmark-like ideas? I'd rather not repeat all that bit by bit in conversation.

Comment author: Zetetic 01 January 2012 02:25:05AM 0 points [-]

Have you looked into Univalent foundations at all? There was an a while ago and it seems as though it might be relevant to your pursuits.

I've read your post on Syntacticism and some of your replies to comments. I'm currently looking at the follow up piece (The Apparent Reality of Physics).

Comment author: ec429 28 December 2011 07:00:08PM 0 points [-]

Can you explain more formally what you mean by "proves that it itself exists"?

The fundamental principle of Syntacticism is that the derivations of a formal system are fully determined by the axioms and inference rules of that formal system. By proving that the ordinal kappa is a coherent concept, I prove that PA+kappa is too; thus the derivations of PA+kappa are fully determined and exist-in-Tegmark-space.

Actually it's not PA+kappa that's 'reflectively consistent'; it's an AI which uses PA+kappa as the basis of its trust in mathematics that's reflectively consistent, for no matter how many times it rewrites itself, nor how deeply iterated the metasyntax it uses to do the maths by which it decides how to rewrite itself, it retains just as much trust in the validity of mathematics as it did when it started. Attempting to achieve this more directly, by PA+self, runs into Löb's theorem.

Comment author: Kaj_Sotala 27 December 2011 10:03:26PM 2 points [-]

Thanks. I'm not sure if you actually got my point, though, which was that while I can roughly get the gist of "who doesn't use the 5-and-10 example illustration", it's an odd sentence and doesn't seem grammatical. ("Who doesn't actually use the term '5-and-10 example'", perhaphs?)

Comment author: lukeprog 27 December 2011 10:06:01PM 3 points [-]

Okay. I changed it here.