Second-Order Logic: The Controversy

Eliezer Yudkowsky

Highly Advanced Epistemology 101 for Beginners

Second-Order Logic: The Controversy

by Eliezer Yudkowsky

19 min read4th Jan 2013192 comments

64

Gödelian LogicLogic & Mathematics

Personal Blog

Followup to: Godel's Completeness and Incompleteness Theorems

"So the question you asked me last time was, 'Why does anyone bother with first-order logic at all, if second-order logic is so much more powerful?'"

Right. If first-order logic can't talk about finiteness, or distinguish the size of the integers from the size of the reals, why even bother?

"The first thing to realize is that first-order theories can still have a lot of power. First-order arithmetic does narrow down the possible models by a lot, even if it doesn't narrow them down to a single model. You can prove things like the existence of an infinite number of primes, because every model of the first-order axioms has an infinite number of primes. First-order arithmetic is never going to prove anything that's wrong about the standard numbers. Anything that's true in all models of first-order arithmetic will also be true in the particular model we call the standard numbers."

Even so, if first-order theory is strictly weaker, why bother? Unless second-order logic is just as incomplete relative to third-order logic, which is weaker than fourth-order logic, which is weaker than omega-order logic -

"No, surprisingly enough - there's tricks for making second-order logic encode any proposition in third-order logic and so on. If there's a collection of third-order axioms that characterizes a model, there's a collection of second-order axioms that characterizes the same model. Once you make the jump to second-order logic, you're done - so far as anyone knows (so far as I know) there's nothing more powerful than second-order logic in terms of which models it can characterize."

Then if there's one spoon which can eat anything, why not just use the spoon?

"Well... this gets into complex issues. There are mathematicians who don't believe there is a spoon when it comes to second-order logic."

Like there are mathematicians who don't believe in infinity?

"Kind of. Look, suppose you couldn't use second-order logic - you belonged to a species that doesn't have second-order logic, or anything like it. Your species doesn't have any native mental intuition you could use to construct the notion of 'all properties'. And then suppose that, after somebody used first-order set theory to prove that first-order arithmetic had many possible models, you stood around shouting that you believed in only one model, what you called the standard model, but you couldn't explain what made this model different from any other model -"

Well... a lot of times, even in math, we make statements that genuinely mean something, but take a while to figure out how to define. I think somebody who talked about 'the numbers' would mean something even before second-order logic was invented.

"But here the hypothesis is that you belong to a species that can't invent second-order logic, or think in second-order logic, or anything like it."

Then I suppose you want me to draw the conclusion that this hypothetical alien is just standing there shouting about standardness, but its words don't mean anything because they have no way to pin down one model as opposed to another one. And I expect this species is also magically forbidden from talking about all possible subsets of a set?

"Yeah. They can't talk about the largest powerset, just like they can't talk about the smallest model of Peano arithmetic."

Then you could arguably deny that shouting about the 'standard' numbers would mean anything, to the members of this particular species. You might as well shout about the 'fleem' numbers, I guess.

"Right. Even if all the members of this species did have a built-in sense that there was a special model of first-order arithmetic that was fleemer than any other model, if that fleem-ness wasn't bound to anything else, it would be meaningless. Just a floating word. Or if all you could do was define fleemness as floobness and floobness as fleemness, you'd have a loop of floating words; and that might give you the impression that each particular word had a meaning, but the loop as a whole wouldn't be connected to reality. That's why it doesn't help to say that the standard model of numbers is the smallest among all possible models of Peano arithmetic, if you can't define 'smallest possible' any more than you can define 'connected chain' or 'finite number of predecessors'."

But second-order logic does seem to have consequences first-order logic doesn't. Like, what about all that Godelian stuff? Doesn't second-order arithmetic semantically imply... its own Godel statement? Because the unique model of second-order arithmetic doesn't contain any number encoding a proof of a contradiction from second-order arithmetic? Wait, now I'm confused.

"No, that's correct. It's not paradoxical, because there's no effective way of finding all the semantic implications of a collection of second-order axioms. There's no analogue of Godel's Completeness Theorem for second-order logic - no syntactic system for deriving all the semantic consequences. Second-order logic is sound, in the sense that anything syntactically provable from a set of premises, is true in any model obeying those premises. But second-order logic isn't complete; there are semantic consequences you can't derive. If you take second-order logic at face value, there's no effectively computable way of deriving all the consequences of what you say you 'believe'... which is a major reason some mathematicians are suspicious of second-order logic. What does it mean to believe something whose consequences you can't derive?"

But second-order logic clearly has some experiential consequences first-order logic doesn't. Suppose I build a Turing machine that looks for proofs of a contradiction from first-order Peano arithmetic. If PA's consistency isn't provable in PA, then by the Completeness Theorem there must exist nonstandard models of PA where this machine halts after finding a proof of a contradiction. So first-order logic doesn't tell me that this machine runs forever - maybe it has nonstandard halting times, i.e., it runs at all standard N, but halts at -2* somewhere along a disconnected chain. Only second-order logic tells me there's no proof of PA's inconsistency and therefore this machine runs forever. Only second-order logic tells me I should expect to see this machine keep running, and not expect - note falsifiability - that the machine ever halts.

"Sure, you just used a second-order theory to derive a consequence you didn't get from a first-order theory. But that's not the same as saying that you can only get that consequence using second-order logic. Maybe another first-order theory would give you the same prediction."

Like what?

"Well, for one thing, first-order set theory can prove that first-order arithmetic has a model. Zermelo-Fraenkel set theory can prove the existence of a set such that all the first-order Peano axioms are true about that set. It follows within ZF that sound reasoning on first-order arithmetic will never prove a contradiction. And since ZF can prove that the syntax of classical logic is sound -"

What does it mean for set theory to prove that logic is sound!?

"ZF can quote formulas as structured, and encode models as sets, and then represent a finite ZF-formula which says whether or not a set of quoted formulas is true about a model. ZF can then prove that no step of classical logic goes from premises that are true inside a set-model, to premises that are false inside a set-model. In other words, ZF can represent the idea 'formula X is semantically true in model Y' and 'these syntactic rules never produce semantically false statements from semantically true statements'."

Wait, then why can't ZF prove itself consistent? If ZF believes in all the axioms of ZF, and it believes that logic never produces false statements from true statements -

"Ah, but ZF can't prove that there exists any set which is a model of ZF, so it can't prove the ZF-axioms are consistent. ZF shouldn't be able to prove that some set is a model of ZF, because that's not true in all models. Many models of ZF don't contain any individual set well-populated enough for that one set to be a model of ZF all by itself."

I'm kind of surprised in a Godelian sense that ZF can contain sets as large as the universe of ZF. Doesn't any given set have to be smaller than the whole universe?

"Inside any particular model of ZF, every set within that model is smaller than that model. But not all models of ZF are the same size; in fact, models of ZF of every size exist, by the Lowenheim-Skolem theorem. So you can have some models of ZF - some universes in which all the elements collectively obey the ZF-relations - containing individual sets which are larger than other entire models of ZF. A set that large is called a Grothendieck universe and assuming it exists is equivalent to assuming the existence of 'strongly inaccessible cardinals', sizes so large that you provably can't prove inside set theory that anything that large exists."

Whoa.

(Pause.)

But...

"But?"

I agree you've shown that one experiential consequence of second-order arithmetic - namely that a machine looking for proofs of inconsistency from PA, won't be seen to halt - can be derived from first-order set theory. Can you get all the consequences of second-order arithmetic in some particular first-order theory?

"You can't get all the semantic consequences of second-order logic, taken at face value, in any theory or any computable reasoning. What about the halting problem? Taken at face value, it's a semantic consequence of second-order logic that any given Turing machine either halts or doesn't halt -"

Personally I find it rather intuitive to imagine that a Turing machine either halts or doesn't halt. I mean, if I'm walking up to a machine and watching it run, telling me that its halting or not-halting 'isn't entailed by my axioms' strikes me as not describing any actual experience I can have with the machine. Either I'll see it halt eventually, or I'll see it keep running forever.

"My point is that the statements we can derive from the syntax of current second-order logic is limited by that syntax. And by the halting problem, we shouldn't ever expect a computable syntax that gives us all the semantic consequences. There's no possible theory you can actually use to get a correct advance prediction about whether an arbitrary Turing machine halts."

Okay. I agree that no computable reasoning, on second-order logic or anything else, should be able to solve the halting problem. Unless time travel is possible, but even then, you shouldn't be able to solve the expanded halting problem for machines that use time travel.

"Right, so the syntax of second-order logic can't prove everything. And in fact, it turns out that, in terms of what you can prove syntactically using the standard syntax, second-order logic is identical to a many-sorted first-order logic."

Huh?

"Suppose you have a first-order logic - one that doesn't claim to be able to quantify over all possible predicates - which does allow the universe to contain two different sorts of things. Say, the logic uses lower-case letters for all type-1 objects and upper-case letters for all type-2 objects. Like, '∀x: x = x' is a statement over all type-1 objects, and '∀Y: Y = Y' is a statement over all type-2 objects. But aside from that, you use the same syntax and proof rules as before."

Okay...

"Now add an element-relation x∈Y, saying that x is an element of Y, and add some first-order axioms for making the type-2 objects behave like collections of type-1 objects, including axioms for making sure that most describable type-2 collections exist - i.e., the collection X containing just x is guaranteed to exist, and so on. What you can prove syntactically in this theory will be identical to what you can prove using the standard syntax of second-order logic - even though the theory doesn't claim that all possible collections of type-1s are type-2s, and the theory will have models where many 'possible' collections are missing from the type-2s."

Wait, now you're saying that second-order logic is no more powerful than first-order logic?

"I'm saying that the supposed power of second-order logic derives from interpreting it a particular way, and taking on faith that when you quantify over all properties, you're actually talking about all properties."

But then second-order arithmetic is no more powerful than first-order arithmetic in terms of what it can actually prove?

"2nd-order arithmetic is way more powerful than first-order arithmetic. But that's because first-order set theory is more powerful than arithmetic, and adding the syntax of second-order logic corresponds to adding axioms with set-theoretic properties. In terms of which consequences can be syntactically proven, second-order arithmetic is more powerful than first-order arithmetic, but weaker than first-order set theory. First-order set theory can prove the existence of a model of second-order arithmetic - ZF can prove there's a collection of numbers and sets of numbers which models a many-sorted logic with syntax corresponding to second-order logic - and so ZF can prove second-order arithmetic consistent."

But first-order logic, including first-order set theory, can't even talk about the standard numbers!

"Right, but first-order set theory can syntactically prove more statements about 'numbers' than second-order arithmetic can prove. And when you combine that with the semantic implications of second-order arithmetic not being computable, and with any second-order logic being syntactically identical to a many-sorted first-order logic, and first-order logic having neat properties like the Completeness Theorem... well, you can see why some mathematicians would want to give up entirely on this whole second-order business."

But if you deny second-order logic you can't even say the word 'finite'. You would have to believe the word 'finite' was a meaningless noise.

"You'd define finiteness relative to whatever first-order model you were working in. Like, a set might be 'finite' only on account of the model not containing any one-to-one mapping from that set onto a smaller subset of itself -"

But that set wouldn't actually be finite. There wouldn't actually be, like, only a billion objects in there. It's just that all the mappings which could prove the set was infinite would be mysteriously missing.

"According to some other model, maybe. But since there is no one true model -"

How is this not crazy talk along the lines of 'there is no one true reality'? Are you saying there's no really smallest set of numbers closed under succession, without all the extra infinite chains? Doesn't talking about how these theories have multiple possible models, imply that those possible models are logical thingies and one of them actually does contain the largest powerset and the smallest integers?

"The mathematicians who deny second-order logic would see that reasoning as invoking an implicit background universe of set theory. Everything you're saying makes sense relative to some particular model of set theory, which would contain possible models of Peano arithmetic as sets, and could look over those sets and pick out the smallest in that model. Similarly, that set theory could look over a proposed model for a many-sorted logic, and say whether there were any subsets within the larger universe which were missing from the many-sorted model. Basically, your brain is insisting that it lives inside some particular model of set theory. And then, from that standpoint, you could look over some other set theory and see how it was missing subsets that your theory had."

Argh! No, damn it, I live in the set theory that really does have all the subsets, with no mysteriously missing subsets or mysterious extra numbers, or denumerable collections of all possible reals that could like totally map onto the integers if the mapping that did it hadn't gone missing in the Australian outback -

"But everybody says that."

Okay...

"Yeah?"

Screw set theory. I live in the physical universe where when you run a Turing machine, and keep watching forever in the physical universe, you never experience a time where that Turing machine outputs a proof of the inconsistency of Peano Arithmetic. Furthermore, I live in a universe where space is actually composed of a real field and space is actually infinitely divisible and contains all the points between A and B, rather than space only containing a denumerable number of points whose existence can be proven from the first-order axiomatization of the real numbers. So to talk about physics - forget about mathematics - I've got to use second-order logic.

"Ah. You know, that particular response is not one I have seen in the previous literature."

Yes, well, I'm not a pure mathematician. When I ask whether I want an Artificial Intelligence to think in second-order logic or first-order logic, I wonder how that affects what the AI does in the actual physical universe. Here in the actual physical universe where times are followed by successor times, I strongly suspect that we should only expect to experience standard times, and not experience any nonstandard times. I think time is connected, and global connectedness is a property I can only talk about using second-order logic. I think that every particular time is finite, and yet time has no upper bound - that there are all finite times, but only finite times - and that's a property I can only characterize using second-order logic.

"But if you can't ever tell the difference between standard and nonstandard times? I mean, local properties of time can be described using first-order logic, and you can't directly see global properties like 'connectedness' -"

But I can tell the difference. There are only nonstandard times where a proof-checking machine, running forever, outputs a proof of inconsistency from the Peano axioms. So I don't expect to experience seeing a machine do that, since I expect to experience only standard times.

"Set theory can also prove PA consistent. If you use set theory to define time, you similarly won't expect to see a time where PA is proven inconsistent - those nonstandard integers don't exist in any model of ZF."

Why should I anticipate that my physical universe is restricted to having only the nonstandard times allowed by a more powerful set theory, instead of nonstandard times allowed by first-order arithmetic? If I then talk about a nonstandard time where a proof-enumerating machine proves ZF inconsistent, will you say that only nonstandard times allowed by some still more powerful theory can exist? I think it's clear that the way you're deciding which experimental outcomes you'll have to excuse, is by secretly assuming that only standard times exist regardless of which theory is required to narrow it down.

"Ah... hm. Doesn't physics say this universe is going to run out of negentropy before you can do an infinite amount of computation? Maybe there's only a bounded amount of time, like it stops before googolplex or something. That can be characterized by first-order theories."

We don't know that for certain, and I wouldn't want to build an AI that just assumed lifespans had to be finite, in case it was wrong. Besides, should I use a different logic than if I'd found myself in Conway's Game of Life, or something else really infinite? Wouldn't the same sort of creatures evolve in that universe, having the same sort of math debates?

"Perhaps no universe like that can exist; perhaps only finite universes can exist, because only finite universes can be uniquely characterized by first-order logic."

You just used the word 'finite'! Furthermore, taken at face value, our own universe doesn't look like it has a finite collection of entities related by first-order logical rules. Space and time both look like infinite collections of points - continuous collections, which is a second-order concept - and then to characterize the size of that infinity we'd need second-order logic. I mean, by the Lowenheim-Skolem theorem, there aren't just denumerable models of first-order axiomatizations of the reals, there's also unimaginably large cardinal infinities which obey the same premises, and that's a possibility straight out of H. P. Lovecraft. Especially if there are any things hiding in the invisible cracks of space."

"How could you tell if there were inaccessible cardinal quantities of points hidden inside a straight line? And anything that locally looks continuous each time you try to split it at a describable point, can be axiomatized by a first-order schema for continuity."

That brings up another point: Who'd really believe that the reason Peano arithmetic works on physical times, is because that whole infinite axiom schema of induction, containing for every Φ a separate rule saying...

(Φ(0) ∧ (∀x: Φ(x) → Φ(Sx))) → (∀n: Φ(n))

...was used to specify our universe? How improbable would it be for an infinitely long list of rules to be true, if there wasn't a unified reason for all of them? It seems much more likely that the real reason first-order PA works to describe time, is that all properties which are true at a starting time and true of the successor of any time where they're true, are true of all later times; and this generalization over properties makes induction hold for first-order formulas as a special case. If my native thought process is first-order logic, I wouldn't see the connection between each individual formula in the axiom schema - it would take separate evidence to convince me of each one - they would feel like independent mathematical facts. But after doing scientific induction over the fact that many properties true at zero, with succession preserving truth, seem to be true everywhere - after generalizing the simple, compact second-order theory of numbers and times - then you could invent an infinite first-order theory to approximate it.

"Maybe that just says you need to adjust whatever theory of scientific induction you're using, so that it can more easily induct infinite axiom schemas."

But why the heck would you need to induct infinite axiom schemas in the first place, if Reality natively ran on first-order logic? Isn't it far more likely that the way we ended up with these infinite lists of axioms was that Reality was manufactured - forgive the anthropomorphism - that Reality was manufactured using an underlying schema in which time is a connected series of events, and space is a continuous field, and these are properties which happen to require second-order logic for humans to describe? I mean, if you picked out first-order theories at random, what's the chance we'd end up inside an infinitely long axiom schema that just happened to look like the projection of a compact second-order theory? Aren't we ignoring a sort of hint?

"A hint to what?"

Well, I'm not that sure myself, at this depth of philosophy. But I would currently say that finding ourselves in a physical universe where times have successor times, and space looks continuous, seems like a possible hint that the Tegmark Level IV multiverse - or the way Reality was manufactured, or whatever - might look more like causal universes characterizable by compact second-order theories than causal universes characterizable by first-order theories.

"But since any second-order theory can just as easily be interpreted as a many-sorted first-order theory with quantifiers that can range over either elements or sets of elements, how could using second-order syntax actually improve an Artificial Intelligence's ability to handle a reality like that?"

Good question. One obvious answer is that the AI would be able to induct what you would call an infinite axiom schema, as a single axiom - a simple, finite hypothesis.

"There's all sorts of obvious hacks to scientific induction of first-order axioms which would let you assign nonzero probability to computable infinite sequences of axioms -"

Sure. So beyond that... I would currently guess that the basic assumption behind 'behaving as if' second-order logic is true, says that the AI should act as if only the 'actually smallest' numbers will ever appear in physics, relative to some 'true' mathematical universe that it thinks it lives in, but knows it can't fully characterize. Which is roughly what I'd say human mathematicians do when they take second-order logic at face value; they assume that there's some true mathematical universe in the background, and that second-order logic lets them talk about it.

"And what behaviorally, experimentally distinguishes the hypothesis, 'I live in the true ultimate math with fully populated powersets' from the hypothesis, 'There's some random model of first-order set-theory axioms I happen to be living in'?"

Well... one behavioral consequence is suspecting that your time obeys an infinitely long list of first-order axioms with induction schemas for every formula. And then moreover believing that you'll never experience a time when a proof-checking machine outputs a proof that Zermelo-Fraenkel set theory is inconsistent - even though there's provably some models with times like that, which fit the axiom schema you just inducted. That sounds like secretly believing that there's a background 'true' set of numbers that you think characterizes physical time, and that it's the smallest such set. Another suspicious behavior is that as soon as you suspect Zermelo-Fraenkel set theory is consistent, you suddenly expect not to experience any physical time which ZF proves isn't a standard number. You don't think you're in the nonstandard time of some weaker theory like Peano arithmetic. You think you're in the minimal time expressible by any and all theories, so as soon as ZF can prove some number doesn't exist in the minimal set, you think that 'real time' lacks such a number. All of these sound like behaviors you'd carry out if you thought there was a single 'true' mathematical universe that provided the best model for describing all physical phenomena, like time and space, which you encounter - and believing that this 'true' backdrop used the largest powersets and smallest numbers.

"How exactly do you formalize all that reasoning, there? I mean, how would you actually make an AI reason that way?"

Er... I'm still working on that part.

"That makes your theory a bit hard to criticize, don't you think? Personally, I wouldn't be surprised if any such formalized reasoning looked just like believing that you live inside a first-order set theory."

I suppose I wouldn't be too surprised either - it's hard to argue with the results on many-sorted logics. But if comprehending the physical universe is best done by assuming that real phenomena are characterized by a 'true' mathematics containing the powersets and the natural numbers - and thus expecting that no mathematical model we can formulate will ever contain any larger powersets or smaller numbers than those of the 'true' backdrop to physics - then I'd call that a moral victory for second-order logic. In first-order logic we aren't even supposed to be able to talk about such things.

"Really? To me that sounds like believing you live inside a model of first-order set theory, and believing that all models of any theories you can invent must also be sets in the larger model. You can prove the Completeness Theorem inside ZF plus the Axiom of Choice, so ZFC already proves that all consistent theories have models which are sets, although of course it can't prove that ZFC itself is such a theory. So - anthropomorphically speaking - no model of ZFC expects to encounter a theory that has fewer numbers or larger powersets than itself. No model of ZFC expects to encounter any quoted-model, any set that a quoted theory entails, which contains larger powersets than the ones in its own Powerset Axiom. A first-order set theory can even make the leap from the finite statement, 'P is true of all my subsets of X', to believing, 'P is true of all my subsets of X that can be described by this denumerable collection of formulas' - it can encompass the jump from a single axiom over 'all my subsets', to a quoted axiom schema over formulas. I'd sum all that up by saying, 'second-order logic is how first-order set theory feels from the inside'."

Maybe. Even in the event that neither human nor superintelligent cognition will ever be able to 'properly talk about' unbounded finite times, global connectedness, particular infinite cardinalities, or true spatial continuity, it doesn't follow that Reality is similarly limited in which physics it can privilege.

Part of the sequence Highly Advanced Epistemology 101 for Beginners

Previous post: "Godel's Completeness and Incompleteness Theorems"

Gödelian LogicLogic & Mathematics

Personal Blog

64

Godel's Completeness and Incompleteness Theorems

87 comments80 karma

Mixed Reference: The Great Reductionist Project

358 comments61 karma

Mentioned in

80Godel's Completeness and Incompleteness Theorems

15Re: Second-Order Logic: The Controversy

7A basis for pattern-matching in logical uncertainty

Second-Order Logic: The Controversy

2Good_Burning_Plastic

New Comment

192 comments, sorted by

top scoring

Click to highlight new comments since: Today at 10:20 AM

Some comments are truncated due to high volume. (⌘F to expand all)Change truncation settings

[-]Nick_Tarleton11y310

I kept expecting someone to object that "this Turing machine never halts" doesn't count as a prediction, since you can never have observed it to run forever.

[-]cousin_it11y170

Then the statement "this Turing machine halts for every input" doesn't count as a prediction either, because you can never have observed it for every input, even if the machine is just "halt". And the statement "this Turing machine eventually halts" is borderline, because its negation doesn't count as a prediction. What does this give us?

1Johnicholas11y

Perhaps there is a type theory for predictions, with concrete predictions like "The bus will come at 3 o'clock", and functions that output concrete predictions like "Every monday, wednesday and friday, the bus will come at 3 o'clock" (consider the statement as a function taking a time and returning a concrete prediction yes or no). An ultrafinitist would probably not argue with the existence of such a function, even though to someone other than an ultrafinitist, the function looks like it is quantifying over all time. From the ultrafinitist's point of view, you're going to apply it to some concrete time, and at that point it's going to output a some concrete prediction.

1cousin_it11y

Ultrafinitism doesn't seem relevant here. The "type theory" you're thinking of seems to be just the arithmetical hierarchy. A "concrete prediction" is a statement with only bounded quantifiers, "this Turing machine eventually halts" is a statement with one existential quantifier, "this Turing machine halts for every input" is a statement with one universal and one existential quantifier, and so on. Or am I missing something?

0Johnicholas11y

The arithmetical hierarchy is presuming a background of predicate logic; I was not presuming that. Yes, the type theory that I was gesturing towards would have some similarity to the arithmetical hierarchy. I was trying to suggest that the answer to "what is a prediction" might look like a type theory of different variants of a prediction. Perhaps a linear hierarchy like the arithmetical hierarchy, yes, perhaps something more complicated. There could be a single starting type "concrete prediction" and a type constructor that, given source type and a destination type, gives the type of statements defining functions that take arguments of the source type and give arguments of the destination type. The intuitive bridge for me from ultrafinitism to the question "what counts as a prediction?" is that ultrafinitists do happily work with entities like 2^1000 considered as a structure like ("^" "2" "1000"), even if they deny that those structures refer to things that are definitely numbers (of course, they can agree that they are definitely "numbers", given an appropriately finitist definition of "number"). Maybe extreme teetotalers regarding what counts as a prediction would happily work with things such as computable functions returning predictions, even if they don't consider them predictions.

6abramdemski11y

Actually, I think "This Turing machine halts" is the more obviously troublesome one. It gives no computable expectation of when we might observe a halt. (Any computable probability distribution we would assign to halting time would put too much mass at small times as compared to the true distribution of halting times.)

5earthwormchuck16311y

If you take this objection seriously, then you should also take issue with predictions like "nobody will ever transmit information faster than the speed of light", or things like it. After all, you can never actually observe the laws of physics to have been stable and universal for all time. If nothing else, you can consider each as being a compact specification of an infinite sequence of testable predictions: "doesn't halt after one step", "doesn't halt after two steps",... "doesn't halt after n steps".

0[anonymous]9y

Technically speaking, you can observe the loop encoded in the Turing machine's code somewhere -- every nonhalting Turing machine has some kind of loop. The Halting theorems say that you cannot write down any finite program which will recognize and identify any infinite loop, or deductively prove the absence thereof, in bounded time. However, human beings don't have finite programs, and don't work by deduction, so I suspect, with a sketch of mathematical grounding, that these problems simply don't apply to us in the same way they apply to regular Turing machines. EDIT: To clarify, human minds aren't "magic" or anything: the analogy between us and regular Turing machines with finite input and program tape just isn't accurate. We're a lot closer to inductive Turing machines or generalized Turing machines. We exhibit nonhalting behavior by design and have more-or-less infinite input tapes.

4Vaniver9y

I think I would use "every" where you use "any."

1[anonymous]9y

Let's just use quantifiers. ~exist P, forall M i, P (M, i) \/ ~P (M, i). Where P is a finite program running in finite time, M is a Turing machine, and i is an input string.

0[anonymous]9y

I don't understand your notation. You have a disjunction of a formula and its negation, which is tautologically true, hence the whole statement is tautologically false. P is a program; what meaning is intended by the predicate P(M,i)? In fact, what work is the object P doing in that formula? A Turing machine M is given an input string i; there is no other "program" involved.

2orthonormal9y

Formal proof needed; I in fact expect there to be something analogous to a Penrose tiling. Moreover, to adapt Keynes' apocryphal quote, a Turing machine can defer its loop for longer than you can ponder it. And finally, as a general note, if you find that your proof that human beings can solve the halting problem can't be made formal and concise, you might consider the possibility that your intuition is just plain wrong. It is in fact relevant that theoretical computer scientists seem to agree that the halting problem is not solvable by physical processes in the universe, including human beings.

0[anonymous]9y

I didn't say that it can't be made formal. I said that the formalism isn't concise enough for one blog comment. Most normal journal/conference papers are about ten pages long, so that's the standard of concision you should use for a claimed formalism in theoretical CS. And indeed, I think I can fit my formalism, when it's done, into ten pages.

1Donald Hobson6y

If anyone wants a go, here's a Turing machine to try on. Does it halt? .i=4.a=False.while not a:. a=True. for j in range(1,i):. k=i-j. b=False. for p in range(2,j):. b=b or j%p==0. for p in range(2,k):. b=b or k%p==0. a=a and b Written in python for the convenience of coders.

1JoshuaZ9y

No. These aren't "loops" in any meaningful sense of the word. Failing to halt and going into an infinite loop are not the same thing.

0[anonymous]9y

I'd appreciate some explanation on that, to see if you're saying something I haven't heard before or if we're talking past each-other. I don't just include while(true); under "infinite loop", I also include infinitely-expanding recursions that cannot be characterized as coinductive stepwise computations. Basically, anything that would evaluate to the \Bot type in type-theory counts for "infinite loop" here, just plain computational divergence.

1Quill_McGee9y

I think that what Joshua was talking about by 'infinite loop' is 'passing through the same state an infinite number of times.' That is, a /loop/, rather than just a line with no endpoint. although this would rule out (some arbitrary-size int type) x = 0; while(true){ x++; } on a machine with infinite memory, as it would never pass through the same state twice. So maybe I'm still misunderstanding.

1JoshuaZ9y

Ok. By that definition, yes these are the same thing. I don't think this is a standard notion of what people mean by infinite loop though.

1g_pepper9y

Actually, the standard Turing machine has an infinite tape. And, the role of the tape in a Turing machine is equivalent to the input, output and working memory of a computer; the Turing machine's "program" is the finite state machine. It seems to me that the above suggests that humans are capable of cognitive tasks that Turing machines are not. But if this is true, it follows that human-level AGI is not achievable via a computer, since computers are thought to be equivalent in power to Turing machines.

0[anonymous]9y

First of all, swapped "induction" to "deduction", because I'm a moron. There's no real semantic distinction between the original contents of the One Tape, or the finite contents of the Input Tape, or an arbitrarily complicated state-machine program, actually. You can build tape data for a Universal Turing Machine to simulate any other Turing machine. No. You're just making the analogy between Turing machines and real, physical computing devices overly strict. A simple Turing machine takes a finite input, has a finite program, and either processes for finite time before accepting (possibly with something on an output tape or something like that), or runs forever. Real, physical computing devices, both biological and silicon, run coinductive (infinite-loop-until-it-isn't) programs all the time. Every operating system kernel, or message loop, or game loop is a coinductive program: its chief job is to iterate forever, taking a finite time to process I/O in each step. Each step performs some definite amount of semantic work, but there is an indefinite number of steps (generally: until a special "STOP NOW" input is given). To reiterate: because these programs both loop infinitely and perform I/O (with the I/O data not being computable from anything the program has when it begins running, not being "predictable" in any sense by the program), they are physically-realizable programs that simply don't match the neat analogy of normal Turing machines. Likewise, human beings loop infinitely and perform I/O. Which more-or-less gives away half my mathematical sketch of how we solve the problem!

2Richard_Kennaway9y

They match the neat analogy of Turing machines that start with possibly infinitely many non-blank squares on their tapes. All the obvious things you might like to know are still undecidable. Is this machine guaranteed to eventually read every item of its input? Is there any input that will drive the machine into a certain one of its states? Will the machine ever write a given finite string? Will it eventually have written every finite prefix of a given infinite string? Will it ever halt? These are all straightforwardly reducible to the standard halting problem. Or perhaps you would add to the model an environment that responds to what the machine does. In that case, represent the environment by an oracle (not necessarily a computable one), and define some turn-taking mechanism for the machine to ask questions and receive answers. All of the interesting questions remain undecidable.

0g_pepper9y

I agree with this: and with this: However, I don't see how you are going to be able to use these facts to solve the halting problem. I'm guessing that, rather than statically examining the Turing machine, you will execute it for some period of time and study the execution, and after some finite amount of time, you expect to be able to correctly state whether or not the machine will halt. Is that the basic idea?

0[anonymous]9y

Basically, yes. "Halting Empiricism", you could call it. The issue is precisely that you can't do empirical reasoning via deduction from a fixed axiom set (ie: a fixed, finite program halts x : program -> boolean). You need to do it by inductive reasoning, instead.

2Richard_Kennaway9y

What is inductive reasoning? Bayesian updating? Or something else?

0[anonymous]9y

My reply to you is the same as my reply to g_pepper: it's easier for me to just do my background research, double-check everything, and eventually publish the full formalism than it is to explain all the details in a blog comment. You are also correct to note that whatever combination of machine, person, input tape, and empirical data I provide, the X Machine can never solve the Halting Problem for the X Machine. The real math involved in my thinking here involves demonstrating the existence of an ordering: there should exist a sense in which some machines are "smaller" than others, and A can solve B's halting problem when A is "strictly larger" than B, possibly strictly larger by some necessary amount. (Of course, this already holds if A has an nth-level Turing Oracle, B has an mth-level Turing Oracle, and n > m, but that's trivial from the definition of an oracle. I'm thinking of something that actually concerns physically realizable machines.) Like I said: trying to go into extensive detail via blog comment will do nothing but confusingly unpack my intuitions about the problem, increasing net confusion. The proper thing to do is formalize, and that's going to take a bunch of time.

0g_pepper9y

This is interesting. Do you have an outline of how a person would go about determining whether a Turing machine will halt? If so, I would be interested in seeing it. Alternatively, if you have a more detailed argument as to why a person will be able to determine whether an arbitrary Turing machine will halt, even if that argument does not contain the details of how the person would proceed, I would be interested in seeing that. Or, are you just making the argument that an intelligent person ought to be able in every case to use some combination of inductive reasoning, creativity, mathematical intuition, etc., to correctly determine whether or not a Turing machine will halt?

1[anonymous]9y

My reply to you is the same as my reply to Richard Kennaway: it's easier for me to just do my background research, double-check everything, and eventually publish the full formalism than it is to explain all the details in a blog comment.

0g_pepper9y

Fair enough. I am looking forward to seeing the formalism!

0jessicat9y

It's possible to compute whether each machine halts using an inductive Turing machine like this: initialize output tape to all zeros, representing the assertion that no Turing machine halts for i = 1 to infinity . for j = 1 to i . . run Turing machine j for i steps . . if it halts: set bit j in the output tape to 1 Is this what you meant? If so, I'm not sure what this has to do with observing loops. When you say that every nonhalting Turing machine has some kind of loop, do you mean the kind of loop that many halting Turing machines also contain?

0[anonymous]9y

No, I mean precisely the fact that it doesn't halt. You can think of it as an infinitely growing recursion if that helps, but what's in question is really precisely the nonhalting behavior. I'm going to make a Discussion post about the matter, since Luke wants me to share the whole informal shpiel on the subject.

-1Donald Hobson6y

It is possible to make a Turing machine that returns "HALT", "RUNS FOREVER" or "UNSURE" and is never wrong. If the universe as we know it runs on quantum mechanics, it should be possible to simulate a mathematician in a box with unlimited data storage and time, using a finite Turing machine. This would imply the existence of some problem that would leave the mathematician stuck forever. There are plenty of Turing machines where we have no Idea if they halt. If you describe the quantum state of the observable universe to plank length resolution, what part of humans supposedly infinite minds is not described in these 2^2^100 bits of information?

0[anonymous]11y

"This Turing machine won't halt in 3^^^3 steps" is a falsifiable prediction. Replace 3^^^3 with whatever number is enough to guarantee whatever result you need. Edit: But you're right.

8Qiaochu_Yuan11y

Really? First, 3^^^3 is an absurdly large number of steps, much more steps than you could feasibly go through before the heat death of the universe or whatever horrible fate awaits us. Second, after enough steps your confidence in the outcome of any computation can be made arbitrarily low because of possible errors, e.g. errors caused by cosmic rays.

2[anonymous]11y

I already saw your comments; I upvoted you. If this objection was made in your thread, I would have said, "Assuming ultrafinitism is wrong..." It seemed to me that ultrafinitism wasn't his true rejection. Edit: But you're right.

5Qiaochu_Yuan11y

Okay, non-ultrafinitist objection: But how do you know what number that is? The function "the number of steps needed to guarantee that a Turing machine won't halt" is not a computable function of Turing machines, because if it were you could solve the halting problem. So how do you compute a function that's, y'know, not computable?

0[anonymous]11y

First of all, I shouldn't have used the number 3^^^3. That was a stupid idea, and I've forgotten what the notation means. So just replace that with "generic really big number", let's just say a googolplex (and, yes, I'm aware that's nowhere close to 3^^^3). To answer your question, I doubt FAI is going to care about whether Turing machine X halts; it's going to be looking for a proof of a certain result regarding its own algorithms. It's possible that it only needs to prove that its algorithms will work for "generic really big number" amount of steps, and thus could use a bounded quantifier instead of a universal quantifier. Using bounded quantifiers makes your system much weaker, but also moves it closer to being decidable (I mean this in a metaphorical sense; the system still won't be decidable). Also, regarding searching for proofs, we'll have to set an upper bound for how long we search, as well. So we need to specify some specific number in that context, too. Specifically which numbers to select is something I have no idea about; I'm not an AI researcher. Maybe I'll have some kind of idea (but very probably not) when a more formal problem specification is given, but this article is still discussing theoreticals, not applications yet.

7Solvent11y

Are you aware of the busy beaver function? Read this. Basically, it's impossible to write down numbers large enough for that to work.

0[anonymous]11y

Interesting article! What do you mean, though? Are you saying that Knuth's triple up-arrow is uncomputable (I don't see why that would be the case, but I could be wrong.)?

3Qiaochu_Yuan11y

No, Solvent is making the same point as in my answer: in order to write down large enough numbers to guarantee that Turing machines don't halt, you need to write down a function larger than the busy beaver function, and any such function fails to be computable because you can use it to solve the halting problem.

0[anonymous]11y

Ah, I see now, thanks!

0Solvent11y

Basically, the busy beaver function tells us the maximum number of steps that a Turing machine with a given number of states and symbols can run for. If we know the busy beaver of, for example, 5 states and 5 symbols, then we can tell you if any 5 state 5 symbol Turing machine will eventually halt. However, you can see why it's impossible to in general find the busy beaver function- you'd have to know which Turing machines of a given size halted, which is in general impossible.

[-]Polymath11y290

There is a lot of psuedo-arguing going on here. I'm a mathematician who specializes in logic and foundations, I think I can settle most of the controversy with the following summary:

1) Second-order logic is a perfectly acceptable way to formalize mathematics-in-general, stronger than Peano Arithmetic but somewhat weaker than ZFC (comparable in proof-theoretic strength to "Type Theory" of "Maclane Set Theory" or "Bounded Zermelo Set Theory", which means sufficient for almost all mainstream mathematical research, though inadequate for results which depend on Frankel's Replacement Axiom).

2) First-order logic has better syntactic properties than Second-order logic, in particular it satisfies Godel's Completeness Theorem which guarantees that the syntax is adequate for the semantics. Second-order logic has better semantic properties than first-order logic, in particular avoiding nonstandard models of the integers and of set-theoretical universes V_k for various ordinals k.

3) The critics of second-order logic say that they don't understand the concept of "finite integer" and "all subsets" and that the people using second-order logic are m... (read more)

[-]Eliezer Yudkowsky11y160

(Since the average LW reader may not know whether to trust that this commenter is a mathematician specializing in logic and foundations, I remark that the above summary sounds much like the papers I've read on second-order logic. Though 'pseudo-arguing' is an odd way to describe the ancient expositional tradition of dialogue.)

6Wei Dai11y

If sets with cardinalities equal to these higher infinities exist as mathematical entities, then some of them will be logically dependent on us, on what we do. For example those sets may contain intelligent beings who are simulating our universe and taking different actions based on what they observe. I don't know if this is actually the case, or if it is, how much we ought to care, but it would be nice if an FAI could consider and try to answer questions like this, and not be arbitrarily limited in its reasoning by the choice of mathematical language we make when building it.

-1[anonymous]11y

Best mathematical comment on LessWrong in four years. I applaud you, sir or madam.

[-]Qiaochu_Yuan11y250

One of the participants in this dialogue seems too concerned with pinning down models uniquely and also seems too convinced he knows what model he's in. Suppose we live in a simulation which is being run by superbeings who have access to oracles that can tell them when Turing machines are attempting to find contradictions in PA. Whenever they detect that something in the simulation is attempting to find contradictions in PA, that part of the simulation mysteriously stops working after the billionth or trillionth step or something. Then running such Turing machines can't tell us whether we live in a universe where PA is consistent or not.

I also wish both participants in the dialogue would take ultrafinitism more seriously. It is not as wacky as it sounds, and it seems like a good idea to be conservative about such things when designing AI.

Edit: Here is an ultrafinitist fable that might be useful or at least amusing, from the link.

I have seen some ultrafinitists go so far as to challenge the existence of 2^100 as a natural number, in the sense of there being a series of 'points' of that length. There is the obvious 'draw the line' objection, asking where in 2^1, 2^2, 2^3, … , 2^1

... (read more)

5Academian11y

For what it's worth, I'm an ultrafinitist. Since 2005, at least as far as I've been able to tell.

[-]Eliezer Yudkowsky11y290

How long do you expect to stay an ultrafinitist?

[-]Academian11y140

Until I'm destroyed, of course!

... but since Qiaochu asked that we take ultrafinitism seriously, I'll give a serious answer: something else will probably replace ultrafinitism as my preferred (maximum a posteriori) view of math and the world within 20 years or so. That is, I expect to determine that the question of whether ultrafinitism is true is not quite the right question to be asking, and have a better question by then, with a different best guess at the answer... just because similar changes of perspective have happened to me several times already in my life.

[-]Kawoomba11y130

Is that because 2005 is as far from the present time as you dare to go?

2satt11y

What precisely do the Overflowers (and the mathematician reporting that anecdote) mean by the "existence" of a number? For instance, I see that the anecdote-reporter refers to there being a series of points of a particular length, but I assume they don't mean that in an intuitive, literal sense: there are certainly at least 2^100 Planck lengths between me and the other end of the room.

9Qiaochu_Yuan11y

I am not sure. If I tabooed "exist," then my best guess is that ultrafinitists would argue that statements involving really big numbers are not meaningful. For example, they might argue that such statements are not verifiable in the real world. (Edit: as another example, as I mentioned in another comment, ultrafinitists might argue that you cannot count to really big numbers.) Yes, but just barely: 2^100 Planck lengths is something like 2 x 10^{-5} meters, so substitute 2^1000 Planck lengths, which is substantially larger than the diameter of the universe.

1khafra11y

Seems weird to think that some of the possible configurations of bits on my 1.5TB hard drive don't exist. Which ones? I hope none of the really good collections of pr0n are logically unreachable. If that number does exist, then what about really big busy beaver numbers, like bb(2^10^13 )? They're just a series of computations on hard drive contents. And that number is so close to infinity that we might as well just step from ultrafinitism to plain old finitism.

2ygert11y

While I am not an ultrafinitist, I believe the idea is meant to be this: It is not valid to talk about those numbers, because there is no meaningful thing you can do with those numbers that can affect the real world. Therefore, the ultrafinitists say that it is not really logical to treat these numbers as "existing" as they can not affect the real world at all, and why say that something exists if it cannot affect anything at all?

0abramdemski11y

This seems incredibly likely, doesn't it? (As long as we are happy to bound 'logically reachable' to within the observable universe.)

0Qiaochu_Yuan11y

Would you like to go through all of them just to be sure? How long do you think that will take you? Trying to actually compute a sufficiently large busy beaver number, you'll run into the problem that there won't be enough material in the observable universe to construct the corresponding Turing machines and/or that there won't be enough usable energy to power them for the required lengths of time and/or that the heat death of the universe will occur before the required lengths of time. If there's no physical way to go through the relevant computations, there's no physical sense in which the relevant computations output a result.

4[anonymous]11y

It may not be possible to check all of them, but it certainly is possible to check one of them...any one of them. And whichever one you choose to check, you'll find that it exists. So if you claim that some of the possible configurations don't exist, you're claiming they'd have to be among the ones you don't choose to check. But wait, this implies that your choice of which one(s) to check somehow affects which ones exist. It sure would be spooky if that somehow turns out to be the case, which I doubt.

0khafra11y

Exactly. And I could make my choice of which pr0n library to check--or which 1.5TB turing machine to run--dependent on 10^13 quantum coinflips; which, while it would take a while, seems physically realizable.

1Academian11y

Help me out here... I can imagine living a simulation... I just don't understand yet what you mean by living in a model in the sense of logic and model theory, because a model is a static thing. I heard someone once before talk about "what are we in?", as though the physical universe were a model, in the sense of model theory. He wasn't able to operationalize what he meant by it, though. So, what do you mean when you say this? Are you considering the physical universe a first-order structure) somehow? If so, how? And concerning its role as a model, what formal system are you considering it a model of?

1Qiaochu_Yuan11y

That was imprecise, but I was trying to comment on this part of the dialogue using the language that it had established: I was also commenting on this part: The point I was trying to make, and maybe I did not use sensible words to make it, is that This Guy (I don't know what his name is - who writes a dialogue with unnamed participants, by the way?) doesn't actually know that, for two reasons: first, Peano arithmetic might actually be inconsistent, and second, even if it were consistent, there might be some mysterious force preventing us from discovering this fact. Models being static is a matter of interpretation. It is easy to write down a first-order theory of discrete dynamical systems (sets equipped with an endomap, interpreted as a successor map which describes the state of a dynamical system at time t + 1 given its state at time t). If time is discretized, our own universe could be such a thing, and even if it isn't, cellular automata are such things. Are these "static" or "dynamic"?

4abramdemski11y

Indeed, I think it's somewhat unclear what is meant here. The speaker attempts to relate it to physics, referring to the idea that we appear to live in continuous space... but how does the speaker propose to rule out infinitesimals and other nonstandard entities? (The speaker only seems to indicate horror about devils living in the cracks.) Or, for that matter, countable models of the reals, as someone already mentioned. This isn't directly related to the question of what set theory is true, what set theory we live in, etc... (Perhaps the speaker's intention in this line was to assume that we live in a Tegmark multiverse, so that we literally do live in some set theory?) Instead, I think the speaker should have argued that we can refer to this state of affairs, not that it must be the true state of affairs. To give another example, I'm not at all convinced that time must correspond to the standard model of the natural numbers (I'm not even sure it doesn't loop back upon itself eventually, when it comes down to it, though I agree that causal models disallow this and I find it improbable for that reason). Yet, I'm (relatively) happy to say that we can at least refer to this as a possible state of affairs. (Perhaps with a qualifier: "Unless peano arithmetic turns out to be inconsistent...")

2Academian11y

Ah, I was asking you because I thought using that language meant you'd made sense of it ;) The language of us "living in a (model of) set theory" is something I've heard before (not just from you and Eliezer), which made me think I was missing something. Us living in a dynamical system makes sense, and a dynamical system can contain a model of set theory, so at least we can "live with" models of set theory... we interact with (parts of) models of set theory when we play with collections of physical objects. Of course, time has been a fourth dimension for ages ;) My point is that set theory doesn't seem to have a reasonable dynamical interpretation that we could live in, and I think I've concluded it's confusing to talk like that. I can only make sense of "living with" or "believing in" models.

7Eliezer Yudkowsky11y

Set theory doesn't have a dynamical interpretation because it's not causal, but finite causal systems have first-order descriptions and infinite causal systems have second-order descriptions. Not everything logical is causal; everything causal is logical.

1[anonymous]11y

I like your proposal, but why not just standard finitism? What is your objection to primitive recursive arithmetic? As a side note, I personally consider a theory "safe" if it fits with my intuitions and can be proved consistent using primitive recursive arithmetic and transfinite induction up to an ordinal I can picture. So I think of Peano arithmetic as safe, since I can picture epsilon-zero.

1Qiaochu_Yuan11y

This isn't my objection personally, but a sufficiently ultra finitist rejects the principle of induction.

2Eliezer Yudkowsky11y

I think it's more that an ultrafinitist claims not to know that successor is a total function - you could still induct for as long as succession lasts. Though this is me guessing, not something I've read.

0[anonymous]11y

Either that, or they claim that certain numbers, such as 3^^^3, cannot be reached by any number of iterations of the successor function starting from 0. It's disprovable, but without induction or cut it's too long a proof for any computer in the universe to do in a finite human lifespan, or even the earth's lifespan.

0[anonymous]11y

Alternatively, one could start by asking whether 2^50 is a real number or not, if yes go up to 2^75, if no go to 2^25, and in up to 7 steps find a real number that, when doubled, ceases to be a real number. There may be impractical or even noncomputable numbers, but continuity holds that doubling a real number always yields a real number.

[-]Qiaochu_Yuan11y100

I think the point of the fable is that Yesenin-Volpin was counting to each number in his head before declaring whether it was 'real' or not, so if you asked him whether 2^50 was 'real' he'd just be quiet for a really really long time.

2[anonymous]11y

But wouldn't that disprove ultrafinitism? All finite numbers, even 3^^^3, can be counted to (in the absence of any time limit, such as a lifespan), there's just no human who really wants to.

[-]Qiaochu_Yuan11y180

All finite numbers, even 3^^^3, can be counted to

As I understand it, this is precisely the kind of statement that ultrafinitists do not believe.

0MrMind11y

If that's true, then Yesenin-Volpin was just playin the role of a Turing machine trying to determine if a certain program halts (the program that counts from 0 to the input). If 3^^^3 (say) for a finitist doesn't exists, then s/he really has a different concept of number than you have (for example, they are not axiomatized by the Peano Arithmetic). It's fun to observe that ultrafinitism is axiomatic: if it's a coherent point of view, it cannot prove that a certain number doesn't exists, only assume it. I also suspect (but finite model theory is not my field at all) that they have an 'inner' model that mimics standard natural numbers...

9Larks11y

Well, that's what the anti-ultrafinitists say. It is precisely the contention of the ultrafinitists that you couldn't "count to 3^^^3", whatever that might mean.

0SecondWind11y

Hmm. So, it's not sufficient to define a set of steps that determine a number... it must be possible to execute them? That's a rather pragmatic approach. Albeit it one you'd have to keep updating if our power to compute and comprehend lengthier series of steps grows faster than you predict.

1Larks11y

No, ultrafinitism is not a doctrine about our practical counting capacities. Ultrafinitism holds that you may not have actually denoted a number by '3^^^3', because there is no such number.

4Peterdjones11y

Utlrafrinitists tend no to specfify the highest number, to prevent people adding one to it.

2Larks11y

Hence "may not"

0wuncidunci11y

I would have done the following if I had been asked that: calculate which numbers I would have time to count up to before I was thrown out/got bored/died/earth ended/universe ran out of negentropy. I would probably have to answer I don't know, or I think X is a number for some of them, but it's still an answer, and until recently people could not say wether "the smallest n>2 such that there are integers a,b,c satisfying a^n + b^n = c^n" was a number or not. I'm not advocating any kind of finitism, but I agree that the position should be taken seriously.

[-]shminux11y230

I confess that I lost track of the reasoning about which-order-logic-can-do-what-and-why somewhere in the last post or so. I'm also not clear how and why it is important in understanding this "Highly Advanced Epistemology 101 for Beginners". I'm wondering whether (or how) the ability to "properly talk about unbounded finite times, global connectedness, particular infinite cardinalities, or true spatial continuity" is essential or even important for (F)AI research.

8[anonymous]11y

Me too. It's the buildup to the "open problems in FAI". Large parts of the internals of an AI look like systems for reasoning in rigorous ways about math, models, etc.

0[anonymous]11y

If that were the reasoning, it'd be nice if he came out and explained why he believes that to be the case. Becuase just about any A(G)I researcher would take issue with that statement...

0ESRogs11y

Maybe we need a handy summary table of the which's, what's, and why's...

5hairyfigment11y

I assume we need to ask some of these questions in order to decide if, or in what sense, an AGI needs second-order logic.

[-]Johnicholas11y140

This and previous articles in this series emphasize attaching meaning to sequences of symbols via discussion and gesturing toward models. That strategy doesn't seem compatible with your article regarding sheep and pebbles. Isn't there a way to connect sequences of symbols to sheep (and food and similar real-world consequences) directly via a discipline of "symbolcraft"?

I thought pebblecraft was an excellent description of how finitists and formalists think about confusing concepts like uncountability or actual infinity: Writing down "... is uncountably infinite ..." is part of our pebblecraft, and we have some confidence that when we get to the bottom line of the formal argument, we'll be able to interpret the results "in the usual way". The fact that someone (perhaps an alien) might watch our pebblecraft and construe us (in medias res) to be referring to (abstract, mathematical) things other than the (abstract, mathematical) things that we usually construe ourselves to be referring to doesn't stop the pebblecraft magic from working.

2abramdemski11y

I think I agree. This gives me the feeling that every time the discussion revolves around models, it is getting off the point. We can touch proof systems. We can't touch models. We can say that models are part of our pebblecraft... The question, though, is whether we should trust particular parts of our pebblecraft. Should we prefer to work with first-order logic, or 2nd-order logic? Should we believe in such a thing as a standard model of the natural numbers? Should we trust proofs which make use of those concepts?

[-]Johnicholas11y110

Feeling our way into a new formal system is part of our (messy, informal) pebblecraft. Sometimes people propose formal systems starting with their intended semantics (roughly, model theory). Sometimes people propose formal systems starting with introduction and elimination rules (roughly, proof theory). If the latter, people sometimes look for a semantics to go along with the syntax (and vice versa, of course).

For example, lambda calculus started with rules for performing beta reduction. In talking about lambda calculus, people refer to it as "functions from functions to functions". Mathematicians knew that there was no nontrivial set S with a bijection to the set of all functions from S to S. So something else must be going on. Dana Scott invented domain theory partially to solve this problem. Domains have additional structure, such that some domains D do have bijections with the set of all structure-preserving maps from D to D. http://en.wikipedia.org/wiki/Domain_theory Similarly, modal logics started with just proof theory, and then Saul Kripke invented a semantics for them. http://en.wikipedia.org/wiki/Kripke_semantics

There's always a syntactical model which you can ... (read more)

6Vladimir_Nesov11y

Nice. This seems like a surprisingly well-motivated way of reducing everything to physics: there's just "syntactic" machinery made out of physics, and any semantics that might be attributed to parts of this machinery is merely a partially informal device (i.e. a collection of cognitive skills) that human mathematicians might use as an aid for reasoning about the machinery. Even when the machinery itself might in some sense be said to be semantically reasoning about something or other, this description of the machinery can be traced back to human mathematicians who use it as a partially informal device for understanding the machinery, and so it won't strictly speaking be a property of the machinery itself. In other words, in this view semantics is an informal art primarily concerned with advancement of human understanding, and it's not fundamental to the operation of intelligence in general, it's not needed for properly designing things, responding to observations or making decisions, any more than curiosity or visual thinking.

0abramdemski11y

Ok, right. I think I didn't fully appreciate your point before. So the fact that a particular many-sorted first-order logic with extra axioms is proof-theoretically equivalent to (a given proof system for) 2nd-order logic should make me stop asking whether we should build machines with one or the other, and start asking instead whether the many-sorted logic with extra axioms is any better than plain first-order logic (to which the answer appears to be yes, based on our admittedly 2nd-order reasoning). Right?

4Johnicholas11y

The prevalence of encodings means that we might not be able to "build machines with one or the other". That is, given that there are basic alternatives A and B and A can encode B and B can encode A, it would take a technologist specializing in hair-splitting to say whether a machine that purportedly is using A is "really" using A at its lowest level or whether it is "actually" using B and only seems to use A via an encoding. If in the immediate term you want to work with many-sorted first order logic, a reasonable first step would be to implement single-sorted first order logic, and a preprocessor to translate the many-sorted syntax into the single-sorted syntax. Then further development, optimizations and so on, might complicate the story, compacting the preprocessor and single-sorted engine into a single technology that you might reasonably say implements many-sorted logic "natively". The prevalence of encodings means that apparently different foundations don't always make a difference.

3abramdemski11y

In this case, that's not true. The many-sorted logic, with axioms and all to emulate 2nd-order logic, has different properties than plain 1st-order logic (even though we may be emulating it in a plain 1st-order engine). For example, in 2nd-order logic, we can quantify over any names we use. In 1st-order logic, this is not true: we can quantify over 1st-order entities, but we are unable to quantify over 2nd-order entities. So, we can have named entities (predicates and relations) which we are unable to quantify over. One consequence of this is that, in 1st-order logic, we can never prove something non-tautological about a predicate or relation without first making some assumptions about that specific predicate or relation. Statements which share no predicates or relations are logically independent! This limits the influence of concepts on one another, to some extent. This sort of thing does not hold for 2nd-order logic, so its translation doesn't hold for the translation of 2nd-order logic into 1st-order logic, either. (Basically, all statements in this many-sorted logic will use the membership predicate, which causes us to lose the guarantee of logical independence for other named items.) So we have to be careful: an encoding of one thing into another thing doesn't give us everything.

[-]Qiaochu_Yuan11y100

Suggestion: if your goal is AI, maybe abandon logic and set theory in favor of type theory. It seems better suited to computers anyway thanks to computational trinitarianism. The problem of pinning down models uniquely seems to be solved by something called the univalence axiom, which I currently don't understand at all.

6JasonGross11y

The univalence axiom has nothing to do with pinning down models (except perhaps that if you are talking about models internally, it says that you're pinning them down up to isomorphism rather than up to some notion of syntactic equality). Non-standard models are essential for getting an intuition about what is and isn't provable; most of the arguments that I've seen for unexpectedly interesting terms involve picking a non-standard model, finding a point in that model with unexpected properties (e.g., something that is false in the standard model), and then construct that point. See, for example, this post, which constructs a nonstandard point on the circle, one for which you can't exhibit a path from it to the basepoint of the circle. More generally, see The Elements of an Inductive Type for a good explanation of why non-standard models are inescapable.

2vi21maobk9vp11y

I guess that understanding univalence axiom would be helped by understanding the implicit equality axioms. Univalence axiom states that two isomorphic types are equal; this means that if type A is isomorphic to type B, and type C contains A, C has to contain B (and a few similar requirements). Requiring that two types are equal if they are isomorphic means prohibiting anything that we can write to distinguish them (i.e. not to handle them equivalently).

-1abramdemski11y

Just so you know... I want to upvote you for linking to that awesome type theory article (although JeremyHahn posted it to this discussion already), but I also want to vote you down for computational trinitarianism (propositions are not types! arggggg!). ;)

-1abramdemski11y

I'm generally not on board with the whole "programs are proofs" dogma. When I look into the details, all it says is that a successful type-check of a program gives a proof of the existence of a program of that type. In expressively powerful languages, this seems very uninteresting; so far as I know, you can easily construct a program of any type by just discarding the input and then creating an output of the desired type. (Handling the empty type is a little tricky, but I wonder if infinite loops or program failures would do.) So my understanding is that you only get the interesting type structure (where it isn't trivial to prove the existence of a type) by restricting the language. (Typed lambda calculus is not Turing-complete.) So, while I see the curry-howard stuff as beautiful math, I don't think it justifies the 'programs as proofs' mantra that many have taken up! I don't see the appeal.

0Pfft11y

I think Harper (who coined the phrase "computational trinitarianism") doesn't claim that "programs are proofs", but rather that proofs are programs. That is, proofs should be written down in typed calculi and (to be intuinistically acceptable) should have computational content. But programming languages and type systems can be used for more things than just proving theorems.

0abramdemski11y

If course this isn't quite true. Other interesting things which are said include the idea that a proof of a conjunction can be handled as a pair of proofs (one for each element of the conjunction), and a proof for a disjunction can be a proof for either side, and so on; so in a sufficiently rich type system, we can leverage the type machinery to do logic, using product types as conjunctions, sum types as disjunctions, and so on. A proposition is then modeled as a type for a proof. An implication is a function type, proven by a procedure which transforms a proof of one thing into a proof for another. Still, I can't swallow "programs are proofs / propositions are types".

-1lukstafi11y

Actually, by discarding the input and then falling into an infinite loop (you cannot construct the output of the desired type without the input).

0abramdemski11y

I don't know what you mean. Suppose you want a function A -> Int, where A is some type. \x:A. 6, the function which takes x (of type A) and outputs 6, seems to do fine. To put it in c-like form, int six(x) {return 6}. If programs are proofs, then general programming languages correspond to trivialism. Am I missing something?

1Johnicholas11y

The type constructors that you're thinking of are Arrow and Int. Forall is another type constructor, for constructing generic polymorphic types. Some types such as "Forall A, Forall B, A -> B" are uninhabited. You cannot produce an output of type B in a generic way, even if you are given access to an element of type A. The type corresponding to a proposition like "all computable functions from the reals to the reals are continuous" looks like a function type consuming some representation of "a computable function" and producing some representation of "that function is continuous". To represent that, you probably need dependent types - this would be a type constructor that takes an element, not a type, as a parameter. Because not all functions are continuous, the codomain representing "that function is continuous" isn't in general inhabited. So building an element of that codomain is not necessarily trivial - and the process of doing so amounts to proving the original proposition. What I'm trying to say is that the types that type theorists use look a lot more like propositions than the types that mundane programmers use.

-2abramdemski11y

...except in Haskell, where we can make functions for generating instances given a desired type. (Such as Magic Haskeller.) I still feel like it's accurate to say that general-purpose languages correspond to trivialism. Thanks, though! I didn't think of the generic version. So, in general, inhabited types all act like "true" in the isomorphism. (We have A->B wherever B is a specific inhabited type.) I'm fully aware of that, and I enjoy the correspondence. What bothers me is that people seem to go from a specific isomorphism (which has been extended greatly, granted) to a somewhat ill-defined general principle. My frustration comes largely from hearing mistakes. One extreme example is thinking that (untyped!) lambda calculus is the universal logic (capturing first-order logic and any higher-order logic we might invent; after all, it's Turing complete). Lesser cases are usually just ill-defined invocations (ie, guessing that the isomorphism might be relevant in cases where it's not obvious how it would be).

1Johnicholas11y

Magic Haskeller and Augustsson's Djinn are provers (or to say it another way, comprehensible as provers, or to say it another way, isomorphic to provers). They attempt to prove the proposition, and if they succeed they output the term corresponding (via the Curry-Howard Isomorphism) to the proof. I believe they cannot output a term t :: a->b because there is no such term, because 'anything implies anything else' is false.

0gjm9y

The type on the right-hand side is usually something much more complicated and "creating an output of the desired type" is not trivial.

0PhilGoetz9y

What is "trivialism"?

2Good_Burning_Plastic9y

https://en.wikipedia.org/wiki/Trivialism I guess.

0lukstafi11y

It is not that difficult to write general-purpose functions in total languages like Coq. You just provide the functions with "fuel" to "burn" during computation. You don't have to express all features of interest in the type. But then you cannot rely on automation. "Programs are proofs" slogan is interesting from yet another point of view. What are the theorems for which a complex program is the simplest proof? There can be multiple such "theorems" ("principal types" property doesn't hold in complex settings).

0lukstafi11y

Well yes, either direct input or wider context (i.e. environment, library). You can only construct values for tautology types with expressions not referring to context.

[-]hairyfigment11y80

ZF shouldn't be able to prove that some set is a model of ZF, because that's not true in all models. Many models of ZF don't contain any individual set well-populated enough for that one set to be a model of ZF all by itself."

Is this true because of the absent C, true in the sense of the (larger) model, or just false?

4Qiaochu_Yuan11y

ZF can't prove that models of ZF exist because proving models of ZF exist is equivalent to proving that ZF is consistent, and ZF can't prove its own consistency (if it is in fact consistent) by the incompleteness theorem. I don't think ZFC can prove the consistency of ZF either, but I'm not a set theorist.

4earthwormchuck16311y

Also not a set theorist, but I'm pretty sure this is correct. ZF+Con(ZF) proves Con(ZFC) (see http://en.wikipedia.org/wiki/Constructible_universe), so if ZFC could prove Con(ZF) then it would also prove Con(ZFC).

0Eliezer Yudkowsky11y

You can substitute ZFC for ZF throughout the quoted paragraph and it will still be true. I don't know what you want me to get from the link given; it doesn't seem to contradict the quoted paragraph. The part where the link talks about Vk entailing ZFC for inaccessible k is exactly what the quoted paragraph is saying.

[-]hairyfigment11y100

Zah? The linked answer says,

Every model of ZFC has an element that is a model of ZFC.

It goes on to say that this need not hold when we interpret "model of ZFC" in the sense of the (outer) model.

4Eliezer Yudkowsky11y

So after reading that, I don't see how it could be true even in the sense described in the article without violating Well Foundation somehow, but what it literally says at the link is that every model of ZFC has an element which encodes a model of ZFC, not is a model of ZFC, which I suppose must make a difference somehow - in particular it must mean that we don't get A has an element B has an element C has an element D ... although I don't see yet why you couldn't construct that set using the model's model's model and so on. I am confused about this although the poster of the link certainly seems like a legitimate authority. But yes, it's possible that the original paragraph is just false, and every model of ZFC contains a quoted model of ZFC. Maybe the pair-encoding of quoted models enables there to be an infinite descending sequence of submodels without there being an infinite descending sequence of ranks, the way that the even numbers can encode the numbers which contain the even numbers and so on indefinitely, and the reason why ZFC doesn't prove ZFC has a model is that some models have nonstandard axioms which the set modeling standard-ZFC doesn't entail. Anyone else want to weigh in on this before I edit? (PS upvote parent and great-grandparent.)

6Academian11y

Here's why I think you don't get a violation of the axiom of well-foundation from Joel's answer, starting from way-back-when-things-made-sense. If you want to skim and intuit the context, just read the bold parts. 1) Humans are born and see rocks and other objects. In their minds, a language forms for talking about objects, existence, and truth. When they say "rocks" in their head, sensory neurons associated with the presence of rocks fire. When they say "rocks exist", sensory neurons associated with "true" fire. 2) Eventually the humans get really excited and invent a system of rules for making cave drawings like "∃" and "x" and "∈" which they call ZFC, which asserts the existence of infinite sets. In particular, many of the humans interpret the cave drawing "∃" to mean "there exists". That is, many of the same neurons fire when they read "∃" as when they say "exists" to themselves. Some of the humans are careful not to necessarily believe the ZFC cave drawing, and imagine a guy named ZFC who is saying those things... "ZFC says there exists...". 3) Some humans find ways to write a string of ZFC cave drawings which, when interpreted --- when allowed to make human neurons fire --- in the usual way, mean to the humans that ZFC is consistent. Instead of writing out that string, I'll just write in place of it. 4) Some humans apply the ZFC rules to turn the ZFC axiom-cave-drawings and the cave drawing into a cave drawing that looks like this: "∃ a set X and a relation e such that <(X,e) is a model of ZFC>" where <(X,e) is a model of ZFC> is a string of ZFC cave drawings that means to the humans that (X,e) is a model of ZFC. That is, for each axiom A of ZFC, they produce another ZFC cave drawing A' where "∃y" is always replaced by "∃y∈X", and "∈" is always replaced by "e", and then derive that cave drawing from the cave drawing " and " according to the ZFC rules. Some cautious humans try not to believe that X really exists... only that ZFC and the consistency of ZF

0vi21maobk9vp11y

Well Foundation in V_alpha case seems quite simple: you build externally-countable chain of subsets which simply cannot be represented as a set inside the first model of ZFC. So the external WF is not broken because the element-relation inside the models is different, and the inner WF is fine because the chain of inner models of external ZFC is not an inner set. In the standard case your even-numbers explanation nicely shows what goes on — quoting is involved. I need to think a bit to say what woud halt our attempts to build a chain of transitive countable models...

0vi21maobk9vp11y

Well, technically not every model of ZFC has a ZFC-modelling element. There is a model of "ZFC+¬Con(ZFC)", and no element of this monster can be a model of ZFC. Not even with nonstandard element-relation.

5Eliezer Yudkowsky11y

Linked impressive authority says the model has a ZFC-model-encoding element, plus enough nonstandard quoted ZFC axioms in-model that in-model ZFC doesn't think it's a ZFC-model-encoding element. I.e., the formula for "semantically entails ZFC" is false within the model, but from outside, using our own standard list of axioms, we think the element is a model of ZFC.

0vi21maobk9vp11y

Ah, sorry, I didn't notice that the question is about model of ZFC inside a "universe" modelling ZFC+Con^∞(ZFC)

2Qiaochu_Yuan11y

The linked answer is talking about models of ZFC in an external sense and Eliezer is talking about them in an internal sense.

0hairyfigment11y

I'll make this more explicit, and note that at present I don't see it affecting anyone's argument: ZF can't prove its own self-consistency and thus can't prove it has a model. You logician says that's as it should be, since not every model of ZF contains another model - which seems true provided we treat each model more as an interpretation than as an object in a larger universe - and then says that a model has to be "well-populated" - which AFAICT holds for standard models, but not for all other interpretations like those your logician used in the previous sentence.

[-]Rob Bensinger11y70

This discussion seems to equate the question of what language we want an AGI to most often reason with, and what language we think our universe, or (Tegmark?-)reality in general, is likely 'written' in. But there may be practical virtues of one language over the other that do not translate into distinct reasons to believe that reality is written in the more practical language; so the two questions should be kept separate to some extent. 'Think like reality' is a heuristic for helping alleviate dissonance between our beliefs, and better aligning our beliefs... (read more)

1abramdemski11y

We should get this automatically in many cases. Classical logic can think about dialetheism and other possibilities by inventing negative predicates for every predicate, say. The rules relating positive to negative can then be chosen at convenience. (In first-order logic, we would need to invent a negative predicate separately for each positive; in second-order logic, we could make a general statement to the effect that each positive predicate has a negative, and make statements about how they combine.) That's one reason we don't need to worry too much about which logic is correct; as johnicolas keeps pointing out, the existence of encodings means (at least in many cases) we get the same power regardless of what choice we make.

2Rob Bensinger11y

In addition to conventional negation, or in lieu of it? Remember that, on dialetheism, we may not know in advance which falsehoods are going to end up later turning out to also be true. So we'd never be able to assert real negations (as opposed to 'negativized' predicates) if we want to completely protect ourselves from ever going paraconsistent/trivialist (depending on where our reasoning system puts us when we encounter a contradiction). This issue isn't just one of computational power; it's of choosing priors for our AGI's basic interpetations of reality. I distinguished the two questions "what language we want an AGI to most often reason with" v. "what language we want the AGI to think our world is written in;" if we think that the AGI's functionality will best be met if it has a somewhat false view of the world in some respect (which is dangerous but might have non-insane justifications), then the latter question will be distinct from "what language the world is actually written in." But I can agree with you that it's mostly just an engineering problem to answer the first question (i.e., we don't need to settle every metaphysical debate before we can start coding!), without thinking this trivializes the second question.

5abramdemski11y

What I'm saying is it doesn't matter, and therefore doesn't obviously make sense to ask, what language the world is written in. Even if the world actually runs on paraconsistent logic, we can still do fine with classical logic. It will alter our prior somewhat, but not a while lot, because the encoding of one into the other isn't so hard. Eliezer is attempting to make some comparison between the structure of the actual world and the structure of the logic, to determine which logic most seems to be what the world is written in. But the existence of encodings of one logic in another makes this exercise somewhat unnecessary (and also means that the structure of the world is only weak evidence for "what logic it was written in"). To answer your question: in addition, because having strong negation in addition to weak negation only increases the expressive power of the system. It does not increase the risk you mention, because the system is still choosing what to believe according to its learning algorithm, and so will not choose to believe strong negatives that cause problems. EDIT: In response to srn347, I would like to clarify that I do not intend to belittle the importance of comparing the effectiveness of different logics. The existence of an encoding will only get you specific things (depending on the nature of the encoding), and furthermore, encodings to not always exist. So, specifically, I'm only saying that I don't see a need for paraconsistent logic. Other possibilities need to be judged on their merits. However, looking for encodings is an important tool in that judgement.

2Rob Bensinger11y

'Doesn't matter' in what sense? 'Doesn't make sense' in what sense? Be careful not to confuse the literal question 'Is our world actually a simulation programmed in part using second-order logic?' from the more generic 'is our world structured in a second-order-logic-ish way?'. The latter, I believe, is what Eliezer was concerned with -- he wasn't assuming we were in a simulation, he was asking whether our universe's laws or regularities are property-generalizing in a second-order-style way, or schematic in a first-order-style way, the importance being that schematic laws are more likely to vary across spacetime. Similarly, the important question of whether there are dialetheias should not be confused with the more occult question of whether our universe is a simulation written using paraconsistent logics (which, of course, need not involve any true contradictions). This is confusing a number of different issues. Paraconsistent logic need not be dialetheist, and not all dialetheists favor paraconsistent logic (though, for obvious reasons, nearly all of them do). You're reconfusing the issues I disentangled in my earlier posts, while (if I'm not misunderstanding you) trying to make a scrambled version of my earlier point that our choice of logic to have the AGI reason with is partly independent of our choice of logic to have the AGI think the universe structurally resembles. If you think it doesn't make sense to ask what logics our universe structurally resembles, remember that the structural similarity between arithmetic operations and physical processes is what makes mathematics useful in the first place, and that we should not be surprised to see similarities between the simplest systems we can come up with, and a universe with remarkably simple and uniform behavior, at the fundamental level. That's assuming it never makes a mistake early on. If it does, and it's not a paraconsistent reasoner, its epistemology will explode. You underestimate the virtues of par

2abramdemski11y

I'm imagining a better-designed system than that. Incoming facts together with internal reasoning would constantly be invalidating various theories; the system should have a bunch of commitments which explode on it, but rather, should always be seeking out contradictions in order to invalidate theories. Perhaps so. I don't actually understand what it would mean to reason with a paraconsistent logic while still believing a classical logic. Is something like that possible? Or (as it seems) would reasoning via a paraconsistent logic have to reduce the number of tautologies I recognize? I've sort of ignored paraconsistent thought merely because I'm not interested in dialetheism... Yes, re-reading your original post, I can see that I missed that point (and was partially re-stating it). The actual point I'm trying to make is basically to respond to this part: by saying that we get much of that automatically. Really, what I should have said is that we can achieve this by choosing as expressive a logic as possible, to ensure that as many other logics as possible have good embeddings in that logic. This could alternatively be stated as an expression of confusion about why you would want to address this concern separately from the choice of logic, since if we attempted to implement the aforementioned inner-dialog reasoning, we would surely have to provide a logic for the reasoning to take place in.

2Rob Bensinger11y

Not only is it possible, but probably over 99% of people who employ paraconsistent logics believe in classical-logic metaphysics, or at least something a lot closer to classical-logic metaphysics than to dialetheism. Paraconsistent logic is just reasoning in such a way that when you arrive at a contradiction in your belief system, you try to diagnose and discharge (or otherwise quarantine) the contradiction, rather than just concluding that anything follows. Dialetheism is one way (or family of ways) to quarantine the contradiction so that it doesn't yield explosion, but it isn't the standard one. Paraconsistent reasoning is a proof methodology, not a metaphysical stance in its own right. Within reason. We may not want the AGI to be so expressive that it can express things we think are categorically meaningless and/or useless. And the power to express some meaningful circumstances might come with costs that outweigh the expected utility. I suppose one way to go about this is to pick the optimal level of expressivity given the circumstances we expect the AGI to run into, but try to make the AGI able to self-modify (or generate nonclassical subsystems with which it can carry on a reasonable, open-minded dialogue) to increase expressivity if it runs into situations that seem especially anomalous given its conception of what a circumstance or fact is. The basic problem is: How does one assign a probability to there being true propositions that are intrinsically ineffable (for non-complexity-related reasons)? A good starting place is to imagine a being that had far less logical expressivity than we do (e.g., someone who could say 'p' / 'true' or (as a primitive concept) 'unknown' / 'unproven' but could not say 'not-p' / 'false'), and reason by analogy from this base case. Ideally, if two subsystems of an AGI are designed to reason with different logics, and we want the two to argue over the best interpretation of a problem case, we would settle the dispute by some rul

2abramdemski11y

I think you are overestimating the difficulty of the mathematical problem here! To quote JoshuaFox: But once those problems are solved, we do not need to additionally solve the problem you are highlighting, I think... When it comes down to it, wouldn't this be just like some logic that is the common subset of the two, or perhaps some kind of average (between the probability distributions on observations induced by each logic)? Again, I think this will be handled well enough (handled better, to be precise) by a more powerful logic which can express each of the two narrower logics as a hypothesis about the structure in which the environment is best defined. Then the honest argument you describe will be a result of the honest attempt of the agent to estimate probabilities and find plans of action which yield utility regardless of the status of the unknowns.

1[anonymous]11y

If I may interject (assuming it isn't too early to start proposing solutions), it does turn out to be the case that computability logic is a superset of linear logic, which encodes resource-boundedness and avoids material entailment paradoxes, intuitionistic logic, which encodes proof/justification, and classical logic, which encodes truth. To accept less is to sacrifice one or more of the above attributes in terms of expressiveness.

0abramdemski11y

Thanks; I certainly didn't intend to say that all logics are equivalent so it doesn't matter... edited to clarify...

0Eugine_Nier11y

True, but these kinds of tricks might make hash of its utility function.

0abramdemski11y

Very true! But that is a problem that already needs to be solved, separately. :)

-1Eugine_Nier11y

Yes, and friendliness is already a hard enough problem, without people dumping the hard parts of other problems into it.

0abramdemski11y

I don't think you can avoid solving this one in the first place, so I wouldn't call it dumping. (The problem is already inherent in the friendliness problem, so we're not adding anything new to the friendliness problem by keeping this out of the logic problem.) The fact is that goals which have to do with external objects need to make use of "fuzzy" descriptions of those objects. There should be some reasonable way of formalizing fuzziness. (Of course standard fuzzy logic theory is woefully inadequate, but it has roughly the right goal, so we can re-use the name.) Utility needs to be a function of some "empirical clusters" in reality, specified independently from the underlying physics (or complete list of border cases) of those entities. Encoding a different logic into our logic is not much different from encoding a physics, so I think there is not a separate problem to be solved.

[-]Wei Dai11y60

What I gather is that the real question here isn't so much "First or second order logic?", but "Is it sensible to try to convey/instill the idea of 'largest powerset' to/in an AI, and if so how?" "Largest powerset" seems both meaningful and simple (for the purpose of applying Occam's razor), but we don't know how to capture its meaning using syntactic proof rules (or rather, we know that we can't). Maybe we can capture the meaning some other way, for example in the form of a "math intuition module" that can arrive at "probabilistic" conclusions about mathematical structures that are defined in part by "largest powerset"?

2abramdemski11y

We know we can't do that either, although we can do slightly better than we can with monotonic logic (see my response).

2Wei Dai11y

I guess the result you're referring to is that there is no TM whose output converges to 1 if its input is a true statement (according to standard semantics) in second order PA and 0 if it's false. But don't we need stronger results to conclude "we can't do that either"? For example, take a human mathematician and consider his or her beliefs about "the standard numbers" as time goes to infinity (given unlimited computing power). Can you show that those beliefs can't come closer to describing the standard numbers than some class of other mathematical structures?

2abramdemski11y

Yes, you're right. I was being too severe. We can't "capture the meaning" probabilistically any better than we can with classical truth values if we assume that "capture the meaning" is best unpacked as "compute the correct value" or "converge to the correct value". (We can converge to the right value in more cases than we can simply arrive at the right value, of course. Moving from crisp logic to probabilities doesn't actually play a role in this improvement, although it seems better for other reasons to define the convergence with shifting probabilities rather than suddenly-switching nonmonotonic logic.) The main question is: what other notion of "capture the meaning" is relevant here? What other properties are important, once accuracy is accounted for as best is possible? Or should we just settle for as much accuracy as we can get, and then forget about the rest of what "capture the meaning" seems to entail?

[-][anonymous]11y60

What I got out of this post is that second-order logic is a lie.

"The mathematicians who deny second-order logic would see that reasoning as invoking an implicit background universe of set theory. Everything you're saying makes sense relative to some particular model of set theory, which would contain possible models of Peano arithmetic as sets, and could look over those sets and pick out the smallest in that model. Similarly, that set theory could look over a proposed model for a many-sorted logic, and say whether there were any subsets within the l

... (read more)

2vi21maobk9vp11y

Worse. You are being tricked into believing that ZFC is at all a definition of a set at all, while it is just a set of restrictions on what we would tolerate. In some sense, if you believe that there is only one second-order model of natural numbers, you have to make decisions what are the properties of natural numbers that you can range over; as Cohen has taught us, this involves making a lot of set-theoretical decisions with continuum hypothesis being only one of them.

[-]JasonGross11y50

Screw set theory. I live in the physical universe where when you run a Turing machine, and keep watching forever in the physical universe, you never experience a time where that Turing machine outputs a proof of the inconsistency of Peano Arithmetic. Furthermore, I live in a universe where space is actually composed of a real field and space is actually infinitely divisible and contains all the points between A and B, rather than space only containing a denumerable number of points whose existence can be proven from the first-order axiomatization of the r

... (read more)

[-]bryjnar11y50

I'm still worried about the word "model". You talk about models of second-order logic, but what is a model of second-order logic? Classically speaking, it's a set, and you do talk about ZF proving the existence of models of SOL. But if we need to use set theory to reason about the semantic properties of SOL, then are we not then working within a first-order set theory? And hence we're vulnerable to unexpected "models" of that set theory affecting the theorems we prove about SOL within it.

It seems like you're treating "model" a... (read more)

3abramdemski11y

My response to this situation is to say that proof theory is more fundamental and interesting than model theory, and pragmatic questions (which the dialog attempted to ask) are more important than model-theoretic questions. However, to some extent, the problem is to reduce model-theoretic talk to more pragmatic talk. So it isn't surprising to see model-theoretic talk in the post (although I did feel that the discussion was wandering from the point when it got too much into models).

[-]abramdemski11y50

My reply got a bit long:

http://lesswrong.com/r/discussion/lw/g81/re_secondorder_logic_the_controversy/

[-]Tyrrell_McAllister11y40

"ZF can quote formulas as structured, and encode models as sets ...

Was there supposed to be a word after "structured" and before the comma?

[-]vi21maobk9vp11y40

Actually, in NBG you have explicitness of assumptions and of first-order logic — and at the same time axiom of induction is a single axiom.

Actually, if you care about cardinality, you need a well-specified set theory more than just axioms of reals. Second-order theory has a unique model, yes, but it has the notion of "all" subsets, so it just smuggles some set theory without specifying it. As I understand, this was the motivation for Henkin semantics.

And if you look for a set theory (explicit or implicit) for reals as used in physics, I am not e... (read more)

[-]Qiaochu_Yuan11y30

"So the question you asked me last time was, 'Why does anyone bother with first-order logic at all, if second-order logic is so much more powerful?'"

For what it's worth, mathematicians care about first-order logic because it satisfies nicer theorems than second-order logic, in the same way that mathematicians care about Lebesgue integration because it satisfies nicer theorems than Henstock-Kurzweil integration. The statements that mathematicians talk about in many contexts are first-order anyway, so we might as well figure out what there is to... (read more)

4JeremyHahn11y

I do not think the situation is as simple as you claim it to be. Consider that a category is more general than a monoid, but there are many interesting theorems about categories. As far as foundations for mathematical logic go, any one interested in such things should be made aware of the recent invention of univalent type theory. This can be seen as a logic which is inherently higher-order, but it also has many other desirable properties. See for instance this recent blog post: http://golem.ph.utexas.edu/category/2013/01/from_set_theory_to_type_theory.html#more That univalent type theory is only a few years old is a sign we are not close to fully understanding what foundational logic is most convenient. For example, one might hope for a fully directed homotopy type theory, which I don't doubt will appear a few years down the line.

0Qiaochu_Yuan11y

Sure. It has of course been the case that careful increases in generality have also led to increases in power (hence the weasel word "usually"). There is something like a "production-possibilities frontier" relating generality and power, and some concepts are near the frontier (and so one can't generalize them without losing some power) while some are not.

[-]hairyfigment9y20

"ZF can quote formulas as structured, and encode models as sets, and then represent a finite ZF-formula which says whether or not a set of quoted formulas is true about a model. ZF can then prove that no step of classical logic goes from premises that are true inside a set-model, to premises that are false inside a set-model. In other words, ZF can represent the idea 'formula X is semantically true in model Y' and 'these syntactic rules never produce semantically false statements from semantically true statements'."
Wait, then why can't ZF prove

... (read more)

[-]dankane11y20

Eliezer - why is it that you seem to adamant that our physical reality is better modeled by the standard natural numbers rather than some supernatural model? If we try to see whether a physical implementation of a Turing machine provides an inconsistency in PA, then of course it won't because it will run for fewer than Ackermann(20) iterations, and all number theories say that it won't halt by then. If we instead lived in something that looked like the game of life and we did expect it to last infinitely long why exactly would we expect time steps to be in... (read more)

1Rob Bensinger11y

Simplicity is an important consideration. There are many more ways for complex universes to have violated our expectations already, while still making sapience possible, than for simple universes to have done so. That is, the space of simple universes is much more constrained by anthropic principles than is the space of complicated ones; so it is surprising that we seem to have found ourselves in a universe largely describable with very simple and uniform logic and mathematics, since there are many more ways to be complicated than to be simple. And this surprising fact may suggest that exotic logical, mathematical, or metaphysical structures do not exist, or are in some other respect not candidates for our anthropic reasoning. In other words, we can draw a tentative inductive conclusion regarding whether spacetime is nonstandard overall by considering all the ways in which our theories to date have been surprisingly successful and general while appealing almost exclusively to the simplest (which often corresponds to 'most standard,' with a few important caveats) conceivable systems. This won't be conclusive, but it's at least suggestive.

1dankane11y

I guess here's my point. We would not expect to have seen any empirical differences between standard and non-standard models of spacetime yet. There would not be any observable differences until we reach supernatural times. I suppose you might discount the nonstandard models based on their not being computable or something, but I'm not convinced that this is the right thing to do. I also agree that the standard model should be given more weight (due to its conceptual simplicity to pin down) than any particular nonstandard model, but I don't see why it should be given more weight than all of them put together. The even more concise (or at least less potentially problematic) theory would be to say that we are living in some model of these axioms and not bother to specify which.

0Rob Bensinger11y

I'm not sure that it does deserve more weight than all of them put together. That largely rests on how complicated our needlessly complicated models have to be in order to recover the empirical world. If there are a lot of relatively simple ways to perfectly reproduce the empirical predictions of our standard models to date, then that will make a bigger difference than if there are an even greater number of fantastically gerrymandered, convoluted systems. Even though there are more complicated universes that perfectly generate our observations to date than there are simple universes that do so, given proper subsets of our universe, assuming a structurally simple universe tends to yield more accurate predictions than does assuming a structurally complex universe. Although in absolute terms there are so many more ways to be complicated than to be simple that given almost any data, most models predicting that data are exceedingly complicated and gerrymandered... ... and although it is not clear whether sentience exists in a larger proportion of complicated worlds than of simple ones (or vice versa)... ... nevertheless it seems clear that a larger proportion of the simple than of complex worlds that predict some of our experience predict all of it. That is, there are proportionally many more ways to complicatedly predict most of our observations and then spectacularly fail to predict the last few, than there are ways to complicatedly predict most of our observations and then predict the last few as well. This, however, is still speaking very generally; the degree of complexity will make a huge difference here, and not all non-standard models are radically more complicated.

1dankane11y

But when we are talking about which model of set theory to use to model our experiences, all of the non-standard models give exactly the same predictions to date (and will continue to do so until we reach non-standard times). And not all of these models are horribly complicated. There's the model of constructable set theory, where you only have the sets absolutely guaranteed you by the axioms. This gives you a countable model of set theory (and hence not the standard one), but it is still easily describable.

0Eugine_Nier11y

How do we know they haven't?

-2MugaSofer11y

Interesting choice of example. Not objecting to it, just interesting.

0dankane11y

Actually, here's maybe a better way of saying what I'm trying to get at here: What evidence would actually convince you that we lived in a universe given by a non-standard model of the natural numbers? Before you say "I run a computer program which returns a proof of 0=1 in PA", think about it for a while. Ignoring the fact that in practice, you would probably suspect that the proof is wrong, might you instead take this as evidence that PA is actually inconsistent. After all, Godel tells us that we can't really be sure about that. In fact, if you did live in a nonstandard model of PA, and found a nonstandard proof of 0=1, wouldn't it feel from the inside like you found an inconsistency in PA rather than that you found a nonstandard number?

[-]DanArmak11y20

Then you could arguably deny that shouting about the 'standard' numbers wouldn't mean anything

Too many contradictions, I think: it should read "deny that shouting about the standard numbers would mean anything".

(Removed wrong edit here)

[-]Liron11y10

When I learned about Lowenhein-Skolem, I just thought "lol first-order logic sux". I didn't question whether distinguishing among infinite cardinalities was a meaningful thing for humans to do. Eliezer is clearly a higher-order student of logic.

7Eliezer Yudkowsky11y

Er, to be clear about not taking credit, this is a long-running philosophical debate in mathematics. The parts about "Well, what physics, then?" I haven't read elsewhere but that could just be limited reading.

4abramdemski11y

And what about Skolem's Paradox? Does it make you think "lol set theory sux"? :)

[-]moshez11y10

As far as complexity-of-logic-theories-for-reason-of-believing-in-them, that should be proportional to the minimal Turing machine that would check if something is an axiom or not. (Of course, in the case of a finite list, approximating it to the total length of the axioms is reasonable, because the Turing machine that does "check if input is equal to following set:" followed by set adds a constant size to the set -- but that approximation breaks down badly for infinite axiom schema).

[-]albtross11y00

hey that was really interesting about whether or not "all properties" in second order logic falls victim to the same weirdness as the idea of a natural number does in first order logic. I never thought of that before.

oh yeah do you have a reference for "there's tricks for making second-order logic encode any proposition in third-order logic and so on."? I would like to study that too, even though you've already given me a lot of things to read about already!

3abramdemski11y

http://plato.stanford.edu/entries/logic-higher-order/

-5albtross11y

[-]mfb11y00

That might sound weird, but do we have any evidence that our time follows the standard numbers (or a continuous version of them) only? Is it even possible to get such evidence? Maybe our turing machine (looking for contradictions in PA) stops at -2* - "we" cannot see it, as "we" are on the standard numbers only, experiencing only effects of previous standard numbers.

4Qiaochu_Yuan11y

Time is a property of the map, not a property of the territory. The territory probably doesn't have anything that could reasonably be called time, along the lines of the timeless physics posts.

0prase11y

In this approach, anything is a property of the map only. (One thing is to say that time isn't, from the point of view of the fundamental microscopic laws, as universal and important as might intuitively appear. Another thing is to say that it doesn't exist in the territory. There are natural clocks, such as decaying radioctive elements, whose existence has little common with human "maps".)

-2Peterdjones11y

Barbour's universe/multiverse is much more complex than a classical timeful universe. And it isn't especialy likely, either. Specualtion =/= proof. To say the least.

4khafra11y

Could you expand on that? I had the impression that timeless physics was larger in RAM, but not on disk--where it counts for MML, Solomonoff Induction, etc.

0Peterdjones11y

I'm afraid I don't understand your RAM/disk metaphor.

2Nornagest11y

I took it to mean that it requires you to keep track of more state at any given time, but that the whole system has less logical complexity when you write it all down. Sort of like the difference between estimating the future state of a restricted N-body problem with numerical methods vs. solving it directly.

0Peterdjones11y

There are no "given times" in Barbour's universe. And who's the "you"? An internal or external observer?

[-]3p1cd3m0n9y-10

"Doesn't physics say this universe is going to run out of negentropy before you can do an infinite amount of computation?" Actually, there is a [proposal that could create a computer that runs forever.

1TheWakalix6y

Are these time crystals made up of protons? If so, proton decay might eventually cause it to disintegrate.

[-][anonymous]11y-10

"If there's a collection of third-order axioms that characterizes a model, there's a collection of second-order axioms that characterizes the same model. Once you make the jump to second-order logic, you're done - so far as anyone knows (so far as I know) there's nothing more powerful than second-order logic in terms of which models it can characterize."

You clearly can state Continuum Hypothesis in the higher order logic, while a 2nd order formulation seems elusive. Are you sure about it?

6[anonymous]11y

Eliezer is correct. See SEP on HOL.

2vi21maobk9vp11y

Link is good, but I guess direct explanation of this simple thing could be useful. It is not hard to build explicit map between R and R² (more or less interleaving the binary notations for numbers). So the claim of Continuum Hypothesis is: For every property of real numbers P there exists such a property of pairs of real numbers, Q such that: 1) ∀x (P(x) -> ∃! y Q(x,y)) 2) ∀x (¬P(x) -> ∀y¬Q(x,y)) (i.e. Q describes mapping from support of P to R) 3) ∀x1,x2,y: ((Q(x1,y)^Q(x2,y)) -> x1=x2) (i.e. the map is an injection) 4) (∀y ∃x Q(x,y)) ∨ (∀x∀y (Q(x,y)-> y∈N)) (i.e. map is either surjection to R or injection to a subset of N) These conditions say that every subset of R is either the size of R or no bigger than N.

[-][anonymous]11y-20

Second order logic may be able to express all expressible statements in 3rd order logic, 4th order, up to any finite Nth order, but perhaps there exist ∞th order statements that 2nd order logic cannot express. Albeit, such statements could never be fully decidable and would thus be, at best, semidecidable or co-semidecidable. This may not be complete, but science works the same way.

1abramdemski11y

Nope, 2nd-order logic can discuss Nth order where N is an infinite ordinal, as well [for some class of describable ordinals]. Maybe the argument could be made that 2nd-order logic cannot discuss a universe containing all the ordinals, and in that case maybe we could argue that set theory can, and is therefore more powerful. But this is not clear. Independently of that, we might also believe that category theory can describe things beyond set theory, because category theory regularly investigates categories which correspond to "large" sets (such as the set of all sets). There are ways to put category theory into set theory, but these translate the "large" sets into "small" sets such as Grothendeik universes, so we could still argue that the semantics of category theory is bigger. However, even if this is the case, it might be that 2nd-order logic can encode category theory in the same way that it can encode 3rd-order logic. We can add axioms to 2nd-order logic which describe non-standard set theories containing "big" sets, such as NF. This may allow for an "honest" account of category theory, in which categories really can be defined on "big" sets such as the set of all sets.

1Eliezer Yudkowsky11y

I don't think that infinitary logic is the same as ω-order logic.

-2[anonymous]11y

Wouldn't ω-order logic be a subset of infinitary logic? Or do I have it backwards?

5Eliezer Yudkowsky11y

I don't think they have anything to do with each other. Infinitary logic is first-order logic with infinite proof lengths. Second-order logic is finite proof lengths with quantification over predicates. I don't know if there's any particular known relation between what these two theories can express.

-4Decius11y

Third order logic talks about something that we don't understand, and we don't understand it in exactly the same way that the first-order aliens can't understand second order logic. One of the things third order logic might be able to do is determine the number of steps a Turing will take before halting in a constant-time algorithm.

2[anonymous]11y

Actually, the halting problem has been proven impossible, for if third order logic can be used to determine if/when a turing machine halts, a turing machine could run that on itself, but would take necessarily more than N steps to determine that it halts (where N is the number of steps it halts in), which contradicts. Third-order logic may be able to do the inconceivable, but not the noncomputable.

3Gurkenglas10y

That's not the standard proof, at least not the one I know: You haven't proven that it would necessarily take more than N steps. The way it goes is that you assume there is a machine that halts iff the machine you give it runs forever given itself as input; now if you run it on itself, if it halts, it is wrong, and if it doesn't halt, it is also wrong, meaning our assumption was wrong.

-6Decius11y

Moderation Log