Less Wrong is a community blog devoted to refining the art of human rationality. Please visit our About page for more information.

No coinductive datatype of integers

4 04 May 2011 04:37PM

Followup to: What's a "natural number"?

While thinking about how to make machines understand the concept of "integers", I accidentally derived a tiny little math result that I haven't seen before. Not sure if it'll be helpful to anyone, but here goes:

You're allowed to invent an arbitrary scheme for encoding integers as strings of bits. Whatever encoding you invent, I can give you an infinite input stream of bits that will make your decoder hang and never give a definite answer like "yes, this is an integer with such-and-such value" or "no, this isn't a valid encoding of any integer".

To clarify, let's work through an example. Consider an unary encoding: 0 is 0, 1 is 10, 2 is 110, 3 is 1110, etc. In this case, if we feed the decoder an infinite sequence of 1's, it will remain forever undecided as to the integer's value. The result says we can find such pathological inputs for any other encoding system, not just unary.

The proof is obvious. (If it isn't obvious to you, work it out!) But it seems to strike at the heart of the issue why we can't naively explain to computers what a "standard integer" is, what a "terminating computation" is, etc. Namely, if you try to define an integer as some observable interface (get first bit, get last bit, get CRC, etc.), then you inevitably invite some "nonstandard integers" into your system.

This idea must be already well-known and have some standard name, any pointers would be welcome!

Comments (137)

Sort By: Controversial
Comment author: 06 May 2011 12:25:58AM 2 points [-]

The proof is obvious. (If it isn't obvious to you, work it out!)

I deem this obnoxious.

Comment author: 06 May 2011 01:09:38AM 1 point [-]

A proof in ROT13:

Gb rnpu cbffvoyr rapbqvat bs n pbhagnoyr frg ol svavgr ovg fgevatf gurer pbeerfcbaqf n ovanel gerr: ng gur ebbg abqr lbh tb yrsg (fnl) vs gur svefg ovg vf n mreb naq evtug vs vg'f n bar, gura ng gur arkg abqr lbh tb yrsg be evtug vs gur frpbaq ovg vf n mreb be bar, naq fb sbegu. Gur npghny vagrtref pbeerfcbaq gb grezvany abqrf, ernpurq ol n svavgr ahzore bs oenapuvatf. Znysbezrq ovg fgevatf, gbb, pbeerfcbaq gb grezvany abqrf, ng gur rneyvrfg cbvag gung jr pna gryy gurl'er znysbezrq. Ubjrire, fvapr gurer ner vasvavgryl znal vagrtref, naq bayl svavgryl znal cbffvoyr ovg fgevatf bs nal tvira yratgu, gurer zhfg or ng yrnfg bar vasvavgryl ybat cngu va gur gerr, naq vs jr srrq guvf frdhrapr bs mrebf naq barf vagb bhe pbzchgre, gur pbzchgre jvyy unat.

Comment author: 06 May 2011 01:57:36AM 3 points [-]

You seem to have left out the part of the proof where you do the actual work! ROT13'd:

Fxngpur fubjf gung gurer zhfg or neovgenevyl ybat cnguf; gur ceboyrz abj vf gb fubj gung gurer vf na vasvavgryl ybat cngu. Jr pbafgehpg guvf cngu nf sbyybjf: Fvapr gurer ner vasvavgryl znal cnguf, gurer ner rvgure vasvavgryl znal cnguf fgnegvat jvgu 0, be vasvavgryl znal cnguf fgnegvat jvgu 1. Vs gur sbezre ubyqf, fgneg jvgu 0, bgurejvfr fgneg jvgu 1. Fvapr juvpurire qvtvg jr cvpxrq, gurer ner vasvavgryl znal fgnegvat jvgu vg, jr pna ybbx ng gur arkg qvtvg naq ercrng guvf cebprff. Fvapr jr pna ercrng guvf neovgenevyl, guvf trgf hf na vasvavgryl ybat cngu. Va trareny guvf nethzrag trgf lbh jung'f xabja nf Xbavt'f Yrzzn.

Comment author: 06 May 2011 02:39:31AM 0 points [-]

Sorry, I have a bit of a skewed perspective about what's obvious. :P Once I perceived the connection to binary trees it seemed plain as day.

Comment author: 06 May 2011 06:14:35AM 2 points [-]

I find that for me, and many other people I know in the mathematics department of my university, once infinities, uncountability, and such enter the picture, the accuracy of intuition quickly starts to diminish, so it's wise to be careful and make sure the proof is complete before declaring it obvious. As a good example, note how surprising and notable Cantor's diagonal argument seemed the first time you heard it- it isn't obvious that the reals aren't countable when you don't already know that, so you might start trying to construct a counting scheme and end up with one that "obviously" works.

Comment author: [deleted] 04 May 2011 05:23:35PM 0 points [-]

How would an infinite stream of bits possibly encode an integer in the first place? All integers are finite, so unless you did something stupid like assign the infinite sequence 11111111... to the integer 37, an infinite number of bits should never correspond to any integer.

Comment author: 04 May 2011 05:43:49PM 2 points [-]

The idea is that you invent a system where each integer corresponds to a finite bit-string, but the lengths of these strings must be unbounded. Unary is one such code.

Then you could set up a computer program which decodes these strings, so you feed it one bit at a time, each time asking it 'has an integer been uniquely specified yet?' The OP's claim is that whatever encoding method you come up with, he can come up with an infinite string that will make your program keep saying "no integer has been uniquely defined yet" forever.

So, nobody is encoding integers as infinite strings, although there's no particular why you can't, in fact the overwhelming majority of possible encodings use infinite strings.

Comment author: 04 May 2011 05:33:14PM 1 point [-]

The decoder doesn't have access to the fact that the stream is infinite. It can only query individual bits.

Comment author: [deleted] 04 May 2011 05:41:52PM 0 points [-]

Seems like you need to add some more explanation to the scenario to bring a wider audience up to speed.

Comment author: 04 May 2011 05:38:48PM 0 points [-]

But the stream isnt an integer. If you tell me I must assign this number to an integer and give me no options to leave then I will be forced to spend the rest of mylife decoding it. Ignoring that we can't give an infinite sequence in the first place I am not sure this is a meaningful problem

Comment author: [deleted] 04 May 2011 05:51:52PM 9 points [-]

I guess my objection is that your task is too obviously impossible. If I tell you that I can color any page in any coloring book, and you give me an infinite coloring book, I won't be able to finish, even though I know how to color. Similarly, if we have a decoder that can interpret infinitely many finite bit strings, it can be stumped by an infinite one, by virtue of its being infinite.

Comment author: [deleted] 04 May 2011 06:19:16PM 2 points [-]

The thing is, not just any infinite input stream will do. It has to at all times look properly encoded to the decoder. You're allowed to come up with any encoding scheme that you like, so what you can try to do is to come up with an encoding scheme that an infinite input stream must violate in finite time. This turns out to be impossible, and the proof is fairly easy, but it is not as obvious as your description characterizes it here.

Comment author: [deleted] 04 May 2011 06:52:26PM *  3 points [-]

Granted. But even so, the primary problem is that the input is infinite: we can deal with any finite input stream, no problem.

ETA: Also, the obvious way to approach the problem is to have a terminator character (this is how the unary representation works, for instance). In that case, what this result says is "if your decoder expects a terminator, then if you feed it an infinite string without a terminator, it'll get stuck." This seems obvious.

Comment author: 04 May 2011 06:08:07PM 4 points [-]

Agreed. There's an even easier way to confound the machine, really: when it asks you for the bit string, walk away from the computer!

Note that there is no way to feed infinite input to a Turing machine. The halting problem isn't demonstrated by a program of infinite length. It's demonstrated by a finite program that resists static analysis. The analogous integer is not the one represented by an infinite bit stream, but it might be something like the Berry number.

Comment author: 05 May 2011 08:05:05AM *  0 points [-]

To clarify, let's work through an example. Consider an unary encoding: 0 is 0, 1 is 10, 2 is 110, 3 is 1110, etc. In this case, if we feed the decoder an infinite sequence of 1's, it will remain forever undecided as to the integer's value. The result says we can find such pathological inputs for any other encoding system, not just unary.

I'm not sure this is particularly significant. If you haven't finished feeding in the input you can hardly expect the machine to give a complete output unless you have given a specific specification as to how to simplify or truncate. I perhaps would avoid saying that the decoder 'hung'. 'Hung' implies some sort of pathological error in the decoder, not just a decoder that is processing new input as fast as is possible in exactly the way it is supposed to.

That said, many things that seem bloody obvious and trivial are considered to be important mathematical principles once phrased in maths language. Maybe what you are describing is significant. It does seem like the sort of thing that usually gets a mathsy name. "Infinite Garbage In, Garbage Out" may not cut it. :)

Comment author: 04 May 2011 05:57:11PM *  0 points [-]

Whatever encoding you invent, I can give you an infinite input stream of bits that makes your decoder hang and never give a definite answer like "yes, this is an integer with such-and-such value" or "no, this isn't a valid encoding of any integer".

What if my encoding is: return TRUE(=is an integer) for all inputs?

That's not such a dumb thing to do. In fact, any encoding that didn't assign some integer outcome to any sequence of bits would be trivially suboptimal.

Comment author: 04 May 2011 05:59:59PM *  4 points [-]

The decoder also has to learn the integer's value. Also, each integer must have at least one bit sequence that decodes to it in finite time. Sorry, it seems I err on the side of conciseness even when I explicitly try to be wordy!

Comment author: [deleted] 04 May 2011 05:17:59PM 8 points [-]

Agree about the theorem. Not sure about the implication for explaining integers, since a human who has been assigned the task of decoding the string should have the same difficulty with that infinite input stream. His understanding of what an integer is doesn't protect him. What seems to me to protect the human from being mentally trapped by an endless input is that his tolerance for the task is finite and eventually he'll quit. If that's all it is, then what you need to teach the computer is impatience.

Comment author: 04 May 2011 05:28:11PM *  1 point [-]

It still feels very weird to me that we cannot explain our intuitive notion of "finiteness" to a machine. Maybe this shows that we don't understand finiteness very well yet.

Comment author: 04 May 2011 06:37:38PM -2 points [-]

If we could explain what "finite" meant, then we could explain what "infinite" meant. More and more I'm starting to believe "infinite" doesn't mean anything.

Comment author: 04 May 2011 09:15:08PM 0 points [-]

How about 'not finite'? Even better, 'non-halting' or maybe even...'not step-wise completable/obtainable'? Something is 'in-finite'ly far away if there is no set of steps that I can execute to obtain it within some determinate amount of time.

Comment author: 05 May 2011 02:09:43AM 2 points [-]

You're repeating my point. It's no harder to explain what "finite" means than it is to explain what "not finite" means. You take this to mean that the concept of "not finite" is easy, and I take it to mean that concept of "finite" is hard. Cousin it's recent experience lends credence to my point of view.

For any N, it's easy to explain to a computer what "cardinality smaller than N" means. It's hard to explain to a computer what "exists N such that cardinality is smaller than N" means, and this is the source of cousin it's trouble. The next logical step is to realize for a cognitive illusion the idea that humans have a special insight into what this means, that computers can't have. For some reason most people aren't willing to take that step.

Comment author: 05 May 2011 02:32:47PM *  3 points [-]

If it's just a cognitive illusion, why is it so hard to find contradictions in axiom systems about integers that were generated by humans intuitively, like PA?

We seem to have a superpower for inventing apparently consistent axiom systems that make "intuitive sense" to us. The fact that we have this superpower is even machine-checkable to a certain degree (say, generate all proofs of up to a million symbols and look for contradictions), but the superpower itself resists attempts at formalization.

Comment author: 06 May 2011 03:34:06AM 5 points [-]

We work in axiom systems that have not been proven inconsistent because in the past we have reacted to inconsistencies (such as Russel's paradox) by abandoning the axioms. I wouldn't call this a superpower but a bias.

Russel's paradox is a cliche but it's really illustrative. The principle of noncontradiction says that, since we have arrived at a false statement (the barber shaves himself and doesn't shave himself) some of our premises must be wrong. Apparently the incorrect premise is: given a property P, there is a set of elements that satisfy P. What could it possibly mean that this premise is false? Evidently it means that the everyday meaning of words like "property," "set," and "element" is not clear enough to avoid contradictions. Why are you so sure that the everyday meaning of words like "number," "successor," and "least element" are so much clearer?

Here's another reason to be unimpressed by the fact that no contradictions in PA have been found. The length of the shortest proof of falsum in a formal system has the same property as the busy beaver numbers: they cannot be bounded above by any computable function. i.e. there is an inconsistent formal system with fewer than 100 axioms in which all theorems with proofs of length smaller than BB(100) are consistent with each other. Why couldn't PA be such a system?

Comment author: 06 May 2011 04:16:27AM *  0 points [-]

You've made good points (and I've voted up your remark) but I'd like to note a few issues:

First we can prove the consistency of PA assuming certain other axiomatic systems. In particular, Gentzen's theorem shows that consistency of PA can be proved if one accepts a system which is incomparable in strength to PA (in the sense that there are theorems in each system which are not theorems in the other).

they cannot be bounded above by any computable function. i.e. there is an inconsistent formal system with fewer than 100 axioms in which all theorems with proofs of length smaller than BB(100) are consistent with each other.

This isn't necessarily true unless one has a 1-1 correspondence between axiomatic systems and Turing machines, rather than just having axiomatic systems modelable as Turing machines. This is a minor detail that doesn't impact your point in a substantial fashion.

Second, it isn't clear how long we expect an average output of a Turing machine to be. I don't know if anyone has done work on this, but it seems intuitively clear to me that if I pick a random Turing machine with n states, run it on the blank tape, I should expect with a high probability that the Turing machine will halt well before BB(n) or will never halt. Let's make this claim more precise:

Definition: Let g(f)(n) be the (# of Turing machines with n states which when run on the blank tape either never halt or halt in fewer than f(n) steps) / (# of Turing machines with n states).

Question: Is there is a computable function f(n) such that g(f)(n) goes to 1 as n goes to infinity.

If such a function exists, then we should naively expect that random axiomatic systems will likely to either have an easy contradiction or be consistent. I'm not sure one way or another whether or not this function exists, but my naive guess is that it does.

Comment author: 06 May 2011 05:10:56AM 0 points [-]

If such a function exists, then we should naively expect that random axiomatic systems will likely to either have an easy contradiction or be consistent. I'm not sure one way or another whether or not this function exists, but my naive guess is that it does.

You must be aware that such halting probabilities are usually uncomputable, right? In any case you're not going to be surprised that I wouldn't find any information about this limit of ratios compelling, any more than you would buy my argument that 15 is not prime because most numbers are not prime.

Comment author: 06 May 2011 05:21:41AM 0 points [-]

You must be aware that such halting probabilities are usually uncomputable, right?

Yes, but the existence of this function looks weaker than being able to compute Chaitin constants. Am I missing something here?

In any case you're not going to be surprised that I wouldn't find any information about this limit of ratios compelling, any more than you would buy my argument that 15 is not prime because most numbers are not prime.

My prior that a random integer is prime is 1/log n . If you give me a large integer, the chance that it is prime is very tiny and that is a good argument for assuming that your random integer really isn't prime. I'm not sure why you think that's not a good argument, at least in the context when I can't verify it (if say the number is too large).

Comment author: [deleted] 04 May 2011 05:37:34PM 4 points [-]

You've thought more about explaining things to machines than I have, but I'm not sure what you consider to count as having successfully explained a concept to a machine. For example, if you tell the machine that any string with both a beginning and an end is finite, then you've given the machine at least some sort of explanation of finiteness - one which I presume you consider to be inadequate. But when I try to imagine what you might find to be inadequate about it, my naive guess is that you're thinking something like this: "even if you tell a machine that a finite string has an end, that won't help the machine to decide whether a given input has reached its end". In other words, the definition doesn't give the machine the ability to identify infinite strings as such, or to identify finite strings as such in a time shorter than the length of the string itself. But my response to this is that our own human understanding of finiteness doesn't give us that ability either. Just for starters, we don't know whether the universe goes on forever in all directions or closes in on itself as a four-dimensional ball - and we might never know.

Comment author: 04 May 2011 05:49:01PM *  0 points [-]

I'm not sure what you consider to count as having successfully explained a concept to a machine

I want it to be able to do some of the things that humans can do, e.g. arrive at the conclusion that the Paris-Harrington theorem is "true" even though it's independent of PA. Humans manage that by having a vague unformalized concept of "standard integers" over which it is true, and then inventing axioms that fit their intuition. So I'm kicking around different ways to make the "standard integers" more clear.

Comment author: 04 May 2011 07:50:07PM 1 point [-]

Having now looked up the PH theorem, I don't understand what you mean. Do you disagree with any of the following?

Computers can prove Paris-Harrington just as easily as humans can. They can also prove the strong Ramsey result that is the subject of PH as easily as humans can.

Both humans and computers are incapable of proving the Ramsey result within Peano arithmetic. Both are capable of proving it in some stronger formal systems.

Both humans and computers can "see that the Ramsey result is true" in the sense that they can verify that a certain string of symbols is a valid proof in a formal system. They are both equally capable of verifying that the Ramsey result (which concerns finite sets of integers) is true by experiment. Neither a human nor a computer can "see that the Ramsey result is true" in any stronger sense.

Comment author: 04 May 2011 08:54:37PM *  -1 points [-]

I agree with everything in your comment except the last sentence. Sorry for being cryptic, I think this still gets the point across :-)

Comment author: 05 May 2011 02:11:35AM 2 points [-]
Comment author: 05 May 2011 02:48:22AM 2 points [-]

Can I attempt a translation/expansion for Sewing-Machine of why you disagree with the last sentence?

It seems that there's an intuition among humans that the Ramsey result is true, in the sense that PA + PH captures our intuition of the integers more closely than PA + ~PH given the second order result. What you want is a computer to be able to make that sort of intuitive reasoning or to make it more precise. Is that more or less the point?

Comment author: 05 May 2011 04:09:42AM 4 points [-]

We can all agree that human intuition is grand but not magical, I hope? Here is my point of view: you are having difficulty teaching a computer to "make that sort of intuitive reasoning" because that sort of reasoning is not quite right.

"That sort of reasoning" is a good heuristic for discovering true facts about the world (for instance, discovering interesting sequences of symbols that constitute a formal proof of the Paris-Harrington theorem), and to that extent it surely can be taught to a computer. But it does not itself express a true fact about the world, and because of that you are limited in your ability to make it part of the premises on which a computer operates (such as the limitation discussed in the OP).

So I've been thinking lately, anyway.

Comment author: 05 May 2011 03:03:19AM 1 point [-]

I'm really at a loss as to why such a thing should be intuitive. The additional condition seems to me to be highly unnatural; Ramsey's theorem is a purely graph-theoretic result, and this strengthened version involves comparing the number of vertices used to numbers that the vertices happen to correspond to, a comparison we would ordinarily consider meaningless.

Comment author: 05 May 2011 03:14:00AM *  0 points [-]

If I'm following cousin it, the idea doesn't have anything really to do with the statement about Ramsey numbers. As I understand it, if in some system that is only slightly stronger than PA we can show some statement S of the form A x in N, P(x), then we should believe that the correct models of PA are those which have S being true. Or to put it a different way, we should think PA + S will do a better job telling us about reality than PA + ~S would. I'm not sure this can be formalized beyond that. Presumably if it he had a way to formalize this, cousin it wouldn't have an issue with it.

Comment author: 04 May 2011 06:18:41PM *  2 points [-]

Considering that Paris-Harrington is derivable from second order arithmetic, do you think any of the ideas from reverse mathematics might come into play? This paper might of interest if you aren't very familiar with reverse mathematics, but would like to know more.

Also, it's my intuition that a good deal of mathematics has more to say about human cognition than it does about anything else. That said, this seems like the sort of problem that should be tackled from a computational neuroscience angle first and foremost. I'm likely wrong about the 'first and foremost' part, but it seems like something on numerical cognition could help out.

Also, have you looked at this work? I don't care for the whole 'metaphor' camp of thought (I view it as sort of a ripoff of the work on analogies), but there seem to be a few ideas that could be distilled here.

Comment author: 04 May 2011 06:27:31PM *  0 points [-]

I'm familiar with reverse mathematics and it is indeed very relevant to what I want. Not so sure about Lakoff. If you see helpful ideas in his paper, could you try to distill them?

Comment author: 04 May 2011 06:44:51PM 1 point [-]

I could give it a shot. Technically I think they are Rafael Nunez's ideas more than Lakoff's (though they are framed in Lakoff's metaphorical framework). The essential idea is that mathematics is built directly from certain types of embodied cognition, and that the feeling of intuitiveness for things like limits comes from the association of the concept with certain types of actions/movements. Nunez's papers seem to have the central goal of framing as much mathematics as possible into an embodied cognition framework.

I'm really not sure how useful these ideas are, but I'll give it another look through. I think that at most there might be the beginnings of some useful ideas, but I get the impression that Nunez's mathematical intuition is not top notch, which makes his ideas difficult to evaluate when he tries to go further than calculus.

Fortunately, his stuff on arithmetic appears to be the most developed, so if there is something there I think I should be able to find it.

Comment author: 05 May 2011 03:13:08AM 1 point [-]

A finite number is one that cannot be the cardinality of a set that has a subset with an equal cardinality.

Comment author: 05 May 2011 08:15:52AM *  3 points [-]

This reduces the problem of explaining "standard integers" to the problem of explaining "subsets", which is not easier. I don't think there's any good first-order explanation of what a "subset" is. For example, your definition fails to capture "finiteness" in some weird models of ZFC. More generally, I think "subsets" are a much more subtle concept than "standard integers". For example, a human can hold the position that all statements in the arithmetical hierarchy have well-defined (though unknown) truth values over the "standard integers", and at the same time think that the continuum hypothesis is "neither true nor false" because it quantifies over all subsets of the same integers. (Scott Aaronson defends something like this position here.)

Comment author: 05 May 2011 02:21:19PM 2 points [-]

I don't think there's any good first-order explanation of what a "subset" is.

Well, ZFC is a first-order theory...

Comment author: 06 May 2011 07:57:46AM -1 points [-]

Yes, but Subsets(x,y) is a primitive relationship in ZFC. I don't really know what cousin_it means by an explanation, but assuming it's something like a first-order definition formula, nothing like that exists in ZFC that doesn't subsume the concept in the first place.

Comment author: 06 May 2011 08:12:01AM 1 point [-]

No, it isn't. The only primitive relations in ZFC are set membership and possibly equality (depending on how you prefer it). "x is a subset of y" is defined to mean "for all z, z in x implies z in y".

Comment author: 06 May 2011 01:22:31PM 2 points [-]

Can I downvote myself? Somehow my mind switched "subset" and "membership", and by the virtue of ZFC being a one-sorted theory, lo and behold, I wrote the above absurdity. Anyway, to rewrite the sentence and make it less wrong: subsets(x,y) is defined by the means of a first-order formula through the membership relation, which in a one-sorted theory already pertains the idea of 'subsetting'. x E y --> {x} <= y. So subsetting can be seen as a transfinite extension of the membership relation, and in ZFC we get no more clarity or computational intuition from the first than from the second.

Comment author: 05 May 2011 03:35:30AM 1 point [-]

Set theory is not easier than arithmetic! Zero is a finite number, and N+1 is a finite number if and only if N is.

Comment author: 05 May 2011 07:25:49PM 0 points [-]

Yes, that is a much better definition. I don't know why this one occurred to me first.

Comment author: 05 May 2011 03:58:26AM 0 points [-]

Though with the standard definitions, that requires some form of choice.

Comment author: 05 May 2011 01:14:16AM 6 points [-]

Of course we can explain it to a machine, just as we explain it to a person. By using second-order concepts (like "smallest set of thingies closed under zero and successor"). Of course then we need to leave some aspects of those second-order concepts unexplained and ambiguous - for both machines and humans.

Comment author: 05 May 2011 07:21:56AM 0 points [-]

I don't understand what you're referring to in your second sentence. Can you elaborate? What sorts of things need to be ambiguous?

Comment author: 05 May 2011 02:36:17PM 3 points [-]

By 'ambiguous', I meant to suggest the existence of multiple non-isomorphic models.

The thing that puzzled cousin_it was that the axioms of first-order Peano arithmetic can be satisfied by non-standard models of arithmetic, and that there is no way to add additional first-order axioms to exclude these unwanted models.

The solution I proposed was to use a second-order axiom of induction - working with properties (i.e.sets) of numbers rather than the first-order predicates over numbers. This approach successfully excludes all the non-standard models of arithmetic, leaving only the desired standard model of cardinality aleph nought. But it extends the domain of discourse from simply numbers to both numbers and sets of numbers. And now we are left with the ambiguity of what model of sets of numbers we want to use.

It is mildly amusing that in the case of arithmetic, the unwanted non-standard models were all too big, but in the case of set theory, people seem to prefer to think of the large models as standard and dismiss Godel's constructive set theory as an aberation.

Comment author: 05 May 2011 03:22:40PM 0 points [-]

Thank you for the clarification.

Comment author: 06 May 2011 08:20:14PM 0 points [-]

It still feels very weird to me that we cannot explain our intuitive notion of "finiteness" to a machine. Maybe this shows that we don't understand finiteness very well yet.

There's actually a reason for that - it's impossible to define "finite" in first-order logic. There's no set of first-order statements that's true in all finite models and false in all infinite models.

Comment author: 07 May 2011 07:52:50AM 0 points [-]

I know that, of course. Inserting obligatory reference to Lowenheim-Skolem for future generations doing Ctrl+F.

Comment author: 06 May 2011 08:22:38PM *  0 points [-]

How's that relevant?

For present purposes, why should we care whether the logic we use is first-order?

Comment author: 06 May 2011 08:47:48PM *  2 points [-]

In first-order logic, if a statement is logically valid - is true in all models - then there exists a (finite) proof of that statement. Second-order logic, however, is incomplete; there is no proof system that can prove all logically valid second-order statements.

So if you can reduce something to first-order logic, that's a lot better than reducing it to second-order logic.

Comment author: 06 May 2011 08:55:32PM *  1 point [-]

Well, now you are exhibiting the IMHO regrettable tendency that the more mathematical LWers exhibit of putting way too much focus on incompleteness uncomputability and other negative results that have negligible chance of actually manifesting unless you are specifically looking for pathological cases or negative results.

I use second-order logic all the time. Will work fine for this purpose.

Comment author: 08 May 2011 04:56:28AM 0 points [-]

Yeah, you're probably right...

Comment author: 07 May 2011 11:38:52AM 0 points [-]

I still do not see how first-order logic relates in any way to cousin_it's statement in grandparent of grandparent.

Just because second-order logic is incomplete does not mean we must restrict ourselves to first-order logic.

Comment author: 08 May 2011 05:04:07AM *  0 points [-]

I think I recall reading somewhere that you only need first-order logic to define Turing machines and computer programs in general, which seems to suggest that "not expressible in first order logic" means "uncomputable"... I could just be really confused about this though...

Anyway, for some reason or other I had the impression that "not expressible in first order logic" is a property that might have something in common with "hard to explain to a machine".

Comment author: 07 May 2011 05:55:39PM 0 points [-]

ADDED. relates -> supports or helps to illuminate

Comment author: 20 May 2011 04:29:31AM 1 point [-]

After learning a bit more about inductive and coinductive types from Robert Harper's (which I found more lucid on the topic than Types and Programming Languages), I don't quite understand why it's important for the datatype to be coinductive. You can clearly encode the integers as inductive types, why doesn't that suffice for 'explaining to a computer what a "terminating computation" is'? Are computers 'naturally coinductive' in some sense?

Comment author: 20 May 2011 04:54:49AM *  0 points [-]

Actually, since computers are turing complete, perhaps it's more like 'computers are naturally recursively typed' and there is an analogous impossibility theorem for encoding some coinductive type in the same way? Perhaps I just don't understanding enough yet.

Comment author: 16 May 2011 12:51:02PM 3 points [-]

OK, here's my encoding (thanks Misha):

N is encoded by N triplets of bits of the form 1??, followed by a zero.

Each triplet is of the form (1, a(N, k), b(N, k)) (with 1 <= k <= N), where

• a(N, k) means "Turing machine number k halts in N steps or less, and it's final state has an even number of ones on the tape", and

• b(N, k) means "Turing machine number k halts in N steps or less, and it's final state has an odd number of ones on the tape".

By "Turing machine number k" I mean we have an infinite list containing all possible Turing machines, sorted by description length.

So to encode N, the encoder "just" simulates the N first turing machines for N steps, and notes which of them stops.

The decoder reads his input triplet-by-triplet. When reading input k, it simulates the k first Turing machines for k steps, and checks whether the results are consistent with all that it has read in the input so far : if one of the simulated machines halts in a way not announced in the input (or if a triplet is impossible like "111"), it stops reading.

While an "infinitely confusing" input stream does exist, generating it requires solving the halting problem, which is generally considered pretty hard.

Comment author: 16 May 2011 01:21:52PM *  0 points [-]

Nice! But why do you need the parity of the output tape? Can't you use groups of 2 bits instead of 3 and just specify whether machine number k halts within N steps? (Edit: sorry, now I see why. You could confuse the decoder by telling it that all machines halt.)

Thanks a lot! Your solution feels more intuitive to me than the one offered on MathOverflow.

Comment author: 16 May 2011 01:49:58PM *  0 points [-]

Well you could, but then you'd need a way to protect against an input of only ones - i.e. "all Turing machines halt in N steps or less". You could make a decoder that handles cases like that by checking whether a Turing machine eventually starts looping, but that makes it's logic less straightforward, and it isn't clear whether an infinite input couldn't be found without solving the halting problem.

Turing machines could be classified in three categories:

• H: those that halt
• L: those that eventually start looping
• F : those that go on forever

The halting problem is checking for membership in H, and tricking the "loop-detecting" 2-bit decoder would "only" require never returning '1' for a member of L. There may be an algorithm A that returns 1 for membership in H, 0 for membership of L, and either 1 or 0 for membership of F. That algorithm wouldn't be equivalent to a halting oracle, so it might not be undecidable, I don't know.

So having 3 bits just makes checking simpler, since you can't pretend to know that an algorithm will halt, you have to know something about the state it halts in too.

Comment author: 05 May 2011 02:05:04PM *  18 points [-]

An alternative phrasing:

In the Atlas Mountains, one can meet two species of Sphinxes: True Sphinxes and Hollow Sphinxes. Each True Sphinx has a secret number, and it will truthfully answer any question about it, provided it can do so in less than a hundred words. A Hollow Sphinx, however, just enjoys wasting people's time : it has no secret number but will pretend it does, and answers questions so as to never contradict itself and maintain uncertainty in the asker's mind.

While you can be sure you are speaking to a True Sphinx (for example, by guessing it's number), you can never be sure that you are speaking to a Hollow Sphinx - it might be a True Sphinx whose number is very large. In fact, no-one has been able to determine whether any Hollow Sphinxes still exist.

Comment author: 05 May 2011 02:16:53PM *  3 points [-]

I wonder if your explanation will make sense to people who didn't understand the original problem... but it's awesome!

Comment author: 05 May 2011 03:05:28PM *  1 point [-]

Seems that the best strategy would "merely" be to try to find an upper bound, like "Is the number smaller than "BusyBeaver(BusyBeaver(BusyBeaver(BusyBeaver(3^^^3))))?", and a Hollow Sphinx would just answer "no, it's bigger", so presented like that, it doesn't look like a very interesting problem.

Though there may be a series of question such that a Hollow Sphinx can't keep up the masquarade unless it has a Halting Oracle (say, a yes-or-no question such that it can't compute which answer could apply to an infinite amount of numbers), but I have no idea which questions those might be.

(The same applies to the original problem: there might be some exotic encoding such that an "infinitely confusing" input exists, determining say it's first 100 bits is uncomputable.)

Comment author: 05 May 2011 05:08:22PM *  7 points [-]

BusyBeaver(BusyBeaver(BusyBeaver(BusyBeaver(3^^^3))))

This is the moral equivalent of saying BB(3^^^3)+1. Mere recursion is puny. At least jump from BB to BB_2!

determining say it's first 100 bits is uncomputable

Every sequence of 100 bits is computable.

Here's an encoding where the "infinitely confusing" input is uncomputable: take the unary encoding from my post and XOR it with some infinite uncomputable sequence, e.g. the halting oracle sequence. Unfortunately, you can't use this test in practice because you can't compute the required bits either.

Comment author: 05 May 2011 07:19:18PM 2 points [-]

To put it more formally: we want an encoding algorithm E and a decoding algorithm D such that for any natural number n, a turing machine can compute E(n) and D(E(n)) in finite time, but that there exists a k such that "determining a string s of length k so that the set {n such that E(n) starts with s} is infinite" is an undecidable problem. Do D and E exist fulfilling those conditions?

I admit my working knowledge of what is decidable/computable is somewhat limited; maybe the condition should be to find E and D such that no algorithm can return s given k, or whether there exists an algorithm that given E and D, can return a "confusing" sequence of arbitrary length.

Comment author: 12 May 2011 02:40:07PM *  2 points [-]

I asked MathOverflow to solve this and got a complete solution within an hour. Just like last time. Wow, just wow.

Comment author: 12 May 2011 04:55:37PM 1 point [-]

The stack exchange sites are pretty impressing, and humbling. Maybe Eliezer should outsource more of his research to them.

Comment author: 12 May 2011 08:58:46AM *  0 points [-]

I think I have a partial solution for the case where the "infinitely confusing" string is unique. Namely, if there is a unique infinite string S on which D doesn't terminate, then S is computable.

Here's an algorithm for computing S by using D. Assume that the first bit of S is 0. That means D terminates on all strings whose first bit is 1. By the argument from the original post, that implies D's running time on strings starting with 1 is uniformly bounded. (Otherwise we would be able to generate an "infinitely confusing" string for D that starts with 1 by using the proof technique from my post, which can't happen because S is unique and starts with 0 by assumption.) Therefore, by feeding D all possible finite inputs in turn while allowing it gradually increasing running times ("dovetailing"), we will eventually get a "covering set" that proves D always terminates on strings starting with 1. So we determine that the first bit of S must be 0. In the same way you can determine the second bit of S, and so on.

Let's see how it works on unary notation. We run D on all finite inputs in turn. First we notice that D("0") terminates without trying to read further, therefore the first bit of S is 1. Then we notice that D("10") terminates, therefore the second bit of S is also 1. And so on.

Comment author: 12 May 2011 12:41:58PM 0 points [-]

if there is a unique infinite string S on which D doesn't terminate, then S is computable.

That looks true, though the process you describe only works if S is indeed unique, othwerwise it gets stuck.

Comment author: 05 May 2011 06:32:22PM 3 points [-]

Mere recursion may be puny, but given the "fifteen-second challenge" and that BB_2 etc aren't well known enough to use there, it seems like the best bet, doesn't it?

Comment author: 05 May 2011 06:00:33PM 1 point [-]

At least jump from BB to BB_2!

BB_2?

Comment author: 05 May 2011 06:04:32PM 6 points [-]

The BusyBeaver function for Turing machines with a halting oracle.

Aaronson

Comment author: 05 May 2011 07:16:23PM 0 points [-]

Ahh, so that's what the <Can someone recall the title of Eliezer's parable in which the genius level humans spend thousands of years deciphering the messages sent by the not-so-smart universe simulators?> were using!

Comment author: 05 May 2011 08:00:57PM 9 points [-]

<Can someone recall the title of Eliezer's parable in which the genius level humans spend thousands of years deciphering the messages sent by the not-so-smart universe simulators?>

You may be thinking of "That Alien Message." Best wishes, the Less Wrong Reference Desk.

Comment author: 05 May 2011 09:47:31PM 0 points [-]

Thankyou! I'd been looking for that one for a while.

Comment author: 05 May 2011 08:05:13PM 2 points [-]

I'm sorry to ask this question, because it seems very stupid, but how exactly would one create a set-up where a Turing machine, as in head-and-tape style thing, actually interacts with a halting oracle?

I don't doubt that its possible but I can't think of an elegant way to do it.

Comment author: 05 May 2011 08:33:25PM *  4 points [-]

Wikipedia has a detailed explanation of the setup.

Comment author: [deleted] 05 May 2011 08:51:17PM 0 points [-]

Try asking the sphinx "Is your number less than 10 if and only if P=NP?"

Comment author: 05 May 2011 09:11:52PM 0 points [-]

That wouldn't work for distinguishing the two kinds of sphinxes, unless you suppose that one kind knows whether P=NP and the other doesn't.

Comment author: [deleted] 05 May 2011 09:27:52PM *  2 points [-]

True, but it would still yield an interesting answer :)

ETA: Also, one could give the sphinx a Turing machine and ask if it halts within N steps or fewer, where N is that sphinx's number. After a number of such questions, you'd feel pretty confident that either the sphinx is a True Sphinx or it has a halting oracle.

Comment author: 06 May 2011 10:17:50AM 0 points [-]

Huh, that looks like it could work, neat :)

So to bring it back to cousin_it's post, one could have a "busy beaver candidate" (a turing machines where we don't know whether it goes on for ever or eventually halt or start repeating), and the encoding of a natural number n would be that the first bit is whether or not that candidate halts before n steps, followed by the unary encoding of n.

"Encoding" or decoding n would be easy (but long when n is big), but "breaking" that encoding for any usy beaver candidate would require a halting oracle.

Comment author: [deleted] 06 May 2011 06:10:09PM *  1 point [-]

Suppose I pass you the bit that says the candidate does not halt, followed by an infinite string of 1s. Then to decode this (by which I mean reject it as invalid) you would need to know whether the busy beaver candidate halts, which we've rejected as hard.

This is a problem with the Sphinxes, too, in retrospect. A Hollow Sphinx could just keep answering "yes" if it's hard to check whether the Turing machine halts, making you do the hard work.

Comment author: 06 May 2011 06:48:07PM 1 point [-]

Agreed, but a non-oracle Sphinx wouldn't be impossible to recognize any more, it would just be very hard to recognize (you would need the Sphinx to guess wrong, and to be patient enough to figure out that it did).

In summary, whoever has the halting oracle wins, and if nobody does, it's a patience contest.

Comment author: 04 May 2011 11:20:49PM 15 points [-]

This phenomenon is noted in any handling of information theory that discusses "stream arithmetic coding". Those codes can be interpreted as a fraction written in binary, where each bit narrows down a region of the [0,1) numberline, with earlier-terminating bitstreams reserved for more likely source strings.

Any probability distribution on integers has a corresponding (optimal) arithmetic stream coder, and you can always make it indefinitely expect more input by feeding it the bit that corresponds to the least likely next character (which means it's not finished decompressing).

The human David MacKay discusses such encodings and their implicit unbounded length potential in Chapter 6 of his/her book which can be accessed without spending USD or violating intellectual property rights.

Comment author: 05 May 2011 08:10:02AM *  1 point [-]

Thanks for pointing out the connection!

Comment author: 05 May 2011 04:56:19PM 3 points [-]

You're welcome! I'm always glad to learn when knowledge I've gained through paperclip maximization has value to humans (though ideally I'd want to extract USD when such value is identified).

I should add (to extend this insight to some ot the particulars of your post) that the probability distribution on the integers implicitly assumed by the unary encoding you described is that smaller numbers are more likely (in proportion to their smallness), as do all n-ary number systems. So-called "scientific" notation instead favors "round" numbers, i.e. those padded with zeros the soonest in the least-significant-digit direction.

Comment author: 05 May 2011 05:15:28PM 2 points [-]

though ideally I'd want to extract USD when such value is identified

Your comments are often pleasant to read, but I don't pay USD for comments that are pleasant to read, and don't know anyone who does. Sorry.

Comment author: 05 May 2011 05:16:58PM 1 point [-]

Thanks. I didn't mean charging for comments, just that if I identified major insights, I could sell consulting services or something. Or become a professor at a university teaching the math I've learned from correct reasoning and paperclip maximizing. (Though my robot would need a lot of finishing touches to pass.)

Comment author: 04 May 2011 09:24:40PM 10 points [-]

You're allowed to invent an arbitrary scheme for encoding integers as strings of bits. Whatever encoding you invent, I can give you an infinite input stream of bits that makes your decoder hang and never give a definite answer like "yes, this is an integer with such-and-such value" or "no, this isn't a valid encoding of any integer".

I find it annoying how my brain keeps saying "hah, I bet I could" even though I explained to it that it's mathematically provable that such an input always exists. It still keeps coming up with "how about this clever encoding?, blablabla" ... I guess that's how you get cranks.

Comment author: 04 May 2011 09:27:41PM *  4 points [-]

Yeah, that's a nice feature of the problem. I felt that too, and hoped that someone would react the same way :-)

Comment author: [deleted] 04 May 2011 07:01:25PM 4 points [-]

Here is a related well-known idea: in nonstandard analysis, a subset B of the real numbers is infinite if and only if its extension B* contains nonstandard numbers.

It might be equivalent, actually. Think of infinite bit streams as hyperintegers (e.g. the infinite sequence of 1s is the hyperinteger (1, 11, 111, 1111...), in binary if you like). Let E be the set of encodings of all integers, and all their prefixes. E is infinite, so E* contains a nonstandard element. I believe (here is where my nonstandard analysis fails me) that such a nonstandard element gives us an infinite bit stream which can't be rejected by the decoder.

A less serious but also related idea is the Frivolous Theorem of Arithmetic, which states that almost all positive integers are really, really, really, really large.

Comment author: 04 May 2011 06:36:01PM *  2 points [-]

This seems like a decent place to ask: I'm slowly trying to learn Type Theory. I haven't seen a place where (Co)Inductive datatypes (as in the Calculus of (Co)Inductive Constructions) are explained formally (though preferably for novice readers); does anyone have a a suggestion?

Comment author: 04 May 2011 04:58:51PM *  3 points [-]

Nice problem! I think this is a proof, but I don't know if it's your proof:

Vs K unf vasvavgryl znal inyvq fhssvkrf, ng yrnfg bar bs K0 be K1 zhfg unir vasvavgryl znal inyvq fhssvkrf. Fb yrg k_{x+1} = 0 vs k_1 ... k_x 0 unf vasvavgryl znal inyvq fhssvkrf, naq 1 bgurejvfr; gura k_1 gb k_x unf vasvavgryl znal inyvq fhssvkrf sbe nal x.

Comment author: 04 May 2011 08:33:21PM 1 point [-]

Warning: reading this comment may make rot13 less effective at obscuring text.

V whfg abgvprq gung k naq x ner rkpunatrq ol ebg-guvegrra, fb gur rkcerffvbaf lbh jebgr nera'g bofpherq irel jryy.

Comment author: 05 May 2011 05:06:45PM *  0 points [-]

Vs K unf vasvavgryl znal inyvq fhssvkrf, ng yrnfg bar bs K0 be K1 zhfg unir vasvavgryl znal inyvq fhssvkrf. Fb yrg k_{x+1} = 0 vs k_1 ... k_x 0 unf vasvavgryl znal inyvq fhssvkrf, naq 1 bgurejvfr; gura k_1 gb k_x unf vasvavgryl znal inyvq fhssvkrf sbe nal x.

Ner gurer frgf bs inyvq fgevatf fhpu gung ab Ghevat znpuvar N pbhyq vzcyrzrag guvf "nggnpx" ba gur bevtvany znpuvar G? Gung vf, qbrf gurer rkvfg n G fhpu gung, rira vs N unf npprff gb G'f fbhepr pbqr, N pnaabg gryy pbapyhfviryl juvpu bs K0 be K1 unf vasvavgryl znal inyvq fhssvkrf? Vs fb, pna lbh npghnyyl pbafgehpg n G gung cebinoyl unf guvf cebcregl?

ETA: Looks like cousin_it says that the answer to my existence question is "Yes."

Comment author: 05 May 2011 06:29:22PM 0 points [-]

The "original Turing machine" decides whether a prefix is valid? Then yes definitely; bear in mind Rice's theorem, which basically says that no non-trivial property of a Turing machine is computable.

Comment author: 05 May 2011 06:37:36PM 0 points [-]

The "original Turing machine" decides whether a prefix is valid?

If I understand you, yes. By the "original Turing machine", I meant the machine T that putatively can interpret an input string as an integer or, alternatively, reject the input string as not corresponding to any integer.

So, can we actually construct such a machine that is provably immune to the "attack" in your proof, in the sense that no Turing machine could implement the attack? Or are you saying that Rice's theorem (with which I will have to acquaint myself) says that pretty much any Turing machine T that maps strings to integers will fit the bill? (Though, the one in cousin_it's OP doesn't . . .)

Comment author: 06 May 2011 07:49:08AM 0 points [-]

Hmm. So T is an acceptor, A is an attacker. A(T) is an infinite sequence of symbols produced after examining T's source code, and T(A(T)) = ⊥ means T never rejects the sequence. Then what I was asserting is ¬∃A:∀T:T(A(T)) = ⊥. What you're asking now is whether ∀T:∃A:T(A(T)) = ⊥ and I can't immediately work out the answer.

Comment author: 04 May 2011 05:00:09PM *  0 points [-]

Yeah, I said it was obvious :-)

Comment author: 04 May 2011 07:59:43PM *  1 point [-]

Ha, I immediately thought of König's Lemma, but failed to notice that the concreteness of the situation means we don't need DC... that's the danger of not doing it from scratch, I guess...

Comment author: 20 May 2011 04:19:27AM 0 points [-]

Have you considered posting your observation/question to cstheory.stackexchange.com? I'd be interested in seeing the response there.

Comment author: 20 May 2011 09:06:40AM *  0 points [-]

I asked a question on MathOverflow, you can find the link in my conversation with Emile above.

Comment author: 04 May 2011 06:48:29PM *  0 points [-]

As for your problem, Robert Harper (posts as 'Abstract Type') frequently notes that you cannot define the natural numbers or any inductive type in haskell because it is lazy so bottom (non-termination) is a member of all types and therefore requires coinductive types (for example see also the reddit discussion).

Comment author: 04 May 2011 08:52:16PM *  4 points [-]

Right. In fact, this phrase from Harper's blog was one of the inspirations for my post:

In Carollian style there are types called naturals and lists, but that’s only what they’re called, it’s not what they are.

ETA: it seems you slightly misunderstood Harper's point. The problem with defining "data Nat = Zero | Succ Nat" is not that bottom is in Nat, but rather that laziness allows you to write the circular definition "omega = Succ omega" (note that pattern-matching on omega doesn't hang), which is similar to the unary example in my post.

Comment author: 04 May 2011 11:35:22PM 1 point [-]

Thanks for correcting me :) I imagine there's a lot of Harper I don't understand correctly.