Less Wrong is a community blog devoted to refining the art of human rationality. Please visit our About page for more information.

Comment author: Manfred 25 April 2015 10:21:56AM 0 points [-]

The recent advances in deep learning come in part from the scruffies/experimental researchers saying "screw hard theory" and just forging ahead

Yeah I get that vibe. But ultimately I'm looking into this because I want to understand the top-down picture, not because of specific applications (Well, integrating logical uncertainty with induction would be nice, but if that works it's more of a theoretical milestone than an application). Research on neural networks is actually more specific than what I want to read right now, but it looks like that's what's available :P (barring a literature search I should do before reading more than a few NN papers)

Comment author: V_V 26 April 2015 11:27:57PM *  1 point [-]

One of the issue is that if you try to prove generalization bounds for anything that has an universal representation property (e.g. neural networks) you always get terrible bounds that don't reflect empirical performance on real-world learning tasks.

I think this is because there are probably some learning tasks which are intrinsically intractable (this is certainly the case if one-way functions exist), therefore any bound that quantifies over all learning tasks will be at least exponential.

Comment author: Kutta 24 April 2015 11:23:09AM *  2 points [-]

I'd like to ask a not-too-closely related question, if you don't mind.

A Curry-Howard analogue of Gödel's Second Incompleteness Theorem is the statement "no total language can embed its own interpreter"; see the classic post by Conor McBride. But Conor also says (and it's quite obviously true), that total languages can still embed their coinductive interpreters, for example one that returns in the partiality monad.

So, my question is: what is the logical interpretation of a coinductive self-interpreter? I feel I'm not well-versed enough in mathematical logic for this. I imagine it would have a type like "forall (A : 'Type) -> 'Term A -> Partiality (Interp A)", where 'Type and 'Term denote types and terms in the object language and "Interp" returns a type in the meta-language. But what does "Partiality" mean logically, and is it anyhow related to the Löbstacle?

Comment author: V_V 26 April 2015 11:13:30PM 0 points [-]

As we've seen, such a sublanguage (perhaps called `Ask', a part of Haskell which definitely excludes Hell) cannot contain all the angels, but it certainly admits plenty of useful ones who can always answer mundane things you might Ask. It's ironic, but not disastrous that lucifer, the evaluation function by which Ask's angels bring their light, is himself an angel, but one who must be cast into Hell.

LoL!

Comment author: eli_sennesh 22 April 2015 12:04:55AM 0 points [-]

Unless you designed the program yourself and have a good understanding of what it is doing, and sometimes even if you designed the program yourself, proving termination may not be trivial.

If it was a trivial problem it wouldn't be worth tackling.

Comment author: V_V 22 April 2015 11:02:49AM 0 points [-]

My point is that the observation that a program has been running for a long time and doesn't have simple "while(true) {...}" loops doesn't give us much information about its termination.

But I'm not sure what you are aiming at.
For most practical purposes, a program that takes a very long time before producing a result is as good as a program that will never produce a result. Termination is important in theoretical computer science as the prototypical undecidable property used to prove other non-existence results, but if you are concerned about specific programs that you plan to actually run on physical machines, why are you interested in termination?

Comment author: eli_sennesh 20 April 2015 10:07:34PM *  0 points [-]

I guess there's another question of how any of this makes sense if the universe is computable. We can still use information about which Turing machines halt in part of our generative model for a computable universe, even though "x doesn't halt" is never actually observed.

Think more like an inductive human being than a deductive formal system: "x doesn't ever halt" would require proof, but "x really seems not to halt for a very long time and seems to contain an infinitely-growing recursion there" gives us very valid reason to believe "x doesn't halt", and guides us in trying to assemble a proof.

(Likewise, as a side note, it can actually be quite difficult to prove "x does halt (on all inputs)", because merely being observed to halt on many inputs is no guarantee at all. Unless the looping/recursion is trivially structural on the input, proving "x does halt" to a type system's termination checker can be quite difficult!)

Perhaps you could make a statement like: Solomonoff induction wins on computable universes for the usual reason, and it doesn't lose too many bits on uncomputable universes in some circumstances because it does at least as well as something that has a uniform prior over Chaitin's omega.

I think that considering "computable" or "uncomputable" to be ontological properties of universes is... contrary to the philosophical viewpoint being taken here? At least from my point of view here, you're not breaking an Ontological Law of the Platonic Plane of Maths if you "decide the undecidable"; you're breaking a law of thermodynamics (namely, you're trying to act as if you possessed infinite entropy with which to compute). It's physically impossible, not Platonically impossible.

If I may invoke a phrase, I'm basically taking the point of view that a formal system is a map of some mathematical territory (which Turing machines halt), but that it is possible to have increasingly detailed and accurate maps (systems with stronger axioms) of the same territory (increasing the quantity of Turing machines whose behavior we can decide). The hard nut to crack is: how do we acquire better maps?

Further, the universe doesn't necessarily have to be computable in any ontological sense. To talk about the physical, empirical universe, we only need to make predictions about events which take place a finite period of time into the future. If we're composing programs whose termination we want to guarantee, we can use inductive and coinductive type systems to guarantee termination and stepwise termination where we need them, and add hard time limits where we can't be sure.

As Schmidhuber put it regarding Goedel Machines: there may be very rare situations in which some part of reality is just Platonically, ontologically incomputable, but a human being is bound by the same laws of entropy as anything else, so the Platonic, ontological level of things doesn't really matter for purposes of doing real-world math or AI. (We find it convenient to pretend it does matter, but actually we only care about the theorems we can demonstrate to be true among real-world human beings (which includes Goedel Statements for Peano Arithmetic) rather than about Platonic Forms of Pure Randomness.)

Comment author: V_V 21 April 2015 02:09:08PM 0 points [-]

Think more like an inductive human being than a deductive formal system: "x doesn't ever halt" would require proof, but "x really seems not to halt for a very long time and seems to contain an infinitely-growing recursion there" gives us very valid reason to believe "x doesn't halt", and guides us in trying to assemble a proof.

Busy beavers seem not to halt for a very long time and spend most of their time in a seemingly infinitely-growing recursion, and then they halt.

You could argue that busy beavers are pathological examples of little practical interest, but then many programs that perform useful computation may run for a very long time without entering easily identifiable "loops" before they halt.
Unless you designed the program yourself and have a good understanding of what it is doing, and sometimes even if you designed the program yourself, proving termination may not be trivial.

Comment author: hairyfigment 20 April 2015 05:29:47PM 0 points [-]

I think the grandparent silently assumes we've solved logical uncertainty, and can use this to decide an axiom does not produce a contradiction by observing the lack of explosion. Though "routing through logic" seems like an incredibly bad way to say that.

Comment author: V_V 20 April 2015 05:56:36PM 0 points [-]

I think the grandparent silently assumes we've solved logical uncertainty, and can use this to decide an axiom does not produce a contradiction by observing the lack of explosion.

Wouldn't that imply having solved undecidability?

Comment author: eli_sennesh 20 April 2015 04:42:20AM 0 points [-]

The equivalent of "kilograms of mass" might be something like bits of Chaitin's omega. If you have n bits of Chaitin's omega, you can solve the halting problem for any Turing machine of length up to n.

That is the domain of things which I am referring to, yes.

you can't actually learn upper bounds on Chaitin's omega except by observing uncomputable processes (for example, a halting oracle confirming that some Turing machine doesn't halt).

This is where things get fuzzier. I think that by routing through logic and observing the empirical consequences of ordinal consistency axioms, we could learn to believe those axioms and thus eventually climb the ordinal hierarchy to gain knowledge of the halting behavior of increasingly many Turing machines. However, my original idea was much plainer: just do a kind of Solmonoff Induction over Turing machines encoding axiom sets.

I'm trying to understand how the things you're saying could give us more powerful theories from empirical data without allowing us to recursively enumerate all non-halting Turing machines.

Even empirical data only allows us to enumerate a finite number of nonhalting Turing machines, without being able to decide the halting/nonhalting behavior of Turing machines too large for our empirical information.

Comment author: V_V 20 April 2015 03:59:26PM *  0 points [-]

This is where things get fuzzier. I think that by routing through logic and observing the empirical consequences of ordinal consistency axioms

I'm not sure I understand what you mean here.

Assume we have a formal system and we are using it to predict observations. In classical logic, if the formal system is inconsistent, then ex falso quodlibet, therefore all possible observations will be allowed by it.

Comment author: V_V 20 April 2015 03:46:27PM *  4 points [-]

This is the sketch and the intuition. As a theory, it does one piece of very convenient work: it explains why we can't solve the Halting Problem in general (we do not possess correct formal systems of infinite size with which to reason about halting), but also explains precisely why we appear to be able to solve it in so many of the cases we "care about" (namely: we are reasoning about programs small enough that our theories are strong enough to decide their halting behavior -- and we discover new formal axioms to describe our environment).

We also appear not to be able to solve the halting problem for many small programs we care about, or at least we haven't been able to solve it yet.
The simplest example is probably the Collatz conjecture, but in general any mathematical conjecture can be reformulated as determining whether some small program that enumerates and checks all candidate proofs halts.

Comment author: jkaufman 06 April 2015 12:49:22PM 2 points [-]

"Futarchy" is an obscure, non-descriptive name. Better call it something like "prophetocracy"

"Futarchy" is the standard term for this governmental system. Perhaps Hanson should have chosen a different name, but that's the name its been going under for about a decade and I don't think "prophetocracy" would be an improvement.

Comment author: V_V 06 April 2015 10:04:40PM 0 points [-]

It's not a very well know word, anyway. Would the cost of changing it outweigh the benefit of a relatively self-descriptive word?

In response to Why bitcoin?
Comment author: trifith 02 April 2015 01:59:28AM *  4 points [-]

Bitcoiner here, so bear that in mind

A 51% attack is hard. It's come close to happening once with a mining pool called GHash.io. The community quickly responded, and as of right now, the largest pool has about 17% of the network hashing power, and GHash.io has about 4%. The current state of network power distribution can be viewed at https://blockchain.info/pools , as well as other blockchain watching services. It is in any particular miners interest to be in the largest pool possible, but counter to any miners purpose to be in any pool with the possibility of making a 51% attack, since a successful 51% attack would be the end of bitcoin's market value, which makes their investment in mining equipment worthless. This has, so far, been successful.

Each wallet does not need to contain the entire blockchain. It is generally recommended that, if you are not running a full node contributing to the network, you use a thin wallet that queries the blockchain, or a trusted blockchain monitoring service, in order to asses balances and get data needed to make new transactions in a much less data-intesive manner.

Governments are not keen to give up their monopoly on the money supply, but there are limits to how much they can do to maintain it. There have been several instances historically where government control of the money supply is undermined by the economic reality of people just using other things to conduct business, regardless of the legal status of such business.

Mining is not "wasted" computing power, but rather power being put to the specific purpose of verifying and securing the integrity of the bitcoin blockchain database. It is no more a waste of resources, than the concrete, steel, and construction time of a bank vault. It's just not doing anything else, like creating an emergency shelter, or storing non-bitcoin valuables.

If you have a non-currency based idea to get people to properly secure a publicly editable, decentralized, ledger such that malicious actors cannot alter it to suit their needs, I'm more than willing to hear it. Until I see such a proposal, I don't see the addition of currency to a blockchain as a complication, but rather as a necessary function. There has to be something which incentivises people to engage in the process of securing the ledger, and that is the receipt of valuable entries on the ledger, those entries being useable as currency.

In response to comment by trifith on Why bitcoin?
Comment author: V_V 04 April 2015 01:27:27PM *  1 point [-]

A 51% attack is hard. It's come close to happening once with a mining pool called GHash.io. The community quickly responded, and as of right now, the largest pool has about 17% of the network hashing power, and GHash.io has about 4%. The current state of network power distribution can be viewed at https://blockchain.info/pools , as well as other blockchain watching services.

According to the site you linked, the four largest pools control over 50% of hashing power. Would it be unrealistic for them to collude? How do you know that it has not already happened?
Also there is 19% of hashing power which is listed as "unknown". Presumably this should be from miners who are not part of any pool, but how can you exclude collusion?

It is in any particular miners interest to be in the largest pool possible, but counter to any miners purpose to be in any pool with the possibility of making a 51% attack, since a successful 51% attack would be the end of bitcoin's market value, which makes their investment in mining equipment worthless. This has, so far, been successful.

That sounds like a tragedy of commons scenario, which isn't promising.

Also, publicly distinct pools could collude in secret. I presume that whoever pulls it might make perhaps hundred million dollars by double-spending before the attack is discovered and Bitcoin value plummets to zero. And if you are a bitcoiner who is convinced that somebody is going to do that sooner or later, then you have an incentive to be the first to do it.

More generally, one of the common criticism of Bitcoin is that the infrastructure is currently supported by "stupid money", since mining has a lower RoI than other forms of investment. In fact, there is some speculation that the RoI may be actually negative when considering hardware costs, and current miners are just trying to partially recoup the costs of the hardware they wrongly purchased when Bitcoin was at ~$ 1,000.
This doesn't bode well for the long-term sustainability of the infrastructure.

Comment author: V_V 04 April 2015 12:37:11PM 1 point [-]

I'm not a big fan of decision making by conditional prediction markets (btw, "futarchy" is an obscure, non-descriptive name. Better call it something like "prophetocracy"), but I think that proponents like Robin Hanson propose that the value system is not set once and for all but regularly updated by a democratically elected government. This should avoid the failure mode you are talking about.

View more: Next