IlyaShpitser comments on Deliberate Grad School - Less Wrong
You are viewing a comment permalink. View the original post to see all comments and the full post content.
You are viewing a comment permalink. View the original post to see all comments and the full post content.
Comments (153)
Isn't that motte/bailey: "philosophy, a diseased discipline" is not a very discriminating title. The best line of his post is this:
And this is definitely ok!
But again, I am not super interested in arguing with people about whether philosophy is worthwhile. I have better things to do. I was only pointing out in response to the OP that I have been harping on LW's silly anti-academic sentiment for ages, that's all.
Not sure it's motte-and-bailey. I do think there are several serious pathologies in large swathes of contemporary philosophy. And I say this not as a dilettante, but a professional philosopher. There are areas of philosophy where these pathological tendencies are being successfully held at bay, and I do think there are promising signs that those areas are growing in influence. But much of mainstream philosophy, especially mainstream metaphysics and epistemology, does suffer from continued adherence to what I consider archaic and unhelpful methodology. And I think that's what Luke is trying to point out. He does go overboard with his rhetoric, and I think he lacks a feel for the genuine insights of the Western philosophical tradition (as smart and insightful as I think Yudkowsky is, I really find it odd that someone who purports to be reasonably familar with philosophy would cite him as their favorite philosopher). But I think there is a sound point lurking under there, and not merely a banal "motte"-style point.
I absolutely agree with you on the silliness of the anti-academic sentiment.
Would you mind explaining your perspective? I'm always interested to hear more angles on this, since with my current sample-size being roughly three (Dennett, Railton, Churchland), I tend to think I have an incomplete picture.
Everyone on LW should consider Francis Bacon their patron saint, imo :).
LW seems to have picked the Presbyterian minister Thomas Bayes as its patron saint with the Franciscan friar William of Ockham running a close second :-)
If I had to pick one, I'd go with Laplace.
In defense of Luke, when I've spent the time to read through philosophy books by strong-naturalist academic philosophers, they've often devoted page-counts easily equivalent in length to "Philosophy: a diseased discipline" to carefully, charitably, academically, verbosely tearing non-naturalist philosophy a new asshole. Luke's post has tended to be a breath of fresh air that I reread after reading any philosophy paper that doesn't come from a strongly naturalist perspective.
It sincerely worries me that the academics in philosophy who do really excellent work, work that does apply to the real world-that-is-made-of-atoms, work that does map-the-territory, have to spend large amounts of effort just beating down obviously bad beliefs over and over again. You should be able to shoot down a bad idea once, preferably in the peer-review phase, and not have to fight it again and again like a bad zombie.
(Examples of obviously bad ideas: p-zombies, Platonism, Bayesian epistemology (the latter two may require explanation).)
Now, to signal fairness even where I'm blatantly opinionated, plenty of people on LW are indeed irritatingly "men of one idea", that usually being some variation on AIXI. And in fact, plenty of people on LW hold philosophical opinions I consider obviously bad, like mathematical Platonism.
But the answer to those bad things hasn't usually been "more philosophy", as if any philosophy is good philosophy, but instead more naturalism, investing more effort to accommodate conceptual theorizing to the world-that-is-made-of-atoms.
Since significant portions of academic philosophy (for instance, Thomas Nagel) are instead devoted to the view - one that I once expected to be contrarian but which I now find depressingly common - that science and naturalism are wrong, or that they are unjustified, or that they are necessarily incapable of answering some-or-another important question - having one page on a contrarian intellectual-hipsters' website devoted to ragging on these ought-to-be-contrarian views is a bit of a relief.
Could you provide that explanation?
Sure.
If we take Platonism to be the belief that abstract objects (take, for instance, the objects of ZFC set theory) actually exist in a mind-independent way, if not in a particularly well-specified way, then it occurs because people mistake the contents of their mental models of the world for being real objects, simply because those models map the world well and compress sense-data well. In fact, those models often compress most sense-data better than the "more physicalist" truth would: they can be many orders of magnitude smaller (in bits of program devoted to generative or discriminative modelling).
However, just because they're not "real" doesn't mean they don't causally interact with the real world! The point of a map is that it corresponds to the territory, so the point of an abstraction is that it corresponds to regularities in the territory. So naive nominalism isn't true either: the abstractions and what they abstract over are linked, so you really can't just move names around willy-nilly. In fact, some abstractions will do better or worse than others at capturing the regularities in sense-data (and in states of the world, of course), so we end up saying that abstractions can exist on a sliding scale from "more Platonic" (those which appear to capture regularities we've always seen in all our previous data) to "more nominalist" (those which capture spurious correlations).
Now, for "Bayesian epistemology", I'm taking the Jaynesian view, which is considered extreme but stated very clearly and precisely, that reasoning consists in assigning probabilities to propositions. People who oppose Bayesianism will usually then raise the Problem of the Prior, and the problem of limited model classes, and so on and so forth. IMHO, the better criticism is simply: propositions are not first-order, actually-existing objects (see above on Platonism)! Consider a proposition to be a set of states some model can be in or not be in, and we can still use Bayesian statistics, including the kinds of complex Bayesian modelling used to model the mind, without endorsing Bayesian philosophy, which would require us to believe in spooky things called "propositions" and "logic" -- while also not believing in certain spooky things called "continuous random variables", which don't really fit into Cox's Theorem very well, if I understood Jaynes correctly.
How do you account for the fact that numbers are the same for everyone? Of course, not everyone knows the same things about numbers, but neither does everyone know the same things about Neptune. Nevertheless, the abstract objects of mathematics have the same ineluctability as physical objects. Everyone who looks at Neptune is looking at the same thing, and so is everyone who studies ZFC. These abstract objects can be used to make models of things, but they are not themselves those models.
You could equally say that everyone who looks at the rules of chess sees the same thing. In order to show some inevitability to ZFC, you have to show that unconnected parties arriving at it independently.
On the one hand, why? I'm quite happy to say that chess exists. Not everyone will ever see chess, but not everyone will ever see Neptune. Among all the games that could be played, chess is but one grain of sand on the beach. But the grain of sand exists regardless of whether anyone sees it.
On the other hand, there has been, I believe, a substantial tendency for people devising alternative axioms for the concepts of sets to come up with things equiconsistent to ZFC or to subsets of ZFC, and with fairly direct translations between them. Compare also the concept of computability, where there is a very strong tendency for different ways to answer the question "what is computation?" to come up with equivalent definitions.
It is (I think) true that if you try to come up with an alternative foundation for mathematics you are likely to get something that's equivalent to some subset of ZFC perhaps augmented with some kind of large cardinal axiom. But that doesn't mean that ZFC is inevitable, it means that if you construct two theories both intended to "support" all of mathematics without too much extravagance, you can often more or less implement one inside the other.
But that doesn't mean that ZFC specifically has any particular inevitability. Consider, e.g., NFU + Infinity + Choice (as used e.g. in Randall Holmes's book "Elementary set theory with a universal set") which I'll call NFUIC henceforward. This is consistent relative to ZFC, and indeed relative to something rather weaker than ZFC, and NFUIC + "all Cantorian sets are strongly Cantorian" (never mind exactly what that means) is equiconsistent with ZFC + some reasonably plausible large-cardinal axioms. OK, fine, so there's a sense in which NFUIC is ZFC-like, at least as regards consistency strength. But NFUIC's sets are most definitely not the same as ZFC's sets. NFUIC has a universal set and ZFC doesn't; ZFC's sets are the same sizes as their sets-of-singletons and NFUIC's often aren't; NFU has lots and lots of urelements and ZFC has just the single empty set; etc. NFUIC is very unlike ZFC despite these relationships in terms of consistency strength.
[EDITED to add:] Here's an analogy. You get the same computable functions whether you start with (1) Turing machines, (2) register machines, (3) lambda calculus, or (4) Post production systems. But those are still four very different foundations for computing, they suggest quite different possible hardware realizations and different kinds of notation, they have quite different performance characteristics, etc. (The execution times are admittedly all bounded by polynomials in one another. We could add (5) quantum Turing machines, in which case that would no longer be known to be true.) It's very interesting that these all turn out to be equivalent in power in some sense, but I wouldn't call that convergence or suggest that it tells us that (e.g.) lambda-calculus terms have any sort of more exalted metaphysical status than they would if it weren't for that equivalence.
Yes, ZFC may be not quite such a starkly isolated landmark of thinginess as computability is, which is why I said "a strong tendency". And anyway, these alternative formalisations of set theory mostly have translations back and forth. Even ZFA (which has sets-within-sets-within-etc infinitely deep) can be modelled in ZFC. It's not a subject I've followed for a long time, but back when I did, Quine's NF was the only significant system of set theory for which this had not been done. I don't know if progress has been made on that since.
(ETA: I found this review of NF from 2011. Its consistency was still open then.)
As for computable functions, yes, the different ways of getting at the class have different properties, but that just makes them different roads leading to the same Rome.
Randall Holmes says he has a proof of the consistency of NF relative to ZFC (and in fact something weaker, I think). He's said this for a while, he's published a few versions of his proof (mostly different in presentation in the interests of clarity, rather than patching bugs), and I think the general feeling is that he probably does have a proof but it hasn't yet been thoroughly checked by others. (Who may be holding off because he's still changing his mind about the best way of writing it down.)
The question is whether the rules of chess have mind-indepedent existence.
Where are these grains, ie the rules of every possible game? Are they in our universe, or some heavenly library of babel?
so how do we cash out the idea that these things are converging on an abstract object , rather than just converging? One way is put forward the counterfactual that if the abstract object were different, then the convergence would occur differently. But that seems rather against the spirit of what you are aiming.
Ask Max Tegmark. :)
I don't believe in his Level IV multiverse, though. That is, I do draw a distinction between physical and abstract objects.
That they are converging is enough. To quote an old saw variously attributed, in mathematics existence is freedom from contradiction.
If you don't know what your theory is, why undertake to defend it?
Vague again. Realists think mathematical existence, is a real, literal existence for which non-contradiuction is a criterion, whereas antirealists think mathematical existence is a mere metaphor with no more content than non-contradiction.
Perhaps I should finish with my usual comment that studying philosophy is useful because it allows you to articulate your theories, or, failing that, to notice when there are no clear concepts behind your words.
Two correct maps of the same territory, designed to highlight the same regularities and obscure the same sources of noise, will be either completely the same or, in the noisy case, will approximate each-other.
Just because there's no Realm of Forms doesn't mean that numbers can be different for different people without losing their ability to compressively predict regularities in the environment.
What is the territory, that numbers are a map of? I can use them to assemble a map, for example, s=0.5at^2 as a map, or model, of uniformly accelerating bodies, but the components of this are more like the ink and paper used to make a map than they are like a map.
I have a bunch of maps, literal printed maps of various places, and the maps certainly exist as physical objects, alongside the places that they are maps of. They exist independently of me, and independently of whether anyone uses them as a map or as wrapping paper. Likewise, it seems to me, numbers.
If there is no Realm of Forms, what territory are you referring to?
The ordinary physical universe, presumably.
As TheAncientGeek said, the ordinary physical universe. "Abstract" objects abstract over concrete objects.
And where is the ordinary physical universe do these abstractions live?
Again: they abstract over concrete objects. You get a map that represents lots of territories at the same time by capturing their common regularities and throwing out the details that make them different.
The motivation actually seems to be the Correspondence Theory of Truth..that is mentioned several timesin subsequent comments.
Indeed, even though when you use a Lossy-Correspondence/Compression Theory of Truth, abstract objects become perfectly sensible as descriptions of regularities in concrete objects.
Not really, because most maths is unphysical, ie physics is picking out the physically applicable parts of maths, ie the rest has nothing to correspond to.
If I remember my Lakoff & Núñez correctly, they were arguing that even the most abstract and un-physical-seeming of maths is constructed on foundations that derive from the way we perceive the physical world.
Let me pick up the book again... ah, right. They define two kinds of conceptual metaphor:
Their argument is that for any kind of abstract mathematics, if you trace back its origin for long enough, you finally end up at some grounding and linking metaphors that have originally been derived from our understanding of physical reality.
As an example of the technique, they discuss the laws of arithmetic as having been derived from four grounding metaphors: Object Collection (if you put one and one physical objects together, you have a collection of two objects), Object Construction (physical objects are made up of smaller physical objects; used for understanding expressions like "five is made up of two plus three" or "you can factor 28 into 7 times 4"), Measuring Stick (physical distances correspond to numbers; gave birth to irrational numbers, when the Pythagorean theorem was used to prove their existence by assuming that there's a number that corresponds to the length of the hypotenuse), and Motion Along A Path (used in the sixteenth century to invent the concept of the number line, and the notion of a number as lying between two other numbers).
Now, they argue that these grounding metaphors, each by themselves, are not sufficient to define the laws of arithmetic for negative numbers. Rather you need to combine them into a new metaphor that uses parts of each, and then define your new laws in terms of that newly-constructed metaphor.
Defining negative numbers is straightforward using these metaphors: if you have the concept of a number line, you can define negative numbers as "point-locations on the path on the side opposite the origin from positive numbers", so e.g. -5 is the point five steps to the left of the origin point, symmetrical to +5 which is five steps to right of the origin point.
Next we can use Motion Along A Path to define addition and subtraction: adding positive numbers is moving towards the right, addition of negative numbers is moving towards the left, subtraction of positive numbers is moving towards the left, and subtraction of negative numbers is moving towards the right. Multiplication by a positive number is also straightforward: if you are multiplying something by n times, you just perform the movement action n times.
But multiplication by a negative number has no meaning in the source domain of motion. You can't "do something a negative number of times". A new metaphor must be found, constrained by the fact that it needs to fit the fact that we've found 5 * (-2) = -10 and that, by the law of commutation (also straightforwardly derivable from the grounding metaphors), (-2) * 5 = -10.
Now:
So in other words, we have taken some grounding metaphors and built a new metaphor that blends elements of them, and after having constructed that new metaphor, we use the terms of that combined metaphor to define a new metaphor on top of that.
While this example was in the context of an obviously physically applicable part of maths, their argument is that all of maths is built in this way, starting from physically grounded metaphors which are then extended and linked to build increasingly abstract forms of mathematics... but all of which are still, in the end, constrained by the physical regularities they were originally based on:
To take a step back. the discussion is about mathematical Platonism, a theory of mathematical truth which is apparently motivated by the Correspondence theory of truth. That is being rivaled by another theory, also motivated by CToT, wherein the truth-makers of mathematical statements are physical facts, not some special realm of immaterial entities. The relevance of my claim that there are unphysical mathematical truths is that is an argument against the second claim.
Lakoff and Nunez give an account of the origins and nature of mathematical thought that while firmly anti-Platonic doesn't back a rival theory of mathematical truth, because that is not in fact their area of interest..their interest is in mathematical thinking.
Who said that? Actual formal systems run on a coherence theory of truth: if the theory is consistent (and I do mean consistent according to a meta-system, so Goedel and Loeb aren't involved right now), then it's a theory. It may also be a totally uninteresting theory, or a very interesting theory. The truth-maker for a mathematical statement is just whether it has a model (and if you really wanted to, you could probably compile that into something about computation via the Curry-Howard Correspondence and some amount of Turing oracles). But the mere truth of a statement within a formal system is not the interesting thing about the statement!
Insofar as logic consists in information-preserving operations, the non-physically-applicable parts of math still correspond to the real world, in that they preserve the information about the real world which was put into formulating/locating the starting formal system in the first place.
This is what makes mathematics so wondrously powerful: formality = determinism, and determinism = likelihood functions of 0 or 1. So when doing mathematics, you get whole formal systems where the theorems are always at least as true as the axioms. As long as any part of the system corresponds to the real world (and many parts of it do) and the whole system remains deterministic, then the whole system compresses information about the real world.
Whereas the physically inapplicable parts don't retain real-world correspondence. Correspondence isn'ta n intrinsic, essential part of maths.
I don't follow your argument re Bayesian epistemology, in fact, I find it not at all obvious. The argument looks like insisting on a different vocabulary while doing the same things, and then calling it statistics rather than epistemology.
Can you give a pointer to where he disbelieves in these? He does refer to them apparently unproblematically here and there, e.g. in deducing what a noninformative prior on the chords of a circle should be.
1) Dissolving epistemology to get statistics of various kinds underneath is a good thing, especially since the normal prescription of Bayesian epistemology is, "Oh, just calculate the posterior", while in Bayesian statistics we usually admit that this is infeasible most of the time and use computational methods to approximate well.
2) The difference between Bayesian statistics and Bayesian epistemology is slight, but the difference between Bayesian statistics and the basic nature of traditional philosophical epistemology that the Bayesian epistemologists were trying to fit Bayesianism into is large.
3) The differences start to become large when you stop using spaces composed of N mutually-exclusive logical propositions arranged into a Boolean algebra. For instance, computational uncertainty and logical omniscience are nasty open questions in Bayesian epistemology, while for an actual statistician it is admitted from the start that models do not yield well-defined answers where computations are infeasible.
I can't, since the precise page number would have to be a location number in my Kindle copy of Jaynes' book.
A brief quote will do, enough words to find them in my copy.
I found a partial answer to the question I asked in the sibling comment. By chance I happened to need to generate random chords of a circle covering the circle uniformly. In searching on the net for Jaynes' solution I came across a few fragments of Jaynes' views on infinity. In short, he insists on always regarding continuous situations as limits of finite ones (e.g as when the binomial distribution tends to the normal), which is unproblematic for all the mathematics he wants to do. That is how the real numbers are traditionally formalised anyway. All of analysis is left unscathed. His wider philosophical objections to such things as Cantor's transfinite numbers can be ignored, since these play no role in statistics and probability anyway.
I don't know about the technicalities regarding Cox's Theorem, but I do notice a substantial number of papers arguing about exactly what hypotheses it requires or does not require, and other papers discussing counterexamples (even to the finite case). The Wikipedia article has a long list of references, and a general search shows more. Has anyone written an up to date review of what Cox-style theorems are known to be sound and how well they suffice to found the mathematics of probability theory? I can google /"Cox's theorem" review/ but it is difficult for me to judge where the results sit within current understanding, or indeed what the current understanding is.
I don't know. But I will say this: I am distrustful of a foundation which takes "propositions" to be primitive objects. If the Cox's Theorem foundation for probability requires that we assume a first-order logic foundation of mathematics in general, in which propositions cannot be considered as instances of some larger class of things (as they can in, for personal favoritism, type theory), then I'm suspicious.
I'm also suspicious of how Cox's Theorem is supposed to map up to continuous and non-finitary applications of probability -- even discrete probability theory, as when dealing with probabilistic programming or the Solomonoff measure. In these circumstances we seem to need the measure-theoretic approach.
Further: if "the extension of classical logic to continuous degrees of plausibility" and "rational propensities to bet" and "measure theory in spaces of normed measure" and "sampling frequencies in randomized conditional simulations of the world" all yield the same mathematical structure, then I think we're looking at something deeper and more significant than any one of these presentations admits.
In fact, I'd go so far as to say there isn't really a "Bayesian/Frequentist dichotomy" so much as a "Bayesian-Frequentist Isomorphism", in the style of the Curry-Howard Isomorphism. Several things we thought were different are actually the same.
That word, "obviously", I don't think it means what you think it means :-)
I'm not a fan of mathematical Platonism, but physical realists, however hardline, face some very difficult problems regarding the ontologica status of physical law, which make Platonism hard to rule out. (And no, the perenially popular "laws are just descriptions" isn't a good answer).
P-zombies as a subject worth discussing, or as something that can exist in our univese? But most of the people who discuss PZs don't think they can exist in our universe. There is some poor quality criticisim of philosophy about as well.
The problems with Bayes are suffcieintly non-obvious to have eluded many or most at LW.
On the one hand, I think that page in specific is actually based on outdated Bayesian methods, and there's been a lot of good work in Bayesian statistics for complex models and cognitive science in recent years.
On the other hand, I freaking love that website, despite its weirdo Buddhist-philosophical leanings and one or two things it gets Wrong according to my personal high-and-mighty ideologies.
And on the gripping hand, he is very, very right that the way the LW community tends to phrase things in terms of "just Bayes it" is not only a mischaracterization of the wide world of statistics, it's even an oversimplification of Bayesian statistics as a subfield. Bayes' Law is just the update/training rule! You also need to discuss marginalization; predictive distributions; maximum-entropy priors, structural simplicity priors, and Bayesian Occam's Razor, and how those are three different views of Occam's Razor that have interesting similarities and differences; model selection; the use of Bayesian point-estimates and credible-hypothesis tests for decision-making; equivalent sample sizes; conjugate families; and computational Bayes methods.
Then you're actually learning and doing Bayesian statistics.
On the miniature nongripping hand, I can't help but feel that the link between probability, thermodynamics, and information theory means Eliezer and the Jaynesians are probably entirely correct that as a physical fact, real-world event frequencies and movements of information obey Bayes' Law with respect to the information embodied in the underlying physics, whether or not I can model any of that well or calculate posterior distributions feasibly.
Starting out by expecting a view opposed to your own to be contrarian is a typical form of overconfidence, and not just overconfidence about other people's opinions.
Sometimes, yes. However, I rather expect that naturalism should be the consensus.