AShepard comments on Some Heuristics for Evaluating the Soundness of the Academic Mainstream in Unfamiliar Fields - Less Wrong
You are viewing a comment permalink. View the original post to see all comments and the full post content.
You are viewing a comment permalink. View the original post to see all comments and the full post content.
Comments (272)
I'm surprised that you don't mention the humanities as a really bad case where there is little low-hanging fruit and high ideological content. Take English literature for example. Barrels of ink have been spilled in writing about Hamlet, and genuinely new insights are quite rare. The methods are also about as unsound as you can imagine. Freud is still heavily cited and applied, and postmodern/poststructuralist/deconstructionist writing seems to be accorded higher status the more impossible to read it is.
Ideological interest is also a big problem. This seems almost inevitable, since the subject of the humanities is human culture, which is naturally bound up with human ideals, beliefs, and opinions. Academic disciplines are social groups, so they have a natural tendency to develop group norms and ideologies. It's unsurprising that this trend is reinforced in those disciplines that have ideologies as their subject matter. The result is that interpretations which do not support the dominant paradigm (often a variation on how certain sympathetic social groups are repressed, marginalized, or "otherized"), are themselves suppressed.
One theory of why the humanities are so bad is that there is no empirical test for whether an answer is right or not. Incorrect science leads to incorrect predictions, and even incorrect macroeconomics leads to suboptimal policy decisions. But it's hard to imagine what an "incorrect" interpretation of Hamlet even looks like, or what the impact of having an incorrect interpretation would be. Hence, there's no pressure towards correct answers that offsets the natural tendency for social communities to develop and enforce social norms.
I wonder if "empirical testability" is a should be included with the low-hanging fruit heuristic.
Well put. You've concisely stated a heuristic that is very powerful but rarely used where it needs to be.
Be warned: it's actually a source of sadness for me whenever I start asking the question, "if X performed Y badly, what would be the impact?" -- because the conclusion is often "not much, so why does the world create incentives that led to them trying to do Y 'well' in the first place?"
AShepard:
Well, I have mentioned history. Other humanities can be anywhere from artsy fields where there isn't even a pretense of any sort of objective insight (not that this necessarily makes them worthless for other purposes), to areas that feature very well researched and thought-out scholarship if ideological issues aren't in the way, and if it's an area that hasn't been already done to death for generations (which is basically my first heuristic).
Perhaps surprisingly, it doesn't seem to me that empirical testability is so important. Lousy work can easily be presented with plenty of empirical data carefully arranged and cherry-picked to support it. To recognize the problem in such cases and sort out correct empirical validation from spin and propaganda is often a problem as difficult as sorting out valid from invalid reasoning in less empirically-oriented work.
It does make them, if not worthless, at least worth less for other purposes.
I spent a lot of time in college in the humanities, art (Bachelor of Fine Art degree, eventually), Philosophy, English (beyond the basic Comp and Rhetoric classes) etc.
The less objective the standards applied, the worse the product, the less effort put into it, the less the artist/author (and yes, I'm generalizing here) put into his work.
I had one class at a very anti-objective school where the teacher (and I almost never use that term, especially for instructors at that school) was fairly strict about meeting her standards, and the final critiques were amusing. Kids who skated by in other classes on a modicum of effort, little talent and a tractor load of post-modernist bullshit (mostly regurgitated and badly understood) got hammered for not working to the fairly loose requirements.
Art is not some special case of human effort where intellect and informed taste have bearing. It currently (since the ~50s) a place where intellect and informed taste have been told they aren't welcome so the children could keep playing with their mud. And I don't say this out of bitterness--I have very little talent for the "high" arts, and merely wish the people producing it these days were better at thinking than they are.
I disagree on the "artsy" fields. I feel like art history has reached a dead end because of the structure of the art market. As the area considered "art" for academic purposes has become more concentrated and expensive, scholarship has been undermined and I think we've seen a general unwillingness to engage new topics simply because they don't lend themselves very well to museums or gallery sales.
Sounds like a good idea until you realize that you are throwing out most math and philosophy with the bathwater.
How about accepting either empirical testability or a requirement that all claims be logically proven? (Much of microeconomics and game theory slides in under 'provable' rather than 'testable'. Quite a bit of philosophy fails under both criteria, but some of it approaches 'provable'.)
Even better demand that there be strict rules in the discipline, which the research must obey - be it logical provability, empirical testability or whatever else. It is still possible to make up unreasonable rules, but production of bullshit is a lot easier without rules. Which is the case of deconstructionism and related fields.
prase:
Strict formal rules are a two-edged sword. If well designed, they indeed serve as a powerful barrier against nonsense. However, they can also be perverted, with extremely bad results.
In many disciplines that have been affected by the malaises discussed in this thread, what happens is that a perverse formal system develops, which then serves as a template for producing impressive-looking bullshit work. This sometimes leads to the very heights of ass-covering irresponsibility, since everyone involved -- authors, editors, reviewers, grant committees... -- can hide behind the fact that the work satisfies all the highest professional expert standards if questioned about it. At worst, these perverse formal standards can also serve as a barrier against actual quality work that doesn't conform to their template.
Just to be clear: by strict rules I don't mean anything with significant subjective judgement involved, like peer reviews. I rather mean things like demanding testability, mathematical proofs, logical consistency and such. Also, not much rules governing the social life of the respective community, but rather rules applied to the hypotheses.
Also, I haven't said that rules are sufficient. One can still publish trivial theories which nobody is interested to test, mathematical proofs of obscure unimportant theorems or logically consistent tautologies. But at least the rules remove arbitrariness and make it possible to objectively assess quality and to decide whether a hypothesis is good or bad, according to standards of the discipline.
The discipline's standard of good hypothesis may not universally correspond to a true hypothesis, but I suspect that if the standards of the discipline are strict enough, either the correspondence is there, or it is easily visible that the discipline is based on wrong premises, because it endorses some easily identifiable falsehoods. (It would be too big a coincidence if a formal system regularly produced false statements, but no trivially false statements.)
On the other hand, when the rules aren't enough formal, the discipline still makes complex false claims, but nobody can clearly demonstrate that their methods are unreliable because the methods (if there are any) can be always flexed to avoid producing embarrassingly trivial errors.
prase:
Trouble is, there are examples of fields where the standards satisfy all this, but the work is nevertheless misleading and remote from reality.
Take the example of computer science, which I'm most familiar with. In some of its subfields, the state of the art has reached a dead end, in that any obvious path for improving things hits against some sort of exponential-time or uncomputable problem, and the possible heuristics for getting around it have already been explored to death. Breaking a new path in this situation could be done only by an extraordinary stroke of genius, if it's possible at all.
So what people do is to propose yet another complex and sophisticated but ultimately feeble heuristic wrapped into thick layers of abstruse math, and argue that it represents an improvement of some performance measure by a few percentage points. Now, if you look at a typical paper from such an area, you'll see that the formalism is accurate mathematically and logically, the performance evaluation is carefully measured over a set of standard benchmarks according to established guidelines, and the relevant prior work is meticulously researched and cited. You have to satisfy these strict formal standards to publish.
Trouble is, nearly all this work is worthless, and quite obviously so. From a practical engineering perspective, implementing these complex algorithms in a practical system would be a Herculean task for a minuscule gain. The hypertrophied formalism often uses numerous pages of abstruse math to express ideas that could be explained intuitively and informally in a few simple sentences to someone knowledgeable in the field -- and in turn would be immediately and correctly dismissed as impractical. Even the measured performance improvements are rarely evaluated truly ceteris paribus and in ways that reveal all the strengths and weaknesses of the approach. It's simply impossible to devise a formal standard that wold ensure that reliably -- these things are possible to figure out only with additional experimentation or with a practical engineering hunch.
Except perhaps in the purest mathematics, no formal standard can function well in practice if legions of extraordinarily smart people have the incentive to get around it. And if there are no easy paths to quality work, the "publish or perish" principle makes it impossible to compete and survive unless one exerts every effort to game the system.
That's right, and I don't disagree. Formal standards are not a panacea, never. But, do you suppose, in cases you describe, things would go better without those formal standards?
I am still not sure if we mean exactly the same thing, when talking about formal rules. Take the example of pure mathematics, which you have already mentioned. Surely, abstruse formalist descriptions of practically uninteresting and maybe trivial problems appear there too, now and then. And revolutionary breakthroughs perhaps more often result from intuitive insights of geniuses than from dilligent rigorous formal work. Much papers, in all fields, can be made more readable, accessible, and effective in dissemination of new results by shedding the lofty jargon of scientific publications. But mathematicians certainly wouldn't do better if they got rid of mathematical proofs.
I do not suggest that all ideas in respectable fields of science should be propagated in form of publications checked against lists of formal requirements: citation index, proofs of all logical statements, p-values below 0.01, certificates of double-blindedness. Not in the slightest. Conjectures, analogies, illustrations, whatever enhances understanding is welcome. I only want a possibility to apply the formal criteria. If a conjecture is published, and it turns out interesting, there should be an ultimate method to test whether it is true. If there is an agreed method to test the results objectively, people aren't free to publish whatever they want and expect to never be proven wrong.
If you compare the results of computer science to postmodern philosophy, you may see my point. In CS most results may be useless and incomprehensible. In postmodern philosophy, which is essentially without formal rules, all results are useless and incomprehensible, and as a bonus, meaningless or false.
I agree about the awful state of fields that don't have any formal rules at all. However, I'm not concerned about these so much because, to put it bluntly, nobody important takes them seriously. What is in my opinion a much greater problem are fields that appear to have all the trappings of valid science and scholarship, but it's in fact hard for an outsider to evaluate whether and to what extent they're actually cargo-cult science. This especially because the results of some such fields (most notably economics) are used as basis for real-world decision-making with far-reaching consequences.
Regarding the role of formalism, mathematics is unique in that the internal correctness of the formalism is enough to establish the validity of the results. Sure, they may be more or less interesting, but if the formalism is valid, then it's valid math, period.
In contrast, in areas that make claims about the real world, the important thing is not just the validity of the formalism, but also how well it corresponds to reality. Work based on a logically impeccable formalism can still be misleading garbage if the formalism is distant enough from reality. This is where the really hard problem is. The requirements about the validity of the formalism are easily enforced, since we know how to reduce those to a basically algorithmic procedure. What is really hard is ensuring that the formalism provides an accurate enough description of reality -- and given an incentive to do so, smart people will inevitably figure out ways to stretch and evade this requirement, unless there is a sound common-sense judgment standing in the way.
Further, more rigorous formalism isn't always better. It's a trade-off. More effort put into greater formal rigor -- including both the author's effort to formulate it, and the reader's effort to understand it -- means less resources for other ways of improving the work. Physicists, for example, normally just assume that the functions are well-behaved enough in a way that would be unacceptable in mathematics, and they're justified in doing so. In more practical technical fields like computer science, what matters is whether the results are useful in practice, and formal rigor is useful if it helps avoid confusion about complicated things, but worse than useless if applied to things where intuitive understanding is good enough to get the job done.
The crucial lesson, like in so many other things, is that whenever one deals with the real world, formalism cannot substitute for common sense. It may be tremendously helpful and enable otherwise impossible breakthroughs, but without an ultimate sanity check based on sheer common sense, any attempt at science is a house built on sand.
I don't think we have a real disagreement. I haven't said that more rigorous formalism is always better, quite the contrary. I was writing about objective methods of looking at the results. Physicists can ignore mathematical rigor because they have experimental tests which finally decide whether their theory is worth attention. Computer scientists can finally write down their algorithm and look whether it works. These are objective rules which validate the results.
Whether the rules are sensible or not is decided by common sense. My point is that it is easier to decide that about the rules of the whole field than about individual theories, and that's why objective rules are useful.
Of course, saying "common sense" does in fact mean that we don't know how did we decide, and doesn't specify the judgement too much. One man's common sense may be other man's insanity.
Oh yes, I didn't mean to imply that you disagreed with everything I wrote in the above comment. My intent was to give a self-contained summary of my position on the issue, and the specific points I raised were not necessarily in response to your claims.
Even in mathematics, you can find contrarian opinions that much of the field is meaningless. What we have is (or at least seems to be) logically proved from the basis of certain assumptions, but we could as easily have picked very different assumptions and proved different theorems instead. There is a prevailing opinion that certain assumptions (the mainstream foundations of mathematics) are correct or at least useful, but correctness ultimately reduces to an aesthetic judgement, and usefulness is known to be exaggerated.
Pure mathematics per se may not be empirically testable, but once you establish certain correspondences - small integers correspond to pebbles in a bag, or increments to physical counting devices - then the combination of conclusion+correspondence often is testable, and often comes out to be true.
In some cases, combinations of correspondences+mathematically true conclusion gives a testably false conclusion about the real world, such as the Banach-Tarski paradox.
The problem here isn't the mathematics, but the correspondence. Physical balls are only measurable sets to a first approximation.
Yes.
However, imagine some abstruse mathematical theory that, in some "evaluate it on its own terms" sense, is true, but every correspondence that we attempt to make to the empirical world fails. I would claim that the failure to connect to an empirical result is actually a potent criticism of the theory - perhaps a criticism of irrelevance rather than falsehood, but a reason to prefer other fields within mathematics nevertheless.
I don't know of any such irrelevant mathematical theories, and to some extent, I believe there aren't any. The vast majority of current mathematical theories can be formalized within something like the Calculus of Constructions or ZF set theory, and so they could be empirically tested by observing the behavior of a computing device programmed to do brute-force proofs within those systems.
My guess is that mathematicians' intuitions are informed by a pervasive (yet mostly ignored in the casual philosophy of mathematics) habit of "calculating". Calculating means different things to different mathematicians, but computing with concrete numbers (e.g. factoring 1735) certainly counts, and some "mechanical" equation juggling counts. The "surprising utility" of pure mathematics derives directly from information about the real world injected via these intuitions about which results are powerful.
This suggests that fields within mathematics that do not do much calculating or other forms of empirical testing might become decoupled from reality and essentially become artistic disciplines, producing tautology after tautology without relevance or utility. I'm not deep enough into mathematical culture to guess how often that happens or to point out any subdisciplines in particular, but a scroll through arxiv makes it look pretty possible: http://arxiv.org/list/math/new
In my perfect world, all mathematical papers would start with pointers or gestures back to the engineering problems that motivated this problem, and end with pointers or gestures toward engineering efforts that might be forwarded by this result.
I don't have any good examples of actual irrelevant/artistic mathematics, but possibly:
"Unipotent Schottky bundles on Riemann surfaces and complex tori" http://arxiv.org/abs/1102.3006
would be an example of how opaque to outsiders (and therefore potentially irrelevant) pure mathematics can get. I'm confident (primarily based on surface features) that this paper in particular isn't self-referential, but I have no clue where it would be applied (cryptography? string theory? really awesome computer graphics?).
...
Why do mathematicians put up with this? I'll need to describe a mathematical culture a little first. These days mathematicians are divided into little cliques of perhaps a dozen people who work on the same stuff. All of the papers you write get peer reviewed by your clique. You then make a point of reading what your clique produces and writing papers that cite theirs. Nobody outside the clique is likely to pay much attention to, or be able to easily understand, work done within the clique. Over time people do move between cliques, but this social structure is ubiquitous. Anyone who can't accept it doesn't remain in mathematics.
Among other things, it sounds like you're expecting inferential distances to be short.
My intent was to demonstrate a particular possible threat to the peer review system. As the number of people who can see whether you're grounded in reality gets smaller, the chance of the group becoming an ungrounded mutual admiration society gets larger. I believe one way to improve the peer review system would be to explicitly claim that your work is motivated by some real-world problem and applicable to some real-world solution, and back those claims up with a citation trail for would-be groundedness-auditors to follow.
Actually, there's a vaguely similar preprint: http://arxiv.org/PS_cache/arxiv/pdf/1102/1102.3523v1.pdf
The danger I see is mathematicians endorsing mathematics research because it serves explicitly mathematical goals. It's possible, even moderately likely, that a proof of the Riemann Hypothesis (for example) would be relevant to something outside of mathematics. Still, I'd like us to decide to attack it because we expect it to be useful, not merely because it's difficult and therefore allows us to demonstrate skill.
Why such prejudice against "explicitly mathematical goals"? Why on Earth is this a danger? One way or another, people are going to amuse themselves -- via art, sports, sex, or drugs -- so it might as well be via mathematics, which even the most cynically "hard-headed" will concede is sometimes "useful".
But more fundamentally, the heuristic you're using here ("if I don't see how it's useful, it probably isn't") is wrong. You underestimate the correlation between what mathematicians find interesting and what is useful. Mathematicians are not interested in the Riemann Hypothesis because it may be useful, but the fact that they're interested is significant evidence that it will be.
What mathematics is, as a discipline, is the search for conceptual insights on the most abstract level possible. Its usefulness does not lie in specific ad-hoc "applications" of particular mathematical facts, but rather in the fact that the pursuit of mathematical research over a span of decades to centuries results in humans' possessing a more powerful conceptual vocabulary in terms of which to do science, engineering, philosophy, and everything else.
Mathematicians are the kind of people who would have invented negative numbers on their own because they're a "natural idea", without "needing" them for any "application", back in the day when other people (perhaps their childhood peers) would have seen the idea as nothing but intellectual masturbation. They are people, in other words, whose intuitions about what is "natural" and "interesting" are highly correlated with what later turns out to be useful, even when other people don't believe it and even when they themselves can't predict how.
This is what we see in grant proposals -- and far from changing the status quo, all it does is get the status quo funded by the government.
It's easier to concoct "real-world applications" of almost anything you please than it is to explain the real reason mathematics is useful to the kind of people who ask about "real-world applications".
There's one funny quote I like about partially uniform k-quandles that comes to mind. Somewhat more relevantly, there's also Von Neumann on the danger of losing concrete applications.
I'm not sure that conceptual soundness has any meaning in fields which don't even in principle admit to predictive power or provably correct solutions. It might be possible to imagine a rigorous approach to, say, textual criticism, but in actual practice the work that gets done is approached along aesthetic lines, and the people running humanities departments seem aware of and happy with this.
Of course, this wouldn't apply to the related field of social science, and many of its subfields do seem to fail both of Vladimir's tests.