Will_Newsome comments on Secrets of the eliminati - Less Wrong
You are viewing a comment permalink. View the original post to see all comments and the full post content.
You are viewing a comment permalink. View the original post to see all comments and the full post content.
Comments (252)
It's more of a tentative starting point of a more thorough analysis, halfway between a question and an assertion. If we wanted to be technical ISTM that we could bring in ideas from coding theory, talk about Kraft's inequality, et cetera, and combine those considerations with info from the heuristics and biases literature, in order to make a decently strong counterargument that certain language choices can reasonably be called irrational or "crazy". Thing is, that counterargument in turn can be countered with appeals to different aspects of human psychology e.g. associative learning and maybe the neuroscience of the default system (default network), and maybe some arguments from algorithmic probability theory (again), so ISTM that it's a somewhat unsettled issue and one where different people might have legitimately different optimal strategies.
Is this a (partial) joke? Do you have some particular reason for not taking these reactions seriously?
Comment reply 2 of 2.
Like,
By the way aren't those hated enemies of Reason so contemptible? Haha! So contemptible! Om nom nom signalling nom contempt nom nom "rationality" nom.
Why do you think this is so important? As far as I can tell, this is not how humanity made progress in the past. Or was it? Did our best scientists and philosophers find or build "various decently-motivated-if-imperfect models of the same process/concept/thing so as to form a constellation of useful perspectives on different facets of it"?
Or do you claim that humanity made progress in the past despite not doing what you suggest, and that we could make much faster progress if we did? If so, what do you base your claim on (besides your intuition)?
This actually seems to me exactly how humanity has made progress - countless fields and paradigms clashing and putting various perspectives on problems and making progress. This is a basic philosophy of science perspective, common to views as dissimilar as Kuhn and Feyerabend. There's no one model that dominates in every field (most models don't even dominate their field; if we look at the ones considered most precise and successful like particle physics or mathematics, we see that various groups don't agree on even methodology, much less data or results).
But I think the individuals who contributed most to progress did so by concentrating on particular models that they found most promising or interesting. The proliferation of models only happen on a social level. Why think that we can improve upon this by consciously trying to "find or build various decently-motivated-if-imperfect models"?
None of that defends the assertion that humanity made progress by following one single model, which is what I was replying to, as shown by a highly specific quote from your post. Try again.
I didn't mean to assert that humanity made progress by following one single model as a whole. As you point out, that is pretty absurd. What I was saying is that humanity made progress by (mostly) having each individual human pursue a single model. (I made a similar point before.)
I took Will's suggestion to be that we, as individuals, should try to pursue many models, even ones that we don't think are most promising, as long as they are "decently motivated". (This is contrary to my intuitions, but not obviously absurd, which is why I wanted to ask Will for his reasons.)
I tried to make my point/question clearer in rest of the paragraph after the sentence you quoted, but looking back I notice that the last sentence there was missing the phrase "as individuals" and therefore didn't quite serve my purpose.
I think you're looking at later stages of development than I am. By the time Turing came around the thousands-year-long effort to formalize computation was mostly over; single models get way too much credit because they herald the triumph at the end of the war. It took many thousands of years to get to the point of Church/Goedel/Turing. I think that regarding justification we haven't even had our Leibniz yet. If you look at Leibniz's work he combined philosophy (monadology), engineering (expanding on Pascal's calculators), cognitive science (alphabet of thought), and symbolic logic, all centered around computation though at that time there was no such thing as 'computation' as we know it (and now we know it so well that we can use it to listen to music or play chess). Archimedes is a much earlier example but he was less focused. If you look at Darwin he spent the majority of his time as a very good naturalist, paying close attention to lots of details. His model of evolution came later.
With morality we happen to be up quite a few levels of abstraction where 'looking at lots of details' involves paying close attention to themes from evolutionary game theory, microeconomics, theoretical computer science &c. Look at CFAI to see Eliezer drawing on evolution and evolutionary psychology to establish an extremely straightforward view of 'justification', e.g. "Story of a Blob". It's easy to stumble around in a haze and fall off a cliff if you don't have a ton of models like that and more importantly a very good sense of the ways in which they're unsatisfactory.
Those reasons aren't convincing by themselves of course. It'd be nice to have a list of big abstract ideas whose formulation we can study on both the individual and memetic levels. E.g. natural selection and computation, and somewhat smaller less-obviously-analogous ones like general relativity, temperature (there's a book about its invention), or economics. Unfortunately there's a lot of success story selection effects and even looking closely might not be enough to get accurate info. People don't really have introspective access to how they generate ideas.
Side question: how long do you think it would've taken the duo of Leibniz and Pascal to discover algorithmic probability theory if they'd been roommates for eternity?
I think my previous paragraph answered this with representative reasons. This is sort of an odd way to ask the question 'cuz it's mixing levels of abstraction. Intuition is something you get after looking at a lot of history or practicing a skill for awhile or whatever. There are a lot of chess puzzles I can solve just using my intuition, but I wouldn't have those intuitions unless I'd spent some time on the object level practicing my tactics. So "besides your intuition" means like "and please give a fine-grained answer" and not literally "besides your intuition". Anyway, yeah, personal experience plus history of science. I think you can see it in Nesov's comments from back when, e.g. his looking at things like game semantics and abstract interpretation as sources of inspiration.
You're right, and perhaps I should better familiarize myself with earlier intellectual history. Do you have any books you can recommend, on Leibniz for example?
This one perhaps. I haven't read it but feel pretty guilty about that fact. Two FAI-minded people have recommended it to me, though I sort of doubt that they've actually read it either. Ah, the joys and sorrows of hypothetical CliffsNotes.
ETA: I think Vassar is the guy to ask about history of science or really history of anything. It's his fault I'm so interested in history.
Comment reply 1 of 2.
I don't recall attempting to make any (partial) jokes, no. I'm not sure what you're referring to as "these reactions". I'll try to respond to what I think is your (not necessarily explicit) question. I'm sort of responding to everyone in this thread.
When I suspect that a negative judgment of me or some thing(s) associated with me might be objectively correct or well-motivated---when I suspect that I might be objectively unjustified in a way that I hadn't already foreseen, even if it would be "objectively" unreasonable for me/others to expect me to have seen so in advance---well, that causes me to, how should I put it?, "freak out". My omnipresent background fear of being objectively unjustified causes me to actually do things, like update my beliefs, or update my strategy (e.g. by flying to California to volunteer for SingInst), or help people I care about (e.g. by flying back to Tucson on a day's notice if I fear that someone back home might be in danger). This strong fear of being objectively (e.g. reflectively) morally (thus epistemicly) antijustified---contemptible, unvirtuous, not awesome, imperfect---has been part of me forever. You can see why I would put an abnormally large amount of effort into becoming a decent "rationalist", and why I would have learned abnormally much, abnormally quickly from my year-long stint as a Visiting Fellow. (Side note: It saddens me that there are no longer any venues for such in-depth rationality training, though admittedly it's hard/impossible for most aspiring rationalists to take advantage of that sort of structure.) You can see why I would take LW's reactions very, very seriously---unless I had some heavyweight ultra-good reasons for laughing at them instead.
(It's worth noting that I can make an incorrect epistemic argument and this doesn't cause me to freak out as long as the moral-epistemic state I was in that caused me to make that argument wasn't "particularly" unjustified. It's possible that I should make myself more afraid of ever being literally wrong, but by default I try not to compound my aversions. Reality's great at doing that without my help.)
"Luckily", judgments of me or my ideas, as made by most humans, tend to be straightforwardly objectively wrong. Obviously this default of dismissal does not extend to judgments made by humans who know me or my ideas well, e.g. my close friends if the matter is moral in nature and/or some SingInst-related people if the matter is epistemic and/or moral in nature. If someone related to SingInst were to respond like Less Wrong did then that would be serious cause for concern, "heavyweight ultra-good reasons" be damned; but such people aren't often wrong and thus did not in fact respond in a manner similar to LW's. Such people know me well enough to know that I am not prone to unreflective stupidity (e.g. prone to unreflective stupidity in the ways that Less Wrong unreflectively interpreted me as being).
If they were like, "The implicit or explicit strategy that motivates you to make comments like that on LW isn't really helping you achieve your goals, you know that right?", then I'd be like, "Burning as much of my credibility as possible with as little splash damage as possible is one of my goals; but yes, I know that half-trolling LW doesn't actually teach them what they need to learn.". But if they responded like LW did, I'd cock an eyebrow, test if they were trolling me, and if not, tell them to bring up Mage: The Ascension or chakras or something next time they were in earshot of Michael Vassar. And if that didn't shake their faith in my stupidity, I'd shrug and start to explain my object-level research questions.
The problem of having to avoid the object-level problems when talking to LW is simple enough. My pedagogy is liable to excessive abstraction, lack of clear motivation, and general vagueness if I can't point out object-level weird slippery ideas in order to demonstrate why it would be stupid to not load your procedural memory with lots and lots of different perspectives on the same thing, or in order to demonstrate the necessity and nature of many other probably-useful procedural skills. This causes people to assume that I'm suggesting certain policies only out of weird aesthetics or a sense of moral duty, when in reality, though aesthetic and moral reasons also count, I'm actually frustrated because I know of many objective-level confusions that cannot be dealt with satisfactorily without certain knowledge and fundamental skills, and also can't be dealt with without avoiding many, many, many different errors that even the best LW members are just not yet experienced enough to avoid. And that would be a problem even if my general audience wasn't already primed to interpret my messages as semi-sensical notes-to-self at best. ("General audience", for sadly my intended audience mostly doesn't exist, yet.)
This cleared things up somewhat for me, but not completely. You might consider making a post that explains why your writing style differs from other writing and what you're trying to accomplish (in a style that is more easily understood by other LWers) and then linking to it when people get confused (or just habitually).
I use this strategy playing basketball with my younger cousin. If I win, I win. And if I lose, I wasn't really trying.
This strategy is pretty transparent to Western males with insecurities revolving around zero-sum competitions.
His reason for not taking the reactions seriously is "because he can".
Could you expand on this? Following Wikipedia, Kraft's inequality seems to be saying that if we're translating a message from an alphabet with n symbols to an alphabet with r symbols by means of representing the symbols s_i in the first alphabet by words l_i spelled in the second alphabet, then in order for the message to be uniquely decodable, it must be the case that
However, I don't understand how this is relevant to the question of whether some human choices of language are crazy. For example, when people object to the use of the word God in reference to what they would prefer to call a superintelligence, it's not because they believe that using the word God would somehow violate Kraft's inequality, thereby rendering the intended message ambiguous. There's nothing information-theoretically wrong with the string God; rather, the claim is that that string is already taken to refer to a different concept. Do you agree, or have I misread you?
Hm hm hm, I'm having trouble sorting this out. The full idea I think I failed to correctly reference is that giving certain concepts short "description lengths"---where description length doesn't mean number of letters, but something like semantic familiarity---in your language is equivalent to saying that the concepts signified by those words represent things-in-the-world that show up more often. But really the whole analogy is of course flawed from the start because we need to talk about decision theoretically important things-in-the-world, not probabilistically likely things-in-the-world, though in many cases the latter is the starting point for the former. Like, if we use a language that uses the concept of God a lot but not the concept of superintelligence---and here it's not the length of the strings that matter, but like the semantic length, or like, how easy or hard it is to automatically induce the connotations of the word; and that is the non-obvious and maybe just wrong part of the analogy---then that implies that you think that God shows up more in the world than superintelligence. I was under the impression that one could start talking about the latter using Kraft's inequality but upon closer inspection I'm not sure; what jumped out at me was simply: "More specifically, Kraft's inequality limits the lengths of codewords in a prefix code: if one takes an exponential function of each length, the resulting values must look like a probability mass function. Kraft's inequality can be thought of in terms of a constrained budget to be spent on codewords, with shorter codewords being more expensive." Do you see what I'm trying to get at now with my loose analogy? If so, might you help me reason through or debug the reference?
Sure. Short words are more expensive because there are fewer of them; because short words are scarce, we want to use them to refer to frequently-used concepts. Is that what you meant? I still don't see how this is relevant to the preceding discussion (see the grandparent).
Also, for clearer communication, you might consider directly saying things like "Short words are more expensive because there are fewer of them" rather than making opaque references to things like Kraft's inequality. Technical jargon is useful insofar as it helps communicate ideas; references that may be appropriate in the context of a technical discussion about information theory may not be appropriate in other contexts.
That's not quite what I mean, no. It's not the length of the words that I actually care about, really, and thus upon reflection it is clear that the analogy is too opaque. What I care about is the choice of which concepts to have set aside as concepts-that-need-little-explanation---"ultimate convergent algorithm for arbitrary superintelligence's'" here, "God" at some theological hangout---and how that reflects which things-in-the-world one has implicitly claimed are more or less common (but really it'd be too hard to disentangle from things-in-the-world one has implicitly claimed are more or less important). It's the differential "length" of the concepts that I'm trying to talk about. The syntactic length, i.e. the number of letters, doesn't interest me.
Referencing Kraft's inequality was my way of saying "this is the general type of reasoning that I have cached as perhaps relevant to the kind of inquiry it would be useful to do". But I think you're right that it's too opaque to be useful.
Edit: To try to explain the intuition a little more, it's like applying the "scarce short strings" theme to the concepts directly, where the words are just paintbrush handles. That is how I think one might try to argue that language choices can be objectively "irrational" anyway.
I don't think the analogy holds. The reason Kraft's inequality works is that the number of possible strings of length n over a b-symbol alphabet is exactly b^n. This places a bound on the number of short words you can have. Whereas if we're going to talk about the "amount of mental content" we pack into a single "concept-needing-little-explanation," I don't see any analogous bound: I don't see any reason in principle why a mind of arbitrary size couldn't have an arbitrary number of complicated "short" concepts.
For concreteness, consider that in technical disciplines, we often speak and think in terms of "short" concepts that would take a lot of time to explain to outsiders. For example, eigenvalues. The idea of an eigenvalue is "short" in the sense that we treat it as a basic conceptual unit, but "complicated" in the sense that it's built out of a lot of prerequisite knowledge about linear transformations. Why couldn't a mind create an arbitrary number of such conceptual "chunks"? Or if my model of what it means for a concept to be "short" is wrong, then what do you mean?
I note that my thinking here feels confused; this topic may be too advanced for me to discuss sanely.
On top of that there's this whole thing where people are constantly using social game theory to reason about what choice of words does or doesn't count as defecting against local norms, what the consequences would be of failing to punish non-punishers of people who use words in a way that differs from ways that are privileged by social norms, et cetera, which make a straight up information theoretic approach somewhat off-base for even more reasons other than just the straightforward ambiguities imposed by considering implicit utilities as well as probabilities. And that doesn't even mention the heuristics and biases literature or neuroscience, which take the theoretical considerations and laugh at them.
Ah, I'm needlessly reinventing some aspects of the wheel.