Hi. I'm Gareth McCaughan. I've been a consistent reader and occasional commenter since the Overcoming Bias days. My LW username is "gjm" (not "Gjm" despite the wiki software's preference for that capitalization). Elsewehere I generally go by one of "g", "gjm", or "gjm11". The URL listed here is for my website and blog, neither of which has been substantially updated for several years. I live near Cambridge (UK) and work for Hewlett-Packard (who acquired the company that acquired what remained of the small company I used to work for, after they were acquired by someone else). My business cards say "mathematician" but in practice my work is a mixture of simulation, data analysis, algorithm design, software development, problem-solving, and whatever random engineering no one else is doing. I am married and have a daughter born in mid-2006. The best way to contact me is by email: firstname dot lastname at pobox dot com. I am happy to be emailed out of the blue by interesting people. If you are an LW regular you are probably an interesting person in the relevant sense even if you think you aren't.
If you're wondering why some of my very old posts and comments are at surprisingly negative scores, it's because for some time I was the favourite target of old-LW's resident neoreactionary troll, sockpuppeteer and mass-downvoter.
(Brief self-review for LW 2023 review.)
Obviously there's nothing original in my writeup as opposed to the paper it's about. The paper still seems like an important one, though I haven't particularly followed the literature and wouldn't know if it's been refuted or built upon by other later work. In particular, in popular AI discourse one constantly hears things along the lines of "LLMs are just pushing symbols around and don't have any sort of model of the actual world in them", and this paper seems to me to be good evidence that transformer networks, even quite small ones, can build internal models that aren't just symbol-pushing.
There's a substantial error in the post as it stands (corrected in comments, but I never edited the post): I claim that the different legal-move-prediction abilities of the "network trained on a smallish number of good games" and "network trained on a much larger number of random games" cases is because the network isn't big enough to capture both "legal" and "good strategy" well, when in fact it seems more likely that the difference is mostly because the random-game training set is so much larger. I'm not sure what the etiquette is around making edits now in such cases.
I'm confused by what you say about italics. Mathematical variables are almost always italicized, so how would italicizing something help to clarify that it isn't a variable?
It seems like a thing that literally[1] everyone does sometimes. "Let's all go out for dinner." "OK, where shall we go?" As soon as you ask that question you're "optimizing for group fun" in some sense. Presumably the question is intending to ask about some more-than-averagely explicit, or more-than-averagely sophisticated, or more-than-averagely effortful, "optimizing for group fun", but to me at least it wasn't very clear what sort of thing it was intending to point at.
[1] Almost literally.
Yeah, I do see the value of keeping things the same across multiple years, which is why I said "might be worth" rather than "would be a good idea" or anything of the sort.
To me, "anti-agathics" specifically suggests drugs or something of the kind. Not so strongly that it's obvious to me that the question isn't interested in other kinds of anti-aging measures, but strongly enough to make it not obvious whether it is or not.
There is arguably a discrepancy between the title of the question "P(Anti-Agathics)" and the actual text of the question; there might be ways of "reaching an age of 1000 years" that I at least wouldn't want to call "anti-agathics". Uploading into a purely virtual existence. Uploading into a robot whose parts can be repaired and replaced ad infinitum. Repeated transfer of consciousness into some sort of biological clones, so that you get a new body when the old one starts to wear out.
My sense is that the first of those is definitely not intended to be covered by the question, and the second probably isn't; I'm not sure about the third. "Magical" options like survival of your immortal soul in a post-mortem heaven or hell, magical resurrection of your body by divine intervention, and reincarnation, are presumably also not intended.
In future years, it might be worth tweaking the wording by e.g. inserting the word "biological" or some wording like "in something that could credibly be claimed to be the body they are now living in". Or some other thing that better matches the actual intent of the question.
I have just noticed something that I think has been kinda unsatisfactory about the probability questions since for ever.
There's a question about the probability of "supernatural events (including God, ghosts, magic, etc.)" having occurred since the beginning of the universe. There's another question about the probability of there being a god.
I notice an inclination to make sure that the first probability is >= the second, for the obvious reason. But, depending on how the first question is interpreted, that may be wrong.
If the existence of a god is considered a "supernatural event since the beginning of the universe" then obviously that's the case. But note that one thing a fair number of people have believed is that a god created the universe and then, so to speak, stepped away and let it run itself. (It would be hard to distinguish such a universe from a purely natural one, but perhaps you could identify some features of the universe that such a being would be more or less likely to build in.) In that case, you could have a universe that (1) created by a god in which (2) no supernatural events have ever occurred or will ever occur.
The unsatisfactory thing, to be clear, is the ambiguity.
For future years, it might be worth considering either (1) replacing "God" with something like "acts of God" or "divine interventions" in the first question, or (2) adding an explicit clarification that the mere existence of a god should be considered a "supernatural event" in the relevant sense even if what that god did was to make a universe that runs naturally, or (3) tweaking the second question to explicitly exclude "deistic" gods.
"Trauma" meaning psychological as opposed to physical damage goes back to the late 19th century.
I agree that there's a widespread tendency to exaggerate the unpleasantness/harm done by mere words. (But I suggest there's an opposite temptation too, to say that obviously no one can be substantially harmed by mere words, that physical harm is different in kind from mere psychological upset, etc., and that this is also wrong.)
I agree that much of the trans community seems to have embraced what looks to me like a severely hyperbolic view of how much threat trans people are under. (But, usual caveats: it's very common for the situation of a minority group to look and feel much worse from the inside than from the outside, and generally this isn't only a matter of people on the inside being oversensitive, it's also a matter of people on the outside not appreciating how much unpleasantness those on the inside face. So my guess is that that view is less hyperbolic than it looks to me.)
I agree that the term "deadname" is probably popular partly because "using my deadname" has more of an obviously-hostile-move sound than "using my old name" or similar. But if we avoid every term with any spin attached, we'll have to stop calling people liberals (as if no one else cared about freedom) or conservatives (as if their opponents were against preserving valuable things) or Catholics (the word means "universal") or pro-life or pro-choice or, or, or, or. For my part, I avoid some spinny terms but not others, on the basis of gut feeling about how much actual wrongness is baked into them and how easy it is to find other language, which (I don't know how coherently) cashes out as being broadly OK with "liberal" and "conservative", preferring to avoid "pro-life" and "pro-choice" or at least making some snarky remarks about the terms before using them, avoiding the broadest uses of the term "transphobia", etc. And for me "deadname" seems obviously basically OK even though, yes, the term was probably chosen partly for connotations one might take issue with. Your mileage may vary.
I don't think "deadname" is a ridiculous term just because no one died. The idea is that the name is dead: it's not being used any more. Latin is a "dead language" because (roughly speaking) no one speaks or writes in Latin. "James" is a "dead name" because (roughly speaking) no one calls that person "James" any more.
This all seems pretty obvious to me, and evidently it seems the opposite way to you, and both of us are very smart [citation needed], so probably at least one of us is being mindkilled a bit by feeling strongly about some aspect of the issue. I don't claim to know which of us it is :-).
I think you're using "memetic" to mean "of high memetic fitness", and I wish you wouldn't. No one uses "genetic" in that way.
An idea that gets itself copied a lot (either because of "actually good" qualities like internal consistency, doing well at explaining observations, etc., or because of "bad" (or at least irrelevant) ones like memorability, grabbing the emotions, etc.) has high memetic fitness. Similarly, a genetically transmissible trait that tends to lead to its bearers having more surviving offspring with the same trait has high genetic fitness. On the other hand, calling a trait genetic means that it propagates through the genes rather than being taught, formed by the environment, etc., and one could similarly call an idea or practice memetic if it comes about by people learning it from one another rather than (e.g.) being instinctive or a thing that everyone in a particular environment invents out of necessity.
When you say, e.g., "lots of work in that field will be highly memetic despite trash statistics, blatant p-hacking, etc." I am pretty certain you mean "of high memetic fitness" rather than "people aware of it are aware of it because they learned of it from others rather than because it came to them instinctively or they reinvented it spontaneously because it was obvious from what was around them".
(It would be possible, though I'd dislike it, to use "memetic" to mean something like "of high memetic fitness for 'bad' reasons" -- i.e., liable to be popular for the sort of reason that we might not appreciate without the notion of memes. But I don't think that can be your meaning in the words I quoted, which seem to presuppose that the "default" way for a piece of work to be "memetic" is for it to be of high quality.)
Yes, that sounds much more normal to me.
Though in the particular case here, something else seems off: when you write f(x) you would normally italicize both the "f" and the "x", as you can see in the rendering in this very paragraph. I can't think of any situation in actual mathematical writing where you would italicize one and not the other in order to make some distinction between function-names and variable names.
For that matter, I'm not wild about making a distinction between "variables" and "functions". If you write f(x) and also sin(x) then it would be normal for "f" and "x" to be italicized and not "sin". I was going to say that the reason is that f and x are in fact both variables, and it just happens that one of them takes values that are functions, whereas sin is a fixed function and you'll never see anything like "let sin = 3" or "let sin = cos" -- but actually that isn't quite right either, because named mathematical constants like e are usually italicized. I think the actual distinction is that single-letter names-of-things get italicized and multiple-letter ones usually don't.