Quick Takes


In Defence of Jargon

People used to say (maybe still do? I'm not sure) that we should use less jargon to increase accessibility to writings on LW, i.e. make it easier to outsider to read. 

I think this is mostly a confused take. The underlying problem is inferential distance. Geting rid of the jargon is actually unhelpful since it hides the fact that there is an inferential distance. 

When I want to explain physics to someone and I don't know what they already know, I start by listing relevant physics jargon and ask them what words they know. This i... (read more)

Showing 3 of 12 replies (Click to show all)
3Steven Byrnes
I’ve seen that in physics vs chemistry vs engineering (I even made a translation guide for some niche topic way back when) but can’t immediately think of good examples related to rationalism or AI alignment.

A couple of terms that I've commented on here recently —

  • "Delmore effect" (of unclear origin) is the same as the "bikeshed effect" (from open-source software, circa 1999) which was itself a renaming of Parkinson's "law of triviality" (1957) — meaning that people spend more effort forming (and fighting over) opinions on the less-important parts of a project because they're easier to understand or lower-stakes.
  • "Stereotype of the stereotype" (newly coined here on LW last month) is the same as "dead unicorn trope" (from TVTropes) — meaning an idea that people s
... (read more)
2Kaj_Sotala
CFAR handbook, p. 43 ("further resources" section of the "inner simulator" chapter, which the "murphyjitsu" unit is a part of):

If anyone here happens to be an expert in the combinatorics of graphs, I'd love to have a call to get help on some problems we're trying to work out. The problems aren't quite trivial but I suspect an expert would pretty straight-forwardly know what techniques to apply.

Questions I have include;

  • When to try for an exact expression versus an asymptotic expression
  • Are there approximations people use other than Stirling's?
  • When to use random graph methods
  • When graph automorphisms make a problem effectively unsolvable

There's an upside of conventional education which no-one on any side of any debate ever seems to bring up, but which was a major benefit (possibly the major benefit) of my post-primary studies. Namely: it lets students discover what they have a natural aptitude for (or lack thereof) relative to a representative peer group. The most valuable things I learned in my Engineering courses at university were:

.I'm pretty mediocre at Engineering, especially sub-subjects which aren't strictly Structural and/or Mechanical.

.In particular, I'm significantly worse than ... (read more)

Showing 3 of 4 replies (Click to show all)
2Garrett Baker
Why is college a particularly better place to learn this than on-the-job training?

Lower switching costs when you're in the middle of a degree, maybe? You can just take courses in a closely related domain, or work as an assistant in a different lab, in a much more fluid and straightforward manner, versus having to apply to a different job and get through the interviews and pay a significant upfront cost before you even get to the nuts and bolts of stuff.

11Garrett Baker
This seems like a mischaracterization of his view. I’m pretty sure he thinks its wrong to subsidize such signaling mechanisms. First off signaling is relative, so if (say) everyone goes to high school and only the very best go to college, from a signaling perspective, this is just as useful a signal as everyone going to college and only the very best go to grad school. Therefore we should not spend public dollars getting more people to go to college. Second, in the signaling framework, there are no externalities to schooling kids, so there is no market failing to correct with (say) the government subsidizing the debts of college students. Third, due to the first point, if any major market failure is present its the tendency to get into signaling spirals, where the positive signal of (say) a high school education degrades over time, making everyone spend more years and dollars in college getting what was once the same signal as a high school diploma. More years of schooling here is a cost, which everyone would prefer to pay less of. So insofar as there’s any case for government involvement it ought to be a tax, not a subsidy.

Slightly hot take: Longtermist capacity/community building is pretty underdone at current margins and retreats (focused on AI safety, longtermism, or EA) are also underinvested in. By "longtermist community building", I mean rather than AI safety. I think retreats are generally underinvested in at the moment. I'm also sympathetic to thinking that general undergrad and high school capacity building (AI safety, longtermist, or EA) is underdone, but this seems less clear-cut.

I think this underinvestment is due to a mix of mistakes on the part of Open Philanth... (read more)

Retreats make things feel much more real to people and result in people being more agentic and approaching their choices more effectively.

Strongly agreed on this point, it's pretty hard to substitute for the effect of being immersed in a social environment like that

I was fired from my first job out of college, and in retrospect that was a gift. It taught me that new jobs were easy to get (as a programmer in the late 00s) and took away my fear of job hunting, which otherwise would have been enormous. I watched so many programmer friends stay in miserable jobs when they had a plethora of options, because job hunting was too scary. Being fired early rescued me from that.

(this is based on / expanded from a response I wrote to a tweet that was talking about how autistic people struggle in the world because the world follows unwritten rules that are more important than the written ones.)

I think most autistic people should invest more in understanding the unwritten rules. it can be cruel and unfair, but it's important to know how to interact with it. and it's actually a really interesting system to map out, with its own rhyme and reason.

it's entirely understandable that people feel burned by bad past experiences, and to have ... (read more)

Showing 3 of 6 replies (Click to show all)

like imagine if "pter" were a single character in words like helicopter and pterodactyl both contain "pter", but you'd probably think of "helicopter" as an atomic unit with its own unique identity

I often do chunk them, but if you've picked up a bit of taxonomic Greek pter means 'wing', so we have helico-pter 'spiral/rotating wing' and ptero-dactyl 'wing fingers' - both cases where breaking down the name tells you something about what the things are!

2Vladimir_Nesov
Inability to put equal effort into everything throughout the day reifies into heuristics about which things get the effort/engagement. In principle, if you are going to spend 2 hours on something, why take it any less seriously/playfully during those 2 hours than anything else, even if you are not planning to put 10,000 hours in it in total? And so you get silly heuristics where you do put 10,000 hours into something, but systematically never do it seriously/playfully, and so never become proficient. It's not enough to be very intelligent to get proficient at moderately complicated things if you systematically avoid learning anything about them. Fair allocation of effort that ensures progress requires that the silly heuristics of systematic avoidance of effort are not in total control. This can happen naturally if you are lucky enough that your heuristics happen to be less silly, or if you have infinite energy and motivation and really do habitually put similar effort in everything throughout the day. But if that's not the case, it's often possible to take deliberate control of your curiouslity and allocate it in a way where any single thing you interact with a nontrivial amount does get a fair portion of effort. It's an obscure enough principle that I'm not sure many people are practicing it, and so any reports of systematic inability to learn something need to account for this confounder of silly-on-reflection systematic avoidance of (productive) effort towards learning a particular topic, that's not just about the time (let alone discomfort) dedicated to it.
4Linda Linsefors
I don't know about other autists, but my primary problem with the neurotypical world isn't that I don't understand it, it that they don't understand me. It doesn't matter how well I can decode the social norms, if I can't also control my unvoluntary emotional expressions, and also do other things ranging from impossible to unpleasant. I do understand social white lies. It's not that complicated. But I still find it unpleasant to speak them. When I was younger I got into trouble for literally being unable to utter words like "thanks" and "apology" when I did not mean them. (My native language does not have the ambiguous "sorry".) I am now able to tell white lies, but it makes me feel bad, in a way that has nothing to do with morals. The dissonance is just intrinsically hurtful to my sole, in a way that non-autistic people don't understand and typically don't respect.  Another common thing is that people assume that if I don't succeed in hiding my negative emotion this is an invitation/request for them to to try to help me, and then proceed to try to do that, even though they have zero skills, in this. And then they refuse to listen to anything I say, including not leaving me alone when I ask to be left alone. I don't want to hang out in a space where the norms are set up to be comfortable to people un-like me, at the cost of making it unpleasant for people like me, and then being told that it's a skill issue and I should just learn the rules.  I accept that the wider norms will be set up to be good for the average people (i.e. not me). I just prefer to not go there.

Many people agree that 'artificial intelligence' is a poor term that is vague and has existing connotations. People use it to refer to a whole range of different technologies.

However, I struggle to come up with any better terminology. If not 'artificial intelligence', what term would be ideal for describing the capabilities of multi-modal tools like Claude, Gemini, and ChatGPT?

I also agree "AI" is overloaded and has existing connotations (ranging from algorithms to applications as well)! I would think generative models, or generative AI works better (and one can specify multimodal generative models if one wants to be super clear), but also curious to see what other people would propose.

1winstonBosan
You probably don’t like the term LLM because it doesn’t describe capability. And most model are multimodal these days, so it is not just natural language.  You also wouldn’t like the term Autoregressive/Next-token predictor. Still because it says what it does, not what it is capable of. AI is a pretty good term. As overloaded as it is.  

Thoughts on how to onboard volunteers/collaborators

Volunteers are super flaky, i.e. often abandon projects with no warning. I think the main reason for this is planning fallacy. People promise more than they actually have time for. Best case they will just be honest about this when they notice. But more typically the person will feel some mix of stress, shame and overwhelm that prevents them from communicating clearly. From the outside this looks like person promised to do a lot of things and then just ghosts the project. (This is a common pattern and not ... (read more)

Can we train models to be honest without any examples of dishonesty?

The intuition: We may not actually need to know ground truth labels about whether or not a model is lying in order to reduce the tendency to lie. Maybe it's enough to know the relative tendencies between two similar samples?

Outline of the approach: For any given Chain of Thought, we don't know if the model was lying or not, but maybe we can construct two variants A and B where one is more likely to be lying than the other. We then reward the one that is more honest relative to the one that... (read more)

I listen to podcasts while doing chores, and often feel like I'm learning something but end up unable to remember anything. So, experiment: I'm going to try writing brief summaries after the fact. I'm going to skip anything where that doesn't feel appropriate, e.g. fiction. By default, nothing here is fact checked, either against reality or against the episode itself.

Planet Money #796 (23 Sep 2017): The Basic Income Experiment

This is a 99% Invisible episode on UBI.

UBI is an idea supported by some on both left and right. Finland is currently trying an exp... (read more)

Showing 3 of 47 replies (Click to show all)
3philh
99% Invisible: The Titanic's Best Lifeboat One of the best known things about Titanic is "she only had lifeboats for half her passengers and crew, which is all she was required to at the time". But that's oversimplified. Initially ships just didn't have lifeboats. They had small boats for ferrying cargo and crew to shore, but not many. Most shipwrecks were near land, but if you went down far out you'd basically just die. Some sailors deliberately didn't learn to swim, to avoid drawing it out. At some point someone invented the first lifeboat (called an "unimmersible" or something), with cork and air pockets for bouyancy. But sailing ships didn't have room for them, so they had to be launched from shore if there was a wreck. Then steam ships were invented, and suddenly there's room for the lifeboats. But they're actually pretty shit. Hard to launch, few or no provisions, not great at surviving storms. Multiple occasions of lifeboats just disappearing, sometimes showing up years or decades later with their occupants dead. Once, the women and children were put on lifeboats from a sinking ship, all the boats were capsized or destroyed, and all the women or children died. The ship sank slowly, and the men were rescued the next day. So the idea in ship design was that the ship would be her own best lifeboat. They'd be designed to sink slowly. And now shipping lanes were busier, and there was radio for distress signals reaching hundreds of miles. A half compliment of lifeboats would be plenty, between two ships, to ferry everyone across. Titanic actually had fewer passengers than she was allowed to have, and four extra collapsible lifeboats than required. When she hit the iceberg, there was a bunch of bad luck. Hitting head on would have been mostly fine, but it would have probably killed a bunch of crew. Officer on duty said to kill the engines and try to turn, but that meant it scraped the side and opened 5 out of the 4 compartments that could have been opened with

Related: SS Eastland capsized maybe in part because of extra weight from the new lifeboats.

2philh
Fall of Civilizations #19: The Mongols The Eurasian steppe is big and flat. Not many trees, lots of grass. Supports big herds. Thousands of years ago the ancestor of the horse arrived. At first it was used for meat and milk, then later for riding. The Proto-Indo-Europeans came from the area and spread themselves around a bunch. Much later the Huns came from the area and made Rome sad. Around the 12th Century, the Mongols were one of a bunch of different nomadic groups living in the area. They had a not-so-great bit of territory in the north. They were themselves divided into smaller groups. There was lots of fighting among everyone. Lots of bits where I'm not sure if the people in question are Mongols or not. A lot of the story comes from "The Secret History of the Mongols". Not sure if it's clear how reliable it is. Temujin's mother fell in love and got married. She and her husband were travelling somewhere when they were attacked. She told her husband to ride away and save himself so he could marry someone else later, and he did. She got captured and taken as someone's second wife. Temujin grew up with various siblings (all brothers?), including one older half-brother (from his father and his father's first wife). There were some tensions between them. I think it's around now he gets a blood brother. At some point he rides somewhere else, to stay with another family for a bit? He falls in love with someone there. I think he's something like 14 at this time. Before they can marry, he hears that his father has been killed by relatives of his mother's previous husband, and returns. His mother and her sons are all abandoned by their tribe and expected to die over winter. But she keeps them alive, and when the tribe comes back they're surprised and a bit disconcerted. At some point he teams up with a brother to kill the eldest brother. The tribe is a bit worried that as he grows he's gonna bear a grudge against them, and he's sent into slavery. He escapes that, g

If the singularity occurs over two years, as opposed to two weeks, then I expect most people will be bored throughout much of it, including me. This is because I don't think one can feel excited for more than a couple weeks. Maybe this is chemical.

Nonetheless, these would be the two most important years in human history. If you ordered all the days in human history by importance/'craziness', then most of them would occur within these two years.

So there will be a disconnect between the objective reality and how much excitement I feel.

Showing 3 of 7 replies (Click to show all)
13Thane Ruthenis
Not necessarily. If humans don't die or end up depowered in the first few weeks of it, it might instead be a continuous high-intensity stress state, because you'll need to be paying attention 24/7 to constant world-upturning developments, frantically figuring out what process/trend/entity you should be hitching your wagon to in order to not be drowned by the ever-rising tide, with the correct choice dynamically changing at an ever-increasing pace. "Not being depowered" would actually make the Singularity experience massively worse in the short term, precisely because you'll be constantly getting access to new tools and opportunities, and it'd be on you to frantically figure out how to make good use of them. The relevant reference class is probably something like "being a high-frequency trader": This is pretty close to how I expect a "slow" takeoff to feel like, yep.
3S. Alex Bradt
This comment has been tumbling around in my head for a few days now. It seems to be both true and bad. Is there any hope at all that the Singularity could be a pleasant event to live through?

Well, an aligned Singularity would probably be relatively pleasant, since the entities fueling it would consider causing this sort of vast distress a negative and try to avoid it. Indeed, if you trust them not to drown you, there would be no need for this sort of frantic grasping-at-straws.

An unaligned Singularity would probably also be more pleasant, since the entities fueling it would likely try to make it look aligned, with the span of time between the treacherous turn and everyone dying likely being short.

This scenario covers a sort of "neutral-alignme... (read more)

Long have I searched for an intuitive name for motte & bailey that I wouldn't have to explain too much in conversation. I might have finally found it. The "I was merely saying fallacy". Verb: merelysay. Noun: merelysayism. Example: "You said you could cure cancer and now you're merelysaying you help the body fight colon cancer only."

Showing 3 of 5 replies (Click to show all)

There are a lot of similar terms, but motte and bailey is a uniquely apt metaphor for describing a specific rhetorical strategy. I think the reason it often feels unhelpful in practice is because it’s unusually unnecessary to be so precise when our goal is just to call out bullshit. I personally like “motte and bailey” quite a bit, but as a tool for my own private thinking rather than as a piece of rhetoric to persuade others with.

1Drake Morrison
I would guess something like historical momentum is the reason people keep using it. Nicholas Shackel coined the term in 2005, then it got popularized in 2014 from SSC. 20 years is a long time for people to be using the term.
7sjadler
20 years is a long time sure, but I don’t think would be good argument for keeping it! (I understand you’re likely just describing, not justifying) Motte & bailey has a major disadvantage of “nobody who hears it for the first time has any understanding of what it means” Even as someone who knows the concept, I’m still not even 100% positive that motte and bailey do in fact mean “overclaim and retreat” respectively People are welcome to use the terms they want, of course. But I’d think there should be a big difference between M&B and some simpler name in order to justify M&B

Parent comment for: Why the focus on wise AI advisors?

0Chris_Leong
With quotes from: Yoshua Bengio, Oppenheimer, Toby Ord, Edward Wilson, Nick Bostrom, Carl Sagan, Pope Francis, Antonio Guterres, Stephen Hawking

I think that I've historically underrated learning about historical events that happened in the last 30 years, compared to reading about more distant history.

For example, I recently spent time learning about the Bush presidency, and found learning about the Iraq war quite thought-provoking. I found it really easy to learn about things like the foreign policy differences among factions in the Bush admin, because e.g. I already knew the names of most of the actors and their stances are pretty intuitive/easy to understand. But I still found it interesting to ... (read more)

8Cole Wyeth
How do you recommend studying recent history?

I don't really have a better suggestion than reading the obvious books. For the Bush presidency, I read/listened to both "Days of Fire", a book by Peter Baker (a well-regarded journalist), and "Decision Points" by Bush. And I watched/listened to a bunch of interviews with various people involved with the admin.

3Drake Morrison
I have long thought that I should focus on learning history with a recency bias, since knowing about the approximate present screens off events of the past. 

Is interp easier in worlds where scheming is a problem? 

The key conceptual argument for scheming is that, insofar as future AI systems are decomposable into [goals][search], there are many more misaligned goals compatible with low training loss than aligned goals. But if an AI was really so cleanly factorable, we would expect interp / steering to be easier / more effective than on current models (this is the motivation for re-target the search). 

While I don't expect the factorization to be this clean, I do think we should expect interp to be easi... (read more)

It is possible that state tracking could be the next reasoning-tier breakthrough in frontier model capabilities. I believe that there exists strong evidence in favor of this being the case. 

State space models already power the fastest available voice models, such as Cartesia's Sonic (time-to-first-audio advertised as under 40ms). There are examples of SSMs such as Mamba, RWKV, and Titans outperforming transformers in research settings. 

Flagship LLMs are also bad at state tracking, even with RL for summarization. Forcing an explicit... (read more)

You will always oversample from the most annoying members of a class.

This is inspired by recent arguments on twitter about how vegans and poly people "always" bring up those facts. I contend that it's simultaneous true that most vegans and poly people are either not judgmental, but it doesn't matter because that's not who people remember. Omnivores don't notice the 9 vegans who quietly ordered an unsatisfying salad, only the vegan who brought up factoring farming conditions at the table. Vegans who just want to abstain from animal products remember the omn... (read more)

Showing 3 of 16 replies (Click to show all)
2Ben Pace
Not that this is directly relevant to your thesis comparing different groups today; but I do assume that Judaism had a massive evangelical period in its early growth (e.g. 2,000 years ago) that let it get so big that it could afford to pivot to being less evangelical today.
6DirectedEvolution
Most of the critical comments I see on HN involve accusing LW of being a cult, being too stupid to realize people can't be fully rational, or being incredibly arrogant and overconfident about analysis based on ass-numbers and ill-researched personal opinion. I don't see that much engagement with LW arguments around AI specifically.

On Twitter at least, a fair number of the cult allegations seem to be from (honestly fairly cult-ish people themselves) who don't like what LW people say about AI, at least in the threads I'm likely to follow. But I defer to your greater HN expertise!

everyone is a few hops away from everyone else. this applies in both directions: when I meet random people they always have some weak connection to other people I know, but also when I think of a collection of people as a cluster, most specific pairs of people within that cluster barely know each other except through other people in the cluster.

It’s worth noting that, though it’s true that for a sufficiently large cluster most pairs of people are not strongly connected, they are significantly more likely to be connected than in a random graph. This is the high clustering coefficient property of small-world graphs like the social graph.

xAI's safety team is 3 people.

Showing 3 of 11 replies (Click to show all)
13leogao
I want to defend interp as a reasonable thing for one to do for superintelligence alignment, to the extent that one believes there is any object level work of value to do right now. (maybe there isn't, and everyone should go do field building or something. no strong takes rn.) I've become more pessimistic about the weird alignment theory over time and I think it's doomed just like how most theory work in ML is doomed (and at least ML theorists can test their theories against real NNs, if they so choose! alignment theory has no AGI to test against.) I don't really buy that interp (specifically ambitious mechinterp, the project of fully understanding exactly how neural networks work down to the last gear) has been that useful for capabilities insights to date. fmpov, the process that produces useful capabilities insights generally operates at a different level of abstraction than mechinterp operates at. I can't talk about current examples for obvious reasons but I can talk about historical ones. with Chinchilla, it fixes a mistake in the Kaplan paper token budget methodology that's obvious in hindsight; momentum and LR decay, which have been around for decades, are based on intuitive arguments from classic convex optimization; transformers came about by reasoning about the shape and trajectory of computers and trying to parallelize things as much as possible. also, a lot of stuff Just Works and nobody knows why. one analogy that comes to mind is if your goal is to make your country's economy go well, it certainly can't hurt to become really good friends with a random subset of the population to understand everything they do. you'll learn things about how they respond to price changes or whether they'd be more efficient with better healthcare or whatever. but it's probably a much much higher priority for you to understand how economies respond to the interest rate, or tariffs, or job programs, or so on, and you want to think of people as crowds of homo economicus wit

My current view is that alignment theory should work on deep learning as soon as it comes out, if it's the good stuff, and if it doesn't, it's not likely to be useful later unless it helps produce stuff that works on deep learning. Wentworth, Ngo, and Causal Incentives are the main threads that already seem to have achieved this somewhat. SLT and DEC seem potentially relevant.

I'll think about your argument for mechinterp. If it's true that the ratio isn't as catastrophic as I expect it to turn out to be, I do agree that making microscope AI work would be incredible in allowing for empiricism to finally properly inform rich and specific theory.

9Cole Wyeth
This seems reasonable. Personally, I’m not that worried about capabilities increases from mech interp, I simply don’t except it to work very well. 

Multiple times have I seen an argument like this:

Imagine a fully materialistic universe strictly following some laws, which are such that no agent from inside the universe is able to fully comprehend them...

(https://www.lesswrong.com/posts/YTmNCEkqvF7ZrnvoR/zombies-substance-dualist-zombies?commentId=iQfr65fKr5nSFriCs)

I wonder if that is possible? For computables, it is always possible to construct a quine (standing for the agent) with arbitrary embedded contents (for the rest of the universe/laws/etc), and it wouldn't even be that large - it only needs to... (read more)

All you need is a bounded universe with laws having complexity greater than can be embedded within that bound, and that premise holds.

You can even have a universe containing agents with unbounded complexity, but laws with infinite complexity describing a universe that only permits agents with finite complexity at any given time.

Load More