Brendan Long — LessWrong

LESSWRONG
is fundraising!
LW

Why does Eliezer make abrasive public comments?

Answer by Brendan LongDec 22, 2025141

I'm not sure if his approach is actually productive for this, but for the longest time, the standard response to Eliezer's concerns was that they're crazy sci-fi. Now that they're not crazy sci-fi, the response is that they're obvious. Constantly reminding people that his crazy predictions were right (and everyone else was wrong in predictable ways) is a strategy to get people to actually take his future predictions seriously (even though they're obviously crazy sci-fi).

Chain-of-Thought as Contextual Stabilization and Associative Retrieval

Brendan Long2d20

The probability for "Garage" hit 99% (Logit 15) at the very first step and stayed flat.

Is the problem that these questions are too easy, so the LLM is outputing reasoning since that's sometimes helpful, but in this case it doesn't actually need it?

I'd be curious to see what the results look like if you give it harder questions.

Recent LLMs can use filler tokens or problem repeats to improve (no-CoT) math performance

Brendan Long2d20

What do we see if we apply interpretability tools to the filler tokens or repeats of the problem? Can we use internals to better understand why this helps with the model?

Yeah, I predicted that this would be too hard for models to learn without help, so I'm really curious to see how it managed to do this. Is it using the positional information to randomize which token does which part of the parallel computation, or something else?

Show LW: Alignment Scry

Brendan Long4d20

I'll have to think about what other weird things I can write about to get my variance up. I won't stop until I'm number one!

Estimating The Portion of Income Consumed By Essentials Between 1985 and 2025

Brendan Long6d30

While this might be useful to you in particular, I'm not convinced this generalizes very well to the whole US.

Rent in California (near LA) might not be nationally representative.
You're comparing national median income to rent in San Bernadino.
The rent values aren't comparable over time (if you're happy with a 40th percentile house in 1985, that same house is cheaper in 2020, even if the median house is larger/nicer).

(It's also unusual that your median American eats 3500 calories of meat every day, but these numbers should be comparable over time)

Leading by example

Brendan Long6d52

I decided to incur the personal cost of not getting vaccinated as a sign of protest.

If you think vaccination is a good idea but forcing people to prove vaccination status is bad, wouldn't it make more sense to get vaccinated but to refuse to provide proof to gatekeepers? I'd worry that being unvaccinated sends the wrong signal, that you're anti-vaccine and not anti-vaccine-mandate.

Chemical (hunger) argument paraphrased

Brendan Long6d90

One thing to consider is food expenditure over time:

People spent twice as much on food in 1950, despite eating out half as much and mostly cooking foods we'd consider extremely cheap today.

Chemical (hunger) argument paraphrased

Brendan Long6d100

As far as I know, they could generally all afford as much food as they wanted.

Unless your great grandparents were rich, this seems unlikely. People massively underestimate how much worse and more expensive food was in the recent past. In the 1950's, people spent a lot of their time thinking about how to get cheaper food, and eating food we would consider to be not particularly good (pretty much everything canned).

Similarly, the best food you mom knows how to make is probably not representative of what your family was eating regularly in your grandparents' generation.

Going to the gym for 20 minutes every other weekend burns more calories than a 9 hour shift cooking food every day.

I think this is incorrect by a huge margin. Working out actually doesn't burn very many additional calories at all.

How I stopped being sure LLMs are just making up their internal experience (but the topic is still confusing)

Brendan Long6d34

If you mean the transformer could literally output this as CoT.. that's an interesting point. You're right that "I should think about X" will let it think about X at an earlier layer again. This is still lossy, but maybe not as much as I was thinking.

LESSWRONG
is fundraising!
LW

LESSWRONG
is fundraising!
LW

Posts

Wikitag Contributions

Comments

Posts

Wikitag Contributions

Comments