158

LESSWRONG
LW

HomeAll PostsConceptsLibrary
Best of LessWrong
Sequence Highlights
Rationality: A-Z
The Codex
HPMOR
Community Events
Subscribe (RSS/Email)
LW the Album
Leaderboard
About
FAQ
Customize
Load More

Quick Takes

Load More

Popular Comments

Science Isn't Enough
Book 4 of the Sequences Highlights

While far better than what came before, "science" and the "scientific method" are still crude, inefficient, and inadequate to prevent you from wasting years of effort on doomed research directions.

Jeremy Gillen20h*441
Resampling Conserves Redundancy (Approximately)
Alfred Harwood and I were working through this as part of a Dovetail project and unfortunately I think we’ve found a mistake. The Taylor expansion in Step 2 has the 3rd order term o(δ3)=16[2(√P[X])3](−δ[X])3. This term should disappear as δ[X] goes to zero, but this is only true if √P[X] stays constant. The Γ transformation in Part 1 reduces (most terms of) P[X] and Q[X] at the same rate, so √P[X] decreases at the same rate as δ[X]. So the 2nd order approximation isn’t valid. For example, we could consider two binary random variables with probability distributions P(x=0)=zp and P(X=1)=1−zp and Q(X=0)=zq and Q(X=1)=1−zq. If δ[X]=√P(X)−√Q(X), then δ[X]→0 as z→0. But consider the third order term for X=0 which is 13(√Q(0)−√P(0)√P(0))3=13(√zq−√zp√zp)3=13(√q−√p√p)3 This is a constant term which does not vanish as z→0. We found a counterexample to the whole theorem (which is what led to us finding this mistake), which has KL(X2→X1→Λ′)max[KL(X1→X2→Λ),KL(X2→X1→Λ)]>10, and it can be found in this colab. There are some stronger counterexamples at the bottom as well. We used sympy because we were getting occasional floating point errors with numpy. Sorry to bring bad news! We’re going to keep working on this over the next 7 weeks, so hopefully we’ll find a way to prove a looser bound. Please let us know if you find one before us!
Raemon1d2811
How I Became a 5x Engineer with Claude Code
Your process description sounds right (like, the thing I would aspire to, although I don't consistently do it – in particular, I've identified it'd be good if I did more automated testing, but haven't built that into my flow yet).  But, you don't really spell out the "and, here's why I'm pretty confident this is a 5x improvement."  A few months ago I'd have been more open to just buying the "well, I seem to be shipping a lot of complex stuff", but, after the METR "turns out a lot of devs in our study were wrong and were actually slowed down, not sped up", it seems worth being more skeptical about it.  What are the observations that lead you to think you're 5x? (also, 5x is a somewhat specific number, do you mean more like 'observations suggest it's specifically around 5x' or more like 'it seems like a significant speedup, but I can tell I'm still worse than the '10x' programmers around me, and, idk, eyeballing it as in the middle?) (I don't mean this to be like, super critical or judgmental, just want to get a sense of the state of your evidence)
jimmy3d6227
The Mom Test for AI Extinction Scenarios
> My mom didn’t buy it. “This is all sounding a bit crazy, Taylor,” she said to me. And she’s usually primed to believe whatever I say, because she knows I’m smart. > > The problem is that these stories are not believable. True, maybe, but not easy to believe. They fail the “mom test”. Only hyper-logical nerds can believe arguments that sound like sci-fi. Maybe only hyper-logical nerds can believe arguments that sound like sci-fi, but your mom only has to believe you. The question is whether you are believable, or whether you're "starting to sound a bit crazy, Taylor". That's her sign to you that you need to show that you can appreciate how crazy it sounds and maintain your belief. Because it does sound a bit crazy. It's quite a leap from demonstrated reality, and most of the time people are making such leaps they're doing fiction/delusion and not actually calling things right in advance. The track record of people saying crazy shit and then insisting "It's not crazy I swear!" isn't good. If instead, you meet her where she's at and admit "Yeah. I know. I wish it was", it hits differently. I can't remember if I've talked to my mom about it, but if I had to talk to her about it, I'd probably say something like "You hear of the idea that AGI is going to be completely transformative, and will have the power to kill us all? Yeah, that's likely real", and she'd probably say something like "Oh.". That's basically how it went when I told her the world was about to change due to the upcoming pandemic. I didn't "try to persuade her" by giving her arguments that she's supposed to buy, let alone spinning stories about how a bat had a virus and then these researchers genetically modified it to better attack humans. I just told her "Here's what I believe to be true", so that she could prepare. I was open to why it was that I believed it, but the heavy lifting was done by the fact that I genuinely believed it and I came off more like I was trying to share information so that she could prepare than like I was trying to convince her of anything. In your shoes, besides making sure to acknowledge her point that it sounds crazy, I'd do a lot of genuine curiosity about her perspective. Has she ever experienced something that sounded crazy as fuck, and then turned out to be real? Not as a rhetorical question, just trying to understand where she's coming from. Is she aware of the massive impact drones are having in the war in Ukraine? Has she thought about what it felt like to be warned of the power of nuclear weapons before anyone had seen them demonstrated? These aren't "rhetorical questions", asked as ways of disguising a push for "Then you should stop being so confident!" but as a genuine inquiry. Maybe she has experienced something "crazy" turning out to be real, and noticing will change her mind. Or maybe she hasn't. Or maybe it seems different to her, and learning in what way it seems different will be relevant for continuing towards resolving the disagreement. Giving people the space to share and examine their perspective without pressure is what allows people to have the experiences that shift views. Maybe she hasn't had the experience of running from a terminator drone, or being outsmarted at every turn, but you could give her that experience -- by pointing out the shared starting point and asking her to imagine where that goes. She'd still have to take you up on that invitation, of course. If I'm wrong about being able to convince my own mom in a single line, it'd be for this reason. Maybe the idea would freak her out so much that she would be motivated to not understand. I don't think she would, but maybe. And if so, that's a very different kind of problem that you deal with by making arguments which are "more believable".  
Load More
488Welcome to LessWrong!
Ruby, Raemon, RobertM, habryka
6y
76
October Meetup - One Week Late
Fri Oct 24•Edmonton
AI Safety Law-a-thon: We need more technical AI Safety researchers to join!
Sat Oct 25•Online
AI Psychosis, with Tim Hua and Adele Lopez
Fri Oct 17•San Francisco
Seville, Spain – ACX Meetups Everywhere Fall 2025
Fri Oct 17•Sevilla
First Post: When Science Can't Help
eggsyntax7h146
James Diacoumis
2
Just a short heads-up that although Anthropic found that Sonnet 4.5 is much less sycophantic than its predecessors, I and a number of other people have observed that it engages in 4o-level glazing in a way that I haven't encountered with previous Claude versions ('You're really smart to question that, actually...', that sort of thing). I'm not sure whether Anthropic's tests fail to capture the full scope of Claude behavior, or whether this is related to another factor — most people I talked to who were also experiencing this had the new 'past chats' feature turned on (as did I), and since I turned that off I've seen less sycophancy.
Jan_Kulveit14h*251
Reuben Adams
1
ACS research is hiring  We're looking for a mix of polymaths, ML research engineers, and people with great intuitions about how AIs behave to work on macrostrategy and LM psychology. Personally I hope it could be Pareto-best option for some of you on combination of topics to work on, incentives, salary, collaborators and research environment.  Deadline in few weeks, 1-2 year appointments in Prague, London or San Francisco Bay Area.  Hiring page with more details - https://acsresearch.org/hiring Gradual Disempowerment Research Fellow We're looking for polymaths who can reason about civilizational dynamics. This role comes with a lot of intellectual freedom - it could mean economic modelling, theoretical work on multi-agent dynamics, historical analysis, and more. LLM Psychology & Sociology Researcher We want people with a strong intuitive understanding of LLMs to help run empirical studies on topics like LLM introspection and self-conception, LLM social dynamics, and how ideologies spread between AIs. AI Psychology & Agent Foundations ML Researcher We need people who can bring technical and methodological rigour, taking high-level ideas about AI psychology and turning them into concrete ML experiments. This could include of evaluations, mech interp, post-training, both APIs and open-weight models.
boazbarak14h220
0
Students are continuing to post lecture notes on the AI safety course, and I am posting videos on youtube. Students experiments are also posted with the lecture notes: I've been learning a lot from them!
Cleo Nardo2d*604
Dmitry Vaintrob, Archimedes, and 7 more
20
What's the Elo rating of optimal chess? I present four methods to estimate the Elo Rating for optimal play: (1) comparing optimal play to random play, (2) comparing optimal play to sensible play, (3) extrapolating Elo rating vs draw rates, (4) extrapolating Elo rating vs depth-search. 1. Optimal vs Random Random plays completely random legal moves. Optimal plays perfectly. Let ΔR denote the Elo gap between Random and Optimal. Random's expected score is given by E_Random = P(Random wins) + 0.5 × P(Random draws). This is related to Elo gap via the formula E_Random = 1/(1 + 10^(ΔR/400)). First, suppose that chess is a theoretical draw, i.e. neither player can force a win when their opponent plays optimally. From Shannon's analysis of chess, there are ~35 legal moves per position and ~40 moves per game. At each position, assume only 1 move among 35 legal moves maintains the draw. This gives a lower bound on Random's expected score (and thus an upper bound on the Elo gap). Hence, P(Random accidentally plays an optimal drawing line) ≥ (1/35)^40 Therefore E_Random ≥ 0.5 × (1/35)^40. If instead chess is a forced win for White or Black, the same calculation applies: Random scores (1/35)^40 when playing the winning side and 0 when playing the losing side, giving E_Random ≥ 0.5 × (1/35)^40. Rearranging the Elo formula: ΔR = 400 × log₁₀((1/E_Random) - 1) Since E_Random ≥ 0.5 × (1/35)^40 ≈ 5 × 10^(-62): The Elo gap between random play and perfect play is at most 24,520 points. Random has an Elo rating of 477 points[1]. Therefore, the Elo rating of Optimal is no more than 24,997 points. 2. Optimal vs Sensible We can improve the upper-bound by comparing Optimal to Sensible, a player who avoids ridiculous moves such as sacrificing a queen without compensation. Assume that there are three sensible moves per in each position, and that Sensible plays randomly among sensible moves. Optimal still plays perfectly. Following the same analysis, E_Sensible ≥ 0.5 × (1/3)^40
David James12h131
StanislavKrym
1
During yesterday's interview, Eliezer didn't give a great reply to Ezra Klein's question: i.e. "why does even a small amount of misalignment lead to human extinction." I think many people agree with this; still, my goal isn't to criticize EY. Instead, my goal is to find various levels of explanation that have been tested and tend to work for different audiences with various backgrounds. Suggestions? Related:
anaguma2d372
Cole Wyeth, sjadler, and 6 more
9
Ezra Klein has released a new show with Yudkowsky today on the topic of X-risk.
Gram Stone1d271
0
MIRI got Guinan to read IABIED you guys https://www.facebook.com/share/v/1BJZz9MHSM/
Load More (7/51)
109
The "Length" of "Horizons"
Adam Scholl
1h
14
253
Towards a Typology of Strange LLM Chains-of-Thought
1a3orn
4d
20
710The Company Man
Tomás B.
24d
64
322Hospitalization: A Review
Logan Riggs
8d
18
652The Rise of Parasitic AI
Adele Lopez
1mo
173
185If Anyone Builds It Everyone Dies, a semi-outsider review
dvd
3d
39
253Towards a Typology of Strange LLM Chains-of-Thought
1a3orn
4d
20
79Cheap Labour Everywhere
Morpheus
17h
13
197The Most Common Bad Argument In These Parts
J Bostock
6d
38
222I take antidepressants. You’re welcome
Elizabeth
7d
8
100That Mad Olympiad
Tomás B.
2d
3
109The "Length" of "Horizons"
Adam Scholl
1h
14
336Global Call for AI Red Lines - Signed by Nobel Laureates, Former Heads of State, and 200+ Prominent Figures
Charbel-Raphaël
25d
27
314Why you should eat meat - even if you hate factory farming
KatWoods
22d
90
132Don't Mock Yourself
Algon
4d
14
Load MoreAdvanced Sorting/Filtering