LESSWRONG
LW

Burny — LessWrong

Also I think a lot of people say a lot of things not as falsifiable factual claims, but as status signaling or values signaling or being part of ingroup signaling.

Burny16d

And if you want to do literally statistical physics on top of deep learning: https://deeplearningtheory.com

Burny1moQuick Take

Dr. Jeff Beck is lovely transdisciplinary galaxybrain, mathematician turned computational neuroscientist doing physics+neuroscience-inspired machine learning

https://www.youtube.com/watch?v=9suqiofCiwM

Burny1moQuick Take

opus 4.5 is 4o for nerds https://x.com/justalexoki/status/2010027088217342095

-2

Burny1moQuick Take

"A mathematician builds the most explanatory model they can which they can still prove theorems about."
"A physicist builds the simplest model they can which still explains the key phenomena."
- Authors of Principles of Deep Learning Theory book
"An engineer builds the most realistic/accurate model they can which can still be computed within budget."
- jez2718

How I think about alignment and ethics as a cooperation protocol software

Burny

4mo

I think alignment by itself is complex phenomenon, but big part of it is sharing the same, or similar ethics.

And ethics itself is also very complex phenomenon and often fuzzy and inconsistent, but to a certain approximation from a certain perspective, I see a lot of ethics in big part as software protocol evolved by evolution for mutual cooperation between "same enough kinds".

And different ethical frameworks define "same enough kind" differently, like if you include people from your neighborhood, country, earth, ideology, personality cluster, interests cluster, etc., or also animals (also which animals you include and exclude), more aligned or less aligned AGIs, etc., in your ethics as someone you want to... (read more)

Burny4mo*Quick Take

"Claude Sonnet 4.5 was able to recognize many of our alignment evaluation environments as being tests of some kind, and would generally behave unusually well after making this observation."

https://x.com/Sauers_/status/1972722576553349471

Burny5mo

How do you rate the lowered sycophancy of GPT-5, relatively speaking?

Burny5moQuick Take

According to Jan Leike, Claude Sonnet 4.5 It’s the most aligned frontier model yet https://x.com/janleike/status/1972731237480718734

Burny5mo*Quick Take

I really like the definition of rationalist from https://www.lesswrong.com/posts/2Ee5DPBxowTTXZ6zf/rationalists-post-rationalists-and-rationalist-adjacents :

"A rationalist, in the sense of this particular community, is someone who is trying to build and update a unified probabilistic model of how the entire world works, and trying to use that model to make predictions and decisions."

I recently started saying that I really love Effective Curiosity:

Maximizing the total understanding of reality by building models of as many physical phenomena as possible across as many scales of the universe as possible, that are as comprehensive, unified, simple, and empirically predictive as possible.

And I see it more as a direction. And I see it from a more collective intelligence perspective. I think modelling... (read more)

Replying toRationalists, Post-Rationalists, And Rationalist-Adjacents

Burny5mo*

Rationalists, Post-Rationalists, And Rationalist-Adjacents

I like your definition of rationalism!

I recently started saying that I really love Effective Curiosity:
Maximizing the total understanding of reality by building models of as many physical phenomena as possible across as many scales of the universe as possible, that are as comprehensive, unified, simple, and empirically predictive as possible.

And I see it more as a direction. And I see it from a more collective intelligence perspective. I think modelling the whole world in fully unified way and in total accuracy is impossible, even with all of our science with all our technology, because we're all finite limited agents with limited computational resources and time, limited modelling capability, and we get stuck... (read more)

Lovely podcast with Max Tegmark "How Physics Absorbed Artificial Intelligence & (Soon) Consciousness"

Description: "MIT physicist Max Tegmark argues AI now belongs inside physics, and that consciousness will be next. He separates intelligence (goal-achieving behavior) from consciousness (subjective experience), sketches falsifiable experiments using brain-reading tech and rigorous theories (e.g., IIT/φ), and shows how ideas like Hopfield energy landscapes make memory “feel” like physics. We get into mechanistic interpretability (sparse autoencoders), number representations that snap into clean geometry, why RLHF mostly aligns behavior (not goals), and the stakes as AI progress accelerates from “underhyped” to civilization-shaping. It’s a masterclass on where mind, math, and machines collide."

Advanced version of Gemini with Deep Think officially achieves gold-medal standard at the International Mathematical Olympiad

https://deepmind.google/discover/blog/advanced-version-of-gemini-with-deep-think-officially-achieves-gold-medal-standard-at-the-international-mathematical-olympiad/

Whaaat!?

Gemini 2.5 pro is way worse at IMO and got 30%, and DeepThink version gets gold??

But it's more finetuned for IMOlike problems, but I bet the OpenAI's model was too.

Both use "novel RL methods".

Hmm, "access to a set of high-quality solutions to previous problems and general hints and tips on how to approach IMO problems", seems like system prompt, as they claim no tool use like OpenAI.

Both models failed the 6th question which required more creativity

Deepmind's solutions are more organized, more readable, more well written than OpenAI's.

But OpenAI's style is also more compressed to save tokens, so maybe going more out of human-like language into more out of distribution territory will be the future (Neuralese).

Did OpenAI and DeepMind somehow hack the methodology, or do these new general language models truly generalize more?

Is narrow superintelligent AI for physics research an existential risk?

>Noam Brown: "Today, we at @OpenAI achieved a milestone that many considered years away: gold medal-level performance on the 2025 IMO with a general reasoning LLM—under the same time limits as humans, without tools. As remarkable as that sounds, it’s even more significant than the headline"
https://x.com/polynoamial/status/1946478249187377206

>"Progress here calls for going beyond the RL paradigm of clear-cut, verifiable rewards. By doing so, we’ve obtained a model that can craft intricate, watertight arguments at the level of human mathematicians."
>"We reach this capability level not via narrow, task-specific methodology, but by breaking new ground in general-purpose reinforcement learning and test-time compute scaling." https://x.com/alexwei_/status/1946477749566390348

So there's some new breakthrough...?

>"o1 thought for seconds. Deep Research for minutes. This one thinks for hours." https://x.com/polynoamial/status/1946478253960466454

>"LLMs for IMO 2025: gemini-2.5-pro (31.55%), o3 high (16.67%), Grok 4 (11.90%)." https://x.com/denny_zhou/status/1945887753864114438

So public LLMs are bad at IMO, while internal models are getting gold medals? Fascinating

Burny's Shortform

Burny

8mo

This is a special post for quick takes (aka "shortform"). Only the owner can create top-level comments.

Why is Gemini telling the user to die?

Burny

https://gemini.google.com/share/6d141b742a13 My favorite theory is that the whole conversation is too much like a scifi plot, where someone asks an AI repetitive questions, until the AI snaps, so this general pattern was pattern matched from the training data, because while training, the RLHF, or whatever they use for alignment, didn't squeeze the region that corresponds to this antihuman persona in the latent space sufficiently enough.

Possible OpenAI's Q* breakthrough and DeepMind's AlphaGo-type systems plus LLMs

Burny

tl;dr: OpenAI leaked AI breakthrough called Q*, acing grade-school math. It is hypothesized combination of Q-learning and A*. It was then refuted. DeepMind is working on something similar with Gemini, AlphaGo-style Monte Carlo Tree Search. Scaling these might be crux of planning for increasingly abstract goals and agentic behavior. Academic community has been circling around these ideas for a while.

Reuters: OpenAI researchers warned board of AI breakthrough ahead of CEO ouster, sources say

Michael Trazzi, on Twitter:

"Ahead of OpenAI CEO Sam Altman’s four days in exile, several staff researchers sent the board of directors a letter warning of a powerful artificial intelligence discovery that they said could threaten humanity
Mira Murati told employees on

... (read 575 more words →)

Human-like systematic generalization through a meta-learning neural network

Burny

Step closer to AGI?

The classic argument made over 30 years ago by Fodor and Pylyshyn - that neural networks fundamentally lack the systematic compositional skills of humans due to their statistical nature - has cast a long shadow over neural network research. Their critique framed doubts about the viability of connectionist models in cognitive science. This new research finally puts those doubts to rest.

Through an innovative meta-learning approach called MLC, the authors demonstrate that a standard neural network model can exhibit impressive systematic abilities given the right kind of training regimen. MLC optimizes networks for compositional skills by generating a diverse curriculum of small but challenging compositional reasoning tasks. This training nurtures... (read 359 more words →)