My guess is that he’s referring to the fact that Blackwell offers much larger world sizes than Hopper and this makes LLM training/inference more efficient. Semianalysis has argued something similar here: https://semianalysis.com/2024/12/25/nvidias-christmas-present-gb300-b300-reasoning-inference-amazon-memory-supply-chain

Reply

METR: Measuring AI Ability to Complete Long Tasks

anaguma2mo10

No, at some point you "jump all the way" to AGI, i.e. AI systems that can do any length of task as well as professional humans -- 10 years, 100 years, 1000 years, etc.

Isn’t the quadratic cost of context length a constraint here? Naively you’d expect that acting coherently over 100 years would require 10x the context, and therefore 100x the compute/memory, than 10 years.

Reply

OpenAI: Detecting misbehavior in frontier reasoning models

anaguma2mo53

I would guess that the reason it hasn’t devolved into full neuralese is because there is a KL divergence penalty, similar to how RHLF works.

Reply

anaguma's Shortform

anaguma2mo10

I gave the model both the PGN and the FEN on every move with this in mind. Why do you think conditioning on high level games would help? I can see why for the base models, but I expect that the RLHFed models would try to play the moves which maximize their chances of winning, with or without such prompting.

Reply

Caleb Biddulph's Shortform

anaguma2mo30

Do you know if there are scaling laws for DLGNs?

Reply

anaguma's Shortform

anaguma2mo10

“Let's play a game of chess. I'll be white, you will be black. On each move, I'll provide you my move, and the board state in FEN and PGN notation. Respond with only your move.”

Reply

anaguma's Shortform

anaguma2mo20

GPT 4.5 is a very tricky model to play chess against. It tricked me in the opening and was much better, then I managed to recover and reach a winning endgame. And then it tried to trick me again by suggesting illegal moves which would lead to it being winning again!

Reply

Reflections on the state of the race to superintelligence, February 2025

anaguma3mo30

How large of an advantage do you think OA gets relative to its competitors from Stargate?