All of Thane Ruthenis's Comments + Replies

Fair enough, I suppose calling it an outright wrapper was an oversimplification. It still basically sounds like just the sum of the current offerings.

Noam Brown: "These aren't the raw CoTs but it's a big step closer."

I think that's a pretty plausible way the world could be, yes.

I still expect the Singularity somewhere in the 2030s, even under that model.

3Cole Wyeth
I think things will be "interesting" by 2045 in one way or another - so it sounds like our disagreement is small on a log scale :) 

GPT-5 ought to come around September 20, 2028, but Altman said today it'll be out within months

I don't think what he said meant what you think it meant. Exact words:

In both ChatGPT and our API, we will release GPT-5 as a system that integrates a lot of our technology, including o3

The "GPT-5" he's talking about is not the next generation of GPT-4, not an even bigger pretrained LLM. It is some wrapper over GPT-4.5/Orion, their reasoning models, and their agent models. My interpretation is that "GPT-5" the product and GPT-5 the hypothetical 100x-bigger GPT mo... (read more)

4Julian Bradshaw
It's unclear exactly what the product GPT-5 will be, but according to OpenAI's Chief Product Officer today it's not merely a router between GPT-4.5/o3.

followed by the promised ‘GPT-5’ within months that Altman says is smarter than he is

Ha, good catch. I don't think GPT-5 the future pretrained LLM 100x the size of GPT-4 which Altman says will be smarter than he is, and GPT-5 the Frankensteinian monstrosity into which OpenAI plan to stitch all of their current offerings, are the same entity.

Edit: It occurred to me that this reading may not be obvious. Look at Altman's Exact Words:

In both ChatGPT and our API, we will release GPT-5 as a system that integrates a lot of our technology, including o3. We will no

... (read more)

Technology for efficient human uploading. Ideally backed by theory we can independently verify as correct and doing what it's intended to do (rather than e. g. replacing the human upload with a copy of the AGI who developed this technology).

What are the current best theories regarding why Altman is doing the for-profit transition at all?

I don't buy the getting-rid-of-profit-caps motive. It seems to live in a fantasy universe in which they expect to have a controllable ASI on their hands, yet... still be bound by economic rules and regulations? If OpenAI controls an ASI, OpenAI's leadership would be able to unilaterally decide where the resources go, regardless of what various contracts and laws say. If the profit caps are there but Altman wants to reward loyal investors, all profits will go t... (read more)

6Noosphere89
The reason is to get OpenAI critical funding, because right now, AI products are huge money losers, and if they want their company to survive, they need funding until AI can get stable profit sources.

If OpenAI controls an ASI, OpenAI's leadership would be able to unilaterally decide where the resources go, regardless of what various contracts and laws say. If the profit caps are there but Altman wants to reward loyal investors, all profits will go to his cronies. If the profit caps are gone but Altman is feeling altruistic, he'll throw the investors a modest fraction of the gains and distribute the rest however he sees fit. The legal structure doesn't matter; what matters is who physically types what commands into the ASI control terminal.

Sama knows this but the investors he is courting don't, and I imagine he's not keen to enlighten them.

jbash127

Altman might be thinking in terms of ASI (a) existing and (b) holding all meaningful power in the world. All the people he's trying to get money from are thinking in terms of AGI limited enough that it and its owners could be brought to heel by the legal system.

I do think RL works "as intended" to some extent, teaching models some actual reasoning skills, much like SSL works "as intended" to some extent, chiseling-in some generalizable knowledge. The question is to what extent it's one or the other.

Some more evidence that whatever the AI progress on benchmarks is measuring, it's likely not measuring what you think it's measuring:

AIME I 2025: A Cautionary Tale About Math Benchmarks and Data Contamination

AIME 2025 part I was conducted yesterday, and the scores of some language models are available here: 
https://matharena.ai thanks to @mbalunovic, @ni_jovanovic et al.

I have to say I was impressed, as I predicted the smaller distilled models would crash and burn, but they actually scored at a reasonable 25-50%.

That was surprising to me! Since these

... (read more)
4Noosphere89
To be fair, non-reasoning models do much worse on these questions, even when it's likely that the same training data has already been given to GPT-4. Now, I could believe that RL is more or less working as an elicitation method, which plausibly explains the results, though still it's interesting to see why they get much better scores, even with very similar training data.

One way to make this work is to just not consider your driven-to-madness future self an authority on the matter of what's good or not. You can expect to start wishing for death, and still take actions that would lead you to this state, because present!you thinks that existing in a state of wishing for death is better than not existing at all.

I think that's perfectly coherent.

3dr_s
I mean, I guess it's technically coherent, but it also sounds kind of insane. That way Dormammu lies. Why would one even care about their future self if they're so unconcerned about that self's preferences?

No, I mean, I think some people actually hold that any existence is better than non-existence, so death is -inf for them and existence, even in any kind of hellscape, is above-zero utility.

2dr_s
I just think any such people lack imagination. I am 100% confident there exists an amount of suffering that would have them wish for death instead; they simply can't conceive of it.

I think "enforce NAP then give everyone a giant pile of resources to do whatever they want with" is a reasonable first-approximation idea regarding what to do with ASI, and it sounds consistent with Altman's words.

But I don't believe that he's actually going to do that, so I think it's just (3).

  • They are not necessarily "seeing" -inf in the way you or me are. They're just kinda not thinking about it, or think that 0 (death) is the lowest utility can realistically go.
  • What looks like an S-risk to you or me may not count as -inf for some people.
2dr_s
True but that's just for relatively "mild" S-risks like "a dystopia in which AI rules the world, sees all and electrocutes anyone who commits a crime by the standards of the year it was created in, forever". It's a bad outcome, you could classify it as S-risk, but it's still among the most aligned AIs imaginable and relatively better than extinction. I simply don't think many people think about what does an S-risk literally worse than extinction look like. To be fair I also think these aren't very likely outcomes, as they would require an AI very aligned to human values - if aligned for evil.
6Aristotelis Kostelenos
I think humanity's actions right now are most comparable those of a drug addict. We as a species dont have the necessary equivalent of executive function and self control to abstain from racing towards AGI. And if we're gonna do it anyway, those that shout about how we're all gonna die just ruin everyone's mood.

Good point regarding GPT-"4.5". I guess I shouldn't have assumed that everyone else has also read Nesov's analyses and immediately (accurately) classified them as correct.

that is a surprising amount of information

Is it? What of this is new?

To my eyes, the only remotely new thing is the admission that "there’s a lot of research still to get to [a coding agent]".

gwern147

The estimate of the compute of their largest version ever (which is a very helpful way to phrase it) at only <=50x GPT-4 is quite relevant to many discussions (props to Nesov) and something Altman probably shouldn't've said.

The estimate of test-time compute at 1000x effective-compute is confirmation of looser talk.

The scientific research part is of uncertain importance but we may well be referring back to this statement a year from now.

7ryan_greenblatt
It's just surprising that Sam is willing to say/confirm all of this given that AI companies normally at least try to be secretive.

You're not accounting for enemy action. They couldn't have been sure, at the onset, how successful the AI Notkilleveryoneism faction will be at raising alarm, and in general, how blatant the risks will become to the outsiders as capabilities progress. And they have been intimately familiar with the relevant discussions, after all.

So they might've overcorrected, and considered that the "strategic middle ground" would be to admit the risk is plausible (but not as certain as the "doomers" say), rather than to deny it (which they might've expected to become a delusional-looking position in the future, so not a PR-friendly stance to take).

Or, at least, I think this could've been a relevant factor there.

Eli Tyre1516

My model is that Sam Altman regarded the EA world as a memetic threat, early on, and took actions to defuse that threat by paying lip service / taking openphil money / hiring prominent AI safety people for AI safety teams.

Like, possibly the EAs could have crea ed a widespread vibe that building AGI is a cartoon evil thing to do, sort of the way many people think of working for a tobacco company or an oil company. 

Then, after ChatGPT, OpenAI was a much bigger fish than the EAs or the rationalists, and he began taking moves to extricate himself from them.

Any chance you can post (or PM me) the three problems AIs have already beaten?

This really depends on the definition of "smarter". There is a valid sense in which Stockfish is "smarter" than any human. Likewise, there are many valid senses in which GPT-4 is "smarter" than some humans, and some valid senses in which GPT-4 is "smarter" than all humans (e. g., at token prediction). There will be senses in which GPT-5 will be "smarter" than a bigger fraction of humans compared to GPT-4, perhaps being smarter than Sam Altman under a bigger set of possible definitions of "smarter".

Will that actually mean anything? Who knows.

By playing with... (read more)

If anyone is game for creating an agentic research scaffold like that Thane describes

Here's a more detailed the basic structure as envisioned after 5 minutes' thought:

  • You feed a research prompt to the "Outer Loop" of a model, maybe have a back-and-forth fleshing out the details.
  • The Outer Loop decomposes the research into several promising research directions/parallel subproblems.
  • Each research direction/subproblem is handed off to a "Subagent" instance of the model.
  • Each Subagent runs search queries on the web and analyses the results, up to the limits of it
... (read more)

I really wonder how much of the perceived performance improvement comes from agent-y training, as opposed to not sabotaging the format of the model's answer + letting it do multi-turn search.

Compared to most other LLMs, Deep Research is able to generate reports up to 10k words, which is very helpful for providing comprehensive-feeling summaries. And unlike Google's analogue, it's not been fine-tuned to follow some awful deliberately shallow report template[1].

In other words: I wonder how much of Deep Research's capability can be replicated by putting Claud... (read more)

2Nathan Helm-Burger
Worth taking model wrapper products into account. For example: * Perplexity * You.com
7Seth Herd
Good questions. I don't have much of a guess about whether this is discernably "smarter" than Claude or Gemini would be in how it understands and integrates sources. If anyone is game for creating an agentic research scaffold like that Thane describes, I'd love to help design it and/or to know about the results. I very much agree with that limitation on Google's deep research. It only accepts a short request for the report, and it doesn't seem like it can (at least easily) get much more in-depth than the default short gloss. But that doesn't mean the model isn't capable of it. Along those lines, Notebook LM has similar limitations on its summary podcasts, and I did tinker with that enough to get some more satisfying results. Using keywords like "for an expert audience already familiar with all of the terms and concepts in these sources" and "technical details" did definitely bend it in a more useful direction. There I felt I ran into limitations on the core intelligence/expertise of the system; it wasn't going to get important but fine distinctions in alignment research. Hearing its summaries was sort of helpful in that new phrasings and strange ideas (when prompted to "think creatively") can be a useful new-thought-generation aid. We shall see whether o3 has more raw reasoning ability that it can apply to really doing expert-level research.

Ah, I think I see my error.

Consider a CoT of the form "[problem statement]: ...........", which always outputs dots regardless of the problem statement. Can that still encode useful computations, with the information being passed through intermediate self-attention? Yes, but: this only increases parallel-computing capabilities, not serial-computing capabilities.

And in the framework I'd outlined, under the Paraphraser, for the purposes of scheme-y thinking, even a legible CoT would consist of such "filler" tokens.

So yep, you're right, this does force all scheming to be taking place within one forward pass.

I don't see why it would be harder to do.

Suppose you tasked a model with refactoring some database, and it secretly wants to sabotage its functionality. It outputs a CoT segment that goes, "I need to design unit tests for this module". Each word of this CoT is mapped to some internal representation, and the internal representation consists of "on-task" human-intended reasoning , and misaligned scheming  (corresponding to, e. g., a tactical decision to fail to include unit tests for some subtle issue). "I" is mapped to , "need" ... (read more)

4Daniel Kokotajlo
I don't think I understand. Let's work through an example.  The AI is being told to write good code, and wants to write subtly-vulnerable code instead. If it just were to honestly try to write good code, it would take (let's say) 100 serial steps of reasoning to succeed. To write subtly-vulnerable code, let's say, requires 300 serial steps of reasoning. The model's strategy is to write subtly-vulnerable code but in a way that looks like it's trying to write good code and just getting a bit unlucky and taking longer than usual (let's suppose the humans have a poor ability to judge how long it's supposed to take, but still some ability, such that it can't take e.g. 10x longer without arousing suspicion). If there's no paraphraser, this is 'easy.' I mean it's still difficult but it's 'easy' in the sense that the AI can basically just use a code -- each nugget of reasoning might have 10 different semantically equivalent expressions, or maybe even 100, so it can 'add extra layers of meaning' to the nuggets -- in addition to their meaning in the surface level i'm-tring-to-write-good-code, they can have a hidden meaning having to do with the security vulnerability-related properties of the code. And it can make a chain of reasoning that is basically about the security stuff while looking like a reasonable but unlucky attempt to write good code. But a paraphraser wrecks that strategy. The model can still add extra layers of meaning to the nuggets, but now it has loads of trouble making the chain of thought look like an honest attempt to write good code. To use your example, if 'I should generate unit tests now' and all semantically equivalent variants of that idea are assigned the hidden meaning 'I should check to see if the vulnerability is too easy to spot' for example, then... what is it going to write next? It's too constrained; in order to look non-suspicious there is a limited set of options it can follow up with... ok, hmm, I'm at the limits of my understanding he

Some thoughts on protecting against LLM steganography.

tl;dr: I suspect the Paraphraser idea is near-completely ineffectual as you scale the capabilities, both for scheming and for general CoT interpretability.

In the reasoning-as-communication framework, the amount of information a fixed-length message/CoT can transfer is upper-bounded by information-theoretic constraints. That is, if we have a ten-bit message, it cannot communicate more than ten bits of information: it cannot always uniquely specify one internal state of an AI model from more than ... (read more)

4Daniel Kokotajlo
I like to distinguish between "Subtle strategic bias" and "Steganography." Paraphrasers help (and maybe solve?) the latter, but do nothing about the former. Example of subtle strategic bias: The model knows it's misaligned and therefore deliberately sandbags on the alignment research you are asking it to do. "I'm sorry sir I just am not making much progress," it says. You look at the CoT, and it's just doing a bunch of sensible but fruitless lines of reasoning. Example of steganography: The model needs to insert a subtle security vulnerability into the code it is writing -- but it doesn't know one right away, it has to invent one. And that requires lots of sequential thought. So it needs to try out different ideas, vet them, critique them, refine them, etc. Can't do that in a single forward pass. So it needs to pass information forward to its future self somehow... without humans reading the CoT knowing that's what it's doing. THIS is the sort of thing that is much harder to do with a paraphraser than without a paraphraser.

Such an entity as described could absolutely be an existential threat to humanity

I agree. I think you don't even need most of the stuff on the "superhuman" list, the equivalent of a competent IQ-130 human upload probably does it, as long as it has the speed + self-copying advantages.

The problem with this neat picture is reward-hacking. This process wouldn't optimize for better performance on fuzzy tasks, it would optimize for performance on fuzzy tasks that looks better to the underlying model. And much like RLHF doesn't scale to superintelligence, this doesn't scale to superhuman fuzzy-task performance.

It can improve the performance a bit. But once you ramp up the optimization pressure, "better performance" and "looks like better performance" would decouple from each other and the model would train itself into idiosyncratic uselessne... (read more)

4Nathan Helm-Burger
Well... One problem here is that a model could be superhuman at: * thinking speed * math * programming * flight simulators * self-replication * cyberattacks * strategy games * acquiring and regurgitating relevant information from science articles And be merely high-human-level at: * persuasion * deception * real world strategic planning * manipulating robotic actuators * developing weapons (e.g. bioweapons) * wetlab work * research * acquiring resources * avoiding government detection of its illicit activities Such an entity as described could absolutely be an existential threat to humanity. It doesn't need to be superhuman at literally everything to be superhuman enough that we don't stand a chance if it decides to kill us. So I feel like "RL may not work for everything, and will almost certainly work substantially better for easy to verify subjects" is... not so reassuring.
5CBiddulph
Yeah, it's possible that CoT training unlocks reward hacking in a way that wasn't previously possible. This could be mitigated at least somewhat by continuing to train the reward function online, and letting the reward function use CoT too (like OpenAI's "deliberative alignment" but more general). I think a better analogy than martial arts would be writing. I don't have a lot of experience with writing fiction, so I wouldn't be very good at it, but I do have a decent ability to tell good fiction from bad fiction. If I practiced writing fiction for a year, I think I'd be a lot better at it by the end, even if I never showed it to anyone else to critique. Generally, evaluation is easier than generation. Martial arts is different because it involves putting your body in OOD situations that you are probably pretty poor at evaluating, whereas "looking at a page of fiction" is a situation that I (and LLMs) are much more familiar with.

The above toy model assumed that we're picking one signal at a time, and that each such "signal" specifies the intended behavior for all organs simultaneously...

... But you're right that the underlying assumption there was that the set of possible desired behaviors is discrete (i. e., that X in "kidneys do X" is a discrete variable, not a vector of reals). That might've indeed assumed me straight out of the space of reasonable toy models for biological signals, oops.

Yeah but if something is in the general circulation (bloodstream), then it’s going everywhere in the body. I don’t think there’s any way to specifically direct it.

The point wouldn't be to direct it, but to have different mixtures of chemicals (and timings) to mean different things to different organs.

Loose analogy: Suppose that the intended body behaviors ("kidneys do X, heart does Y, brain does Z" for all combinations of X, Y, Z) are latent features, basic chemical substances and timings are components of the input vector, and there are dramatically more ... (read more)

4johnswentworth
Just because the number of almost-orthogonal vectors in d dimensions scales exponentially with d, doesn't mean one can choose all those signals independently. We can still only choose d real-valued signals at a time (assuming away the sort of tricks by which one encodes two real numbers in a single real number, which seems unlikely to happen naturally in the body). So "more intended behaviors than input-vector components" just isn't an option, unless you're exploiting some kind of low-information-density in the desired behaviors (like e.g. very "sparse activation" of the desired behaviors, or discreteness of the desired behaviors to a limited extent).

Have you looked at samples of CoT of o1, o3, deepseek, etc. solving hard math problems?

Certainly (experimenting with r1's CoTs right now, in fact). I agree that they're not doing the brute-force stuff I mentioned; that was just me outlining a scenario in which a system "technically" clears the bar you'd outlined, yet I end up unmoved (I don't want to end up goalpost-moving).

Though neither are they being "strategic" in the way I expect they'd need to be in order to productively use a billion-token CoT.

Anyhow, this is nice, because I do expect that probably

... (read more)

Not for math benchmarks. Here's one way it can "cheat" at them: suppose that the CoT would involve the model generating candidate proofs/derivations, then running an internal (learned, not hard-coded) proof verifier on them, and either rejecting the candidate proof and trying to generate a new one, or outputting it. We know that this is possible, since we know that proof verifiers can be compactly specified.

This wouldn't actually show "agency" and strategic thinking of the kinds that might generalize to open-ended domains and "true" long-horizon tasks. In ... (read more)

6Daniel Kokotajlo
Have you looked at samples of CoT of o1, o3, deepseek, etc. solving hard math problems? I feel like a few examples have been shown & they seem to involve qualitative thinking, not just brute-force-proof-search (though of course they show lots of failed attempts and backtracking -- just like a human thought-chain would). Anyhow, this is nice, because I do expect that probably something like this milestone will be reached before AGI (though I'm not sure)

Prooobably ~simultaneously, but I can maybe see it coming earlier and in a way that isn't wholly convincing to me. In particular, it would still be a fixed-length task; much longer-length than what the contemporary models can reliably manage today, but still hackable using poorly-generalizing "agency templates" instead of fully general "compact generators of agenty behavior" (which I speculate humans to have and RL'd LLMs not to). It would be some evidence in favor of "AI can accelerate AI R&D", but not necessarily "LLMs trained via SSL+RL are AGI-comp... (read more)

6Daniel Kokotajlo
OK. Next question: Suppose that next year we get a nice result showing that there is a model with serial inference-time scaling across e.g. MATH + FrontierMath + IMO problems. Recall that FrontierMath and IMO are subdivided into different difficulty levels; suppose that this model can be given e.g. 10 tokens of CoT, 100, 1000, 10,000, etc. and then somewhere around the billion-serial-token-level it starts solving a decent chunk of the "medium" FrontierMath problems (but not all) and at the million-serial-token level it was only solving MATH + some easy IMO problems. Would this count, for you?

I wish we had something to bet on better than "inventing a new field of science,"

I've thought of one potential observable that is concrete, should be relatively low-capability, and should provoke a strong update towards your model for me:

If there is an AI model such that the complexity of R&D problems it can solve (1) scales basically boundlessly with the amount of serial compute provided to it (or to a "research fleet" based on it), (2) scales much faster with serial compute than with parallel compute, and (3) the required amount of human attention ("... (read more)

6Daniel Kokotajlo
Nice. What about "Daniel Kokotajlo can feed it his docs about some prosaic ML alignment agenda (e.g. the faithful CoT stuff) and then it can autonomously go off and implement the agenda and come back to him with a writeup of the results and takeaways. While working on this, it gets to check in with Daniel once a day for a brief 20-minute chat conversation." Does that seem to you like it'll come earlier, or later, than the milestone you describe?

Well, he didn't do it yet either, did he? His new announcement is, likewise, just that: an announcement. Manifold is still 35% on him not following through on it, for example.

"Intensely competent and creative", basically, maybe with a side of "obsessed" (with whatever they're cracked at).

Supposedly Trump announced that back in October, so it should already be priced in.

(Here's my attempt at making sense of it, for what it's worth.)

2Garrett Baker
Politicians announce all sorts of things on the campaign trail, that usually is not much indication of what post-election policy will be.
kwiat.dev102

Trump "announces" a lot of things. It doesn't matter until he actually does them.

Here's a potential interpretation of the market's apparent strange reaction to DeepSeek-R1 (shorting Nvidia).

I don't fully endorse this explanation, and the shorting may or may not have actually been due to Trump's tariffs + insider trading, rather than DeepSeek-R1. But I see a world in which reacting this way to R1 arguably makes sense, and I don't think it's an altogether implausible world.

If I recall correctly, the amount of money globally spent on inference dwarfs the amount of money spent on training. Most economic entities are not AGI labs training n... (read more)

8Vladimir_Nesov
The bet that "makes sense" is that quality of Claude 3.6 Sonnet, GPT-4o and DeepSeek-V3 is the best that we're going to get in the next 2-3 years, and DeepSeek-V3 gets it much cheaper (less active parameters, smaller margins from open weights), also "suggesting" that quality is compute-insensitive in a large range, so there is no benefit from more compute per token. But if quality instead improves soon (including by training DeepSeek-V3 architecture on GPT-4o compute), and that improvement either makes it necessary to use more compute per token, or motivates using inference for more tokens even with models that have the same active parameter count (as in Jevons paradox), that argument doesn't work. Also, the ceiling of quality at the possible scaling slowdown point depends on efficiency of training (compute multiplier) applied to the largest training system that the AI economics will support (maybe 5-15 GW without almost-AGI), and improved efficiency of DeepSeek-V3 raises that ceiling.

I don't have deep expertise in the subject, but I'm inclined to concur with the people saying that the widely broadcast signals don't actually represent one consistent thing, despite your plausible argument to the contrary.

Here's a Scott Alexander post speculating why that might be the case. In short: there was an optimization pressure towards making internal biological signals very difficult to decode, because easily decodable signals were easy target for parasites evolving to exploit them. As the result, the actual signals are probably represented as "un... (read more)

8Steven Byrnes
Yeah but if something is in the general circulation (bloodstream), then it’s going everywhere in the body. I don’t think there’s any way to specifically direct it. …Except in the time domain, to a limited extent. For example, in rats, tonic oxytocin in the bloodstream controls natriuresis, while pulsed oxytocin in the bloodstream controls lactation and birth. The kidney puts a low-pass filter on its oxytocin detection system, and the mammary glands & uterus put a high-pass filter, so to speak.

Hmm. This does have the feel of gesturing at something important, but I don't see it clearly yet...

Free association: geometric rationality.

MIRI's old results argue that "corrigibility via uncertainty regarding the utility function" doesn't work, because if the agent maximizes expected utility anyway, it doesn't matter one whit whether we're taking expectation over actions or over utility functions. However, the corrigibility-via-instrumental-goals does have the feel of "make the agent uncertain regarding what goals it will want to pursue next". Is there, t... (read more)

However, the corrigibility-via-instrumental-goals does have the feel of "make the agent uncertain regarding what goals it will want to pursue next".

That's an element, but not the central piece. The central piece (in the subagents frame) is about acting-as-though there are other subagents in the environment which are also working toward your terminal goal, so you want to avoid messing them up.

The "uncertainty regarding the utility function" enters here mainly when we invoke instrumental convergence, in hopes that the subagent can "act as though other subage... (read more)

Coming back to this in the wake of DeepSeek r1...

I don't think the cumulative compute multiplier since GPT-4 is that high, I'm guessing 3x, except perhaps for DeepSeek-V3, which wasn't trained compute optimally and didn't use a lot of compute, and so it remains unknown what happens if its recipe is used compute optimally with more compute.

How did DeepSeek accidentally happen to invest precisely the amount of compute into V3 and r1 that would get them into the capability region of GPT-4/o1, despite using training methods that clearly have wildly different r... (read more)

5Vladimir_Nesov
Selection effect. If DeepSeek-V2.5 was this good, we would be talking about it instead. Original GPT-4 is 2e25 FLOPs and compute optimal, V3 is about 5e24 FLOPs and overtrained (400 tokens/parameter, about 10x-20x), so a compute optimal model with the same architecture would only need about 3e24 FLOPs of raw compute[1]. Original GPT-4 was trained in 2022 on A100s and needed a lot of them, while in 2024 it could be trained on 8K H100s in BF16. DeepSeek-V3 is trained in FP8, doubling the FLOP/s, so the FLOPs of original GPT-4 could be produced in FP8 by mere 4K H100s. DeepSeek-V3 was trained on 2K H800s, whose performance is about that of 1.5K H100s. So the cost only has to differ by about 3x, not 20x, when comparing a compute optimal variant of DeepSeek-V3 with original GPT-4, using the same hardware and training with the same floating point precision. The relevant comparison is with GPT-4o though, not original GPT-4. Since GPT-4o was trained in late 2023 or early 2024, there were 30K H100s clusters around, which makes 8e25 FLOPs of raw compute plausible (assuming it's in BF16). It might be overtrained, so make that 4e25 FLOPs for a compute optimal model with the same architecture. Thus when comparing architectures alone, GPT-4o probably uses about 15x more compute than DeepSeek-V3. Returns on compute are logarithmic though, advantage of a $150 billion training system over a $150 million one is merely twice that of $150 billion over $5 billion or $5 billion over $150 million. Restrictions on access to compute can only be overcome with 30x compute multipliers, and at least DeepSeek-V3 is going to be reproduced using the big compute of US training systems shortly, so that advantage is already gone. ---------------------------------------- 1. That is, raw utilized compute. I'm assuming the same compute utilization for all models. ↩︎
4Noosphere89
I buy that 1 and 4 is the case, combined with Deepseek probably being satisfied that GPT-4-level models were achieved. Edit: I did not mean to imply that GPT-4ish neighbourhood is where LLM pretraining plateaus at all, @Thane Ruthenis.

Hm, that's not how it looks like to me? Look at the 6-months plot:

The latest growth spurt started January 14th, well before the Stargate news went public, and this seems like just its straight-line continuation. The graph past the Stargate announcement doesn't look "special" at all. Arguably, we can interpret it as Stargate being part of the reference class of events that are currently driving GE Vernova up – not an outside-context event as such, but still relevant...

Except for the fact that the market indeed did not respond to Stargate immediately. Which ... (read more)

Overall, I think that the new Stargate numbers published may (call it 40%) be true, but I also think there is a decent chance this is new administration trump-esque propoganda/bluster (call it 45%),

I think it's definitely bluster, the question is how much of a done deal it is to turn this bluster into at least $100 billion.

I don't this this changes the prior expected path of datacenter investment at all. It's precisely how the expected path was going to look like, the only change is how relatively high-profile/flashy this is being. (Like, if they invest $3... (read more)

Thane RuthenisΩ10232

So, Project Stargate. Is it real, or is it another "Sam Altman wants $7 trillion"? Some points:

  • The USG invested nothing in it. Some news outlets are being misleading about this. Trump just stood next to them and looked pretty, maybe indicated he'd cut some red tape. It is not an "AI Manhattan Project", at least, as of now.
  • Elon Musk claims that they don't have the money and that SoftBank (stated to have "financial responsibility" in the announcement) has less than $10 billion secured. If true, while this doesn't mean they can't secure an order of magnitude
... (read more)
6Robert Cousineau
Here is what I posted on "Quotes from the Stargate Press Conference": On Stargate as a whole: This is a restatement with a somewhat different org structure of the prior OpenAI/Microsoft data center investment/partnership, announced early last year (admittedly for $100b).   Elon Musk states they do not have anywhere near the 500 billion pledged actually secured:  I do take this as somewhat reasonable, given the partners involved just barely have $125 billion available to invest like this on a short timeline. Microsoft has around 78 billion cash on hand at a market cap of around 3.2 trillion. Softbank has 32 billion dollars cash on hand, with a total market cap of 87 billion. Oracle has around 12 billion cash on hand, with a market cap of around 500 billion. OpenAI has raised a total of 18 billion, at a valuation of 160 billion.   Further, OpenAI and Microsoft seem to be distancing themselves somewhat - initially this was just an OpenAI/Microsoft project, and now it involves two others and Microsoft just put out a release saying "This new agreement also includes changes to the exclusivity on new capacity, moving to a model where Microsoft has a right of first refusal (ROFR)." Overall, I think that the new Stargate numbers published may (call it 40%) be true, but I also think there is a decent chance this is new administration trump-esque propoganda/bluster (call it 45%), and little change from the prior expected path of datacenter investment (which I do believe is unintentional AINotKillEveryone-ism in the near future).   Edit:  Satya Nadella was just asked about how funding looks for stargate, and said "Microsoft is good for investing 80b".  This 80b number is the same number Microsoft has been saying repeatedly.

McLaughlin announced 2025-01-13 that he had joined OpenAI

He gets onboarded only on January 28th, for clarity.

gwern100

My point there is that he was talking to the reasoning team pre-hiring (forget 'onboarding', who knows what that means), so they would be unable to tell him most things - including if they have a better reason than 'faith in divine benevolence' to think that 'more RL does fix it'.

I think GPT-4 to o3 represent non-incremental narrow progress, but only, at best, incremental general progress.

(It's possible that o3 does "unlock" transfer learning, or that o4 will do that, etc., but we've seen no indication of that so far.)

I'm sure he wants hype, but he doesn't want high expectations that are very quickly falsified

There's a possibility that this was a clown attack on OpenAI instead...

Valid, I was split on whether it's worth posting vs. it'd be just me taking my part in spreading this nonsense. But it'd seemed to me that a lot of people, including LW regulars, might've been fooled, so I erred on the side of posting.

As I'd said, I think he's right about the o-series' theoretic potential. I don't think there is, as of yet, any actual indication that this potential has already been harnessed, and therefore that it works as well as the theory predicts. (And of course, the o-series scaling quickly at math is probably not even an omnicide threat. There's an argument for why it might be – that the performance boost will transfer to arbitrary domains – but that doesn't seem to be happening. I guess we'll see once o3 is public.)

The thing is - last time I heard about OpenAI rumors it was Strawberry. 

That was part of my reasoning as well, why I thought it might be worth engaging with!

But I don't think this is the same case. Strawberry/Q* was being leaked-about from more reputable sources, and it was concurrent with dramatic events (the coup) that were definitely happening.

In this case, all evidence we have is these 2-3 accounts shitposting.

6Alexander Gietelink Oldenziel
Thanks.  Well 2-3 shitposters and one gwern.  Who would be so foolish to short gwern? Gwern the farsighted, gwern the prophet, gwern for whom entropy is nought, gwern augurious augustus
Thane Ruthenis*Ω371208

Alright, so I've been following the latest OpenAI Twitter freakout, and here's some urgent information about the latest closed-doors developments that I've managed to piece together:

  • Following OpenAI Twitter freakouts is a colossal, utterly pointless waste of your time and you shouldn't do it ever.
  • If you saw this comment of Gwern's going around and were incredibly alarmed, you should probably undo the associated update regarding AI timelines (at least partially, see below).
  • OpenAI may be running some galaxy-brained psyops nowadays.

Here's the sequence of even... (read more)

1Hzn
I think super human AI is inherently very easy. I can't comment on the reliability of those accounts. But the technical claims seem plausible.
7Alexander Gietelink Oldenziel
Thanks for the sleuthing.   The thing is - last time I heard about OpenAI rumors it was Strawberry.  The unfortunate fact of life is that too many times OpenAI shipping has surpassed all but the wildest speculations.
9Seth Herd
Thanks for doing this so I didn't have to! Hell is other people - on social media. And it's an immense time-sink. Zvi is the man for saving the rest of us vast amounts of time and sanity. I'd guess the psyop spun out of control with a couple of opportunistic posters pretending they had inside information, and that's why Sam had to say lower your expectations 100x. I'm sure he wants hype, but he doesn't want high expectations that are very quickly falsified. That would lead to some very negative stories about OpenAI's prospects, even if they're equally silly they'd harm investment hype.
4Joseph Miller
I feel like for the same reasons, this shortform is kind of an engaging waste of my time. One reason I read LessWrong is to avoid twitter garbage.
avturchin122

It all started from Sam's six words story. So it looks like as organized hype. 

4Cervera
I dont think any of that invalidates that Gwern is a usual, usually right. 

I personally put a relatively high probability of this being a galaxy brained media psyop by OpenAI/Sam Altman.  

Eliezer makes a very good point that confusion around people claiming AI advances/whistleblowing benefits OpenAI significantly, and Sam Altman has a history of making galaxy brained political plays (attempting to get Helen fired (and then winning), testifying to congress that it is good he has oversight via the board and he should not be full control of OpenAI and then replacing the board with underlings, etc).

Sam is very smart and politically capable.  This feels in character. 

If you're wondering why OAers are suddenly weirdly, almost euphorically, optimistic on Twitter

For clarity, which OAers this is talking about, precisely? There's a cluster of guys – e. g. this, this, this – claiming to be OpenAI insiders. That cluster went absolutely bananas the last few days, claiming ASI achieved internally/will be in a few weeks, alluding to an unexpected breakthrough that has OpenAI researchers themselves scared. But none of them, as far as I can tell, have any proof that they're OpenAI insiders.

On the contrary: the Satoshi guy straight... (read more)

Thane Ruthenis*Ω1234-1

Current take on the implications of "GPT-4b micro": Very powerful, very cool, ~zero progress to AGI, ~zero existential risk. Cheers.

First, the gist of it appears to be:

OpenAI’s new model, called GPT-4b micro, was trained to suggest ways to re-engineer the protein factors to increase their function. According to OpenAI, researchers used the model’s suggestions to change two of the Yamanaka factors to be more than 50 times as effective—at least according to some preliminary measures.

The model was trained on examples of protein sequences from many species, as

... (read more)
5Logan Riggs
For those also curious, Yamanaka factors are specific genes that turn specialized cells (e.g. skin, hair) into induced pluripotent stem cells (iPSCs) which can turn into any other type of cell. This is a big deal because you can generate lots of stem cells to make full organs[1] or reverse aging (maybe? they say you just turn the cell back younger, not all the way to stem cells).  You can also do better disease modeling/drug testing: if you get skin cells from someone w/ a genetic kidney disease, you can turn those cells into the iPSCs, then into kidney cells which will exhibit the same kidney disease because it's genetic. You can then better understand how the [kidney disease] develops and how various drugs affect it.  So, it's good to have ways to produce lots of these iPSCs. According to the article, SOTA was <1% of cells converted into iPSCs, whereas the GPT suggestions caused a 50x improvement to 33% of cells converted. That's quite huge!, so hopefully this result gets verified. I would guess this is true and still a big deal, but  concurrent work got similar results. Too bad about the tumors. Turns out iPSCs are so good at turning into other cells, that they can turn into infinite cells (ie cancer).  iPSCs were used to fix spinal cord injuries (in mice) which looked successful for 112 days, but then a follow up study said [a different set of mice also w/ spinal iPSCs] resulted in tumors. My current understanding is this is caused by the method of delivering these genes (ie the Yamanaka factors) through retrovirus which  which I'd guess this is the method the Retro Biosciences uses.  I also really loved the story of how Yamanaka discovered iPSCs: 1. ^ These organs would have the same genetics as the person who supplied the [skin/hair cells] so risk of rejection would be lower (I think)
Load More