Comment Permalink

If the companies need capital - and I believe that they do - what better option do they have?

I think you’re imagining cash-rich companies choosing to sell portions for dubious reasons, when they could just keep it all for themselves.

But in fact, the companies are burning cash, and to continue operating they need to raise at some valuation, or else not be able to afford the next big training run.

The valuations at which they are raising are, roughly, where supply and demand equilibriate for the amounts of cash that they need in order to continue operating. (Possibly they could raise at higher valuations from taking on less-scrupulous investors, but to date I believe some of the companies have tried to avoid this.)

See in context

DAL's Shortform

by DAL

27th Jan 2025

1 min read

1

This is a special post for quick takes by DAL. Only they can create top-level comments. Comments here also appear on the Quick Takes page and All Posts page.

19 comments, sorted by

top scoring

Click to highlight new comments since: Today at 3:33 AM

[-]DAL1d12-1

If AI executives really are as bullish as they say they are on progress, then why are they willing to raise money anywhere in the ballpark of current valuations?

Dario Amodei suggested the other day that AI will take over all or nearly all coding working within months. Given that software is a multi-trillion dollar industry, how can you possibly square that statement with agreeing to raise money at a valuation for Anthropic in the mere tens of billions? And that's setting aside any other value whatsoever for AI.

The whole thing sort of reminds me of the Nigerian prince scam (i.e., the Nigerian prince is coming into an inheritance of tens of millions of dollars but desperately needs a few thousand bucks to claim it, and will cut you in for incredible profit as a result) just scaled up a few orders of magnitude. Anthropic/OpenAI are on the cusp of technologies worth many trillions of dollars, but they're so desperate for a couple billion bucks to get there that they'll sell off big equity stakes at valuations that do not remotely reflect that supposedly certain future value.

[-]sjadler1d2916

If the companies need capital - and I believe that they do - what better option do they have?

I think you’re imagining cash-rich companies choosing to sell portions for dubious reasons, when they could just keep it all for themselves.

But in fact, the companies are burning cash, and to continue operating they need to raise at some valuation, or else not be able to afford the next big training run.

[-]DAL1d8-2

I don't doubt they need capital. And the Nigerian prince who needs $5,000 to claim the $100 million inheritance does too. It's the fact that he/they can't get capital at something coming anywhere close to the claimed value that's suspicious.

Amodei is forecasting AI that writes 90% of code in three to six months according to this recent comments. Is Anthropic really burning cash so fast that they can't wait a quarter, demonstrate to investors that AI has essentially solved software, and then raise at 10x the valuation?

[-]faul_sname1d1415

Is Amodei forecasting that, in 3 to 6 months, AI will produce 90% of the value derived from written code, or just that AI will produce 90% of code, by volume? It would not surprise me if 90% of new "art" (defined as non-photographic, non-graph images) by volume is currently AI-generated, and I would not be surprised to see the same thing happen with code.

And in the same way that "AI produces 90% of art-like images" is not the same thing as "AI has solved art", I expect "AI produces 90% of new lines of code" is not the same thing as "AI has solved software".

[-]DAL8h10

Yea, fair enough. His prediction was: "I think we will be there in three to six months, where AI is writing 90% of the code. And then, in 12 months, we may be in a world where AI is writing essentially all of the code"

The second one is more hedged ("may be a world") but "essentially all the code" must translate to a very large fraction of all the value even if that last 1% or whatever is of outsize economic significance.

[-]Vladimir_Nesov13h20

Amodei is forecasting AI that writes 90% of code in three to six months according to his recent comments.

I vaguely recall hearing something like this, but with crucial qualifiers that disclaim the implied confidence you are gesturing at. I expect I would've noticed more vividly if this statement didn't come with clear qualifiers. Knowing the original statement would resolve this.

[-]DAL8h30

The original statement is:

"I think we will be there in three to six months, where AI is writing 90% of the code. And then, in 12 months, we may be in a world where AI is writing essentially all of the code"

So, as I read that he's not hedging on 90% in 3 to 6 months, but he is hedging on "essentially all" (99% or whatever that means) in a year.

[-]Vladimir_Nesov6h20

Here's the place in the interview where he says this (at 16:16). So there were no crucial qualifiers for the 3-6 months figure, which in hindsight makes sense, since it's near enough to likely refer to his impression of an already existing AI available at Anthropic internally^[1]. Maybe also corroborated in his mind with some knowledge about capabilities of a reasoning model based on GPT-4.5, which is almost certainly available internally at OpenAI.

Probably a reasoning model based on a larger pretrained model than Sonnet 3.7. He recently announced in another interview that a model larger than Sonnet 3.7 is due to come out in "relatively small number of time units" (at 12:35). So probably the plan is to release in a few weeks, but something could go wrong and then it'll take longer. Possibly long reasoning won't be there immediately if there isn't enough compute to run it, and the 3-6 months figure refers to when he expects enough inference compute for long reasoning to be released. ↩︎

[-]sjadler1d10

I appreciate the question you’re asking, to be clear! I’m less familiar with Anthropic’s funding / Dario’s comments, but I don’t think the magnitudes of ask-vs-realizable-value are as far off for OpenAI as your comment suggests?

Eg, If you compare OpenAI’s reported raised at $157B most recently, vs. what its maximum profit-cap likely was in the old (still current afaik) structure.

The comparison gets a little confusing, because it’s been reported that this investment was contingent on for-profit conversion, which does away with the profit cap.

But I definitely don’t think OpenAI’s recent valuation and the prior profit-cap would be magnitudes apart.

(To be clear, I don’t know the specific cap value, but you can estimate it - for instance by analyzing MSFT’s initial funding amount, which is reported to have a 100x capped-profit return, and then adjust for what % of the company you think MSFT got.)

(This also makes sense to me for a company in a very competitive industry, with high regulatory risk, and where companies are reported to still be burning lots and lots of cash.)

[-]Ted Sanders22h158

One potential angle: automating software won't be worth very much if multiple players can do it and profits are competed to zero. Look at compilers - almost no one is writing assembly or their own compilers, and yet the compiler writers haven't earned billions or trillions of dollars. With many technologies, the vast majority of value is often consumer surplus never captured by producers.

In general I agree with your point. If evidence of transformative AI was close, you'd strategically delay fundraising as late as possible. However, if you have uncertainty about your ability to deliver, investors' ability to recognize transformative potential, or uncertainty about competition, you might hedge and raise sooner than you need. Raising too early never kills a business. But raising too late always does.

[-]Seth Herd9h97

This is a key point for a different discussion: job loss and effect on economies. Supposing writing software is almost all automated. Nobody is going to get the trillions currently spent on it. If just two companies, say Anthropic and OpenAI have agents that automate it, they'll compete and drive the prices down to near the compute costs (or collude until others make systems that can compete...)

Now those trillions aren't being spent on writing code. Where do they go? Anticipating how businesses will use their surplus as they pay less wages is probably something someone should be doing. But I don't know of any economists taking seriously claims like AI doing all coding in a few years let alone a few months.

I'm afraid we're going to get blindsided because economists aren't taking the possibility of unprecedented rapid job loss seriously.

[-]DAL8h10

So, I certainly wouldn't expect the AI companies to capture all the value; you're right that competition drives the profits down. But, I also don't think it's reasonable to expect profits to get competed down to zero. Innovations in IT are generally pretty easy to replicate, technically speaking, but tech companies operate at remarkably high margins. Even at the moment, your various LLMs are similar but are not exact substitutes for one another, which gives each some market power.

[-]lc1d109

If AI executives really are as bullish as they say they are on progress, then why are they willing to raise money anywhere in the ballpark of current valuations?

The story is that they need the capital to build the models that they think will do that.

[-]Ann1d31

Commoditization / no moat? Part of the reason for rapid progress in the field is because there's plenty of fruit left and that fruit is often shared, and also a lot of new models involving more fully exploiting research insights already out there on a smaller scale. If a company was able to try to monopolize it, progress wouldn't be as fast, and if a company can't monopolize it, prices are driven down over time.

[-]Josh You12h10

AI has probably increased valuations for Big Tech (particularly Nvidia) by at least a few trillion over the past two years. So part of this is that investors think OpenAI/Anthropic will only capture around 10% of total AI profits.

[-]DAL2mo1-4

It's worth thinking through what today's DeepSeek-induced, trillion dollar-plus drop in AI related stocks means.

There are two basic explanations for DeepSeek's success training models with a lot less compute:

Imitation is Easy: DeepSeek is substantially just re-treading the same ground as the other players. They're probably training on O1 outputs, etc. DeepSeek proves that it's easy to match breakthroughs, but not to generate them. Further advances will still require tons of compute.
DeepSeek is really clever: Facing compute constraints, DeepSeek engineers were forced to find a better way to do work and they did. That clever will likely translate into forward progress, and there's no reason it would be limited to imitation.

If #1 is true, then I think it implies that we're headed towards a big slowdown in AI progress. The whole economic value proposition for building models just changed. If your frontier model can be imitated at a tiny fraction of the cost after a few months, what good is it? Why would VCs invest money in your training runs?

If #2 is true, then we may be headed towards incredibly rapid AI progress, and the odds of recursively self-improving AI are much higher. If what you really need to build better models is tons and tons of compute, then AI can't speed itself up much. If what you need is just lots of cleverness, then it's much easier to imagine a fast takeoff.

#1 is likely better for alignment in that it will slow things down from the current frenetic pace (the possible downside is that if you can imitate a cutting edge model cheaply and easily then hostile actors may deliberately build misaligned models).

#1 also seems to have big implications for government/legal involvement in AI. If the private sector loses interest in funding models that can be easily imitated, then further progress will tend to rely on either: government investment (as in basic science) or aggressive IP law that allows commercialization of progress by preventing imitators (as we do in drug development). Either of those means a much bigger role for the public sector.

[-]Vladimir_Nesov2mo40

training on O1 outputs

Outputs of o1 don't include reasoning traces, so not particularly useful compared to outputs of chatbot models, and very expensive, so only a modest amount can be collected.

Imitation helps with post-training, but the compute-heavy part is pretraining, and obtaining good quality with little pretraining is a novel feat that isn't known to be explainable by good post-training, or by including a lot of outputs from good models in the pretraining/annealing mix.

[-]gwern2mo80

Outputs of o1 don't include reasoning traces, so not particularly useful compared to outputs of chatbot models, and very expensive, so only a modest amount can be collected.

It would be more precise to say outputs of o1 aren't supposed to include the reasoning traces. But in addition to the reasoning traces OA voluntarily released, people have been observing what seem to be leaks, and given that the history of LLM robustness to jailbreaks can be summarized as 'nil', it is at least conceivable that someone used a jailbreak+API to exfiltrate a bunch of traces. (Remember that Chinese companies like ByteDance have definitely been willfully abusing the OA API for the purposes of knowledge distillation/cloning and evading bans etc, in addition to a history of extremely cutthroat tactics that FANG would blanch at, so it's a priori entirely plausible that they would do such things.)

I don't believe DeepSeek has done so, but it is technically possible. (Regardless of whether anyone has done so, it is now partially moot given that r1 traces in the DS paper, and based on third party reports thus far, work so well for distillation so everyone can kickstart their own r1-clone with r1 reasoning traces and work from there. There may be more reason to try to exfiltrate o3+ traces, but OA may also decide to not bother, as users are claiming to value and/or enjoy reading the raw traces, and since the secret & capability is out, maybe there's not much point in hiding them any longer.)

[-]Hastings2mo10

I think most of the imitation happens in the pretraining. I don't know about o1, but DeepSeek v3 is at minimum trained on a ton of 4o outputs, although they are slightly cagey about this. Just the first thing I tried, I had ChatGPT write a sock poem:

4o's poem:

Socks, the keepers of toes so warm,
Guardians of feet through life's storm.
In stripes, in polka dots, or plain,
They dance with us through joy and pain.
A cozy pair on winter’s chill,
Or mismatched mates—yet charming still.
From drawer to floor, they roam and stray,
But where one goes, the other may.
Oh, humble socks, so oft unseen,
Yet vital threads of life's routine.
A quiet comfort, soft and neat,
The unsung heroes of weary feet.

Then I gave v3 just the first two stanzas, and asked it to continue the poem:

v3's continuation:

They hide in shoes, they lounge in beds,
Sometimes they’re lost, or so it’s said.
Yet always there, a faithful friend,
From morning’s start to evening’s end.

They cushion steps, they soften blows,
Through every stride, their comfort shows.
Oh, humble socks, so oft unseen,
The quiet heroes of routine.
So here’s to socks, both bold and meek,
For every day, for every week.
A simple joy, a small delight,
They keep us snug, from day to night.

The shot in the dark guess of the "humble socks, so oft unseen... routine" couplet is a fucking soul read. v3 knows 4o, in a way that I kind of hope no model ever knows a person.

Moderation Log