Comment Permalink

Steve Kommrusch10mo10

The Tom's Hardware article is interesting, thanks. It makes the point that the price quoted may not include the full 'cost of revenue' for the product in that it might be the bare die price and not the tested and packaged part (yields from fabs aren't 100% so extensive functional testing of every part adds cost). The article also notes that R&D costs aren't included in that figure; the R&D for NVIDIA (and TSMC, Intel, AMD, etc) are what keep that exponential perf-per-dollar moving along.

For my own curiosity, I looked into current and past income statements for companies. Today, NVIDIA's latest balance sheet for the fiscal year ending 1/31/2024 has $61B in revenue, 17B for cost of revenue (that would include the die cost, as well as testing and packaging), R&D of 9B, and a total operating income of 33B. AMD for their fiscal year ending 12/31/2023 had $23B revenue, 12B cost of revenue, 6B R&D, and 0.4B operating income. Certainly NVIDIA is making more profit, but the original author and wikipedia picked the AMD RX 7600 as the 2023 price-performance leader and there isn't much room in AMD's income statement to lower those prices. While NVIDIA could cut their revenue in half and still make a profit in 2023, in 2022 their profit was 4B on 27B in revenue. FWIW, Goodyear Tire, selected by me 'randomly' as an example of a company making a product with lower technology innovation year-to-year, had 20B revenue for the most recent year, 17B cost of revenue, and no R&D expense. So if we someday plateau silicon technology (even if ASI can help us build transistors smaller than atoms, the plank length is out there at some point), then maybe silicon companies will start cutting costs down to bare manufacturing costs. As a last study, the wikipedia page on FLOPS cited the Pentium Pro from Intel as part of the 1997 perf-per-dollar system. For 1997, Intel reported 25B in revenues, 10B cost of sales (die, testing, packaging, etc), 2B in R&D, and an operating income of 10B; so it was spending a decent amount on R&D too in order to stay on the Moore's law curve.

I agree with Foyle's point that even with successful AGI alignment the socioeconomic implications are huge, but that's a discussion for another day...

See in context

37 Super-Exponential versus Exponential Growth in Compute Price-Performance

by moridinamael

6th Oct 2023

3 min read

25

37

In recent months and years I have seen sober analyses of compute price-performance suggesting that the price-performance in computing (that is, the amount of calculations per second that you can buy for a dollar) has a doubling time of something like 2-3 years. I do not think these figures are good predictors of future expectations, and I wish to explain why.

Over the years I have often returned to Kurzweil's^[1] plot of price-performance in the 20th century. I occasionally update the plot on my own and marvel that the trend has persisted essentially unabated since it was published, illustrating a continuous and consistent trend from 1900 through 2023. For your reference and for the sake of clarity I have taken the original plot and added one recent point, the AMD RX 7600 GPU, which boasts 21.4 TFLOP/s (single-precision) at a price point of $269.99 as of this week. Take my word for it that the plots between 1995 and 2023 remain essentially on-trend.

This plot^[2] has no "doubling time" because it is super-exponential, i.e. there is an exponent inside the exponent, and the effective doubling time gets shorter over time. I have not found any published reference to how the white dashed band is calculated, but my own best fit line is:

where C is price-performance of compute in FLOP/s per $1000 and t in this case is years-since-1900. The instantaneous doubling time for this trend as of today would be about 0.93 years, less than half of even the most Pollyannaish of the recent forecasts. And the instantaneous doubling time obviously gets shorter each year.

The discrepancy between this <1 year doubling time and the >2 year doubling time observed in more recent publications is explained by the fact that trends calculated on the basis of narrow, recent time-frames will only capture one paradigm, e.g. the "GPU paradigm", which, like all individual paradigms in technology, exhibits S-curve behavior, starting out slow, accelerating, and then flattening. I also note that serious authors tend to present doubling-time figures that lean in the pessimistic direction.

Of course, it is entirely possible that 2023 marks the end of the validity of the above super-exponential trend. Perhaps, for some reason, no new computing paradigm arises to put us back on the white dashed band.

I feel it is important to consider that predictions for the end of the decade are wildly different depending on whether we extrapolate using this super-exponential trend or a "merely" exponential trend. Using an exponential extrapolation from today, we would expect compute to be roughly 10x cheaper per FLOP/s by 2030. Using the super-exponential extrapolation projects something closer to 1000x improvement in price-performance. Take these numbers as rough, since the extrapolations depend very much on the minutiae of how you do your curve fit. Regardless of the details, it is a difference of orders of magnitude.

I don't know how exactly we could achieve 1000x price-performance in 7 years, but responsible forecasting requires that we be open to the possibility of unforeseeable paradigm shifts, and I wouldn't want to bet against a curve that has held up for 123 years. If you had tried to make forecasts over the timescale of a decade using an exponential trend at any point over the last 100 years, you would have been consistently wrong by a margin that only increases with each decade. It seems particularly important that we avoid being wrong this decade.

Discussion in the comments prompted me to add this table of data to the original post, so that it would be more visible and provide a shared frame of reference:

Hardware	Precision	TFLOPS	Price ($)	FLOPS/$1000
Nvidia GeForce RTX 4090	FP8	82.58	$1,600	5.16E+13
AMD RX 7600	FP8	21.5	$270	7.96E+13
TPU v5e	INT8	393	$4730*	8.31E+13
H100	FP16	1979	$30,603	6.47E+13
H100	FP8	3958	$30,603	1.29E+14

*The price point of the TPU v5e is estimated based on a variety of sources, and adjusted based on my calculations from the hourly usage rates.

^{^}
I can't figure out if Kurzweil was the one to originally publish this plot, but I know that the first place I saw it was in The Singularity is Near in 2005.
^{^}
For reference, the "one human brain" estimate comes from FLOPS = 86 billion neurons×1000 synapses/neuron×200 Hz = 10^16 - 10^17 FLOPSa mode of estimation that I suspect Kurzweil would admit is tendentious.

ComputeAIWorld Modeling

Frontpage

37

New Comment

25 comments, sorted by

top scoring

Click to highlight new comments since: Today at 6:23 AM

[-]Sam Ringer1y1310

This post (and the author's comments) don't seem to be getting a great response and I'm confused why? The post seems pretty reasonable and the author's comments are well informed.

My read of the main thrust is "don't concentrate on a specific paradigm and instead look at this trend that has held for over 100 years".

Can someone concisely explain why they think this is misguided? Is it just concerns over the validity of fitting parameters for a super-exponential model?

(I would also add that on priors when people claim "There is no way we can improve FLOPs/$ because of reasons XYZ" they have historically always been wrong.)