Mike Capuano — LessWrong

I see some discussion here and in the associated Reddit thread about more efficient and smaller models. I think ChatGPT4 is at about one trillion parameters. I was under the impression that model sizes were increasing at about 10x/year so that could mean GPT5 is 10 trillion and GPT6 (or equivalent) is 100 trillion parameters by 2026. Does that sound about right or is there some sort of algorithmic change likely to happen that will allow LLMs to improve without the number of parameters growing 10x/year?

On a related note, I've heard backend cluster sizes are supposedly growing at similar rates. 32K nodes with 8 GPUs per node today growing at 10x per year. To me that seems improbable as it would be 320K nodes in 2025 and then 3.2M nodes in 2026.

Thoughts or info that you might have here?

LESSWRONG
LW

LESSWRONG
LW

Posts

Wikitag Contributions

Comments

Posts

Wikitag Contributions

Comments