“The resources used to train the model can be repurposed to run millions of instances of it (this matches projected cluster sizes by ~2027), and the model can absorb information and generate actions at roughly 10x-100x human speed. … We could summarize this as a ‘country of geniuses in a datacenter’.”
Dario Amodei, CEO of Anthropic, Machines of Loving Grace
“Let’s say each copy of GPT-4 is producing 10 words per second. It turns out they would be able to run something like 300,000 copies of GPT-4 in parallel. And by the time they are training GPT-5 it will be a more extreme situation where just using the computer chips they used to train... (read 9592 more words →)