https://arxiv.org/abs/2202.05924
What do you need to develop advanced Machine Learning systems? Leading companies don’t know. But they are very interested in figuring it out. They dream of replacing all these pesky workers with reliable machines who take no leave and have no morale issues.
So when they heard that throwing processing power at the problem might get you far along the way, they did not sit idly on their GPUs. But, how fast is their demand for compute growing? And is the progress regular?
Enter us. We have obsessively analyzed trends in the amount of compute spent training milestone Machine Learning models.
Our analysis shows that:
Mathematica is the most powerful solver I’ve come across (it’s basically Wolfram Alpha with additional computational time).
I’m confused why you think looking at the rate and lumpiness of historical progress on narrowly circumscribed performance metrics is not meaningful, because it seems like you do seem to think that drawing straight lines is fine when compute is on the x-axis—which seems like a similar exercise. What’s going on there?