even with copious amounts of test-time compute
There is no copius amount of test-time compute yet. I would argue that test-time compute has barely been scaled at all. Current spend on RL is only a few million dollars. I expect this to be scaled a few orders of magnitude this year.
I predict that Pokemon Red will be finished very fast (<3 months) and everyone who was disappointed and adjusted their AI timelines due to CPP will have to readjust them.
Any slowdown seems implausible given Anthropic timelines, which I consider to be a good reason to be skeptical of data and compute cost-related slowdowns at least until nobel-prize level. Moreover, the argument that we will very quickly get 15 OOMs or whatever of effective compute after the models can improve themselves is also very plausible