I'm thinking of doing a paper for GovAI on model sizes if they matter. Any or all of the following would be helpful:

1) If you found out that the largest published neural net model sizes (by # of parameters) will grow by 10000x in the next 6 years, would that make your timelines shorter or longer and by how much?

2) Same question with 100x.

3) Same question but with 100x less (100x more) than whatever you expected.

4) Same question but with 10x less than in the last 6 years.

Assume all else is equal if possible. E.g. we have the same compute, memory, algorithms and data available. Assume that the reason for slower increase is due to lack of possibility to grow models, and not necessarily lack of will. Ignore conditional execution models where only small parts of the model is used at a time.

I'm okay with quick, unexplained qualitative answers. Well-reasoned quantitative ones are encouraged of course.

Background:

-Some ML projects like GPT-2 (figure 1) have recorded high (but diminishing) returns to bigger models, typically looking like power laws.

-Some people (e.g. Geoff Hinton) informally argued that model sizes are analogous to synapse count in the brain, which correlates with brain size, which has high returns.

Bonus question: Without looking it up, by what factor do you think the largest published model sizes have increased between 2012 and 2019? By what factor do guess they'll increase in the next 6 years?


New Answer
New Comment

1 Answers sorted by

SoerenMind

30

For the record, two people who I consider authorities on this told me some version of "model sizes matter a lot".