I'm way more used to thinking about weird maths or distributed algorithms or abstract philosophical problems than about concrete machine learning architectures. But based on everything I see about GPT-3, it seems a nice idea to learn more about it, even if only for participating in the discussion without spouting non-sense.

So I'm asking for what you think are the must-reads on GPT-3 specifically, and maybe any requirement to understand them.

New Answer
New Comment

2 Answers sorted by

Peter Jin

Ω6130

nostalgebraist's blog is a must-read regarding GPT-x, including GPT-3. Perhaps, start here ("the transformer... 'explained'?"), which helps to contextualize GPT-x within the history of machine learning.

(Though, I should note that nostalgebraist holds a contrarian "bearish" position on GPT-3 in particular; for the "bullish" case instead, read Gwern.)

Thanks for the answer! I knew about the "transformer explained" post, but I was not aware of its author's position on GPT-3.

Juraj Vitko

Ω360

Here's a list of resources that may be of use to you. The GPT-3 paper isn't too specific on implementation details because the changes that led to it were rather incremental (especially from GPT-2, and more so the farther back we look at the Transformer lineage). So the scope to understand GPT-3 is broader than one might expect.

Thanks! I'll try to read that.