In January 2023, beren and Eric Winsor cataloged basic distributional properties of weights, activations, and gradients in GPT-2 models, providing a systematic view of model internals (thanks to Ryan Greenblatt for the pointer). This post extends their investigation in two directions.
First, examining their characterization of transformer activations as "nearly Gaussian with outliers," I conducted detailed distributional analyses of post-residual activations. Myfindings align with observations made in comments and unpublished work: the distributions are better described by heavier-tailed distributions, dominated by the logistic distribution. What can appears as outliers or artifacts manifests in my analysis as consistent mixture distributions, with minor modes appearing systematically to both sides of the primary distribution.
Second, prompted by Buck... (read 1387 more words →)
Love the way you laid things out here! Lots to discuss, but I'll focus on one specific question. We've communicated privately so you know I'm very bullish on PD as a potential new paradigm. Don't take the below as general skepticism!
I don't understand this claim, except perhaps in a trivial sense which I'm assuming you don't mean. My confusion stems from my intuition that we don't have a good reason or evidence to assume that the model never needs... (read more)