I'm kind of puzzled by the amount of machinery that seems to be going into these arguments, because it seems to me that there is a discrete analog of the same arguments which is probably both more realistic (as neural networks are not actually continuous, especially with people constantly decreasing the precision of the floating point numbers used in implementation) and simpler to understand.
Suppose you represent a neural network architecture as a map where and is the set of all possible computable functions from the input and output space you're considering. In thermodynamic terms, we could identify elements of as "microstates" and the corresponding functions that the NN architecture maps them to as "macrostates".
Furthermore, suppose that comes together with a loss function evaluating how good or bad a particular function is. Assume you optimize using something like stochastic gradient descent on the function with a particular learning rate.
Then, in general, we have the following results:
Is there some additional content of singular value theory that goes beyond the above insights?
Edit: I've converted this comment to a post, which you can find here.
Edit: Originally the sequence was going to contain a post about SLT for Alignment, but this can now be found here instead, where a new research agenda, Developmental Interpretability, is introduced. I have also now included references to the lectures from the recent SLT for Alignment Workshop in June 2023.
I'm looking forward to both this series, and the workshop!
I think I (and probably many other people) would find it helpful if there was an entry in this sequence which was purely the classical story told in a way/with language which makes its deficiencies clear and the contrasts with the Watanabe version very easy to point out. (Maybe a -1 entry, since 0 is already used?)
The way I've structured the sequence means these points are interspersed throughout the broader narrative, but its a great question so I'll provide a brief summary here, and as they are released I will link to the relevant sections in this comment.
Hi! I am in the process of reading this sequence and would love some supplemental lecture materials (particularly at the intersection of alignment research) and was very excited by the prospect of the lectures form the June summit, however the YouTube channels appears to 404 now. Is there somewhere else I can listen to these lectures?
TLDR; In this sequence I distill Sumio Watanabe's Singular Learning Theory (SLT) by explaining the essence of its main theorem - Watanabe's Free Energy Formula for Singular Models - and illustrating its implications with intuition-building examples. I then show why neural networks are singular models, and demonstrate how SLT provides a framework for understanding phases and phase transitions in neural networks.
Epistemic status: The core theorems of Singular Learning Theory have been rigorously proven and published by Sumio Watanabe across 20 years of research. Precisely what it says about modern deep learning, and its potential application to alignment, is still speculative.
Acknowledgements: This sequence has been produced with the support of a grant from the Long Term Future Fund. I'd like to thank all of the people that have given me feedback on each post: Ben Gerraty, @Jesse Hoogland , @mfar, @LThorburn , Rumi Salazar, Guillaume Corlouer, and in particular my supervisor and editor-in-chief Daniel Murfet.
Theory vs Examples: The sequence is a mixture of synthesising the main theoretical results of SLT, and providing simple examples and animations that illustrate its key points. As such, some theory-based sections are slightly more technical. Some readers may wish to skip ahead to the intuitive examples and animations before diving into the theory - these are clearly marked in the table of contents of each post.
Prerequisites: Anybody with a basic grasp of Bayesian statistics and multivariable calculus should have no problems understanding the key points. Importantly, despite SLT pointing out the relationship between algebraic geometry and statistical learning, no prior knowledge of algebraic geometry is required to understand this sequence - I will merely gesture at this relationship. Jesse Hoogland wrote an excellent introduction to SLT which serves as a high level overview of the ideas that I will discuss here, and is thus recommended pre-reading to this sequence.
SLT for Alignment Workshop: This sequence was prepared in anticipation of the SLT for Alignment Workshop 2023 and serves as a useful companion piece to the material covered in the Primer Lectures.
Thesis: The sequence is derived from my recent masters thesis which you can read about at my website.
Developmental Interpretability: Originally the sequence was going to contain a short outline of a new research agenda, but this can now be found here instead.
Introduction
In 2009, Sumio Watanabe wrote these two profound statements in his groundbreaking book Algebraic Geometry and Statistical Learning where he proved the first main results of Singular Learning Theory (SLT). Up to this point, this work has gone largely under-appreciated by the AI community, probably because it is rooted in highly technical algebraic geometry and distribution theory. On top of this, the theory is framed in the Bayesian setting, which contrasts the SGD-based setting of modern deep learning.
But this is a crying shame, because SLT has a lot to say about why neural networks, which are singular models, are able to generalise well in the Bayesian setting, and it is very possible that these insights carry over to modern deep learning.
At its core, SLT shows that the loss landscape of singular models, the KL divergence K(w), is fundamentally different to that of regular models like linear regression, consisting of flat valleys instead of broad parabolic basins. Correspondingly, the measure of effective dimension (complexity) in singular models is a rational quantity called the RLCT [1], which can be less than half the total number of parameters. This fact means that classical results of Bayesian statistics like asymptotic normality break down, but what Watanabe shows is that this is actually a feature and not a bug: different regions of the loss landscape have different tradeoffs between accuracy and complexity because of their differing information geometry. This is the content of Watanabe's Free Energy Formula, from which the Widely Applicable Bayesian Information Criterion (WBIC) is derived, a generalisation of the standard Bayesian Information Criterion (BIC) for singular models.
With this in mind, SLT provides a framework for understanding phases and phase transitions in neural networks. It has been mooted that understanding phase transitions in deep learning may be a key part of mechanistic interpretability, for example in Induction Heads, Toy Models of Superposition, and Progress Measures for Grokking via Mechanistic Interpretability, which relate phase transitions to the formation of circuits. Furthermore, the existence of scaling laws and other critical phenomena in neural networks suggests that there is a natural thermodynamic perspective on deep learning. As it stands there is no agreed-upon theory that connects all of this, but in this sequence we will introduce SLT as a bedrock for a theory that can tie these concepts together.
In particular, I will demonstrate the existence of first and second order phase transitions in simple two layer feedforward ReLU neural networks which we can understand precisely through the lens of SLT. By the end of this sequence, the reader will understand why the following phase transition in the Bayesian posterior corresponds to a changing accuracy-complexity tradeoff of the different phases in the loss landscape:
Key Points of the Sequence
To understand phase transitions in neural networks from the point of view of SLT, we need to understand how different regions of parameter space can have different accuracy-complexity tradeoffs, a feature of singular models that is not present in regular models. Here is the outline of how these posts get us there:
(Edit: Originally the sequence was going to contain a post about SLT for Alignment, but this can now be found here instead, where a new research agenda, Developmental Interpretability, is introduced).
Resources
Though these resources are relatively sparse for now, expanding the reach of SLT and encouraging new research is the primary longterm goal of this sequence.
SLT Workshop for Alignment Primer
In June 2023, a summit, "SLT for Alignment", was held, which produced over 20hrs of lectures. The details of these talks can be found here, with recordings found here.
Research groups
Research groups I know of working on SLT:
Literature
The two canonical textbooks due to Watanabe are:
The two main papers that were precursors to these books:
This sequence is based on my recent thesis:
MDLG recently wrote an introduction to SLT:
Other theses studying SLT:
Other introductory blogs:
Short for the algebro-geometric Real Log Canonical Threshold, which I define in DSLT1.