I think Eliezer's presentation of the Bayesianism vs frequentism arguments in science came from E. T. Jaynes' posthumous book Probability Theory: The Logic of Science, which was written about arguments that took place over Jaynes' lifetime, well before the Sequences were written.
Doesn't weight decay/L2 regularization tend to get rid of the "singularities", though? There are no longer directions you can move in that change your model weights and leave the loss the same because you are altering your loss function to prefer weights of lower norm. A classic example of L2 regularization removing the "singularities"/directions you can move leaving your loss the same is the L2 regularized support vector machine w/ hinge loss, which motivated me to check it for neural nets. I tried some numerical experiments and found zero eigenvalues of ... (read more)