I stumbled across this document. I believe it may have influenced a young Eliezer Yudkowsky. He's certainly shown reverence for the author before.
This essay includes everything. A rant against frequentism and the superiority of bayes. A rant against modern academic institutions. A rant against mainstream quantum physics. A section about how mainstream AI is too ad hoc and not grounded in perfect bayesian math. A closing section about sticking to your non-mainstream beliefs and ignoring critics.
I'm not really qualified to speak about most of it. The part about AI, particularly, bothered me. He attacks neural networks, and suggests that bayesian networks are best.
I initially wrote a big rant about how terribly he misunderstands neural networks. But the more I think about it, the more I like the idea of bayesian networks. The idea of ideal, perfect, universal methods appeals to my mind a great deal.
And that's a serious problem for me. I once got very into libertarianism over that. And then crazy AI methods that are totally impractical in reality.
And thinking about it some more; Bayesian networks are cool, but I don't think they could replace all of ML. I mean half of what neural networks do isn't just better inference. Sometimes we have plenty of training data and overfitting isn't much of an issue. It's just getting a model to fit to the data at all.
Bayes theorem doesn't say anything about optimization. It's terribly expensive to approximate. And Jaynes' rant against non-linear functions doesn't even make any sense outside of boolean functions (and even there it isn't necessarily optimal; you would have to learn a lookup table for each node that explodes exponentially with the number of inputs. (And if you are going to go full Bayesian, why stop there? Why not go to full Solomonoff Induction (or some approximation of it, at least.))
I don't think he suggests bayesian networks (which, to me, mean the causal networks of Pearl et al). Rather, he is literally suggesting trying to learn by Bayesian inference. His comments about nonlinearity I think are just to the effect that one shoudn't have to introduce nonlinearity with sigmoid activation functions, one should have nonlinearity naturally from Bayesian updates. But yeah, I think it's quite impractical.
E.g. suppose you wanted to build an email spam filter, and wanted P(spam). A (non-naive) Bayesian approach to this classification problem...
If it's worth saying, but not worth its own post (even in Discussion), then it goes here.
Notes for future OT posters:
1. Please add the 'open_thread' tag.
2. Check if there is an active Open Thread before posting a new one. (Immediately before; refresh the list-of-threads page before posting.)
3. Open Threads should be posted in Discussion, and not Main.
4. Open Threads should start on Monday, and end on Sunday.