Leon comments on A Fervent Defense of Frequentist Statistics - Less Wrong
You are viewing a comment permalink. View the original post to see all comments and the full post content.
You are viewing a comment permalink. View the original post to see all comments and the full post content.
Comments (125)
Ah, good point. It's like the prior, considered as a regularizer, is too "soft" to encode the constraint we want.
A Bayesian could respond that we rarely actually want sparse solutions -- in what situation is a physical parameter identically zero? -- but rather solutions which have many near-zeroes with high probability. The posterior would satisfy this I think. In this sense a Bayesian could justify the Laplace prior as approximating a so-called "slab-and-spike" prior (which I believe leads to combinatorial intractability similar to the fully L0 solution).
Also, without L0 the frequentist doesn't get fully sparse solutions either. The shrinkage is gradual; sometimes there are many tiny coefficients along the regularization path.
[FWIW I like the logical view of probability, but don't hold a strong Bayesian position. What seems most important to me is getting the semantics of both Bayesian (= conditional on the data) and frequentist (= unconditional, and dealing with the unknowns in some potentially nonprobabilistic way) statements right. Maybe there'd be less confusion -- and more use of Bayes in science -- if "inference" were reserved for the former and "estimation" for the latter.]
See this comment. You actually do get sparse solutions in the scenario I proposed.
Cool; I take that back. Sorry for not reading closely enough.