Stuart_Armstrong comments on Why we should err in both directions - Less Wrong Discussion
You are viewing a comment permalink. View the original post to see all comments and the full post content.
You are viewing a comment permalink. View the original post to see all comments and the full post content.
Comments (7)
And a monomodal assumption as well.
But many real-world distributions are approximately like that, so its good.
I think the same argument works if there could be multiple peaks (even if my picture doesn't cover that case) -- you just need the local properties around the optimum to run things. But in that case you can't assume a local optimum is a global optimum, so it's harder to apply.
As you say in many cases we don't need to worry about these complications, so I haven't spent too much time on that.