Clarity comments on Open thread, Aug. 03 - Aug. 09, 2015 - Less Wrong Discussion
You are viewing a comment permalink. View the original post to see all comments and the full post content.
You are viewing a comment permalink. View the original post to see all comments and the full post content.
Comments (177)
Rhetorical solution: Multi armed bandit problem
disclaimer: I'm not a computer scientist. I read up on the problem to see what the takeaways might be for decision theory. Since I'm not trained in any formal logic, I don't know how to represent this solution in symbols. I think of the problem in terms of things like - am I spending too much time becoming smarter, than doing things that are smart?
Exploitation dominates exploration cause unless exploration is a subset of exploitation by definition, it would not be optimising expected utility for a given optimisation problem.
If exploitation is a subset of exploitation then unless components of exploitation have negative utility and thus wouldn’t be included in exploitation anyway, exploitation will have a higher expected utility than exploration
Thoughts?