Clarity comments on Open thread, Aug. 03 - Aug. 09, 2015 - Less Wrong

5 Post author: MrMind 03 August 2015 07:05AM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (177)

You are viewing a single comment's thread.

Comment author: Clarity 03 August 2015 07:18:16AM *  -1 points [-]

Rhetorical solution: Multi armed bandit problem

disclaimer: I'm not a computer scientist. I read up on the problem to see what the takeaways might be for decision theory. Since I'm not trained in any formal logic, I don't know how to represent this solution in symbols. I think of the problem in terms of things like - am I spending too much time becoming smarter, than doing things that are smart?

  • Exploitation dominates exploration cause unless exploration is a subset of exploitation by definition, it would not be optimising expected utility for a given optimisation problem.

  • If exploitation is a subset of exploitation then unless components of exploitation have negative utility and thus wouldn’t be included in exploitation anyway, exploitation will have a higher expected utility than exploration

Thoughts?