You're looking at Less Wrong's discussion board. This includes all posts, including those that haven't been promoted to the front page yet. For more information, see About Less Wrong.

Clarity comments on Open thread, Aug. 03 - Aug. 09, 2015 - Less Wrong Discussion

5 Post author: MrMind 03 August 2015 07:05AM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (177)

You are viewing a single comment's thread.

Comment author: Clarity 03 August 2015 07:18:16AM *  -1 points [-]

Rhetorical solution: Multi armed bandit problem

disclaimer: I'm not a computer scientist. I read up on the problem to see what the takeaways might be for decision theory. Since I'm not trained in any formal logic, I don't know how to represent this solution in symbols. I think of the problem in terms of things like - am I spending too much time becoming smarter, than doing things that are smart?

  • Exploitation dominates exploration cause unless exploration is a subset of exploitation by definition, it would not be optimising expected utility for a given optimisation problem.

  • If exploitation is a subset of exploitation then unless components of exploitation have negative utility and thus wouldn’t be included in exploitation anyway, exploitation will have a higher expected utility than exploration

Thoughts?