You're going on the road of actually introducing necessary hacks. That's good. I don't think simply setting threshold probability or capping the utility on a Bayesian agent results in the most effective agent given specific computing time, and it feels to me that you're wrongfully putting a burden of both the definition of what your agent is, and the proof, on me.
You got to define what the best threshold is, or what is the reasonable cap, first - those have to be somehow determined before you have your rational agent that works well. Clearly I can't show that it is exploitable for any values, because assuming hypothesis probability threshold of 1-epsilon and utility cap of epsilon, the agent can not be talked into doing anything at all. edit: and trivially, by setting threshold too low and cap too high, the agent can be exploited.
We were talking about LW rationality. If LW rationality didn't give you procedure for determining the threshold and the cap, then I already demonstrated the point I was making. I don't see huge discussion here on the optimal cap for utility, and on the optimal threshold, and on best handling of the hypotheses below threshold, and it feels to me that rationalists have thresholds set too low and caps set too high. You can of course have an agent that will decide with commonsense and then set threshold and cap as to match it, but that's rationalization not rationality.
Someone who claims to have read "the vast majority" of the Sequences recently misinterpreted me to be saying that I "accept 'life success' as an important metric for rationality." This may be a common confusion among LessWrongers due to statements like "rationality is systematized winning" and "be careful… any time you find yourself defining the [rationalist] as someone other than the agent who is currently smiling from on top of a giant heap of utility."
So, let me explain why Actual Winning isn't a strong measure of rationality.
In cognitive science, the "Standard Picture" (Stein 1996) of rationality is that rationality is a normative concept defined by logic, Bayesian probability theory, and Bayesian decision theory (aka "rational choice theory"). (Also see the standard textbooks on judgment and decision-making, e.g. Thinking and Deciding and Rational Choice in an Uncertain World.) Oaksford & Chater (2012) explain:
Is it meaningful to attempt to develop a general theory of rationality at all? We might tentatively suggest that it is a prima facie sign of irrationality to believe in alien abduction, or to will a sports team to win in order to increase their chance of victory. But these views or actions might be entirely rational, given suitably nonstandard background beliefs about other alien activity and the general efficacy of psychic powers. Irrationality may, though, be ascribed if there is a clash between a particular belief or behavior and such background assumptions. Thus, a thorough-going physicalist may, perhaps, be accused of irrationality if she simultaneously believes in psychic powers. A theory of rationality cannot, therefore, be viewed as clarifying either what people should believe or how people should act—but it can determine whether beliefs and behaviors are compatible. Similarly, a theory of rational choice cannot determine whether it is rational to smoke or to exercise daily; but it might clarify whether a particular choice is compatible with other beliefs and choices.
From this viewpoint, normative theories can be viewed as clarifying conditions of consistency… Logic can be viewed as studying the notion of consistency over beliefs. Probability… studies consistency over degrees of belief. Rational choice theory studies the consistency of beliefs and values with choices.
Thus, one could have highly rational beliefs and make highly rational choices and still fail to win due to akrasia, lack of resources, lack of intelligence, and so on. Like intelligence and money, rationality is only a ceteris paribus predictor of success.
So while it's empirically true (Stanovich 2010) that rationality is a predictor of life success, it's a weak one. (At least, it's a weak predictor of success at the levels of human rationality we are capable of training today.) If you want to more reliably achieve life success, I recommend inheriting a billion dollars or, failing that, being born+raised to have an excellent work ethic and low akrasia.
The reason you should "be careful… any time you find yourself defining the [rationalist] as someone other than the agent who is currently smiling from on top of a giant heap of utility" is because you should "never end up envying someone else's mere choices." You are still allowed to envy their resources, intelligence, work ethic, mastery over akrasia, and other predictors of success.