A possible implementation of your idea would be an agent that, at every juncture, evaluates the expected consequences of the actions it could take. If there is a unique action that will lead to the agent's goals being satisfied with subjective probability above the threshold p, then the agent takes that action. If there are multiple such actions, the agent chooses one randomly, perhaps with preference given to the action "do nothing" or "self-destruct". If there are no such actions, then the agent takes an action that maximizes the subjective probability of its goals being satisfied.
A problem with this is that if the agent's subjective probability of achieving its goals is ever below the threshold, then the agent will have reason to modify itself to become an optimizer.
[Final Update: Back to 'Discussion'; stroked out the initial framing which was misleading.]
[Update: Moved to 'Main'. Also, judging by the comments, it appears that most have misunderstood the puzzle and read way too much into it; user 'Manfred' seems to have got the point.][Note: This little puzzle is my first article. Preliminary feedback suggests some of you might enjoy it while others might find it too obvious, hence the cautious submission to 'Discussion'; will move it to 'Main' if, and only if, it's well-received.]In his recent paper "The Superintelligent Will: Motivation and Instrumental Rationality in Advanced Artificial Agents", Nick Bostrom states:Let us take it on from here.It is tempting to say that a machine can never halt after achieving its goal because it cannot know with full certainty whether it has achieved its goal; it will continually verify, possibly to increasing degrees of certainty, whether it has achieved its goal, but never halt as such.
What if, from a naive goal G, the machine's goal were then redefined as "achieve 'G' with 'p' probability" for some p < 1? It appears this also would not work, given the machine would never be fully certain of being p certain of having achieved G. (and so on...)
Yet one can specify a set of conditions for which a program will terminate, so how is the argument above fallacious?
Solution in ROT13: Va beqre gb unyg fhpu na ntrag qbrfa'g arrq gb *xabj* vg'f c pregnva, vg bayl arrqf gb *or* c pregnva; nf gur pbaqvgvba vf rapbqrq, gur unygvat jvyy or gevttrerq bapr gur ntrag ragref gur fgngr bs c pregnvagl, ertneqyrff bs jurgure vg unf (shyy) xabjyrqtr bs vgf fgngr.