Saving 3^^^^3 people is more than worth a bit of vulnerability to blackmail. If 3^^^^3 people are in danger, the AI wishes to believe 3^^^^3 people are in danger and in that case "never surrender to blackmail" is a strictly worse strategy.
Also, DP0 isn't even a coherent decision process. The expected utilies will fail to converge if "there's no upper bound to the disutility such a hypothetical agent may claim" and these claims are interpreted with some standard assumptions, so the agent has no way of even comparing expected utilities of actions.
If 3^^^^3 people are in danger, the AI wishes to believe 3^^^^3 people are in danger
This isn't about beliefs, this is about decisions. The process of epistemic rationality needn't be modified, only the process of instrumental rationality. Regardless of how much probability the AI assigns to the danger for 3^^^^3 people, it needn't be the right choice to decide based on a mere probability of such danger multiplied to the disutility of the harm done.
...Saving 3^^^^3 people is more than worth a bit of vulnerability to blackmail. If 3^^^^3 people are in dan
If it's worth saying, but not worth its own post (even in Discussion), then it goes here.