A few examples (in approximately increasing order of controversy):
If you proceed anyway...
- Identify knowledge that may be dangerous. Forewarned is forearmed.
- Try to cut dangerous knowledge out of your decision network. Don’t let it influence other beliefs or your actions without your conscious awareness. You can’t succeed completely at this, but it might help.
- Deliberately lower dangerous priors, by acknowledging the possibility that your brain is contaminating your reasoning and then overcompensating, because you know that you’re still too overconfident.
- Spend a disproportionate amount of time seeking contradictory evidence. If believing something could have a great cost to your values, make a commensurately great effort to be right.
- Just don’t do it. It’s not worth it. And if I found out, I’d have to figure out where you live, track you down, and kill you.
I don't think that makes sense. Changing back is no more desirable than any other change.
Once you've changed, you've changed. Changing your utility function is undesirable. But it isn't bad. You strive to avoid it; but once it's happened, you're glad it did.
Right; that's what happens by default. But if you find that because your future self will want to keep its new values, you're overly reluctant to take useful actions that change your values as a side effect, you might want to precommit to roll back certain changes; or if you can't keep track of all the side effects, it's conceivable you want to turn it into a general habit. I could see this either being a good or bad idea on net.