A few examples (in approximately increasing order of controversy):
If you proceed anyway...
- Identify knowledge that may be dangerous. Forewarned is forearmed.
- Try to cut dangerous knowledge out of your decision network. Don’t let it influence other beliefs or your actions without your conscious awareness. You can’t succeed completely at this, but it might help.
- Deliberately lower dangerous priors, by acknowledging the possibility that your brain is contaminating your reasoning and then overcompensating, because you know that you’re still too overconfident.
- Spend a disproportionate amount of time seeking contradictory evidence. If believing something could have a great cost to your values, make a commensurately great effort to be right.
- Just don’t do it. It’s not worth it. And if I found out, I’d have to figure out where you live, track you down, and kill you.
I'd say it's rather a form of conceptual confusion: you can't change a concept ("change" is itself a "timeful" concept, meaningful only as a property within structures which are processes in the appropriate sense). But it's plausible that creating agents with slightly different explicit preference will result in a better outcome than, all else equal, if you give those agents your own preference. Of course, you'd probably need to be a superintelligence to correctly make decisions like this, at which point creation of agents with given preference might cease to be a natural concept.