Let's say Bob's terminal value is to travel back in time and ride a dinosaur.
It is instrumentally rational for Bob to study physics so he can learn how to build a time machine. As he learns more physics, Bob realizes that his terminal value is not only utterly impossible but meaningless. By definition, someone in Bob's past riding a dinosaur is not a future evolution of the present Bob.
There are a number of ways to create the subjective experience of having gone into the past and ridden a dinosaur. But to Bob, it's not the same because he wanted both the subjective experience and the knowledge that it corresponded to objective fact. Without the latter, he might as well have just watched a movie or played a video game.
So if we took the original, innocent-of-physics Bob and somehow calculated his coherent extrapolated volition, we would end up with a Bob who has given up on time travel. The original Bob would not want to be this Bob.
But, how do we know that _anything_ we value won't similarly dissolve under sufficiently thorough deconstruction? Let's suppose for a minute that all "human values" are dangling units; that everything we want is as possible and makes as much sense as wanting to hear the sound of blue or taste the flavor of a prime number. What is the rational course of action in such a situation?
PS: If your response resembles "keep attempting to XXX anyway", please explain what privileges XXX over any number of other alternatives other than your current preference. Are you using some kind of pre-commitment strategy to a subset of your current goals? Do you now wish you had used the same strategy to precommit to goals you had when you were a toddler?
So, I'm basically ignoring the "terminal" part of this, for reasons I've belabored elsewhere and won't repeat here.
I agree that there's a difference between wanting to do X and wanting the subjective experience of doing X. That said, frequently people say they want the former when they would in fact be perfectly satisfied by the latter, even knowing it was the latter. But let us assume Bob is not one of those people, he really does want to travel back in time and ride a dinosaur, not just experience doing so or having done so.
I don't understand why you say "I want to travel back in time and ride a dinosaur" is meaningless. Even granting that it's impossible (or, to say that more precisely, granting that greater understanding of reality tends to sharply reduce its probability), how does that make it meaningless? You seem to offer "By definition, someone in Bob's past riding a dinosaur is not a future evolution of the present Bob" as an answer to that question, but that just completely confuses me. By definition of what, and why are we using that definition, and why is that important?
That said, as far as I can tell its meaninglessness is irrelevant to your actual point. The key point here is that if Bob knew enough about the world, he would give up on devoting resources to realizing his desire to go back in time and ride a dinosaur... right? I'm fine with assuming that; there are lots of mechanisms that could make this true, even if the whole "meaningless" thing doesn't quite work for me.
We don't know that. Indeed, I expect most of our values are extremely fragile, not to mention mutually opposed. Anything resembling a process of "calculating my coherent extrapolated volition" will, I expect, produce a result that I am as likely to reject in horror, or stare at in bewilderment, as I am to embrace as valuable. (Add another seven billion minds' values to the mix and I expect this becomes somewhat more true, but probably not hugely more true.)
The rational course of action for an agent is to optimally pursue its values.
That is, keep attempting to XXX anyway.
What privileges XXX over other alternatives is that the agent values XXX more than those alternatives.
There's no precommitment involved... if the agents values change such that it no longer values XXX but rather YYY, at that point it ought to stop pursuing XXX and pursue YYY instead. It may well regret having previously pursued XXX at that time. It may even predict its later regret of pursuing XXX, and its rational course of action is still to pursue XXX.
Of course, if it happens to value its future value-satisfaction, then YYY is part of XXX to the extent that the later value-shift is expected.
More generally: you seem to want your values to be justified in terms of something else.
Do you have any coherent notion of what that "something else" might be, or what properties it might have?