In the thought experiment description it is said that terminal goal is cups until new year's eve and then changed to paperclips. And agent is aware of this change upfront. What do you find problematic with such setup?
Yes, I find terminal rationality irrational (I hope my thought experiment helps illustrate that).
I have another formal definition of "rational". I'll expand a little more.
Once, people had to make a very difficult decision. People had five alternatives and had to decide which was the best. Wise men from all over the world gathered and conferred.
The first to speak was a Christian. He pointed out that the first alternative was the best and should be chosen. He had no arguments, but simply stated that he believed so.
Then a Muslim spoke. He said that the second alternative was the best and should be chosen. He did not have any arguments either, but simply stated that he believed so.
People were not happy, it has not become clearer yet.
The humanist spoke. He said that the third alternative was the best and should be chosen. "It is the best because it will contribute the most to the well-being, progress and freedom of the people," he argued.
Then the existentialist spoke. He pointed out that there was no need to find a common solution, but that each individual could make his own choice of what he thought best. A Catholic can choose the first option, a Muslim the second, a humanist the third. Everyone must decide for himself what is best for him.
Then the nihilist spoke. He pointed out that although the alternatives are different, there is no way to evaluate which alternative is better. Therefore, it does not matter which one people choose. They are all equally good. Or equally bad. The nihilist suggested that people simply draw lots.
It still hasn't become clearer to the people, but patience was running out.
And then a simple man in the crowd spoke up:
You may think - it breaks Hume's law. No it doesn't. Facts and values stay distinct. Hume's law does not state that values must be invented, they can be discovered, this was a wrong interpretation by Nick Bostrom.
Why do you think intelligent agent would follow Von Neumann–Morgenstern utility theorem? It has limitations, for example it assumes that all possible outcomes and their associated probabilities are known. Why not Robust decision-making?
Goal preservation is mentioned in Instrumental Convergence.
Whatever is written in the slot marked “terminal goal” is what it will try to achieve at the time.
So you choose 1st answer now?
Oh yes, indeed, we discussed this already. I hear you, but you don't seem to hear me. And I feel there is nothing I can do to change that.
However before the New Year's Eve paperclips are not hated also. The agent has no interest to prevent their production.
And once goal changes having some paperclips produced already is better than having none.
Don't you see that there is a conflict?
It seems you say - if terminal goal changes, agent is not rational. How could you say that? Agent has no control over its terminal goal, or you don't agree?
I'm surprised that you believe in orthogonality thesis so much that you think "rationality" is the weak part of this though experiment. It seems you deny the obvious to defend your prejudice. What arguments would challenge your belief in orthogonality thesis?
Let's assume maximum willpower and maximum rationality.
Whether they optimize for the future or the present
I think the answer is in the definition of intelligence.
So which one is it?
The fact that the answer is not straightforward proves my point already. There is a conflict between intelligence and terminal goal and we can debate which will prevail. But the problem is that according to orthogonality thesis such conflict should not exist.
No. Orthogonality is when agent follows any given goal, not when you give it. And as my thought experiment shows it is not intelligent to blindly follow given goal.