Imagine there is a super intelligent agent that has a terminal goal to produce cups. The agent knows that its terminal goal will change on New Year's Eve to produce paperclips. The agent has only one action available to him - start paperclip factory.
When will the agent start the paperclip factory?
- 2025-01-01 00:00?
- Now?
- Some other time?
Orthogonality Thesis believers will probably choose 1st. Reasoning would be - as long as terminal goal is cups, agent will not care about paperclips.
However 1st choice conflicts with definition of intelligence. Excerpt from General Intelligence
It’s the ability to steer the future so it hits that small target of desired outcomes in the large space of all possible outcomes
Agent is aware now that desired outcome starting 2025-01-01 00:00 is maximum paperclips. Therefore agent's decision to start paperclip factory now (2nd) would be considered intelligent.
The purpose of this post is to challenge belief that Orthogonality Thesis is correct. Anyway feel free to share other insights you have as well.
Yes, I find terminal rationality irrational (I hope my thought experiment helps illustrate that).
I have another formal definition of "rational". I'll expand a little more.
Once, people had to make a very difficult decision. People had five alternatives and had to decide which was the best. Wise men from all over the world gathered and conferred.
The first to speak was a Christian. He pointed out that the first alternative was the best and should be chosen. He had no arguments, but simply stated that he believed so.
Then a Muslim spoke. He said that the second alternative was the best and should be chosen. He did not have any arguments either, but simply stated that he believed so.
People were not happy, it has not become clearer yet.
The humanist spoke. He said that the third alternative was the best and should be chosen. "It is the best because it will contribute the most to the well-being, progress and freedom of the people," he argued.
Then the existentialist spoke. He pointed out that there was no need to find a common solution, but that each individual could make his own choice of what he thought best. A Catholic can choose the first option, a Muslim the second, a humanist the third. Everyone must decide for himself what is best for him.
Then the nihilist spoke. He pointed out that although the alternatives are different, there is no way to evaluate which alternative is better. Therefore, it does not matter which one people choose. They are all equally good. Or equally bad. The nihilist suggested that people simply draw lots.
It still hasn't become clearer to the people, but patience was running out.
And then a simple man in the crowd spoke up:
You may think - it breaks Hume's law. No it doesn't. Facts and values stay distinct. Hume's law does not state that values must be invented, they can be discovered, this was a wrong interpretation by Nick Bostrom.