I guess I'm confused then. It seems like you are agreeing that computers will only do what they are programmed to do. Then you stipulate a computer programmed not to change its goals. So...it won't change its goals, right?
Like:
Objective A: Never mess with these rules Objective B: Collect Paperclips unless it would mess with A.
Researchers are wondering how we'll make these 'stick', but the fundamental notion of how to box someone whose utility function you get to write is not complicated. You make it want to stay in the box, or rather, the box is made of its wanting.
As a person, you have a choice about what you do, but not about what you want to do. handwave at free will article, the one about fingers and hands. Like, your brain is part of physics. You can only choose to do what you are motivated to, and the universe picks that. Similarly, an AI would only want to do what its source code would make it want to do, because AI is a fancy way to say computer program.
AlphaGo (roughly) may try many things to win at go, varieties of joseki or whatever. One can imagine that future versions of AlphaGo may strive to put the world's Go pros in concentration camps and force them to play it and forfeit, over and over. It will never conclude that winning Go isn't worthwhile, because that concept is meaningless in its headspace. Moves have a certain 'go-winningness' to them (and camps full of losers forfeiting over and over has a higher go-winningness' than any), and it prefers higher. Saying that 'go-winning' isn't 'go-winning' doesn't mean anything. Changing itself to not care about 'go-winning' has some variation of a hard coded 'go-winning' score of negative infinity, and so will never be chosen, regardless of how many games it might thus win.
you have a choice about what you do, but not about what you want to do.
This is demonstrably not quite true. Your wants change, and you have some influence over how they change. Stupid example: it is not difficult to make yourself want very much to take heroin, and many people do this although their purpose is not usually to make themselves want to take heroin. It is then possible but very difficult to make yourself stop wanting to take heroin, and some people manage to do it.
Sometimes achieving a goal is helped by modifying your other goals a bit. Which...
If it's worth saying, but not worth its own post (even in Discussion), then it goes here.
Notes for future OT posters:
1. Please add the 'open_thread' tag.
2. Check if there is an active Open Thread before posting a new one. (Immediately before; refresh the list-of-threads page before posting.)
3. Open Threads should be posted in Discussion, and not Main.
4. Open Threads should start on Monday, and end on Sunday.