If it's worth saying, but not worth its own post, then it goes here.
Notes for future OT posters:
1. Please add the 'open_thread' tag.
2. Check if there is an active Open Thread before posting a new one. (Immediately before; refresh the list-of-threads page before posting.)
3. Open Threads should start on Monday, and end on Sunday.
4. Unflag the two options "Notify me of new top level comments on this article" and "
For what it's worth, though, as far as I can tell we don't have the ability to create an AI that will reliably maximize the number of paperclips in the real world, even with infinite computing power. As Manfred said, model-based goals seems to be a promising research direction for getting AIs to care about the real world, but we don't currently have the ability to get such an AI to reliably actually "value paperclips". There are a lot of problems with model-based goals that occur even in the POMDP setting, let alone when the agent's model of the world or observation space can change. So I wouldn't expect anyone to be able to propose a fully coherent complete answer to your question in the near term.
It might be useful to think about how humans "solve" this problem, and whether or not you can port this behavior over to an AI.
If you're interested in this topic, I would recommend MIRI's paper on value learning as well as the relevant Arbital Technical Tutorial.