Let me try removing the word "value" and rewording this a little.
The paperclip maximizer doesn't begin as a self reproducing pattern, but it doesn't seem like it would go very far if it didn't build more paperclip maximizers in addition to building more paperclips. And it would probably want to have it's own copies be maximized as well, or it might self destruct into paperclips. This means it would have to consider itself a form of paperclip, since that is explicitly the only thing it maximizes for, since it isn't a [paperclip and paperclip maximizer] maximizer which seems to mean it is very likely it resolves into building copies of itself.
Does that rephrase fix the problems in my earlier post?
And it would probably want to have it's own copies be maximized as well [...] This means it would have to consider itself a form of paperclip
That's the problematic step. If maximizing copies of itself if what maximizes paperclips, it happens automatically. It doesn't have to decide "paperclips" stands for "paperclips and the 837 things I've found maximize them". It notices "making copies leads to more paperclips than self-destructing into paperclips", and moves on. Like you're not afraid that, if you don't believe growing cocoa beans is inherently virtuous, you might try to disassemble farms and build chocolate from their atoms.
(Why? Because it's fun.)
1) Do paperclip maximizers care about paperclip mass, paperclip count, or both? More concretely, if you have a large, finite amount of metal, you can make it into N paperclips or N+1 smaller paperclips. If all that matters is paperclip mass, then it doesn't matter what size the paperclips are, as long as they can still hold paper. If all that matters is paperclip count, then, all else being equal, it seems better to prefer smaller paperclips.
2) It's not hard to understand how to maximize the number of paperclips in space, but how about in time? Once it's made, does it matter how long a paperclip continues to exist? Is it better to have one paperclip that lasts for 10,000 years and is then destroyed, or 10,000 paperclips that are all destroyed after 1 year? Do discount rates apply to paperclip maximization? In other words, is it better to make a paperclip now than it is to make it ten years from now?
3) Some paperclip maximizers claim want to maximize paperclip <i>production</i>. This is not the same as maximizing paperclip count. Given a fixed amount of metal, a paperclip count maximizer would make the maximum number of paperclips possible, and then stop. A paperclip production maximizer that didn't care about paperclip count would find it useful to recycle existing paperclips, melting them down so that new ones could be made. Which approach is better?
4) More generally, are there any conditions under which the paperclip-maximizing thing to do involves destroying existing paperclips? It's easy to imagine scenarios in which destroying some paperclips causes there to be more paperclips in the future. (For example, one could melt down existing paperclips and use the metal to make smaller ones.)