Pretty accurate description of the mostly-the-same attempt. Also agreed that reverse-engineering your labelers is hard.
I think that for the other examples you would then need to do the more dangerous alternative and try and figure out what about your original concept you valued. It seems like you can do this with a mix of built-in hardware (if you're a human), and trying to come up with explanations about what would cause you to value something.
Like, for physical contiguity I value the fact that if I interact with a "person" then the result of that interaction will be causally relevant to them at some later time. That's very important if I want to like, interact with people in some way that I'll care about having done later. It would suck to make lunch plans with someone, and then have that be completely irrelevant to their later behavior/memory.
I also think that it's worth mentioning that I don't think that humans are always valuing things because they have a conceptual framework that implies they should. I'm not saying that ontologies change, but it doesn't feel like most updates restructure the concepts that something is expressed in. For instance I used to be pretty pro-abortion, but recently found out that, emotionally at least, I find it very upsetting for reasons mostly unrelated to my previous justifications.
In a previous post, I argued that nihilism is often short changed around here. However I'm far from certain that it is correct, and in the mean time I think we should be careful not to discard our values one at a time by engaging in "selective nihilism" when faced with an ontological crisis, without even realizing that's what's happening. Karl recently reminded me of the post Timeless Identity by Eliezer Yudkowsky, which I noticed seems to be an instance of this.
As I mentioned in the previous post, our values seem to be defined in terms of a world model where people exist as ontologically primitive entities ruled heuristically by (mostly intuitive understandings of) physics and psychology. In this kind of decision system, both identity-as-physical-continuity and identity-as-psychological-continuity make perfect sense as possible values, and it seems humans do "natively" have both values. A typical human being is both reluctant to step into a teleporter that works by destructive scanning, and unwilling to let their physical structure be continuously modified into a psychologically very different being.
If faced with the knowledge that physical continuity doesn't exist in the real world at the level of fundamental physics, one might conclude that it's crazy to continue to value it, and this is what Eliezer's post argued. But if we apply this reasoning in a non-selective fashion, wouldn't we also conclude that we should stop valuing things like "pain" and "happiness" which also do not seem to exist at the level of fundamental physics?
In our current environment, there is widespread agreement among humans as to which macroscopic objects at time t+1 are physical continuations of which macroscopic objects existing at time t. We may not fully understand what exactly it is we're doing when judging such physical continuity, and the agreement tends to break down when we start talking about more exotic situations, and if/when we do fully understand our criteria for judging physical continuity it's unlikely to have a simple definition in terms of fundamental physics, but all of this is true for "pain" and "happiness" as well.
I suggest we keep all of our (potential/apparent) values intact until we have a better handle on how we're supposed to deal with ontological crises in general. If we convince ourselves that we should discard some value, and that turns out to be wrong, the error may be unrecoverable once we've lived with it long enough.