So basically: Redomain your utility function by composing it with an adaptor. Where the adaptor is a map from new-ontology -> old-ontology. Construct the adaptor by reverse-engineering your algorithms. Have I got that right?
Edit: No this sucks. Sometimes the old ontology doesn't make sense. I must think more. /Edit
That's a good statement of the problem, but I can see that "reverse engineer your algorithms" is the hard part, and we've just bottled it up as a black box. There's no obvious way to deal with cases that couldn't exist in your old ontology (brain damage can't exist in a simple dualist ontology, for example), or cases where there's a disagreement (teleportation and destructive-scan + print are different when things are ontologically basic, but more advanced physics says they are the same).
Some help may come from the fact that we seem to have some builtin support for ontology-shifting. It does happen successfully, though perhaps not always without loss. On the other hand people with the same ontology don't seem to diverge much by getting their through different update-chains.
Pretty accurate description of the mostly-the-same attempt. Also agreed that reverse-engineering your labelers is hard.
I think that for the other examples you would then need to do the more dangerous alternative and try and figure out what about your original concept you valued. It seems like you can do this with a mix of built-in hardware (if you're a human), and trying to come up with explanations about what would cause you to value something.
Like, for physical contiguity I value the fact that if I interact with a "person" then the result of th...
In a previous post, I argued that nihilism is often short changed around here. However I'm far from certain that it is correct, and in the mean time I think we should be careful not to discard our values one at a time by engaging in "selective nihilism" when faced with an ontological crisis, without even realizing that's what's happening. Karl recently reminded me of the post Timeless Identity by Eliezer Yudkowsky, which I noticed seems to be an instance of this.
As I mentioned in the previous post, our values seem to be defined in terms of a world model where people exist as ontologically primitive entities ruled heuristically by (mostly intuitive understandings of) physics and psychology. In this kind of decision system, both identity-as-physical-continuity and identity-as-psychological-continuity make perfect sense as possible values, and it seems humans do "natively" have both values. A typical human being is both reluctant to step into a teleporter that works by destructive scanning, and unwilling to let their physical structure be continuously modified into a psychologically very different being.
If faced with the knowledge that physical continuity doesn't exist in the real world at the level of fundamental physics, one might conclude that it's crazy to continue to value it, and this is what Eliezer's post argued. But if we apply this reasoning in a non-selective fashion, wouldn't we also conclude that we should stop valuing things like "pain" and "happiness" which also do not seem to exist at the level of fundamental physics?
In our current environment, there is widespread agreement among humans as to which macroscopic objects at time t+1 are physical continuations of which macroscopic objects existing at time t. We may not fully understand what exactly it is we're doing when judging such physical continuity, and the agreement tends to break down when we start talking about more exotic situations, and if/when we do fully understand our criteria for judging physical continuity it's unlikely to have a simple definition in terms of fundamental physics, but all of this is true for "pain" and "happiness" as well.
I suggest we keep all of our (potential/apparent) values intact until we have a better handle on how we're supposed to deal with ontological crises in general. If we convince ourselves that we should discard some value, and that turns out to be wrong, the error may be unrecoverable once we've lived with it long enough.