The interpretation of "reflective equilibrium" that I currently have in mind is something like this (written by Eliezer), which I think is pretty close to Yvain's version as well:
I see the project of morality as a project of renormalizing intuition. We have intuitions about things that seem desirable or undesirable, intuitions about actions that are right or wrong, intuitions about how to resolve conflicting intuitions, intuitions about how to systematize specific intuitions into general principles.
And this may not be too different from what you have in mind when you say "unrestricted iterated self-modification" but I wanted to point out that we could easily diverge in reflective equilibrium even without "hardware" self-modification, just by thinking and applying our intuitions, if those intuitions and especially meta level intuitions differ at the start. (And I do think this is pretty obvious, so it confuses me when Eliezer does not acknowledge it when he talks about CEV.)
So this interpretation of "reflective equilibrium" is almost useless, right?
I'm not sure what you mean, but in this case it seems at least useful for showing that we don't have an argument showing that our "actual values" are complex. (Do you mean it's not useful as a way to build FAI?)
(Do you mean it's not useful as a way to build FAI?)
Yes.
we don't have an argument showing that our "actual values" are complex
Do you agree that FAI probably needs to have a complex utility function, because most simple ones lead to futures we wouldn't want to happen? The answer to that question doesn't seem to depend on notions like reflective equilibrium or Yvain's "actual values", unless I'm missing something again.
In the Wiki article on complexity of value, Eliezer wrote:
But in light of Yvain's recent series of posts (i.e., if we consider our "actual" values to be the values we would endorse in reflective equilibrium, instead of our current apparent values), I don't see any particular reason, whether from evolutionary psychology or elsewhere, that they must be complex either. Most of our apparent values (which admittedly are complex) could easily be mere behavior, which we would discard after sufficient reflection.
For those who might wish to defend the complexity-of-value thesis, what reasons do you have for thinking that human value is complex? Is it from an intuition that we should translate as many of our behaviors into preferences as possible? If other people do not have a similar intuition, or perhaps even have a strong intuition that values should be simple (and therefore would be more willing to discard things that are on the fuzzy border between behaviors and values), could they think that their values are simple, without being wrong?