hairyfigment comments on An overall schema for the friendly AI problems: self-referential convergence criteria - Less Wrong
You are viewing a comment permalink. View the original post to see all comments and the full post content.
You are viewing a comment permalink. View the original post to see all comments and the full post content.
Comments (110)
Wait... what? No.
You don't solve the value-alignment problem by trying to write down your confusions about the foundations of moral philosophy, because writing down confusion still leaves you fundamentally confused. No amount of intelligence can solve an ill-posed problem in some way other than pointing out that the problem is ill-posed.
You solve it by removing the need to do moral philosophy and instead specifying a computation that corresponds to your moral psychology and its real, actually-existing, specifiable properties.
And then telling metaphysics to take a running jump to boot, and crunching down on Strong Naturalism brand crackers, which come in neat little bullet shapes.
Near as I can tell, you're proposing some "good meta-ethical rules," though you may have skipped the difficult parts. And I think the claim, "you stop when your morality is perfectly self-consistent," was more a factual prediction than an imperative.
I didn't skip the difficult bits, because I didn't propose a full solution. I stated an approach to dissolving the problem.
And do you think that approach differs from the one you quoted?
It involves reasoning about facts rather than metaphysics.