Stuart_Armstrong comments on An overall schema for the friendly AI problems: self-referential convergence criteria - Less Wrong
You are viewing a comment permalink. View the original post to see all comments and the full post content.
You are viewing a comment permalink. View the original post to see all comments and the full post content.
Comments (110)
Because FAI's can change themselves very effectively in ways that we can't.
It might be that human brain in computer software would have the same issues.
Doesn't mean the FAI couldn't remain genuinely uncertain about some value question, or consider it not worth solving at this time, or run into new value questions due to changed circumstances, etc.
All of those could prevent reflective equilibria, while still being compatible with the ability for extensive self-modification.
It's possible. They feel very unstable, though.