torekp comments on Concept Safety: Producing similar AI-human concept spaces - Less Wrong Discussion
You are viewing a comment permalink. View the original post to see all comments and the full post content.
You are viewing a comment permalink. View the original post to see all comments and the full post content.
Comments (45)
Compare jacob_cannell's earlier point that
Do we know or can we reasonably infer what those optimization criteria were like, so that we can implement them into our AI? If not, how likely and by how much would we expect the optimal solution to change?