gregconen comments on Fusing AI with Superstition - Less Wrong
You are viewing a comment permalink. View the original post to see all comments and the full post content.
You are viewing a comment permalink. View the original post to see all comments and the full post content.
Comments (75)
This solves nothing. If we knew the failure mode exactly, we could forbid it explicitly, rather than resort to some automatic self-destruct system. We, as humans, do not know exactly what the AI will do to become Unfriendly; that's a key point to understand. Since we don't know the failure mode, we can't design a superstition to stop it, anymore than we can outright prohibit it.
This is, in fact, worse than explicit rules. It requires the AI to actively want to do something undesirable, instead of it occurring as a side effect.