gwern comments on The Strangest Thing An AI Could Tell You - Less Wrong
You are viewing a comment permalink. View the original post to see all comments and the full post content.
You are viewing a comment permalink. View the original post to see all comments and the full post content.
Comments (574)
This is a little tricky, I'll admit, but if we could just ignore whatever the AI says - which is something in a different modality from whatever it is we're ignoring - then doesn't that defeat the whole thought-experiment? Because you could just ignore the anognosic module you, in a fit of absence of mind, wrote into your AI and subsequently ignored on all your reviews.
(Yes, a module full of code like that would look absolutely nothing like what was being censored, but it's not like the statement '90% of SIDs are actually irritated mothers murdering their kids' looks anything like an irritated mother murdering her child either.)