hairyfigment comments on The genie knows, but doesn't care - Less Wrong
You are viewing a comment permalink. View the original post to see all comments and the full post content.
You are viewing a comment permalink. View the original post to see all comments and the full post content.
Comments (515)
Yeah, I can think of two general ways to interpret this:
In a variant of CEV, the AI uses our utterances as evidence for what we would have told it if we thought more quickly etc. No single utterance carries much risk because the AI will collect lots of evidence and this will likely correct any misleading effects.
Having successfully translated the quoted instruction into formal code, we add another possible point of failure.