CuSithBell comments on Prices or Bindings? - Less Wrong
You are viewing a comment permalink. View the original post to see all comments and the full post content.
You are viewing a comment permalink. View the original post to see all comments and the full post content.
Comments (42)
So, I realize this is really old, but it helped trip the threshold for this idea I'm rolling between my palms.
Do we suspect that a proper AI would interpret "avoid destroying the world" as something like
avoid(prevent self from being cause of) destroying(analysis indicates destruction threshold ~= 10% landmass remaining habitable, etc.) the world(interpret as earth, human society...)
(like a modestly intelligent genie)
or do we have reason to suspect that it would hash out the phrase to something more like how a human would read it (given that it's speaking english which it learned from humans)?
This idea isn't quite fully formed yet, but I think there might be something to it.