cousin_it comments on The virtual AI within its virtual world - Less Wrong Discussion
You are viewing a comment permalink. View the original post to see all comments and the full post content.
You are viewing a comment permalink. View the original post to see all comments and the full post content.
Comments (34)
Yeah, this should work correctly, assuming that the AI's prior specifies just one mathematical world, rather than e.g. a set of possible mathematical worlds weighted by simplicity. I posted about something similar five years ago.
The application to "fake cancer" is something that hadn't occurred to me, and it seems like a really good idea at first glance.
Thanks, that's useful. I'll think how to formalise this correctly. Ideally I want a design where we're still safe if a) the AI knows, correctly, that pressing a button will give it extra resources, but b) still doesn't press it because its not part of its description.