Sebastian_Hagen comments on Superintelligence 11: The treacherous turn - Less Wrong
You are viewing a comment permalink. View the original post to see all comments and the full post content.
You are viewing a comment permalink. View the original post to see all comments and the full post content.
Comments (50)
Relevant post: Value is Fragile. Truly Friendly goal systems would probably be quite complicated. Unless you make your tests even more complicated and involved (and do it in just the right way - this sounds hard!), the FAI is likely to be outperformed by something with a simpler utility function that nevertheless performs adequately on your test cases.
Yes, I agree that getting the right tests is probably hard. What you need is to achieve the point where the FAI's utility function + the utility function that fits the test cases compresses better than the unfriendly AI's utility function + the utility function that fits the test cases.
To prevent human children taking a treacherous turn we spend billions: We isolate children from dangers, complexity, perversitiy, drugs, porn, aggression and presentations of these. To create a utility function that covers many years of caring social education is AI complete. A utility function is not enough - we have to create as well the opposite: the taboo and fear function.