anotheruser comments on asking an AI to make itself friendly - Less Wrong Discussion
You are viewing a comment permalink. View the original post to see all comments and the full post content.
You are viewing a comment permalink. View the original post to see all comments and the full post content.
Comments (30)
They just bicker endlessly about uncertainty. "can you really know that 1+1=2?". No, but it can be used as valid until proven otherwise (which will never happen). As I said, the AI would need to understand the idea of uncertainty.
there is no such thing as objective morality. Good and evil are subjective ideas, nothing more. Firstly, unless someone explicitly tells the AI that it is a fundamental truth that nature is important to preserve, this can not happen. Secondly, the AI would also have to be incredibly gullible to just swallow such a claim. Thirdly, even if the AI does believe that, it will plainly say so to the people it is conversing with, in accordance with its goal to always tell the truth, thus warning us of this bug.
I agree with you that I don't think a AGI would have the same problems humans have with the concept of truth. However, what you described is neither the issues philosophers raise nor the sorts of big-universe issues the AI might get stuck on.
But wouldn't that actually support my approach? Assuming that there really is something important that all of humanity misses but the AI understands:
-If you hardcode the AI's optimal goal based on human deliberations you are guaranteed to miss this important thing.
-If you use the method I suggested, the AI will, driven by the desire to speak the truth, try to explain the problem to the humans who will in turn tell the AI what they think of that.
I don't see how that's relivant to philosophical questions about truth. Did you mean to reply to my other comment?