dlthomas comments on So You Want to Save the World - Less Wrong
You are viewing a comment permalink. View the original post to see all comments and the full post content.
You are viewing a comment permalink. View the original post to see all comments and the full post content.
Comments (146)
Let me get this straight, are you saying that if you believe X, there can't possibly exist any information that you haven't discovered yet that could convince your belief is false? You can't know what connections and conclusions might AI deduce out of every information put together. They might conclude that humanity is a stain of universe and even if they thought wiping humanity out wouldn't accomplish anything (and they strongly desired against doing so), they might wipe us out purely because the choice "wipe humanity" would be assigned higher value than the choice "not to wipe out humanity".
Also, is the statement "my desire is not do do X, therefore I wouldn't choose to desire to do X even if I could choose that." your subjective feeling, or do you base it on some studies? For example, this statement doesn't apply to me, as I would, under certain circumstances, choose to desire to do X, even if it was not my desire initially. Therefore it's not an universal truth, therefore may not apply to AI either.
No. I'm saying that if I value X, I can't think of any information that would cause me to value NOT(X) instead.
Can you give me an example of something you desire not to do, which you would willingly edit yourself to desire to do?
If you have lexicographic preferences, and prefer W to X, and you learn that NOT(X) and W are equivalent?