Kaj_Sotala comments on [link] New essay summarizing some of my latest thoughts on AI safety - Less Wrong Discussion
You are viewing a comment permalink. View the original post to see all comments and the full post content.
You are viewing a comment permalink. View the original post to see all comments and the full post content.
Comments (27)
Taboo natural representations?
Without defining a natural representation (since I don't know how to), here's 4 properties that I think a representation should satisfy before it's called natural (I also give these in my response to Vika):
(1) Good performance on different data sets in the same domain.
(2) Good transference to novel domains.
(3) Robustness to visually imperceptible perturbations to the input image.
(4) "Canonicality": replacing the learned features with a random invertible linear transformation of the learned features should degrade performance.
Thanks.
So to clarify, my claim was not that we'd yet have algorithms producing representations that would fulfill all of these criteria. But it would seem to me that something like word embeddings would be moving towards the direction of fulfilling these. E.g. something like this bit from the linked post:
sounds to me like it would be represent clear progress towards at least #1 and #2 of your criteria.
I agree that the papers on adversarial examples that you cited earlier are evidence that many current models are still not capable of meeting criteria #3, but on the other hand the second paper does seem to present clear signs that the reasons for the pathologies are being uncovered and addressed, and that future algorithms will be able to avoid this class of pathology. (Caveat: I do not yet fully understand those papers, so may be interpreting them incorrectly.)