Vika comments on [link] New essay summarizing some of my latest thoughts on AI safety - Less Wrong
You are viewing a comment permalink. View the original post to see all comments and the full post content.
You are viewing a comment permalink. View the original post to see all comments and the full post content.
Comments (27)
Thanks for the handy list of criteria. I'm not sure how (3) would apply to a recurrent neural net for language modeling, since it's difficult to make an imperceptible perturbation of text (as opposed to an image).
Regarding (2): given the impressive performance of RNNs in different text domains (English, Wikipedia markup, Latex code, etc), it would be interesting to see how an RNN trained on English text would perform on Latex code, for example. I would expect it to carry over some representations that are common to the training and test data, like the aforementioned brackets and quotes.