jacob_cannell comments on The Unfriendly Superintelligence next door - LessWrong

48 Post author: jacob_cannell 02 July 2015 06:46PM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (67)

You are viewing a single comment's thread. Show more comments above.

Comment author: jacob_cannell 26 June 2015 09:49:40PM 1 point [-]

Yes, agreed with just about all of that.

Even if machine learning tends to average across batches, the decision about how to cluster the data is usually a function of the kinds of questions you are trying to answer with the data. It seems to me raw data is more useful than clustered, averaged data, because it has not presupposed the types of questions that will be asked.

Yes there is probably a fundamental information tradeoff between anonymization and data effectiveness, but it isn't clear that this will be much of a limiter in practice.

Secondly, people should be able to opt-in to various levels of anonymization risk, and perhaps that could be tied to financial incentives, so that you can effectively sell your data to some degree.