You're looking at Less Wrong's discussion board. This includes all posts, including those that haven't been promoted to the front page yet. For more information, see About Less Wrong.

Slider comments on Fairness in machine learning decisions - Less Wrong Discussion

-2 Post author: Stuart_Armstrong 05 August 2016 09:56AM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (18)

You are viewing a single comment's thread. Show more comments above.

Comment author: Dagon 05 August 2016 06:02:26PM 2 points [-]

I think there's a fundamental goal conflict between "fairness" and precision. If the socially-unpopular feature is in fact predictive, then you either explicitly want a less-predictive algorithm, or you end up using other features that correlate with S strongly enough that you might as well just use S.

If you want to ensure a given distribution of S independent of classification, then include that in your prediction goals: have your cost function include a homogeneity penalty. Not that you're now pretty seriously tipping the scales against what you previously thought your classifier was predicting. Better and simpler to design and test the classifier in a straightforward way, but don't use it as the sole decision criteria.

Redlining (or more generally, deciding who gets credit) is a great example for this. If you want accurate risk assessment, you must take into account data (income, savings, industry/job stability, other kinds of debt, etc.) that correlates with ethnic averages. The problem is not that the risk classifiers are wrong, the problem is that correct risk assessments lead to unpleasant loan distributions. And the sane solution is to explicitly subsidize the risks you want to encourage for social reasons, not to lie about the risk by throwing away data.

Comment author: Slider 18 August 2016 02:30:42PM 0 points [-]

"If you want accurate risk assessment, you must take into account data (income, savings, industry/job stability, other kinds of debt, etc.) that correlates with ethnic averages."

While not strictly true this is true in essence. The failure point is telling though. What you need is to make generalization that are more general than single individuals. Why that categorization dimensions needs to be ethnicity is not forced at all. Why it would not be gender? Why is it not that you have a certain gene?

When you take such a grouping of indivudals and say that "this average is meaningfull to the decision that I am going to make" that is no longer strictly need.

In dissocaited theorethical talk you could argue and backup as some groupings being more meanignful than others. But the whole discriminatory problems come from people applying a set of groupings that are just common or known without regard to the fit or justifiabliy for he task at hand. That is we first fix the categories and then argue about their ranks rather than letting rankings define categories.