Less Wrong is a community blog devoted to refining the art of human rationality. Please visit our About page for more information.

110phil comments on The Importance of Goodhart's Law - Less Wrong

75 Post author: blogospheroid 13 March 2010 08:19AM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (113)

You are viewing a single comment's thread.

Comment author: 110phil 13 March 2010 03:07:39PM 4 points [-]

I wish there were some examples (other than the Soviet nails) ... if I had some better idea of what G and G* might actually represent, I'd be able to more easily get my head around the rest of the post.

Comment author: Unnamed 14 March 2010 02:36:07AM *  19 points [-]

In education, this is one of the criticisms of high-stakes testing: you'll just get schools teaching to the test, in ways that aren't correlated to real learning (the test is G*, real knowledge/learning is G). People say the same thing about the SAT and test prep - kids get into better colleges because they paid to learn tricks for answering multiple choice questions. The Wire does a great job of showing the police force's efforts to "juke the stats" (e.g. counting robberies as larcenies) so that crime statistics (G*) look better even while crime (G) is getting worse. Athletes get criticized for playing for their stats (G*), or trying to pad their stats, instead of playing to win, when the stats are supposed to be a measure of how much a player has contributed to his team's chances of winning (G). I'm not sure if it's historically accurate, but I've heard that body count (G*) was used by the US as one of the main metrics of success (G) in the Vietnam war, and as a result we ended up with a bunch of dead bodies but a misguided war.

In general, any time you measure something you care about in order to incentivize people, or to hold people accountable, or to keep track of what's going on, and the thing you measure isn't exactly the same as the thing that you care about, there's a risk of figuring out ways to improve the measurement that don't translate into improvements on the thing that you care about.

Comment author: Morendil 14 March 2010 01:25:02PM 18 points [-]

I'm surprised no one has yet brought up (G*) the LW karma system as a proxy for (G) contributing to "refining the art of human rationality".

Comment author: jimmy 14 March 2010 07:05:19PM 12 points [-]

LW karma is an interesting example because no one has direct access to the karma giving algorithm.

It's a bit like telling the nail factory that you're going to evaluate them on something, but not telling them whether its nail mass or number or something else until the end of the evaluation period.

If the one being evaluated knows nothing about how he's going to be evaluated except that it's going to be a proxy for goodness, then he can't really cheat. However, they might know that it's going to be very simple criteria so they make a very massive nail and many miniature ones.

Comment author: Kaj_Sotala 14 March 2010 09:05:13PM 30 points [-]

This reminds me of the way I hear they do state censorship in China. The censoring agencies don't actually give out any specific guidelines on what is allowed and what isn't, instead just clamping down on cases they do consider to be over the line. As a consequence, everyone self-censors more than they might with specific guidelines: with the guidelines, you could always try to twist their letter to violate their spirit. Instead, people are constantly unsure of just exactly what will get you in trouble, so they err on the side of caution.

While I strongly oppose state censorship, I can't help but admire the genius in the system.

Comment author: Emile 06 December 2011 09:16:23PM 5 points [-]

Also, unlike Saudi Arabia, they don't make many efforts to block pornography. As a result, the average Chinese teen is less likely to know how to access blocked sites than the average Saudi teen is (or so I read; I'm not aware of any study on that).

Comment author: [deleted] 06 December 2011 07:07:24PM *  2 points [-]

Depressing. This would mean that most informal norms of censorship are much more resilient and effective than most formal laws censoring material.

Arguably this makes them much harder to dislodge than even the intentionally vague Chinese law. Since I guess you can't really be prosecuted under it by pointing out there is a censorship law right?

Comment author: TheAncientGeek 16 April 2015 09:20:26AM *  1 point [-]

Or section 28 , which didn't forbid the discussion of homosexuality in the classroom, only its promotion....but since promotion wasn't defined, schools erred on the side of not mentioning it.

Comment author: CronoDAS 14 March 2010 05:40:38PM 1 point [-]

I never thought of the LW karma system a proxy for that.

Comment author: Morendil 14 March 2010 05:57:26PM 1 point [-]

What is your interpretation of it? It seems a pretty plausible hypothesis to me that it's a proxy for something, and has come to be relied upon as such. If we think Goodhart's Law applies in the case of karma, the final prediction in the "speculative origin" section might be something to be concerned about.

Comment author: CronoDAS 14 March 2010 06:23:32PM 4 points [-]

I think of it as a proxy for "valued member of the community" - if someone has karma, then people like their posts and comments. I'm mostly here to have fun and pass the time, and I happen to find discussing rationality to be fun. I don't really expect refining the art of human rationality to be well-correlated with a popularity contest.

Comment author: Morendil 14 March 2010 06:33:30PM 2 points [-]

And do you think Goodhart's Law, as presented in the post, applies here? That is, we should expect that eventually people (through gaming the system) end up with high karma without that in fact reliably correlating with being valued members of the community?

Comment author: CuSithBell 12 May 2011 08:38:14PM *  7 points [-]

As a data point, one thing I've noticed that seems to give a disproportionate amount of karma is arguing with someone who's wrong and unwilling to listen. It's easy to think they might come around eventually, and each point you make against them is worth a few points of karma from the amused onlookers or fellow arguers - which might tell you that you're making a valuable contribution, and so encourage you to keep arguing with trolls. This is my impression, at least.

Edit: (The problem being - determining the point of diminishing returns.)

Comment author: Jack 14 March 2010 07:31:32PM *  4 points [-]

Except we're like the self-employed in this regard. You can't do anything with karma. It won't impress your boss. It is just a way of quantifying how valued you are by the community. An employee doesn't really care about G at all. She cares about G* because that's what impresses the boss which furthers her own goals. But if you are your own boss you do care about G, G* is just an easy way to measure it. For me at least, this is the case with karma. I can't do anything with the number but it suggests that people like me.

So perhaps revenue sharing is a way to help address the problem. Instead of trying to come up with ways to measure what you care about, make the people beneath you care about it too. Of course this is a lot easier with money than it is with values.

Comment author: Alicorn 14 March 2010 09:30:17PM 9 points [-]

My boss cares about karma.

Comment author: CronoDAS 14 March 2010 06:35:56PM 4 points [-]

Only if people care about having high karma. It's probably fairly easy to game karma by making multiple accounts and voting yourself up, but why bother?

Comment author: wedrifid 12 May 2011 08:50:47PM 1 point [-]

And do you think Goodhart's Law, as presented in the post, applies here? That is, we should expect that eventually people (through gaming the system) end up with high karma without that in fact reliably correlating with being valued members of the community?

What? You mean Karma doesn't reliably correlate with objective worth of the individual? Damn.

Comment author: JenniferRM 14 March 2010 06:53:05AM 7 points [-]

The health and/or beauty of a woman (G) and her scale reported weight (G*) which might be somewhat correlated under some circumstances, but are definitely not identical and can diverge rather sharply due to crazy diets.

Comment author: CronoDAS 13 March 2010 07:09:47PM 5 points [-]
Comment author: Sniffnoy 13 March 2010 07:06:12PM 5 points [-]

Well there's a few described here, for instance: http://lesswrong.com/lw/le/lost_purposes/

Comment author: JamesAndrix 14 March 2010 07:17:20AM 3 points [-]

Products that are good for humanity, and products that are profitable

Comment author: dlthomas 06 December 2011 07:30:20PM 1 point [-]

Call time (G) or calls taken (G) in a call center, where what they care about is customer satisfaction (G) (at least inasmuch as it serves profitability).