CronoDAS comments on How to Measure Anything - Less Wrong

50 Post author: lukeprog 07 August 2013 04:05AM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (48)

You are viewing a single comment's thread.

Comment author: CronoDAS 03 August 2013 06:40:09PM 3 points [-]

So... how do you measure programmer productivity? ;)

Comment author: lukeprog 03 August 2013 07:34:21PM 7 points [-]

What do you mean by "programmer productivity" and why do you care about it? What are you observing when you observe increased programmer productivity?

Comment author: SaidAchmiz 03 August 2013 07:47:23PM 2 points [-]

Implementation of desired things per unit time.

Comment author: [deleted] 05 August 2013 09:18:55PM 0 points [-]

I haven't read the article so I could be full of shit, but essentially:

If you have the list of desired things ready, there should be an ETA on the work time necessary for each desired thing as well as confidence on that estimate. Confidence varies with past data and expected competence, e.g. how easily you believe you can debug the feature if you begin to draft it. Or such. Then you have a set of estimates for each implementable feature.

Then you put in time on that feature over the day tracked by some passive monitoring program like ManictTime or something like it.

The ratio of time spent on work that counted towards your features over the work that didn't is your productivity metric. As time goes on your confidence is calibrated in your feature-implementation work time estimates.

Comment author: fubarobfusco 04 August 2013 04:16:00AM -1 points [-]

How do you measure desired things?

Comment author: SaidAchmiz 04 August 2013 06:16:22AM 1 point [-]

It's not very hard.

Just recently the task before me was to implement an object selection feature in an app I'm working on.

I implemented it. Now, the app lets the user select objects. Before, it didn't.

Prior to that, the task before me was to fix a data corruption bug in a different app.

I fixed it. Now, the data does not get corrupted when the user takes certain actions. Before, it did.

You see? Easy.

Comment author: fubarobfusco 04 August 2013 10:32:54AM 1 point [-]

So, I agree that you accomplished these desired things. However, before you accomplished them, how accurately did you know how much time they would take, or how useful they would be?

For that matter, if someone told you, "That wasn't one desired thing I just implemented; it was three," is it possible to disagree?

(My point is that "desired thing" is not well-defined, and so "desired things per unit time" cannot be a measurement..)

Comment author: SaidAchmiz 04 August 2013 04:01:12PM 0 points [-]

However, before you accomplished them, how accurately did you know how much time they would take

I didn't. I never said I did.

or how useful they would be?

Uh, pretty accurately. Object selection is a critical feature; the entire functionality of the app depends on it. The usefulness of not having your data be corrupted is also obvious. I'm not really sure what you mean by asking whether I know in advance how useful a feature or bug fix will be. Of course I know. How could I not know? I always know.

For that matter, if someone told you, "That wasn't one desired thing I just implemented; it was three," is it possible to disagree?

Ah, now this is a different matter. Yes, "desired thing" is not a uniform unit of accomplishment. You have to compare to other desired things, and other people who implement them. You can also group desired things into classes (is this a bug fix or a feature addition? how big a feature? how much code must be written or modified to implement it? how many test cases must be run to isolate this bug?).

Comment author: fubarobfusco 04 August 2013 09:24:22PM 0 points [-]

Yes, "desired thing" is not a uniform unit of accomplishment.

Right! So, "implementation of desired things per unit time" is not a measure of programmer productivity, since you can't really use it to compare the work of one programmer and another.

There are obvious cases, of course, where you can — here's someone who pounds out a reliable map-reduce framework in a weekend; there's someone who can't get a quicksort to compile. But if two moderately successful (and moderately reasonable) programmers disagree about their productivity, this candidate measurement doesn't help us resolve that disagreement.

Comment author: SaidAchmiz 05 August 2013 07:21:33PM 1 point [-]

Well, if your goal is comparing two programmers, then the most obvious thing to do is to give them both the same set of diverse tasks, and see how long they take (on each task and on the whole set).

If your goal is gauging the effectiveness of this or that approach (agile vs. waterfall? mandated code formatting style or no? single or pair programming? what compensation structure? etc.), then it's slightly less trivial, but you can use some "fuzzy" metrics: for instance, classify "desired things" into categories (feature, bug fix, compatibility fix, etc.), and measure those per unit time.

As for disagreeing whether something is one desired thing or three — well, like I said, you categorize. But also, it really won't be the case that one programmer says "I just implemented a feature", and another goes "A feature?! You just moved one parenthesis!", and a third goes "A feature?! You just wrote the entire application suite!".

Comment author: Lumifer 06 August 2013 07:37:27PM 1 point [-]

Well, if your goal is comparing two programmers, then the most obvious thing to do is to give them both the same set of diverse tasks, and see how long they take (on each task and on the whole set).

That might work in an academic setting, but doesn't work in a real-life business setting where you're not going to tie up two programmers (or two teams, more likely) reimplementing the same stuff just to satisfy your curiosity.

And of course programming is diverse enough to encompass a wide variety of needs and skillsets. Say, programmer A is great at writing small self-contained useful libraries, programmer B has the ability to refactor a mess of spaghetti code into something that's clear and coherent, programmer C writes weird chunks of code that look strange but consume noticeably less resources, programmer D is a wizard at databases, programmer E is clueless about databases but really groks Windows GUI APIs, etc. etc. How are you going to compare their productivity?

Comment author: khafra 06 August 2013 07:18:12PM 0 points [-]

It's a two-step process, right? First, you measure how long a specific type of feature takes to implement; from a bunch of historic examples or something. Then, you measure how long a programmer (or all the programmers using a particular methodology or language, whatever you're measuring), take to implement a new feature of the same type.

Comment author: Heka 31 December 2013 12:32:34AM 2 points [-]

Hubbard writes about performance measurement in chapter 11. He notes that management typically knows what are the relevant performance metrics. However it has trouble prioritizing between them. Hubbard's proposal is to let the managament create utility charts of the required trade-offs. For instance on curve for programmer could have on-time completion rate in one axis and error-free rate in the other (page 214). Thus the management is required to document how much one must increase to compensate for drop in the other. The end product of the charts should be a single index for measuring employee performance.