vi21maobk9vp comments on Diseased disciplines: the strange case of the inverted chart - Less Wrong

47 Post author: Morendil 07 February 2012 09:45AM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (150)

You are viewing a single comment's thread. Show more comments above.

Comment author: vi21maobk9vp 07 February 2012 06:34:38AM 1 point [-]

By definition, no cheap experiment can give meaningful data about high-cost bugs.

Comment author: Pavitra 08 February 2012 11:30:41AM 4 points [-]

That sounds intuitively appealing, but I'm not quite convinced that it actually follows.

Comment author: vi21maobk9vp 09 February 2012 06:09:29AM 0 points [-]

You can try to find people who produce such an experiment as a side-effect, but in that case you don't get to specify parameters (that may lead to a failure to control some variable - or not).

Overall cost of experiment for all involved parties will be not too low, though (although marginal cost of the experiment relative to just doing business as usual can be reduced, probably).

A "high-cost bug" seems to imply tens of hours spent overall on fixing. Otherwise, it is not clear how to measure the cost - from my experience quite similar bugs can take from 5 minutes to a couple of hours to locate and fix without clear signs of either case. Exploration depends on your shape, after all. On the other hand, it should be a relatively small part of the entire project, otherwise it seems to be not a bug, but the entire project goal (this skews data about both locating the bug and cost of integrating the fix).

if 10-20 hours (how could you predict how high-cost will a bug be?) are a small part of a project, you are talking about at least hundreds of man-hours (it is not a good measure of project complexity, but it is an estimate of cost). Now, you need to repeat, you need to try alternative strategies to get more data on early detection and on late detection and so on.

It can be that you have access to some resource that you can spend on this (I dunno, a hundred students with a few hours per week for a year dedicated to some programming practice where you have a relative freedom?) but not on anything better; it may be that you can influence set of measurements of some real projects.. But the experiment will only be cheap by making someone else cover the main cost (probably, for a good unrelated reason).

Also notice that if you cannot influence how things are done, only how they are measured, you need to specify what is measured much better than the cited papers do. What is the moment of introduction of a bug? What is cost of fixing a bug? Note that fixing a high-cost bug may include doing some improvements that were put off before. This putting off could be a decision with a reason, or just irrational. It would be nice if someone proposed a methodology of measuring enough control variables in such a project - but not because it would let us run this experiment, but because it would be a very useful piece of research on software project costs in general.

Comment author: fubarobfusco 09 February 2012 04:51:22PM 0 points [-]

A "high-cost bug" seems to imply tens of hours spent overall on fixing. Otherwise, it is not clear how to measure the cost - from my experience quite similar bugs can take from 5 minutes to a couple of hours to locate and fix without clear signs of either case.

A high-cost bug can also be one that reduces the benefit of having the program by a large amount.

For instance, suppose the "program" is a profitable web service that makes $200/hour of revenue when it is up, and costs $100/hour to operate (in hosting fees, ISP fees, sysadmin time, etc.), thus turning a tidy profit of $100/hour. When the service is down, it still costs $100/hour but makes no revenue.

Bug A is a crashing bug that causes data corruption that takes time to recover; it strikes once, and causes the service to be down for 24 hours, which time is spent fixing it. This has the revenue impact of $200 · 24 = $4800.

Bug B is a small algorithmic inefficiency; fixing it takes an eight-hour code audit, and causes the operational cost of the service to come down from $100/hour to $99/hour. This has the revenue impact of $1 · 24 · 365 = $8760/year.

Bug C is a user interface design flaw that makes the service unusable to the 5% of the population who are colorblind. It takes five minutes of CSS editing to fix. Colorblind people spend as much money as everyone else, if they can; so fixing it increases the service's revenue by 4.8% to $209.50/hour. This has the revenue impact of $9.50 · 24 · 365 = $83,220/year.

Which bug is the highest-cost? Seems clear to me.

Comment author: vi21maobk9vp 10 February 2012 06:41:05AM 1 point [-]

The definition of cost you use (damage-if-unfixed-by-release) is distinct from all the previous definitions of cost (cost-to-fix-when-found). Neither is easy to measure. Actual cited articles discuss the latter definition.

I asked to include the original description of the values plotted in the article, but this it not there yet.

Of course, existence of the high-cost bug in your definition implies that the project is not just a cheap experiment.

Futhermore, following your example makes the claim the article contests as plausible story without facts behind it the matter of simple arithmetics (the longer the bug lives, the higher is time multiplier of its value). On the other hand, given that many bugs become irrelevant because of some upgrade/rewrite before they are found, it is even harder to estimate the number of bugs, let alone cost of each one. Also, how an inefficiency affects operating costs can be difficult enough to estimate that nobody knows whether it is better to fix a cost-increaser or add a new feature to increase revenue.

Comment author: Morendil 10 February 2012 07:16:21AM *  0 points [-]

I asked to include the original description of the values plotted in the article, but this it not there yet.

Is that a request addressed to me? :)

If so, all I can say is that what is being measured is very rarely operationalized in the cited articles: for instance, the Brady 1999 "paper" isn't really a paper in the usual sense, it's a PowerPoint, with absolutely no accompanying text. The Brady 1989 article I quote even states that these costs weren't accurately measured.

The older literature, such as Boehm's 1976 article "Software Engineering", does talk about cost to fix, not total cost of the consequences. He doesn't say what he means by "fixing". Other papers mention "development cost required to detect and resolve a software bug" or "cost of reworking errors in programs" - those point more strongly to excluding the economic consequences other than programmer labor.

Comment author: vi21maobk9vp 10 February 2012 10:52:59AM 0 points [-]

Of course. My point is that you focused a bit too much on misciting instead of going for quick kill and saying that they measure something underspecified.

Also, if you think that their main transgression is citing things wrong, exact labels from the graphs you show seem to be a natural thing to include. I don't expect you to tell us what they measured - I expect you to quote them precisely on that.

Comment author: Morendil 10 February 2012 11:56:02AM 2 points [-]

their main transgression is citing things wrong

The main issue is that people just aren't paying attention. My focus on citation stems from observing that a pair of parentheses, a name and a year seem to function, for a large number of people in my field, as a powerful narcotic suspending their critical reason.

I expect you to quote them precisely on that.

If this is a tu quoque argument, it is spectacularly mis-aimed.

Comment author: vi21maobk9vp 11 February 2012 06:56:12AM 1 point [-]

as a powerful narcotic suspending their critical reason.

The distinction I made is about the level of suspension. It looks like people suspend their reasoning about statements having a well-defined meaning, not just reasoning about the mere truth of facts presented. I find the former way worse than the latter.

I expect you to quote them precisely on that. If this is a tu quoque argument, it is spectacularly mis-aimed.

It is not about you, sorry for stating it slightly wrong. I thought about unfortunate implications but found no good way to evade them. I needed to contrast "copy" and "explain".

I had no intention to say you were being hypocritical, but discussion started to depend on some highly relevant (from my point of view) objectively short piece of data that you had but did not include. I actually was wrong about one of my assumptions about original labels...

Comment author: Morendil 11 February 2012 11:06:37AM 0 points [-]

No offence taken.

As to your other question: I suspect that the first author to mis-cite Grady was Karl Wiegers in his requirements book (from 2003 or 2004), he's also the author of the Serena paper listed above. A very nice person, by the way - he kindly sent me an electronic copy of the Grady presentation. At least he's read it. I'm pretty damn sure that secondary citations afterwards are from people who haven't.

Comment author: RichardKennaway 08 February 2012 01:22:08PM 2 points [-]

Or to put that another way, there can't be any low-hanging fruit, otherwise someone would have plucked it already.