Comment author: stefanhendriks 07 February 2012 07:40:35PM 1 point [-]

Since I think more people should know about this, I have made a question on Stackoverflow about it: http://stackoverflow.com/questions/9182715/is-it-significantly-costlier-to-fix-a-bug-at-the-end-of-the-project

Comment author: Morendil 07 February 2012 08:54:28AM 0 points [-]

I'm interested in your source for that graph.

Googling a bit for stuff by Sommerville, I come across a pie chart for "distribution of maintenance effort" which has all the hallmarks of a software engineering meme: old study, derived from a survey (such self-reports are often unreliable owing to selection bias and measurement bias), but still held to be current and generally applicable and cited in many books even though more recent research casts doubt on it.

Here's a neat quote from the linked paper (LST is the old study):

(Possibly) participants in the survey from which LST was derived simply did not have adequate data to respond to the survey. The participating software maintenance managers were asked whether their response to each question was based on reasonably accurate data, minimal data, or no data. In the case of the LST question, 49.3% stated that their answer was based on reasonably accurate data, 37.7% on minimal data, and 8.7% on no data. In fact, we seriously question whether any respondents had ‘‘reasonably accurate data’’ regarding the percentage of effort devoted to the categories of maintenance included in the survey, and most of them may not have had even ‘‘minimal data.’’

I love it that 10% of managers can provide a survey response based on "no data". :)

Comment author: stefanhendriks 07 February 2012 07:22:43PM 1 point [-]

I've read the paper you refer to, very interesting data indeed. The quote is one of five possible explenations of why the results differ so much, but it certainly is a good possibility.

This post sparked my interest/doubt knob for now. I will question more 'facts' in the SE world from now on.

About sommerville: Sommerville website: http://www.comp.lancs.ac.uk/computing/resources/IanS/

The book I refer to: http://www.comp.lancs.ac.uk/computing/resources/IanS/SE8/index.html

You can download presentations of his chapters here: http://www.comp.lancs.ac.uk/computing/resources/IanS/SE8/Presentations/index.html

I have based my findings on the presentations now, since I haven't got the book nearby. You can look them up yourself (download the chapters from the above link).

Chapter 7 says:

Requirements error costs are high so validation is very important • Fixing a requirements error after delivery may cost up to 100 times the cost of fixing an implementation error.

Chapter 21, refers to Software Maintanance, claiming (might need to verify this as well? ;)) :

[Maintanance costs are] Usually greater than development costs (2* to 100* depending on the application).

Because I don't have the book nearby I cannot tell for certain where it was stated. But I was pretty certain it was stated in that book.

Comment author: Morendil 06 February 2012 07:59:59PM 5 points [-]

ISTM that you're making a great argument that the defects claim is in the same category as the "10% of the brain" category. Let me explain.

To a layman, not well versed in neuroanatomy, the 10% thing has surface plausibility because of association between brain size and intelligence (smaller brained animals are dumber, in general), and because of the observed fact that some humans are massively smarter than others (e.g. Einstein, the paradigmatic case). Therefore, someone with the same size brain who's only "normal" in IQ compared to Einstein must not be using all of that grey matter.

Of course, as soon as you learn more of what we actually know about how the brain works, for instance the results on modularity, the way simulated neural networks perform their functions, and so on - then the claim loses its plausibility, as you start asking which 90% we're supposed not to be using, and so on.

Similarly, someone with a poor understanding of "defects" assumes that they are essentially physical in nature: they are like a crack in cement, and software seems like layer upon layer of cement, so that if you need to reach back to repair a crack after it's been laid over, that's obviously harder to fix.

But software defects are nothing like defects in physical materials. The layers of which software is built are all equally accessible, and software doesn't crack or wear out. The problem is a lot more like writing a novel in which a heroine is dark-haired, complete with lots of subtle allusions or maybe puns referencing that hair color, and then deciding that she is blonde after all.

As you observe, the cost of fixing a defect is not a single category, but in fact decomposes in many costs which have fuzzy boundaries:

  • the cost of observing the erroneous behaviour in the first place (i.e. testing, whether a tester does it or a user)
  • the cost of locating the mistake in the code
  • the cost of devising an appropriate modification
  • the cost of changing the rest of the software to reflect the modification
  • the economic consequences of having released the defect to the field
  • the economic consequences of needing a new release
  • all other costs (I'm sure I'm missing some)

These costs are going to vary greatly according to the particular context. The cost of testing depends on the type of testing, and each type of testing catches different types of bugs. The cost of releasing new versions is very high for embedded software, very low for Web sites. The cost of poor quality is generally low in things like games, because nobody's going to ask for their money back if Lara Croft's guns pass through a wall or two; but it can be very high in automated trading software (I've personally touched software that had cost its owners millions in bug-caused bad trades). Some huge security defects go undetected for a long time, causing zero damage until they are found (look up the 2008 Debian bug).

The one thing that we know (or strongly suspect) from experience is always monotonically increasing as we add more code is "the cost of changing the rest of the software to reflect the modification". This increase applies whatever the change being made, which is why the "cost of change curve" is plausible. (The funny part of the story is that there never was a "cost of change curve", it's all a misunderstanding; the ebook tells the whole story.)

Of course, someone who is a) sophisticated enough to understand the decomposition and b) educated enough to have read about the claim is likely to be a programmer, which means that by the availability heuristic they're likely to think that the cost they know best is what dominates the entire economic impact of defects.

In fact, this is very unlikely to be the case in general.

And in fact, the one case where I have seen a somewhat credible study with detailed data (the Hughes Aircraft study), the data went counter to the standard exponential curve: it was expensive to fix a defect during the coding phase, but the (average per defect) cost then went down.

Comment author: stefanhendriks 07 February 2012 08:11:31AM 0 points [-]

You make me curious about your book, perhaps I'll read it. Thanks for the extensive answer. Could'nt agree more with what you're saying. I can see why this 'cost of change curve' actually might not exist at all.

Made me wonder, I recently found a graph by Sommerville telling the exact story about these cost of change. I wonder what its source is for that graph .. ;)

Comment author: stefanhendriks 06 February 2012 06:34:49PM 0 points [-]

Interesting read to begin with. Nice anology. I do support the thought that claims made (in any field) should have data to back it up.

I do think at this point that , even though there is no 'hard scientific data' to claim it; Don't we have enough experience to know that once software is in operation, when bugs are found they cost more the fix than initially?

(Bugs are also in my opinion features that do not meet the expectations)

Even though the chart may be taken out of context, and a bit taken too far I don't think it belongs to the infamous quotes like "you only use 10% of your brain". This claim btw is easier to "prove" wrong. You could measure brain activity and calculate the amount of % is used of the whole. Software however is much more complex.

It is much harder to prove if defects actually cost more to fix later than to fix early. I don't think the bugs themselves actually are more costly. Sure, some bugs will be more costly because of the increased complexity (compared to the not-yet-released-version), but most costs will come from the missed oppertunities. A concrete example would be an e-commerce website only supporting Visa Cards, while the customer expected it to have Visa Cards, but also Mastercard, and other creditcard vendor support. Clearly the website will miss income, the costs of this 'defect' will be much greater of this missed oppertunity than actually implementing the support. (yes, you need to back this up with numbers, but you get the point :)).

Kudos for pointing out this 'flaw', it takes some balls to do so ;)