Testing .. just do it

Posted by

True, though paraphrased discussion (circa 2002):

Me: We should have a full test harness and throw low level errors so we can be sure the code responds properly.

Boss’s boss: We don’t have time for that.

Me: Well, what do we have time for, fixing problems under duress?

Boss’s boss (fumes and sputters and decides to lie) Yes, believe it or not, that is cheaper!

Yeah I was being a little bit of a wise guy, but since both our work lives were being half-consumed with critical issues, it was germane and honest.  Those issues needed multiple hops either internally or at customer sites to even gather data then surmise solutions, sometimes trying them in production because we lacked the means of reproducing data and memory states that were the root cause.

Causing defects to occur on your terms should be a no-brainer exercise for development organizations.  But it’s not.  No matter how much is written and preached about test driven development in the Agile framework.  My quip about organizations having the time to fix problems rather than prevent them is not a critical, wise-guy statement.  It describes absolute standard business tactics.

I conducted a meeting with engineers for a period of 5 years.  It was applying a practice called Orthogonal Defect Classification (ODC) to resolutions to past defects and customer problems.  Most engineers found it interesting to go over the work they had done so we could gather data about how to improve.  The process is one of many attempts at formal analysis of system defects.  And as analysis it produced graphs but little change in the way things were done.  The number one sticky fact I took away was that 41% of our bugs were a failure to check the results between system components – often cross-team calls the can change any time and without warning.

So then, part of the problem is that of metrics; gathering them, believing them and acting upon then.  My ODC meetings were only one instance.  I have a friend who led a project team make a body of problematic legacy code more reliable.  It was a funded effort with design and priorities.  His team ended up closing (true story) thousands of defect reports and eliminated issues that were provably existing for 15 years.  When they reported their progress, those numbers were celebrated but phases 2 and 3 of the project were never approved.  That’s because the cost in the field could never be measured.  Since problems were eliminated, that meant counting something that no longer existed.  And since the problems they quashed were a small subset of the whole body of issues encountered, it was an uphill battle to convince management of the value of their work.

I have no other word for another cultural problem – it’s filthy.  There exist caste systems in technical organizations (no, not all technical organizations) where those who test are considered truly less important than those who develop.  Traditionally they may have had lower salaries, had fewer and lower technical skills, etc., so the work they did was likewise considered ancillary to that done by developers.  This is perhaps the aspect of corporate life in technical organizations that is most violently at odds with test driven development and Agile process itself.  Testing is inferior work?  PLEASE!!  May that attitude die a dishonorable death.

Almost all the literature about TDD and Agile are tacitly aimed at new(ish) products and new(ish) teams.  It’s rare to find someone who understands the problem facing legacy software in this approach.  A refreshing exception is Michael Feathers’ Working Effectively with Legacy Code.  He doesn’t work on the code I work on but he gets it regarding the problems of old and new.

Image result for test driven development

There are a number of quips used by opponents to resist applying new and rigorous testing to old code.  I have found I needed counter-quips to combat the dismissive over simplification of the business case.  One quip I’ve heard too much is “Well, you can’t boil the ocean” – which implies that identical rigor needs to be applied to multiple millions of lines of code or else it’s not worth starting.  My counter-quip is “Yeah, but a bay or inlet boils nicely and you know the very bays of which I speak.”

It’s true.  There is no mystery in any system where the problems lie.  And there truly is no mystery about the value of increased testing rigor in those inlets and “bays”.  So all that keeps it from happening is corporate bad habits and errant calculations about business expenses.

And certainly anything I am doing personally will include testing commensurate with the complexity of the technology in anything I’ve developed.  That’s a promise.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s