Goodheart's Law and Projects

This post originally appeared on BetterProjects.net on May 9, 2010.


An interesting phenomenon I’ve noticed during testing phases of projects is that the number of defect reasons tracked grows the longer the project drags on. It is as if in order to shift blame away from people and onto processes and systems, we create increasingly elaborate ways of explaining away and obscuring the real problems faced during a development cycle.

In light of that observation and similar observations during other phases of the project lifecycle, I was intrigued when BoingBoing linked to a concept called Goodhart’s law. The basic principle is that once a measure becomes used as a way to target specific behavior, the usefulness of the measurement itself decreases significantly. If you look back at my example of defect reasons, you’ll see the principle at work. Lets put together a quick scenario to illustrate this…

A project has just gone into its testing phase and a QA team member logs a new defect. The defect goes through triage and is assigned to a developer. The developer looks at the code and verifies that the behavior described is reproducible, but feels that the code is working as designed. At this point, he changes the defect reason from ‘code’ to 'requirements’ and reassigns the defect to a BA.

The BA reads the defect and believes the developer misunderstood the intent of the requirement and thus the solution as implemented doesn’t fit with what the business area needs. The BA adds additional comments to the defect, reassigns the bug to the solution manager and changes the defect reason back to 'code’. The product manager, protecting her developer, has a new status added called 'enhancement’ and pushes the defect back around to the business owner.

Time passes and this scenario repeats itself over months and years. Every few iterations of the product, a new defect reason is added to cover a new outcome that was previously undiscovered. Eventually, there are so many codes that it looks like each group is doing a fantastic job because it only has one or two defects per reason per release cycle, regardless of the fact that the aggregate defect total is continually increasing.

Those of us who have been involved with many projects laugh at this scenario because we’ve seen it happen all too often. Its not that such detail is inherently bad, its just not necessarily useful the as projects progress.

Its not just testing where we see this type of behavior, either. We see the same thing happen when determining which projects to undertake, eliciting requirements, analyzing requirements, creating project plans or just about any other part of a project.

So what do we do about this? Do we refuse to ever create new defect reasons? Do we stop using defect categorizations as a metric for project and team success? The answer really comes in trying to understand what behavior you are targeting and what is the best way to achieve this new behavior. We need to take a long and thorough look at the real problem and then model how different potential solutions will impact the organization. If we are not performing a deep and comprehensive analysis prior to implementing a change, we greatly increase the risk of having our solution be nothing more than a new problem.
Mastodon