Metrics: The Good, The Bad, and The Ugly

Metrics and the Superorganism

“A foolish consistency is the hobgoblin of the small-minded.”

– Ralph Waldo Emerson

Metrics are frequently the greatest challenge managers and executives face. The vast majority of companies are so bad at defining metrics below the highest possible level (e.g. standard accounting KPIs) that they would be better off with no metrics at all.

The worst possible approach to performance metrics is constantly changing them. However, when executives lose their sense of company “proprioception” they look for easy to digest numbers they can be provided as a substitute for regaining an understanding of the organization.

“Managers who don’t know how to measure what they want settle for wanting what they can measure.” – Russell Ackoff

In my next few posts I’ll build off the idea that metrics can be the mental “cue” that compliments the proprioception of the executives of a superorganism (the company). In weightlifting, maximizing force and controlling process function often requires “cues” – proprioception is a complex system to interpret and control and the Central Nervous System doesn’t take direct orders from the lifter’s mind. On the squat this means a coach may say “Back on the heels!” or “Screw your feet into the floor.”

Metrics are just like this when used correctly. Naturally, my focus will be on agile software development in an environment with scaling issues – but the relationship between employee motivation and company outcomes holds true across every for-profit superorganism.


Process Control Methods

Six Sigma is a Statistical Process Control methodology. Using a Statistical Process Control methodology is perfect when the process environment is stable and the goal is static and repeatable and known before the opportunity occurs.  As a rule of thumb, if planning is done up-front and the same product is built repeatedly, statistical process control will have meaningful metrics for throughput. Imagine one automated drill in a factory that produces billions of parts per year. The holes drilled are one step in an easy-to-view process and performance is simple to judge – the holes must be within less than a .02mm of variance or other steps in production may fail and the overall product will be unacceptable.

However, software development is not like this at all.  The business context, market expectations, tools, and process environment are in a state of constant evolution, so an Empirical Process Control methodology (some combination of agile, Scrum, Kanban, and XP typically ) is needed such that that the process gathers and inspects short-term results after they have occurred to inform the next set of short-term adaptive goals. A single engineer writing code, due to the unlimited number of possible demands and her/his nearly unlimited paths to supply each demand has a virtually infinite potential to succeed or fail and the throughput process has zero visibility.


Control Metrics vs Performance Indicators:

While the visibility and the tangible nature of throughput makes it possible determine the relationship between statistical process control metrics and company Key Performance Indicators, the continual change inherent in innovation-based software makes empirical process control metrics more difficult to tie out to KPIs. This is because accounting metrics are statistical in nature based on the consistency or predictability of the value of the currency being used and stability of the institutions assuring the long-term meaningfulness of the accounting measurements.

In other words, the success of statistical process control against a variance of .02mm requires a shared understanding and valuation of the unit of measure “millimeter” just like accounting measures are all built on a shared valuation of the monetary unit of the company. Just like companies can choose to measure in inches or centimeters, most countries currently value exchange rates for currencies relative to the dollar. Currently, the world economy is a faceless superorganism that is cohesive to the exact extent that every individually is mutually self-interested in the shared valuation of the U.S. dollar. Even those who deny the right of the U.S. dollar to play this role live in a world in which their continued engagement in the global economy requires them to interact with rationally self-interested entities that will ultimately compare the non-USD valuation of their currency with the USD-based practices of others.

On that foundation, when you are an executive leading a brick-and-mortar retail chain, managerial accounting statistics are relatively simple to choose.  As an example, retail stores create semi-homogenous shopping environments across sufficiently similar demographic regions, train and compensate based on similar sales techniques, then track revenue precursor metrics: Traffic (customers through the door, Conversion (sales per customer), Items per ticket, dollars per ticket, etc.  Daily statistics give a clear indication of the trend toward longer-term goals.

Similar metrics can be applied to an e-commerce solution to the extent that a high sample rate can give insight into the conversion funnel.  The empirical tuning of these KPI’s has a long feedback loop based on the assumption that missing the target statistic is a failure for which managers can be held accountable and the process can be corrected.  To restate – when a sales team or web site fail to hit a target, the KPI is not immediately put under review because they are based on benchmarks consistent across an industry.  Instead, the sales organization is expected to improve to meet the benchmark.

Note that the KPIs are valuable to the owner of each company (whether a sole proprietor or millions of shareholders) only insofar as they are comparable between two groups or time periods, stable over time, simple (enough for your audience) to understand, and honest in origin (whether trusted or proven). If you analyze enough Annual 10K reports to shareholders, you will note that in addition to the metrics that “everyone” reports you can also report virtually any measure you believe gives a positive and accurate portrayal of the current and future potential value of shares.   A carefully written explanation for each of these is required both in law and in practice. If it is too difficult for shareholders to understand it will likely be ignored. If it misrepresents the value of shares, there are grounds for action by governing institutions and lawsuits by those with a fiduciary interest in the value of the company.


What about innovation?

In companies that pursue innovation as a competitive advantage, any “benchmark” is by definition emergent.  The learning organization is constantly seeking out new tools, processes, practices, and behaviors that will lead to a unique product offering, ideally with significant learning curve advantage over competitors.  Unlike the retail company’s use of weekly conversion as a leading indicator for quarterly profit, most innovation-based companies are unable to find a leading indicator for “successful innovation” because the success being pursued has not yet been discovered!

While managerial accounting can trace the market risk of an investor based on quarterly earnings, the innovation-based startup company is often creating supply for a product prior to also creating the demand AND the market for what it will supply!

So, if you do not know your price, your market, your product, or your demand, how can you possibly guide the process of product creation to ensure successful innovation? The executive must:

  • Ensure cohesion around a shared vision and values
  • Ensure the identity and evaluation of a company is consistent enough for isolation of variable in experiments
  • Strip every risk down to its smallest possible impact

This is where empirical process control for agile software development starts to look more effective the innovation-based company.  The emerging next practices are monitored using hypotheses and experiments such that metrics are selected as appropriate to a given learning opportunity.  The problem is tying the Scrum metrics that are meaningful to a team understanding itself now may have no meaning later after they have evolved. Because these metrics are not comparable across teams and not meaningful when compared at one time versus another, they cannot be used as an indication of performance for individuals or the company!

Metrics and KPI’s are still possible, but executive leadership must be careful not to lead the learning organization into non-learning behaviors constrain the innovating company into anti-innovating practices. Note that in the retail store example, the validity of performance indicators tying into accounting metrics down to the store level and the Store Manager is meaningful to the manager. If the associates staffing the store are wage-only earners, comparing conversion by employee is pretty nonsensical. However, if a new initiative like a Loyalty Card is rolled out, incenting employees at Point of Sale for sign-ups can be effective at driving change.

This is an important distinction. The length of time a fast food chain cook keeps the French Fries in the hopper is not a performance metric – it is a minimum requirement. Behaving in a way that encourages sales is a minimum requirement for the store associate. It the manager’s duty and power to be engaged enough to whether minimum requirements are being met and reward, correct, or discipline accordingly. The Store Manager therefore manages toward a company-wide performance indicator while the employee manages their own behavior in a way that ensures continued employment.

H.T Johnson wrote a terrific analysis of the tension between Lean principles and Accounting metrics and supplies this terrific summary:

Quantum physicists have suggested that undisturbed systems in the universe naturally stay in multiple states simultaneously, unless someone intervenes with a measurement device. Then all states collapse, except the one being measured. Perhaps what you measure is what you get. More likely, what you measure is all you get. What you don’t (or can’t) measure is lost.  – H. T. Johnson, “Lean Dilemma”


This is commonly called Hawthorne Effect in organization psychology – the moment an organization knows a behavior is being observed, the behavior will change from its natural state (typically to match the state the members of an organization believe to be desirable).  Because time and energy are finite per resource, shifting to a new behavior is always at the expense of another behavior.  This does not mean by necessity that observation should never occur at the enterprise level, it means that every metric must be extremely strategic.

 Moreover, the undisturbed natural state of the system includes the organic self-maintenance of the minimum requirements for membership.   This self-maintaining state is the emergent Prestige Economy of the company as a superorganism.

Metrics that have an impact on the relative value of individuals within the superorganism must be selected as a way to purposefully protect behaviors against change or to drive new adaptation.   In a company where what outcome means success, executives must be that much more careful with metrics because organizational energy will be expended in response to the metric – potentially at the expense of behaviors that will result in future but unknown success.

We can break metrics into three simple categories:

  1. The Good – Metrics that reinforce known successful behaviors – or less-certain but expected to drive success and judged by executive leadership as worth the risk – thereby reinforcing organization attention, energy, and output in the direction of strategic goals; with the known trade-off of less energy being expended on less-important processes.
  2. The Bad – Metrics that reinforce non-priority behaviors often due to leadership not possessing a sufficient understanding of strategic goals and what drives their achievement.
  3. The Ugly – Metrics that shift attention directly into bad behaviors, leaving the process that was being tracked worse rather than better.  These are typically a symptom of systemic co-dependency – the members of the organization see that there is a parent-child relationship, they are misunderstood, and they act out or lie in order to avoid punishment or gain attention from leadership.


Measure Strategically

Good metrics can only come from leadership, because only leadership is empowered and accountable for strategic trade-offs.  Know what you want for your organization, what you must prioritize, and what you are willing to sacrifice.  Measure the few key numbers you are certain positively reinforce what you want.  If you find yourself relying on consensus, or measuring in aggregate anything organization members are already measuring for their own purposes, slap your own wrist.  You have failed at leading.

A pilot has dozens of metrics available to her.  In context, many of them are important to a safe and comfortable flight.  You may have noticed that – while Average Aggregate Flight Altitude could be meaningful for scientific research – it is meaningless on an airline Annual 10K.  Many organizations do the equivalent of asking a pilot – “What is the most important metric in your cockpit?”  Don’t be that leader.

This is part of a series!

Part 1 – Metrics: The Good, The Bad, and The Ugly

Part 2 – How to Fail at Performance Metrics

Part 3 – Rules For Measuring Success

Part 4 – Measuring What Matters to Innovation

Throughout the series I tie together ideas from two great resources:

Kevin Simler’s Minimum Viable Superorganism

Steven Borg’s From Vanity to Value, Metrics That Matter: Improving Lean and Agile, Kanban, and Scrum

5 thoughts on “Metrics: The Good, The Bad, and The Ugly

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s