How to Fail at Performance Metrics

In my last post we reviewed Hawthorne Effect and other exciting topics.  Check it out!

Throughput Metrics:

So how do we find statistical process metrics that lead to better empirical process output (without dire consequences)?  The ramifications of an “ugly” metric cannot be understated.  The goal of implementing agile is to reap the benefits of higher team velocity, better fit-to-market, better quality products, faster time-to-market, while establishing culture and innovation as a competitive advantage.  These are lofty goals. The engineers and other functions you have gathered together likely joined with a desire for meaningful software creation. The natural undisturbed and unmeasured systemic state of such a group should be a collaborative effort to create products envisioned by executive leadership. Introducing an ugly metric will near-instantaneously disrupt whatever was gained through agile by driving symptoms of codependency in the organization. It will be a betrayal and undermine the creative process.

“Managers who don’t know how to measure what they want settle for wanting what they can measure.” – Russell Ackoff

 

First of all “manager who don’t know how to measure what they want” need to try harder and ask for help from thought leaders, a Google search, or fellow leaders. There is no excuse for allowing a company to hum along without any guidance from its visionary executive leader(s). There are an enormous number of metrics possible.  An experienced statistician could produce probability distributions showing likelihood of correlation between any number of variables and an expected outcome. This does not make them valuable to an executive or appropriate for an organization. A metric must be easy (enough) to understand. Although a fair number of humans (especially engineers) can compute two-variable “fuzzy weighted logic” in their heads, I defy you to find an entire for-profit organization where every person can compute and make informed decisions based on complex multivariate calculus and probability distributions.


 

Vanity Metrics:

We have seen so far that the right reason to have a metric is as a purposeful tool for implementing executive vision while the wrong reason to introduce a metric is to correct the insecurity of executives when they feel “out of touch”. The latter are vanity metrics. They make the executive feel better at the risk of redirecting energy toward behaviors that run counter to success. One example is utilization.  It may feel good to track as a manager, because companies that pay people have taken a risk and want an appropriate return on the social contract known as “salary”.

Unlike some metrics, it is unlikely that utilization gets tracked with a purposeful tradeoff against lead time or cycle time. In other words, to the extent a company adopts agile and prioritizes “responding to change” – or responsiveness in general – maximizing utilization is mathematically counter to agile because it is detrimental to responsiveness.

This has been thoroughly analyzed in queuing theory. If you imagine any one engineer:

  • Demands arrive to the employee at a variable rate.
  • Work is accomplished at a variable rate.
  • There is one worker.
  • The possible queue of demands is potentially infinite.

This type of queue is an M/M/1/ ∞ queue. Now you may have heard Google has 20% time as a benefit, but when looking at M/M/1 queue – applied to highway flow, server traffic, or people – the point at which the trade-off between capacity utilization and responsiveness becomes unacceptable is not solved statistically. All that is known is that handling additional requests will eventually need additional capacity.

“As the freeway approaches 100% capacity, it ceases being a freeway. It becomes a parking lot.”

Jim Benson, Personal Kanban: Mapping Work | Navigating Life

 

This is the problem with tracking utilization. What is the “right” utilization number? Executive strategy defines acceptable trade-offs. Unless you clearly articulate a benchmark and its importance, your employees will assume utilization is tracked against 100% of 40hrs, shifting their behavior to an inability to quickly respond to new requests. The Hawthorne Effect of tracking utilization purposelessly is over-commitment and burn out.

However, as a leader of an organization, an expectation of managers must be established. When are additional resources hired to ensure the desired level of responsiveness? As a rule of thumb, how much work – assuming there is significant work to do – assign to any given employee? Is it okay keep utilization at 50% for some employees? When is overtime acceptable? Acceptable management practices must be defined based on goals for responsiveness.

This is the difference between “utilizing” an hourly wage warehouse employee by having them sweep the floor an extra time on a given day due to downtime versus cutting a salary-based ambulance and firefighting team due to low “utilization”. The hourly employee typically would not want reduced wages because of a lack of work and there is always a floor to sweep while they wait – the manager knows they are suppose to keep the employee busy. In contrast, responsiveness to a major fire or someone going into cardiac arrest is prioritized through “excess” capacity by mitigating the risk that utilization of the capacity to respond to fires or medical emergencies ever exceeds 100%.

We can see now that tracking capacity and utilization is far less important than tracking responsiveness. In agile software delivery there are two types of metrics that ought to be meaningfully tracked and compared to achievement of company financial goals:

  1. Responsiveness to Change – In aggregate, from the time it is known a market demand has changed, how long does it take to “pivot” and address shifting market conditions.
  2. Feedback Timeliness – For any given point in the process, this is the length of time it takes to validate the intended change was implemented in response to change.

 

 

Proxy Metrics:

If the metric you want is nearly impossible to reliably compute or gain sufficient organization-wide understanding and traction around your vision, this is when you need to find proxy metrics that everyone can agree is an indirect leading or trailing indicator that the organization is properly taking the small daily steps that result in annual success. While a good expression of executive vision likely expresses strategic commitment and trade-off at a broad level, employees need an indication of how to make the daily hard decisions that directly impact their status and prestige within the superorganism.

Without this sense of “blessing” surrounding the commitment of time and resources, employees are powerless. Expect diffusion of responsibility and self-protective over-documenting of decisions that are made. In contradistinction, an executive seeking “the good” metrics needs a sharp eye on how a metric will create positive reinforcement of decisions that fit with the long term position in which the company is moving. If a metric does not reinforce the empowerment and authority you have blessed employees with, so that they make the correct decisions you expect your employees to make, it is a dreadful metric.

This is part of a series!

Part 1 – Metrics: The Good, The Bad, and The Ugly

Part 2 – How to Fail at Performance Metrics

Part 3 – Rules For Measuring Success

Part 4 – Measuring What Matters to Innovation

Throughout the series I tie together ideas from two great resources:

Kevin Simler’s Minimum Viable Superorganism

Steven Borg’s From Vanity to Value, Metrics That Matter: Improving Lean and Agile, Kanban, and Scrum

Strength to Compete

Excess of strength is the only proof of strength.  

We must strive, fight, and harden ourselves, continuously improve and overcome, to outstrip and outpace our rivals.  We must brace ourselves, proud and resilient, against risk – and even welcome loss when justified – because even in a wound there is the power to heal.  It is a first-principle from the military school of life:  

What does not kill me makes me stronger.

As warriors we expose our weakness happily, welcome vulnerability, fail often, delighted, and inspect, adapt, evolve, and innovate – hard and fast.  We make pain our truth, we make learning our competitive strategy, we make ourselves immune to the setbacks that ruin the weak around us.  In the face of tragedy the warrior in our soul celebrates, and even honors life as the most worth adversary we will ever face; because, more consistently than any other rival, “life” brings its most formidable weapons against us.  Every artist needs his torture, even more the disrupter and creator of values.

The warrior-champion is born out of, and evermore accustomed to, suffering, and extols his existence by means of tragedy and hardship, because he knows the value of a thing often lies not in what one attains with it, but in what one pays for it; what it truly costs him. Liberated by perseverance, gritting our teeth against pain and loss, war becomes a training in freedom – after all, what is freedom?

Freedom is the will to self-responsibility.

Freedom is a state of spirit; that one embodies the will to self-responsibility.  That one preserves the distance that divides us, even in an embrace.  That one is ready to sacrifice men to one’s cause, oneself not excepted.  Freedom means that the instincts that delight in war and victory within us have gained mastery over all other instincts.  The truly free man is a creator, destroying the past and disrupting the present, a warrior constantly overcoming resistance, five steps from becoming a tyrant while standing on the threshold of servitude.  He combats the tyranny of the pitiless, dreadful instincts with maximum authority and discipline toward himself.  After all, what is strength?

Strength is the will to self-discipline.

It is great danger, our thorough and deliberate exposure to risk, and winning against it, that makes us deserving of reverence.  It is only the real danger of losing everything that first teaches us to know our resources, our virtues, our shield and spear, our very spirit; it is danger that compels us to be strong.  Thus the first-principle:

One must need strength; otherwise one will never have it.

The strongest among us, champions respected throughout history, have felt precisely this way – freedom is something a man attains but can never own, something one always pursues, something for which we must fight, a state one continuously conquers.

Stay strong, rise to the fight!



– An adaptation, extension, paraphrasing from the works of Friedrich Neitzsche

Photo Attribution:  Rob Weir‘s photo of “Atlas (1937) Statue” by Lee Lawrie, Rockefeller Center, NYC

What Westside Barbell has taught me about Scaling Agile

Agile Portfolio Management:

There is a new way of doing things in delivering a complex product portfolio.  It focuses on delivering value both incrementally and iteratively.  It utilizes empirical process control and hypothesis-driven planning.  It utilizes test-driven development in both convergent and emergent delivery, even when budget and scope are fixed.  It utilizes a Lean kaizen approach to maximize velocity.

This philosophy is by nature, object-oriented and modular.  No one framework is right for every product, so it is highly customizable.  It may sound new to you, but it has been around for quite awhile.  But wait – I’m not talking about Agile, Scrum, or Lean software principles – I’m talking about Westside Barbell’s approach to powerlifting.


Waterfall Weightlifting:

Powerlifting is a sport in which the lifter competes for the highest single-repetition maximum in the Squat, Deadlift, and Bench Press for their weight class.  The traditional approach to training powerlifters relied on linear periodization – a method still very valuable for beginning athletes because each phase builds on the last while progressing toward competition-specific strength.

At a basic level, here is a 12-week competition plan:

3 Week Hypertrophy Phase (muscle size, stamina): Sets of 12 to 15
3 Week Strength Phase (movement form, ability to move weight): Sets of 5 to 7
3 Week Power Phase (Explosive speed, maximum weight at progressively higher volume): Sets of 1 to 3
3 Week Peak & Rest (Highest weight, lowest volume): Sets of 1 to 3, tapering off to a few rest days
Competition: Three chances to get three lifts correct, competing against others who are doing the same

As agilists, this correlates perfectly with the “waterfall” approach we try to leave behind:

Hypertrophy phase: Business planning, creative design, and thorough documentation
Strength phase: Database layer, middle-tier
Power phase: Client-side logic, front end development
Peaking phase: Testing, beta release, focus group and stakeholder reviews
Rest days: Code freeze and marketing
Competition: Release to the market, in which you may not recover from failure

Then the lifter starts over.  If there was a big loss (e.g. an injury) pre-competition, the weight lifter might not compete at all – just like software project that gets cancelled after key engineers leave or technical debt gets too high to meet the release date.  More problematically, if there is a big loss or injury at the competition, the lifter may never compete again- just like the software team with a botched release that gets “reassigned” or laid off.


Repeating the Cycle:

The weightlifter who perseveres, win or lose, still has big “waterfall” problems.  The lifter rests a little and repeats the linear progression cycle, an exercise in bodily context-switching.  When the next hypertrophy phase starts post competition, most of what was developed in the previous cycle is gone!  The same is true of each phase.  When the lifter resumes focus on 3-rep max, some hypertrophy and stamina is lost.  As the lifter peaks for competition, the 1-rep max may increase but the 5-7 rep range decreases.  Studies show that after a few weeks in the subsequent hypertrophy phase, up to 15% of single-repetition strength is lost.  The disconnect between foundational planning (by increasing stamina and size) sacrifices a considerable amount of value captured (ability to perform the same single-rep max).

What does this specificity-switching cost the lifter?  As a beginner, not very much – any work will improve size, conditioning, and maximal strength, and fantastic progress can occur.  The discipline of repeating the movement pattern likewise increases maximal strength even with little planning.  However, once the lifter goes from a beginning athlete – a time when nearly anything will improve the lifts – to an intermediate athlete – subsequent peaking phases will see little or no increase.

The process requires disruption if total stagnation is to be avoided.

If this sounds like delivering software in waterfall, it is!  As you read this quote from a strength coach describing the “waterfall” lifting approach, think about the Waterfall PMO:

Having now gotten away from this type of training and looking back as an outsider, I can see where the program is lacking and why I had so many problems. I used to feel it was the only way to train (mostly because it was all I ever knew). It was also the only type of program for which I could find a lot of research. Some of the limitations to this linear style of periodization include:

  • It’s a percentage-based program
  • It starts with a high volume
  • It only has one peak
  • Your abilities aren’t maintained
  • The program has no direction to the future

– Dave Tate via T-Nation.com

Here are the parallel problems we see with waterfall:

  • “It’s a percentage-based program” – accounting-based statistical process controls are applied to an emergent system
  • “It starts with a high volume” – a significant portion of the budget is spent planning, designing, and fighting about features that no user wants (and if the project is cancelled, 100% of this sunk cost never drives user- or owner- value capture)
  • “It only has one peak” – A major release attempts to market itself to all segments simultaneously and a flop may kill the product line completely
  • “Your abilities aren’t maintained” – once the waterfall project plan is set in motion, market evaluation, user feedback, and stakeholder review is non-existent
  • “The program has no direction to the future” – a waterfall project plan is delivered based on the knowledge available at the beginning of the project when the least is known and has no intrinsic method of looking to the future relationship between the user market that might exist and the software that could be produced.

Westside Barbell’s “Conjugate Method”

The Conjugate Method attempts to balance all phases across preparation for competition. At the “enterprise level” three movement patterns are continuously tested as the measure of the process. At the “business level” a new variation of a similar movement may become the focus for 3 to 5 weeks (e.g. training rack pulls instead of full deadlifts when “lock out”, the upper portion of the movement, is the weak link). At the “team level” (the lifter + coach), the two-week sprint has a consistent set of ceremonies and artifacts (workout plan, workout log, the workout, etc).

Here is an example:

Week 1
Monday – Max effort lower body day (squat + low back + hamstrings), focus on strength and power
Wednesday – Max effort upper body (bench press), focuses on strength and power
Friday – Dynamic effort lower body (squat, deadlift), focuses on speed and hypertrophy
Sunday – Dynamic effort upper body (bench press), focuses on speed and hypertrophy
Week 2
Monday – Max effort lower body day (deadlift + low back + hamstrings), focus on strength and power
Wednesday – Max effort upper body (bench press), focuses on strength and power
Friday – Dynamic effort lower body (squat, deadlift), focuses on speed and hypertrophy
Sunday – Dynamic effort upper body (bench press), focuses on speed and hypertrophy

This correlates nicely with “core” Scrum concepts:

  1. Maximal strength is tested every week – working software every sprint
  2. The metric (1-rep max / story points delivered), is improved (strength / velocity over time), through hypothesis and experiments (empirical process control)
  3. The entire body is trained for size, stamina, strength, and power per every week – vertical slicing and user stories
  4. The lifter gets to experiment with new exercises without fear of wrecking a 15-week cycle – sprint retrospective, sprint planning
  5. The coach focuses exercise planning on addressing weak points – a ScrumMaster, removing impediments
  6. The Power Lifting competition is not a unique event with a long lead time – working software every sprint, TDD, XP, continuous integration and release

Now the lifter, like our Scrum team, gets to plan, experiment, and deliver often.  The overall roadmap (Lean + Scrum) might have a basic end-game or vision (increasing 1-rep competition max performed on 3 lifts the same day is equivalent to convergent product delivery), but planning only looks forward up to 5 weeks, commitment at 1 to 2 weeks.  Likewise, the lifter and coach is always looking at the most recent data, the newest lessons learned, and quickly reacts to whether a behavior, practice, or process should be continued or not – just like the Product Owner, ScrumMaster, and Team are always planning and executing based on the most recent market and team data.


Applications to the SDLC:

Now we can extend the metaphor and draw conclusions.  The powerlifter’s body equates to a complex large-scale digital portfolio.  The lifter needs to increase value three programs that focus on convergent product delivery while also developing several programs that utilize emergent product delivery.  In waterfall these two program methods are separated by functional division and project lifecycle, in conjugate (Scrum) these two are handled in tandem.

For the powerlifter, the three convergent products are squat, deadlift, and bench press.  Quality must stay constant or the increase in value does not qualify.  The same is true in software products – adding a high-value feature while allowing a 50% increase in crash on launch is absolutely unacceptable.  Your users will disqualify you!  Whether your have a three-application enterprise CRM program or a three-iOS app consumer program (see LinkedIn or Facebook as examples), adding an exciting feature to an app that causes mass user drop out is a risk no business can tolerate in today’s market.  The competition is too fierce, barrier to entry too low; someone will blow you away.

At the same time, the powerlifter needs to maintain several emergent delivery programs, some for function (increasing grip strength), some for fun (increasing bicep size).  Ongoing workout plans, building size, stamina, and maintaining joint health, addressing weak points by focusing on a new accessory exercise for 5 weeks – all of these priorities must be balanced and evolved.  Keeping a workout log is the only way to be sure that exercise volume, intensity, and density are increasing.  The relationship between the convergent product value and the emergent product investment is the only metric rationally applicable.  The same is true in software delivery.  Emergent-delivery programs like R&D, marketing, UX, product planning are all critical to the health and success of the portfolio as a whole – but the end goal must be clear.

  • Over-planning and under-delivering is not acceptable.
  • Over-researching and under-user-pleasing is not acceptable.
  • Over-designing and under-testing is not acceptable.
  • Over-marketing and and under-releasing is not acceptable.

Conclusion:

The Conjugate Method as an analogy for Agile, Scrum, XP, and Lean at scale works for me because I love lifting.  I realize it may not be right for you, especially if neither agile or weightlifting are familiar territory.  So, like everything, find how this applies to your life so that you can find inspiration in ordinary – then start a conversation about it.  I’m happy to discuss anytime:  224.223.5248

Never Hold Back, Never Give Up

Fail fast and learn quickly.  Free yourself from fear.  Take control of who you are, the value you create, and the judgement of that value.  Be “The Good” as you define it for the world around you.

#life #power #strength  #philosophy  #psychology