Subscribe to AgileIQ via email

Your email:

Subscribe to our Newsletter

siq newsletter 180

AgileIQ Blog

Current Articles | RSS Feed RSS Feed

One Metric to Rule Them All and in the Darkness Bind Them

  
  
  

by Wes Williams

Agile metrics and measurementI think metrics and measurements are good when used in the correct way based on the context and team I am working with. I use metrics to help them see what their issues are. Once they see their issues, we use metrics to help us determine, as early as possible, if the changes we are making are having a positive or negative impact on those issues and the rest of the system.

Measurements ARE necessary to know that we are headed in the right direction.

There are plenty of articles out there about abusing metrics. I thought it well known that all metrics need to be balanced (e.g. code coverage going up and complexity going down), and of course they need to be trended to be useful.

Now I have a request to find one or two metrics to apply to all teams to determine how effective Agile and coaching are at improving the teams. Does someone really think that one or two metrics can be used to determine effectiveness?

All teams do not have the same highest priority issue(s). Teams with terrible user stories and acceptance criteria do not need the same metrics as a team trying to fix high coupling code issues.

Ok, enough complaining! To help me, and I hope others, I want write about 1) the goals of specific metrics, 2) the dangers and abuses of those metrics, and 3) how to balance those metrics against each other.

Average Velocity trend

Goals:

  • Predictability!! What can be done by a specific date or when can something be completed.
  • Velocity is a capacity measure, NOT a productivity measure.
  • Velocity allows a team to know how much business value they can deliver over time.
  • Developing a consistent velocity allows for more accurate (i.e. predictable) release and iteration planning.

Possible abuses:

  • Calling this a measure of productivity. Focusing on velocity alone could even hurt productivity. Teams can artificially increase velocity in many ways: stop writing unit tests or acceptance tests, increase estimates, stop fixing story defects, and reduce customer collaboration, just to name a few.
  • Comparing velocity between teams. Velocity is a team value and not a global value. Many variables affect a team's velocity, including relative estimating base, support requirements, number of defects, political environment of the product or project, and more.
  • Calculating velocity by individual. This leads to a focus on individual performance vs. team performance (i.e. sub optimization).
  • Using velocity to commit to the content of an iteration when the value is not valid. Velocity is a simple concept and provides a lightweight measure, but it is also a very mature measure. To be useful it requires estimation maturity and the consistent application of this over a period of time by a stable team base. If it lacks these elements, its abuse can come at the hands of management or from the team, the latter occurring when a team makes assumption about the validity of the metric when, without the mature elements in place, it is not usable at all.

Balancing metrics:

  • Percentage of rework vs. stories done on average each iteration. This can help a team see how much of their work in each iteration is delivering new value to the team's customers.
  • Planned work vs. unplanned work trend. A lot of unplanned work will cause a team’s velocity to be of less value because it hinders the team's ability to plan. Having a low value for unplanned work will make the team’s planning more consistent and accurate.
  • Code quality metrics such as code test coverage, cyclomatic complexity, static error checking, and performance. A team that is increasing their velocity by not focusing on code quality is making a short term decision that will have a negative impact over time.

Delivered Features vs. Rework Resolution trend

Goals:

  • Makes _waste_ visible so that it can be eliminated.
  • Gives the team a good understanding of how much of their iteration capacity is consumed by rework (i.e._waste_).

Possible abuses/issues:

  • Lagging indicator of the team quality.
  • Story defects are not worked on until a regression period, giving a short term indication of fewer defects.
  • Increasing story estimates and/or reducing defect estimates.
  • Hiding defects as stories.

Balancing metrics:

  • An inconsistent velocity. Delaying defect correction until later will make the velocity trend erratic with large spikes.
  • Planned vs. unplanned scope. A team that is delaying defect correction will tend to have more unplanned work due to poor quality issues.
  • Number of defects in the backlog. Ideally this number should be on a downward trend. An upward trend of the number of defects in the backlog could indicate the team is delaying defect correction.
  • Increasingly long regression periods at the end of each release.

Completed Work vs. Carryover trend

Goals:

  • Show how well the team executes the iteration (i.e. delivers on their commitments).

Possible abuses:

  • Planning less work than the team is capable of to allow for interruptions or poor estimating.
  • Delaying refactoring code to complete work but not keeping the code at a level that makes change cheaper and easier in the future (or other good practices such as TDD/unit testing).

Balancing metrics:

  • A velocity trend that is not improving or is going down could be caused by planning less than the real capacity of the team.
  • Planned vs. unplanned work can indicate if the team is being interrupted and is causing task switching that could be the cause of the carryover.
  • Downward test coverage trend and/or upward cyclomatic complexity trend could indicate that the code is becoming more difficult to change and much more difficult to estimate accurately.

Planned vs. Unplanned Scope trend

Goals:

  • Show how good the team is at planning.
  • Show how often the team is being interrupted within the iteration to work on something that wasn't originally planned.

Possible abuses:

  • Large placeholders to allow unplanned work to come in and appear to be part of the planned work.

Balancing metrics:

  • Delivered Features vs. Rework Resolution trend
  • Completed Work vs. Carryover trend

Code Coverage vs. Cyclomatic Complexity trend

Goals:

  • Reduce the cost of change. Clean code tends to make the application easier to understand and safer to change.
  • Indicate that the system is being tested at an accurate level.
  • Indicate that the code quality is good: loosely coupled, simple as possible, etc.

Possible abuses:

  • Focusing only on one code metric, e.g. 100% code coverage with generated tests will not make the code easier to understand or change.
  • Focusing on code quality alone and not focusing on the business goals of the customer.

Balancing metrics:

  • Velocity trend
  • Delivered Features vs. Rework Resolution trend
  • Afferent and efferent coupling trends
  • Abstractness trend
  • Package dependency cycles
  • Number of changes in class(es)

This is far from an exhaustive list of metrics! But I hope the idea helps, of thinking about a metric, what your goal is of measuring a value, and how you can stop yourself or others from gaming the value by balancing it with other methods.

Tags: ,

Comments

Wes -- 
 
A thoughtful and useful post. I do have a quibble: 
 
> Planned vs. Unplanned Scope trend 
>  
> Goals: 
>  
> Show how good the team is at planning. 
> Show how often the team is being interrupted within the 
> iteration to work on something that wasn't originally 
> planned. 
 
I'm not sure that this metric shows how good the team is at "planning"; I completely agree it can show how often the team is interrupted, and perhaps initiate a team discussion with the interrupters to see if the work is truly emergent and urgent. 
 
If *really* good planning could prevent all unexpected work or accommodation of change, perhaps the team could shift into the business of "planning" the stock market leaders, horse-race winners, or drawn lottery numbers ;-)  
 
Cheers.  
 
+ Michael 
Posted @ Tuesday, January 22, 2013 6:52 PM by Michael Tardiff
Michael,  
 
I agree and I don't believe there is such things as a 'good plan' in the since of one that can predict all unexpected work. And seeing unplanned scope on occasions does not bother me. It is a consistent trend that says we are not planning on things we should be expecting. There are several ways a team could handle that.  
 
• Create an expedited queue that is limited. The idea that David Anderson talks about in the class of service section of his Kanban book  
 
• Move to a fully flow based system, like Lean/Kanban systems, that removes the artificial planning boundary and require all stakeholder to prioritize the work together and set appropriate WIP limits  
 
• As you stated, meet with the team/person to understand the real priority and hopefully the cause of the work  
 
Since I wrote the article I really feel in many context the flow based system is the better way to go.  
 
Thanks for the comments!
Posted @ Wednesday, January 23, 2013 10:35 AM by Wes Williams
I should add if I were to rewrite this I would throw out the term good plan and focus more on a forecast and the need to adjust with experiments and a focus on understanding causes as well.
Posted @ Wednesday, January 23, 2013 1:17 PM by Wes Williams
Good article however I will take the side of management here. Where is the management reporting? This seems like its for the team which is fine. But as a manager I want to know are we on schedule, budget, and scope? Heresy in the agile community where there aren't really any managers anymore but there it is.
Posted @ Wednesday, January 23, 2013 4:15 PM by Dan Williams
This is management reporting but it isn't a magic 8 ball (there isn't a magic 8 bal). All of these metrics together give feedback into the question 'what is the team capabilities in the current context'. The real question is what will I do with the information that they tell me? Will I look for blame or will I use it to adjust the system and help the team improve their capabilities?  
 
Metrics can only guide us and are only good in terms of letting us see the variability in a system. This is usually seen through trends. Managers who want a fixed scope, fixed budget and fixed date, and let's be real that is what a bad manager wants, have asked the team for the impossible.  
 
As a related example of these ideas, even SCRUM has taken the word commitment out of the SCRUM guide and replaced it with goal. Still not quite the correct word but a step in a better direction. Goal at least has some acknowledgement that people are committed to doing a good job, without asking for further commitment, that the work is not completely predictable, even if certain practices do have predictable outcomes, and that many things, that will affect the team's ability to meet a goa,l are outside of their control.
Posted @ Wednesday, January 23, 2013 4:54 PM by Wes Williams
I should add, this is not advice I give without having practiced it. I have not given a fixed date estimate in years, though I have had it requested of me many times. What I do give is an estimate as a range with likely and possible impediments and risk that I believe to be outside of the teams control as well as a list I believe to be likely are possible that are within the teams control. This was rarely what was asked for by the immediate manager. It was however appreciated, in most cases, and allowed the manager to have a good conversation with sr leaders and stakeholders. It also allowed later conversations to move more quickly to what is the best thing to do now that 'X' has happened. 
 
... and I was never fired for giving this type of information.
Posted @ Wednesday, January 23, 2013 5:06 PM by Wes Williams
I like the article, Wes. There was one phrase that got me thinking: 
 
> Velocity allows a team to know how much business value they can deliver over time. 
 
Are you saying that there is a direct correlation between story points (velocity) and business value? In my experience, there isn't; I don't see an 8 point story necessarily providing more value than a 3 point story. We coach that story points are an estimate of effort, not value. 
 
Assigning value to a story is in the product owner's realm, not the team's. The product owner can then use that value, along with the team's estimate of effort, to assign priority. 
 
Thoughts?
Posted @ Wednesday, January 30, 2013 2:13 PM by Steve Bement
Steve, that is a very good catch of a misleading wording. Velocity is not a measure of how much value the business will get. I simply mean it is not a duration estimate or productivity estimate but a rough comparative size of the business value. I am explicitly saying it should also only represent business value comparative size. 
 
It may be more accurate to say what ever I measure as part of velocity I should always measure as part of velocity. Do no mix units, e.g. duration and comparative size. 
 
Another part of making it consistent is always measuring the same types of work. I called out that I usually only count business value things in my velocity. I don't recommend creating 'stories' for defects, meetings,etc. If those items are in my velocity they also have to be in my plan for it to be accurate. That makes release planning much more difficult. But if I allow defects, meetings, etc. to lower my velocity it becomes a much better number for planning. Of course it would be possible to have some planned defect correction and some unplanned but the unplanned defects should not be counted in velocity. 
 
This last paragraph is a concept that deserves a blog post of its own but I hope it helps you understand what I mean in this case by a measure of business value. It is not amount but comparative size of business value. Amount in this case should have been replaced with comparative size.
Posted @ Wednesday, January 30, 2013 3:19 PM by Wes Williams
Comments have been closed for this article.