Agile at scale is BigVisible’s specialty. Our coaches have been developing and refining practices for scaling agile methodologies in large enterprises for over a decade. The “BV Way” was developed as BigVisible’s approach to implementation at scale. Recently other scaling frameworks have emerged and are gaining tremendous interest in large enterprises.
The current market-leader among these enterprise scaling models is the Scaled Agile Framework (SAFe). BigVisible has embraced SAFe and hybrid solutions that incorporate SAFe and elements of other scaling models, along with our BV Way.
Each approach has its strengths and weaknesses, but any scaling of agile to large programs and portfolios must be built on a firm foundation of team practices that support scale and complexity.
As the agile adoption curve has progressed to the point that many mainstream organizations have experienced benefits using Agile at the team-level, many now want to multiply those benefits by using agile practices to manage large programs and product suites involving many teams. Some are moving to “agile-ize” their steering and portfolio functions across dozens or hundreds of projects or products.
At this juncture, organizations trying to use agile methods at scale, usually find themselves grappling with substantial additional complexity related to planning and rolling up or consolidating information at higher levels of planning and tracking. Often early attempts are troubled by poor transparency of organizational capacity at different levels, which leads to high levels of variability in forecasting (a fancy way of saying their plans are inaccurate).
In conversations with clients and prospective clients, we often find they are eager to scale agile at the program and portfolio levels before they have reached even modest levels of proficiency in technical and project management practices for delivery execution at the team level. Some think their teams are doing Scrum pretty well, others think that introducing program and portfolio level planning structures will somehow strengthen their teams’ weak execution, from the top down. This is a dangerous misconception, analogous to thinking that adding additional floors to a tottering building will somehow strengthen its weak foundation.
When I was VP of Professional Services at Rally, we developed a model for maturing an organization’s Agile capabilities. The “Flow, Pull, Innovate” model nicely illustrates the importance of establishing foundational delivery capabilities at the team level before adding scale and complexity:
– Rally Software
BigVisible’s “BV Way” similarly ensures that the critical base factors of software production, the development teams, are demonstrably in control BEFORE we expect program and portfolio planning functions to be predictable and managed.
Businesses need to make intelligent tradeoff decisions for scarce development recourses. To do this they need reliable data that is accumulated from observations of historical performance. This historical data populates a statistical model that supports higher-level plans, larger organizational structures, and longer time horizons through the aggregation of data.
To forecast accurately, our base-element statistics must provide sufficient data points of low standard deviation. Our confidence in the predictive utility of our data, its accuracy and precision, should be known and communicated transparently. Rolling up weak stats with high variability and/or artificially high precision will inspire false confidence. Both are damaging to the delivery organization’s credibility, morale, and the agile brand in the organization.
Assuming there’s a consensus that it’s bad to rely on a complex structure built on a foundation of marshmallows, what are some key traits and behaviors demonstrated by agile delivery teams that do provide a solid foundation for scaling?
Here are some specific indicators of Delivery Team maturity:
1. Measures of velocity, throughput, and/or cycle time: If these stats aren’t tracked, you can’t know the team’s capacity and you can’t forecast. If the data varies widely (high standard deviation) the team’s capacity estimates inspire low confidence, and they probably have underlying issues that contribute variability and risk. These stats can be gamed, so ensure process disciplines are adhered to (eg: accepting only done stories of productizeable increments).
2. Short cycle times with high fidelity feedback: Longer cycle times and larger batches yield infrequent feedback/sampling rates and mask variability. Relatively more data points should be provided if cycle times are long (4 week iterations, for example).
3. Consistently high acceptance rates: Teams should consistently deliver by the end of the iteration what they committed to in the beginning. If the percentage of story points accepted over those committed (plus pulled in), is consistently >90%, for example, it indicates several positive behaviors and capabilities are probably present: the team understands their work, their relative estimates of PBIs are consistent, they swarm to complete stories and limit work in progress. Teams with low acceptance rates tend to generate highly variable velocities.
4. Thorough Definition of Done (D of D): High acceptance rates don’t mean much if we accept software that isn’t fully validated and integrated, or produce software that may not deploy predictably. A Definition of Done that allows acceptance of anything less than a potentially shippable increment creates batches of risk and introduces variability near release time.
5. Comprehensive test strategy and continuous delivery: It is critical to fast feedback that every code change initiates progressive stages of high-coverage automated test suites in a continuous delivery pipeline. Even if not releasing to production, demonstrating the capability offers assurance that code assets integrate and deliver value predictably. No pipeline promotion step (except maybe a “DEPLOY TO PROD” button) should be vulnerable to human frailties. Software that relies on humans to read or remember or type to build, test, deploy, configure, migrate (or whatever), will suffer exponentially greater variability when scaled to larger programs.
6. Measured and managed technical-debt: High levels of technical debt are often related to a weak D of D, or it may simply be the fruit of sins past. Technical-debt increases variability and risk. Technical debt includes untested code, code lacking automated test coverage, unmerged code, unintegrated system components, design flaws, and code smells that require refactoring. The lack of enabling infrastructure can also be considered a form of debt or underinvestment.
7. A well-groomed backlog: We generally advise teams and Product Owners to maintain 1.5 to 2.5 iterations of “ready” stories in the backlog (maybe more going into release planning). Well-groomed and ready means more than story cards with estimates. It means ready to code- acceptance criteria everyone on the team really understands, and a viable high-level design approach everyone has an informed consensus on.
8. Continuous planning at all 5 levels: Agile planning at 5 levels suggests an appropriate level of granularity for work items at each time horizon. How many teams have you seen who do standups and sprint planning, and think that’s sufficient to kick scrum butt? Without a roadmap and vision, and a release plan to map strategy to tactical execution, a team is operating myopically with too short a planning horizon. They will fail to address risks, architecture, and design in a timely way (through dialog, spikes, reference implementation, UI guidelines, feature tests/experiments, etc). These teams will fail to provide transparency to stakeholders at the level (feature and release) required to make key decisions.
These elements are mutually reinforcing. It’s hard for a team to deliver well and consistently over the long term with significant shortcomings in any area above. Critically, because variability compounds exponentially with the scale and complexity of a system, it is perilous to believe your program is in control if its constituent teams exhibit significant shortcomings in any of these areas.
Please understand that I am NOT suggesting that if your teams aren’t perfect you mustn’t do release planning, or shouldn’t start implementing SAFe, or that you should delay implementing a “Release Train” to manage your big complex program. By all means, do those things in an informed and deliberate manner, while understanding the limitations of the plan, and aggressively improving the capabilities of the teams.
Indeed, planning should be done early and often, and planning at all five levels continuously is an important element of achieving flow and pull. Just remember that it’s essential to set expectations about the accuracy of plans generated, and to use the appropriate level of precision for the confidence you have in your stats, and for the time horizons in the plan, as errors compound over time.
Some corporate cultures make it very hard to share a plan without it being misconstrued as a cast in concrete “promise” in a destructive game that inevitably ends in finger pointing, recriminations, demoralized employees, and consequently retarded value delivery. Strive to set stakeholders’ expectations that a release plan generated at an early stage of agile maturity is:
1. based on a changeable (agile) backlog, thus is NOT a commitment
2. based on weak foundational stats, thus inherently inaccurate
If you want to scale agile, any significant shortcomings in critical team-level capabilities should immediately trigger an aggressive improvement program involving training, intensive coaching, technology investment, and organization and employee development. This is critical to scale to the program and portfolio levels with the transparency and consistent value delivery your organization demands, and it’s best to begin while leadership and team members have the energy and enthusiasm to effectively embrace the changes necessary to succeed with agile.
A Note on Metrics and Measures
Important Note: In any discussion of metrics and measures, I feel compelled to repeat some classic considerations to help avoid their (all too common) misuse:
- Measure what’s important, not what’s easy to measure.
- Poor metrics are worse than none (and incentives even more so).
- No one metric tells a whole story- a balanced set is essential.
- Understanding trends is more useful than trying to manage by snapshots.
- Mismanagement will engender gaming. People are smart. Feeding gamed metrics to disengaged managers is easy if managers create reasons to do so.
Numbers don’t replace leadership and people, insight and active engagement, and can’t replace the need to collaborate and care.