I’ve been thinking about Agile architecture recently, since a seasoned consultant friend (Payson Hall of Catalysis, Inc.) sent me a literal cry for help. I’m going to begin by quoting his email at length because what he wrote is so descriptive of a certain challenge and mindset. “Children running around…” I love it! Read on…
Help me Obi-Wan-Halim… you are my only hope!
I need to make some time to rave with you.
As a former developer I am completely on board with Agile principles as a means of organizing a development team. I’m sold. Agile seems particularly relevant when doing small simple projects, or when doing greenfield development.
I’m not suggesting Agile isn’t up to the task of complex system development (I know you have done some complex systems), I just am tired of children running around and telling me that “architecture is an emergent property of Agile and I shouldn’t worry about it.”
Here’s the context of my concern: There is a large legacy system that was developed over 30 or 40 years that is nearing end of life. It processes millions of dollars’ worth of transactions each month. It has something like 150 interfaces. It may be an aging rat’s nest, but it works.
I have no doubt that a handful of smart people could analyze this system and partition functionality, then build module specifications and a sequence road map for components that could be built and swapped in using Agile methods… but I am watching a bunch of children wave around the Agile manifesto and tell me that my concerns about complexity are because I have gray hair and don’t understand modern development approaches. They think the product owner should prioritize based on business value and everything will be swell.
I need you to either talk me down… tell me it’s time to retire… or give me some references to people who have used Agile to build really complex systems to replace legacy systems – how their teams were structured, how they managed architecture, and how they moved things into production “every few weeks” without a huge burden of regression testing.
You can’t build a skyscraper by starting with a one-story house and deciding you want to “go up.” If you are going to go up 50 stories you need to lay in the foundation and plumbing and electrical with sufficient robustness at the start (swap in security, data integrity, assignment of functionality, and air traffic control for transaction sequence if you want a software metaphor).
Well, this made me smile. I rolled up my sleeves, rubbed my hands together, and gave my knuckles a good crack. Here’s what I told him:
Payson, what you are hearing in what these “kids” are saying or seem to be saying … just … no. Agile architecture does not mean no overview and no planning – although it does offer significant advantages over traditional approaches in many situations, including the one you describe. With your seniority, you can help the organization steer a middle course between extremes on both sides.
Agile architecture principles are the same as for Agile in general – where there is lots of planning. It just doesn’t happen all up front. The idea, as you know, is to break the work up into small, well-structured pieces; get rapid feedback and learn as you go; keep an eye on shifting business and technical priorities; and let the plan evolve based on your learning. There’s no simple formula for this – every aspect, done well, requires the ongoing judgment and collaboration of the delivery teams, the product owner, and often other stakeholders; in a large Agile program or organization these will often include architects!
What we’re looking for with Agile architecture is for the architects to work with the delivery teams, not dispense pronouncements in advance from far away and high above the action. Let the architects step into some responsibility for teaching, mentoring, and coaching the teams so that team members can develop their abilities and deliver with excellence. And let architects learn from the team’s experiences as they engage with the actual codebase and development challenges. In this way, the architecture can evolve with intelligence. You have ongoing adaptation based on feedback-driven learning. You get none of these benefits from big up-front design and early monolithic commitments.
Looking at your immediate challenge: For end-of-life replacement, partitioning a complex existing system and replacing old code in increments with new architecture and strong interfaces is a sound idea, really the only one I know of for rebuilding large legacy systems. And replacements should be tackled in business value order, except where technical dependencies or risk reduction suggest otherwise, and as conditioned by the cost to replace the various elements. But I suggest you start from the business view. Where will the sponsor get the biggest bang for their buck? Where is money hemorrhaging? What parts of the code cause the most production outages, or other defect-correction load on dev teams (which causes reduction in delivery of new features, and the costs of delay that follow from that)? What are the biggest drivers of user support calls? Where are the rigidity and fragility of the current code most impeding the need to move quickly for regulatory or legal compliance or product innovation? Which subsets of the data domain or which interfaces get the most use or carry the highest dollar volume? And so on. Of course, balancing these considerations can be a complex multi-stakeholder exercise, for which many organizations lack the maturity.
(Organizational maturity is a whole topic of its own; look for another post on that some day…)
So – by all means, put a sound foundation in place at the beginning. Show respect to existing architectural requirements, security and data privacy and other standards, goals, and conventions in the containing organization and systems, while using what you know today to advance the game. Maybe you’ll take this opportunity to introduce microservices. I see no sound basis to insist on partitioning the entire complex system into specific planned increments before starting to develop the first piece. Sketch the territory, by all means, and then begin by delivering the first piece. Establish an initial plan and architecture, but hold them loosely while you are in early learning – and don’t go too far down the road before you pause to see what you’ve learned and adjust if needed. If you’re building with sound engineering, it will almost always be cheaper to refactor as you go than continue with a design that turned out not to be such a hot idea after all. Or delay your early returns while too much foundation work is taking up time.
To take one specific example, you don’t need and should not attempt to architect, say, a standard glue layer for all 150 external interfaces at the beginning of the project. Don’t start by taking time to dream up a standard to handle every possible wrinkle that could come up. Instead, lay in a sound foundation as a just-sufficient starting point and then use common sense and good communication among teams to evolve the architecture so it serves the needs that emerge incrementally as you go.
Now, it’s true – all of this places some important demands on communication, coordination, and discipline within the multiple teams of a large program or enterprise systems complex. It’s the farthest thing from a free-for-all. You need arrangements and humans that can integrate the flow of intelligence and creativity both top-down and bottom-up, as well as laterally. Another signal of the need for organizational maturity.
As to regression testing – yes, as you say, it is a huge challenge. Automated testing can be a big part of the answer, and it’s true that test automation is difficult to retrofit for a large system with tightly-coupled components: as we have often seen, changes that seem local can have surprising and unpredictable distant consequences, while details of intended functionality are often lost in history and obscured by defects in the current code. Domain SMEs and other stakeholders will have to participate in test development to confirm expected behaviors. (As it turns out, these same difficulties also plague manual testing.)
Building automated regression tests for a legacy system has to be budgeted, including time from both the IT team and stakeholders. On the other hand, if you rely on manual regression testing (which you may have to, except where you have extreme isolation in the bits you are committing), it will not only force a slower pace overall, but the expense of manual runs will force you to test less often and in larger increments. This increases risk, makes debugging more difficult, and delays critical learning. In a situation like this, you can just forget about going to production every two or three weeks. Not gonna happen. (Side note – Manual testing at scale is also a challenge. I was once asked to approve a manual test plan for a quarterly release – two thousand pages of cut and paste errors and incomprehensible instructions.) In my experience, large manual regression cycles tend to produce a false sense of security in people who are far from the front lines.
So, to summarize – Agile can make big, hairy situations like this somewhat less impossible, but not necessarily easy. Think about it this way: the organization has spent decades digging itself into this hole. Why should it be possible with any methods to climb out quickly just because someone is in a hurry?
I understand your impulse to retire. Fun as this may be at some level, the frustration factors are enormous, and can carry a significant personal cost to dedicated professionals like yourself who are struggling through.
While you stay, you might want to ask yourself questions like these: What’s important here? What do I value in this? How do I want to show up? What will be my next experiment? How does my work here align to my life purpose?
Engaging this kind of inquiry can lead to significant value for your life.
A closing note – How does this topic fit within the ACI mission?
The Agile Coaching Institute (ACI) has articulated a number of skill sets within an overall Agile Coaching Competency Framework. I am writing to you today primarily from the Agile-Lean Practitioner and Mentoring wedges of the Framework.
Thanks to Payson Hall and Lyssa Adkins for invaluable editorial assistance. Remaining shortcomings are my own.