In my last post, I covered Branching, also known as “Forked Development”, and discussed what happens when it’s time to merge branches. Today, I’ll introduce some Source Code Management tools.
Every Agile team should define an SCM strategy.
- Trunk is pristine. No Junk in Trunk!
- Single branch development, unless emergency bug fix.
- Development branches always taken from Head of Trunk.
- Releases to production always taken from a tag/label.
- Bug fix branches always taken from a tag/label.
- Every commit contains a log message.
- Log message adhere to defined template (more on this later).
- Commit Early, Commit Often [Avoid Merge Hell].
By now we have a pretty good idea what SCM and Revision Control are all about. We also discussed a variety of guidelines on how best to use an SCM system within an Agile team.
It is important to note that SCM is the first and most important part of any Release Management System. The SCM system allows the team to track changes to the source code (who, what, when, why), as well as manage what version of the source is compiled for release (always from a tag/label), including any subsequent emergency bug fixes.
The question that should be burning in your mind right now is… which one do I use?!
You can look at these two lists from Wikipedia that compare various SCM solutions across multiple platforms: List of revision control software, and Comparison of revision control software. Unless you have some experience with these systems, choosing which is one right for you is fraught with assumptions, conjecture, and sales hype.
First, what platform you are developing for is usually not a problem; most SCM systems today work on all of the big 3: Windows, Unix, Mac. One of the standouts here is Team Foundation Server (TFS), the Microsoft offering, which of course only runs on Windows platforms. It is geared to DotNet development projects only, although you can store and track any file format you wish.
Microsoft TFS is a good solution only if you are a dedicated Microsoft Solution provider. A major downside to TFS is the cost — very expensive — and the care and feeding of the system (setup/configuration/maintenance). Similar to ClearCase by IBM Rational Software, you need a full time Configuration Management Guru to support your SCM system. But, if you are going down that road there are plenty of cool features that Microsoft has cleverly integrated across their entire tool line.
Next, I’ll discuss some other software options for CMS: ClearCase, AccuRev, StarTeam, Perforce, CVS, and SVN (Subversion).
In my previous post on the topic of source code management, I discussed the ins and outs of the Concurrent Versioning System and outlined some best practices. This brings us to another topic of hot debate on SCM: Branching, otherwise known as “Forked Development”. You have heard of some people speaking with forked tongues… try to avoid this, it generally causes problems and is easily abused!
Forked Development is the practice of having two or more parallel versions of the same product under development at the same time. These separate versions have a common base, but their development efforts are kept separate until some release date far into the future. When it comes time to merge branches, there is always “Merge Hell”.
Don’t get me wrong, Branching can be a good thing, if used properly. The following image, snagged from Wikipedia, demonstrates a common branching/tagging strategy:
Trunk is the mainline of development. This usually represents what is currently in production. When development of a new feature begins, a branch is created from the latest version of trunk (1 -> 2). When that feature is complete and ready to release to production, the branch is merged back into trunk (3 -> 4). When the merged changes are verified, that version of trunk is labeled/tagged for release (T1).
The next feature request comes out and a new branch is created (5). While the team is working on this feature, an emergency bug is found in production. A second branch is created from the last published version of trunk (4), which should have been taken from the tag/label T1 instead. The bug is fixed and merged back into trunk, tagged/labeled for release as T2. In the meantime development continues on the branch taken at (5), but is eventually abandoned. No changes are made to trunk from that branch.
When the next feature request work is started, a branch will be drawn from what is called Top Of Branch, or Head (of trunk).
The reason that a bug fix should be taken from the released Tag/Label is to give clarity as to the purpose. Even though branching from trunk instead of the tag gives the same functionality, later browsing of the project tree will not be as clear.
This brings me to another topic on SCM: Managing your history and meta data. It is very important to think of how the revision history will look, and will be used, when you create branches, tags, or even log messages for commits. I’ll cover this topic in my next post.
The foundation of any Agile Software Development effort is people, that is true, but in order to support and enable those people to do great things you need a solid structural foundation, an infrastructure of tools and related practices with which they can succeed.
The Release Management Stack provides the tools and practices that all developers, product owners and scrum masters need to define, design, implement, deploy and track the progress and quality of your latest great venture. This stack is composed of three base layers; Source Code Management (SCM), Build Automation and Continuous Integration.
Today we are going to start at the very bottom, Source Code Management.
- What is SCM?
- How does SCM fit into the Agile flow?
- Why is SCM important to a successful Agile Team?
- Which SCM system should I use for my team?
SCM, “Source Code Management”, or “Software Configuration Management” depending on who you talk to, has its roots in revision control systems, one of the earliest was published in 1972, SCCS (Source Code Control System), and was later superceded by the appropriately named RCS (Revision Control System).
Roger Pressman, in his book Software Engineering: A Practitioner’s Approach, states that SCM is a “set of activities designed to control change by identifying the work products that are likely to change, establishing relationships among them, defining mechanisms for managing different versions of these work products, controlling the changes imposed, and auditing and reporting on the changes made.”
Basically any SCM system is a specialized file system database. The specialization of this database is in how different versions of files are stored, labeled and retrieved. Today there are dozens of SCM solutions, both open source and commercial options, that serve and extend this basic need of managing version histories of our important IP. Yes, many vendors have added multiple features above and beyond the call, that whole “value added” credo pushing SCM past Source Code Management and more towards Software Configuration Management, but when it comes down to brass tacks, what 95% of the development community wants is a file system database with a specific set of revision control features.
Many SCM solutions offer many more features than this. In my opinion this draws away from the core functionality of the tool and waters down the overall experience.
Let’s say you want to buy an entertainment system. Do you buy an all-in-one solution, or a component system? For some people with limited space or means to manage various components, the all-in-one solution is a good choice. What you gain in the simplicity of setup, configuration and maintenance, you lose in specialization, and sometimes quality of features.
In my opinion the best software out there attempts to solve a fairly focused need, yet allows for integration with other systems and tools. This is similar to the *nix pipeline. A common practice of Unix users is to pipeline several commands together, the output of one command feeding into the input of the next. The individual commands have a narrowly focused use, and as such are able to dive deep into the functionality and quality of the features provided. Each command in the pipeline has a specialized functionality, and in that focused feature set, that command is king.
When you go for the all-in-one approach you are often constrained to adopt the workflow process of the tool vendor, giving control over to the vendor to determine how you should work and what you should be able to do. When you go for the modularized, or component model, you can select the tools that best fit your practices and style, link them together to create the perfect system for you, and have a higher degree of confidence that each individual tool will perform its’ job specialty better than a more generic tool might.
Getting back to SCM systems, the simplest, and unfortunately all too often used, method of revision control is the file copy-rename system. This is where you copy the file you are working on and rename the copy with a date-time stamp before you modify the original. If you mess up the document in some way, or want to “revert” back to a previous version you simply copy the renamed file and remove the date-time stamp from the file name, basically replacing your working copy with the archived version. I’ve seen this done as a standard practice in some companies as late as 2007. I was horrified.
Yes, this is essentially what an SCM system does for you, but there are a few distinct differences. The file-copy-rename method is a local machine only solution, therefore no other team members have access to this saved version history, unless you have a single shared drive that the team works on, but this has many pitfalls inherent in it and is greatly discouraged.
All modern day SCM systems do not store complete, in tact copies of your documents. Instead they store a series of ‘deltas’. Deltas are a compressed manner of describing the distinct differences between two versions of the same file. When you ask for a specific version of a file the SCM system compiles the list of deltas to re-create the exact content of the file version you requested.
A benefit of this delta system is storage. Instead of storing complete copies of every version of a file, you only store the diff description of what changed.
SCM systems generally store meta data about every file as well; who committed a change, when, why (a log comment explaining the changes). Users can browse the listed history of the file, viewing all this meta-data to determine the version they are interested. Furthermore they can use special diff/merge tools to view specific versions and see exactly what changed between any two versions of a file or filesystem. The modern day visual diff tools are very useful for quickly spotting small changes in a file.
What happens when there is more than one person that needs to work with a set of documents. Some early SCM systems used a system of locking files. When you checked out a file to work on it that file was essentially locked to all other users, only you could commit changes to that file. For another team member to be able to make changes you either had to commit your changes, or drop your changes and release the lock. The other option is your teammate had to wait for you to finish before they could add their changes. Any way you go it still comes out as wasted time and money.
So, file locking is a bad practice. Don’t do it.
In my next post in this series, I take a look at Concurrent Versioning System (CVS), Branching, and other SCM topics.