Acceptance Test-Driven Development: Are We Flogging a Dead Horse?

Or: Functional Testing Practices I Do and Don’t Like and Why


Horse portrait by Yago PartalFunctional Testing Practices I don’t like

I’m going to start with what I don’t like, as it will create more context for what I do like, and because it is nicer to end on a happy note. But first, let’s get the definitions out of the way.

What exactly is ATDD?

The first thing I don’t like about ATDD is the sheer amount of confusion in the industry about what the term means. Follow these links for examples of the confusion around definition:

In order to move past this hurdle and on with my argument, here is a definition of the term and some others related to it, so we have a common understanding.

Acceptance tests:  otherwise known as customer tests and previously known as functional tests, these are tests that match/replace the acceptance criteria of a story. The concept has its roots in Extreme Programming (XP). The idea is that the customer will accept the story when all the acceptance tests pass. Or, put another way, a passing suite of acceptance tests constitutes part of the Definition of Done for a story, iteration, or sprint.

Because acceptance criteria are typically expressed at a high level of user functionality, Automated Acceptance Tests (AAT) are often written using GUI driving testing frameworks (e.g., Selenium). These tools are sometimes referred to as Acceptance Testing Tools/Frameworks.

The Agile community has long wanted acceptance tests to be written by the customer (hence the newer name of customer tests), ideally before work starts on the story. The acceptance tests are written before the code; this leads to the term Acceptance Test Driven Development (ATDD) or Automated Acceptance Test Driven Development (AATDD), because it follows the pattern of test before code as practiced by Test Driven Development (TDD). I will address TDD later.

If you have implemented Scrum as your Agile practice, read “product owner” in place of customer. You will find me switching between the two terms in this article.

Product owners didn’t buy into the vision

The reality of the above vision of acceptance testing is that most product owners were unable or unwilling to write such tests, particularly if they required using a scripting or programming language.

When the Agile community witnessed this reluctance, they thought that if product owners could create tests using familiar applications like Excel or Word, then the practice would become more palatable and widely accepted. This resulted in testing tools such as FitNesse. I’m going to step out on a limb and say that for the most part these tools didn’t work in the hoped-for manner. Even using this medium to write tests, product owners were still reluctant to produce testing artifacts and found the output of FitNesse confusing and of little value to them.

So the community went back to the drawing board and valiantly tried to another approach. Enter Behavior Driven Development (BDD). The BDD approach was to create a framework where the tests are written in plain English and saved as plain text documents. The thinking was that if we make it simpler, product owners and customers will want to write tests. Agilists hoped that if the customers could do this then the clever devs and their BDD frameworks would do all the other magic to turn these plain text documents into automated acceptance tests.

I have yet to see a product owner get excited about this breakthrough and throw her arms into the air declaring “Hallelujah! Finally I am in control of the quality of this project!” Nor have I seen a product owner thanking the Agile community for developing the tool that he has been waiting for all his life.

In short, I feel that Agilists are guilty of not understanding the product owners’ needs and are telling them how they should be doing their job. This is akin to software companies with the view that the users are the problem and need to be educated – “if only the customers would get just how awesome this product is then they would use it the way that we designed it – the way they are supposed to use it. The users need all these awesome features we have created for them but they just don’t understand the concept.”

Product owners aren’t coders, they’re business people

Product owners have their role because they understand requirements, customers, business drivers, marketing, sales, and more. They are often business analysts and spend much of their time talking with customers, stakeholders, sales, marketing, and usability experts; visiting sites/customers; and making themselves available to the team to answer questions, providing scope and resolving any uncertainty that may arise in a story during an iteration. Thinking of acceptance criteria in a certain format or in a programming language is often incompatible with their skill set and interests. In my experience, this is why they are reluctant to write acceptance tests. They seem happy to create acceptance criteria when asked, but are often reluctant to do anything more than that.

Acceptance tests and especially BDD tests are hard to refactor

Acceptance tests tend to just grow in a corner of the codebase without being refactored. The acceptance test regression suite (the accumulated acceptance tests) gets added to each iteration and no thought (e.g. do we need to refactor these tests? Are they still relevant?) goes into the question of duplication and quality. Test Code is a first class citizen and needs refactoring. Acceptance tests often don’t lend themselves to being refactored, and I have yet to see a way to refactor plain text BDD tests.

Acceptance and BDD test suites are slow and this only gets worse over time

An acceptance test regression suite keeps getting bigger and bigger and slower and slower. You may find, when working on a story, that one small change can break a hundred or hundreds of these tests, forcing you to go back and edit each of these failing tests. More often, the entire suite is thrown out and the ATDD/BDD exercise is declared a failure.

When the running of an acceptance test regression suite starts taking an extended period of time, you lose the benefit of short feedback cycles. The likelihood of a dev team taking action on a failed build reduces in relation to the length of time the build takes, i.e., the longer the build, the less likely it is going to be maintained and acted upon when it fails. Without refactoring, any test suite will suffer from accretion and entropy, and eventually be abandoned – particularly if the suite starts to take longer and longer amounts of time to run.

Acceptance and BDD test suites are fragile and cost time to maintain

Acceptance tests are known for suffering from false positive test failures. Development teams are unlikely to take any action when the build fails when they are tired of expending vast amounts of time and energy trying to fix fragility. It is hard to ascertain if the failure was a bug in the code, a bug in the test, a change in the environment, or just the suite framework being flaky. Fixing bugs that are intermittent and hard to reproduce is notoriously difficult and time-consuming. When this happens repeatedly, the solution is often to comment out the test, remove the test, or abandon the entire suite.

ATDD focuses on the wrong part of the testing pyramid

Don’t get me wrong, I like integration, system, and end-to-end tests. I’m in favor of testing the entire pyramid (see the pyramid below).  I’m not suggesting throwing the baby out with the bathwater when I criticize ATDD and BDD. An advanced development team should be able to get functional tests to run quickly and be stable and appropriate for the product. But too often new teams leap on ATDD as a best practice, and the testing pyramid is turned upside down as their primary focus moves to the top of the pyramid where ATDD lives. In the pyramid diagram you will notice that unit testing and Test Driven Development (TDD) should be the primary focus of the team and the foundation block of all testing. When a team is proficient at TDD with unit tests, they are more likely to create the appropriate level of testing at the higher levels of the pyramid. In my opinion, ATDD should not be undertaken by beginner (or SHU level) development teams.

ATDD is too often mistaken for TDD (Test-Driven Development)

“We do TDD here” is a comment I hear too often from teams that are not doing any unit testing but have focused their testing efforts on functional testing only and believe that this is what TDD is. Even if you are writing unit tests, that doesn’t mean you are doing TDD. Unless you are writing the unit tests at the same time as the code, you are not doing TDD.

ATDD can encourage big design up front instead of emergent design

While everything else in Agile follows the philosophies of just in time (JIT) and simple design, many of the assertions that arise from acceptance tests before code implementation are based on assumptions in functionality that turn out to be wrong once the code and design have emerged. This then requires you to have to rewrite the acceptance tests during the iteration. Having loose or vague requirements (as favored by Agile) means that the way you implement a story may be very different from the way that you thought it would be implemented before you started the story. The ability to develop the solution is embedded in Agile processes and is indeed one of Agile’s strengths. Because of this I’m going to claim that many acceptance tests go against Agile architecture and emergent design.

This mostly happens when your acceptance tests are driving a GUI. Now I am not opposed to designing a GUI up front, and indeed it is helpful for the developers to have a design or mock to work from when implementing the story. But often, it is only when you start implementing a design (coding) that you find all the flaws in the design and all the edge cases that the designer did not think of when creating the design. This results in the design and/or functionality changing, or emerging (part of emergent design) during an iteration. It is a waste of time and effort to have written tests against the GUI before the GUI is implemented.

ATDD is time consuming and not lean

The practice of ATDD can consume time in these areas:

  • Creating fixtures to drive BDD tests
  • Rewriting the tests after emergent design has changed the original design
  • Maintaining a fragile suite that suffers from false positives
  • The tests themselves are slow to run

This is not an insignificant amount of time and definitely not lean in nature.

BDD context switching

I prefer acceptance testing tools that sit on top of the same technology that you are using for unit testing, like JWebUnit which runs on JUnit. You can edit and run the tests all in the same IDE. BDD testing involves switching between IDEs and/or language technologies, and that feels unnecessarily disjointed to me.

When do you call the horse dead?

The industry has had long enough to try the concept of acceptance testing out on the community of product owners, and the uptake has proven poor. I call that a failure.

So how much longer are we going to flog this dead horse? Compare this to the uptake and acceptance of TDD as a practice. TDD has stuck and works. ATDD did not, and the Agile community at large seems to keep wanting to revive it by changing its shape and creating new frameworks and approaches instead of ditching it.

Testing Practices I Do Like

Writing good, clean, and readable tests

Your tests are a first class citizen and form part of the documentation (or specification) of your application. Learning to write good tests, wherever they are in the pyramid, is an art in itself and one that must be learned by every Agile developer and practiced by every Agile team. Bob Martin has a great chapter on Unit Testing in Clean Code: A Handbook of Agile Software Craftsmanship. The pattern of keeping test code clean applies to all types of automated tests.

Refactoring tests

All code, including test code, suffers from rot and needs to be kept fresh. Learn to refactor your tests. I highly recommend Gerard Meszaros’ book on this topic, xUnit Test Patterns: Refactoring Test Code.

Use the right tool for the right job. What Specification frameworks are good for

If your business has areas of functionality or business logic that have to meet compliance or legal requirements and be documented, then this can be a good use of a BDD or other specification testing tool. You can meet the compliance requirement around documentation and build a valuable test artifact at the same time. Win/win!

BDD test protocol

Given, When, Then is a nice way to think about your tests. It has become very popular among unit test frameworks as well. I like it. It is similar to the Arrange, Act, Assert pattern but it builds the pattern into the test protocol and thus improves the readability of tests.

Following the testing pyramid – Agile Testing

High performance teams I have worked with know how to use end-to-end and integration test frameworks effectively to ensure that they are writing quality code and are not breaking functionality as they go. They know when to write these tests and what to test. They know how to refactor these tests to avoid fragility and keep them fresh and relevant. They know how to avoid duplication in these tests, and how and when to implement configuration management. They take ownership of writing and maintaining these tests (as they do for the quality of the entire product in general). They know how and when to break these tests into suites to optimize build performance – for example, splitting out a smoke test suite that runs on every local build and a more complete regression suite that runs on a CI server after each check in. They know how to create suites of tests and how to schedule and chain builds – for example, running stress/performance testing and/or benchmarking every night. And not to be forgotten, the base of the testing pyramid is solid! That is to say, they have embraced TDD and have a high level of unit test code coverage.

This is what I call “Agile Testing,” and it refers to the entire pyramid. I have asked non-coding Agile consultant colleagues to use this term when talking about best practices for Agile teams instead of the (somewhat rhetorical) terms ATDD and TDD. The rule is if you can’t write it, don’t talk about it – instead use the term “Agile Testing.”

Agile Testing Pyramid

Healthy testing pyramid


Unhealthy inverted testing pyramid – when ATDD takes center stage

(Source: Flipping the Automated Testing Triangle: the Upshot)

(Horse portrait by Yago Partal)

Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivs 3.0 United States License.