Adam Petersen - Software Development Pages, Articles Section
Start News Articles Book Reviews

Design in Test-Driven Development

With its roots in Japanese just-in-time manufacturing, Test-Driven Development (TDD) puts the traditional development process on its head. In this article I will discuss when to use TDD, how to use it successfully, and discuss TDD in the context of up-front design.


The Roots of TDD

While best known to originate from Extreme Programming, TDD really has its roots in the Toyota Production System (TPS). Taiichi Ohno, the brilliant engineer behind TPS, was obsessed with eliminating waste in production. The main problem was how to supply the number of parts needed just-in-time. Ohno approached the problem by reversing the production flow: “a later process goes to an earlier process to pick up only the right part in the quantity needed at the exact time needed” [TPS]. And here’s the core of TPS: now the earlier process only have to make the number of parts actually needed, thereby approaching zero inventory. It also puts focus on quality by detecting deficiencies early in the process. The communication between the steps in the production chain is solved by kanban (sign board).

How does TPS relate to software? Toyota is about cars, isn't it? Well, I consider eliminating waste very relevant for software development too. A very common type of waste is code that's written but not integrated until weeks and even months later. In this case, the main waste arises from the late feedback and the missed opportunity to improve the code from the knowledge gained. Further it indicates a waste because something was developed but there obviously wasn't any true need for it. At least not immediately and the time spent could have been invested in an activity adding immediate value.

TDD achieves just-in-time exactly the same way as Toyota does: by inverting the steps in the traditional process using a failing unit test as its kanban. That is, the failing testcase is the need that triggers production and ensures that no unnecessary code (i.e. waste) is developed. Of course it also provides immediate feedback.

TDD Crash Course

TDD is dead simple. At least in theory. Here are the two only rules [BECK]:

1. Write new code only if an automated test has failed
2. Eliminate duplication

Simple to remember but hard to apply. As Kent Beck puts it “These are two simple rules, but they generate complex individual and group behavior with technical implications” [BECK].

The first rule forces us, at least if we’re hardcore TDDers, to write a test before writing any production code. I always wanted to be a space pilot, so using Java and the unit test framework JUnit I’ll try to explore the characteristics and responsibilities of a spaceship:

// SpaceshipTest.java
public class SpaceshipTest extends TestCase {
         
   public SpaceshipTest(String testName) {
      super(testName);
   }
   public void testDrivingMode() {
      Spaceship spaceship = new Spaceship();
      assertEquals(spaceship.speed(), 0);
         
      final int speedOfLight = 299792458;
         
      DrivingMode hyperSpeed = new DrivingMode() {
         public int topSpeed() { 
            return speedOfLight;
         }
      };
         
      spaceship.shiftDrivingModeTo(hyperSpeed);
      assertEquals(spaceship.speed(), speedOfLight);
   }
}

There are not many lines of test code and no production code at all, yet I have specified several design decisions in the test above:

  • Object creation: I have decided how a spaceship comes into existence and the initial state of a spaceship object (a speed of zero m/s). A typical spaceship will probably have many more characteristics, but I’ll leave that for now. With TDD, I’ll try to address one problem at a time (TDDers refer to this as organic design) and right now I want to drive at high speed; both with the development as with the spaceship.
  • API design: The unit tests are the first usage of the code and provide immediate feedback, actually even before the code exists, on how easy it is to use the API.
  • Decoupling: The Spaceship class is decoupled from the concrete driving modes and only knows about the DrivingMode interface. This is in line with the design principle of programming to an interface, not an implementation and is a typical TDD pattern. Besides being good design, it allows full control of the unit under test through the test stub (the anonymous class implementing the DrivingMode interface). I avoid depending upon concrete classes particularly because they add another factor of uncertainty and in TDD I never want more than one at a time.
  • Side-effects specified: Shifting driving mode means that the new speed will equal the top speed in the current mode. It’s stated explicitly in code what it means to shift driving mode.

The first test-case for a class is typically the one that involves most exploration. When I have it in place, I write the first version of the Spaceship:

// DrivingMode.java
public interface DrivingMode {
   public int topSpeed();
}
// Spaceship.java
public class Spaceship {
         
   private DrivingMode drivingMode;
         
   public int speed() {
      int speed = 0;
         
      if(drivingMode != null) {
         speed = drivingMode.topSpeed();
      }
         
      return speed;
   }
   void shiftDrivingModeTo(DrivingMode newMode) {
      drivingMode = newMode;
   }
}

Unless the implementation is obvious (as it is above), I start with a stub implementation where I return a hardcoded value. The main reason is that it ensures that I’m testing the right thing, which gets harder as the body of code grows. It also makes it easier to explore alternatives; because I haven’t really put much effort into the code it is easier to just delete the code and start over if I am dissatisfied with it, something that’s much harder mentally if I go for a full implementation directly. After I’ve run the test and got a green bar, I’ll check for duplications and potential improvements. What I don’t like above, is the speed() method; depending on state, the speed is set at two different places and the conditional is an unnecessary complexity. Let’s factor it out by taking full advantage of DrivingMode:

// Parked.java
public class Parked implements DrivingMode {
   public int topSpeed() {
      return 0;
   }
}
// Spaceship.java
public class Spaceship {
         
   private DrivingMode drivingMode;
         
   public Spaceship() {
      drivingMode = new Parked();
   }
         
   public int speed() {
      return drivingMode.topSpeed();
   }
   void shiftDrivingModeTo(DrivingMode newMode) {
      drivingMode = newMode;
   }
}

After my refactoring I run the test again and ensure that I still have a green bar in JUnit. In this last example, the testcases take on their second role; instead of driving the design they now function as regression tests.

Iterate again and again

These small iterations are the foundation of TDD and the single most common error developers make as they start to test-drive is taking too large steps. This is a hard balance; taking too small steps is inefficient, taking too large steps is a sure way to lose the feedback that TDD provides. With small steps, as a unit test fails it is immediately clear where the problem is (if it isn’t you’re not taking small enough steps). Every time I have to enter the debugger during development I know that I’ve rushed away with the coding and have to take smaller steps.

Small steps are also a good way to stay on track. In the average large software organization there are lots of disturbing factors such as phone calls, e-mails, and background noises. With a small testcase, there’s less information necessary to regain as I pick up the coding after a distraction. Unit tests help relieve my mind by keeping knowledge in the world instead of in the head. It is also the way I prefer to leave a coding session at the end of the day; a small, failing unit test that functions as a memory aid, a written and executable note to my future self.

I believe that it is impossible to give a general guideline on the size of the steps; the optimal step probably varies depending on personality and experience of the programmer. For example, as I use a new language or start working in a new problem domain the steps I take are shorter than the one above.

Design at different levels

These days it seems popular to bash Extreme Programming (XP) where TDD is a vital component. Matt Stephens and Doug Rosenberg even devoted a whole book to dissecting and, partly, ridicule XP. While they credit unit testing as important and state that it can complement more traditional up-front design, they make sure to push their own silver-bullets (Use Cases and Sequence Diagrams, which is hardly surprising as Doug has written two books about it): “The clean allocation of operations to classes you can achieve on a sequence diagram will eliminate the need for a whole bunch of Constant Refactoring After Programming”. This begs the question, how much design do I want up-front and how does it impact the role of my unit tests?

The first question is impossible to answer without a context. For example, I used to work on safety-critical software for the railway industry. One of the safety techniques was diversified programming which basically means that the same program is written twice by two independent teams. The two programs are run in parallel and the results are compared between the programs at predefined points, everything in real-time. If the programs don’t agree on the result, it means an emergency stop of all trains (hardly popular, particularly not for the passengers). It is a very expensive way to develop. Think about how hard it is to get one program working. Trying to test correctness into diversified software is a dead end and it is obvious that the main design has to be defined up-front, are the two programs ever going to agree on their state of the world. Here well-defined use cases and complementary models are invaluable, particularly to make the transition from problem space, as defined by the requirements, to the solution space and the design. Still, at a certain point it makes more sense to switch to code as a design medium. The reason is requirements explosion.

Requirements Explosion

The single most frequent question I’ve gotten with respect to TDD is: “how do I know the tests to write?” It’s an interesting question. The concept of TDD seems to trigger something in peoples mind; something that the design process perhaps isn’t deterministic. I mean, I never hear the question “how do I know what to program?” although it is exactly the same problem. As I answer something along the lines that design (as well as coding) always involves a certain amount of exploration and that TDD is just another tool for this exploration I get, probably with all rights, sceptical looks. The immediate follow-up question is: “but what about the requirements?” Yes, what about them? It’s clear that they guide the development but should the unit tests be traced to requirements?

My answer is a strong no, njet, nein. Requirements describe the “what” of software in the problem domain. And as we during the design move deeper and deeper into the solution domain, something dramatic happens. Robert L. Glass identifies requirements explosion as a fundamental fact of software development: “there is an explosion of “derived requirements” [..] caused by the complexity of the solution process” [GLASS]. How dramatic is this explosion? Glass continues: “The list of these design requirements is often 50 times longer than the list of original requirements” [GLASS]. It is requirements explosion that makes it unsuitable to map unit tests to requirements; in fact, many of the unit tests arise due to the “derived requirements” that do not even exist in the problem space!

Further, to capture all these derived requirements in a document or a UML model requires the level of detail of a programming language. Languages for that purpose do exist. The Object Constraint Language (OCL) for example is a formal language for adding details to modelling artifacts. The problem is that extending the models with that kind of detailed information may actually limit their use. Unless you go for full code-generation a la MDA where the model actually is the program (an approach which has problems of its own), you’ll lose what I believe is the most valuable quality of models: a higher level view than the code. The models will in that case basically turn into a mixture of two different abstraction levels. Detailed design taken to such lengths will also result in a lot of overlap with the code; you’ll get the feeling that you already coded the stuff ones before during the modelling. On projects where I worked with such detailed designs I found it terribly hard to keep them up-to-date. Every change results in a necessary update of the models, which isn’t very productive. Sure, there are tools that may help by supporting reverse engineering, but basically they only help covering the symptoms of a real problem.

My advice is to care about designing the details but doing it in the medium most suitable to express that level of detail: unit tests, written in the same programming language as the production code.

The purpose of TDD

To me, TDD is primary a design technique. Sure, the unit tests developed during TDD do serve a very valuable verification purpose. However, they verify code-correctness. Every software project has to complement them with testing on other levels, such as acceptance- and requirements-testing. The unit tests lay the foundation for the higher level tests and enable them to focus on their true purpose in a more efficient way; as I work with system tests I don’t want to be stopped by coding errors and this is where TDD helps.

TDD is also a verification tool for the intent of the programmer. Like so many other techniques descending from Extreme Programming, TDD provides an interesting double-check mechanism. With TDD every programmer states his/her intent twice; once in the unit test and once in the production code. Only if they match we get a green bar.

If you only want verification, you don’t have to do TDD (although you need something else to carry out the low-level design, be it modelling, formal specifications, genius or plain luck). In this case I still recommend the unit tests to be written in close conjunction to the code. Writing unit tests only with respect to verification is more straightforward as there’s no more design decisions to take. The major disadvantages are of course that you lose an excellent opportunity for design and run the risk of writing un-testable code.

Code coverage

Code coverage is a simple technique for providing feedback on the quality of the unit tests. A technique I’ve found valuable is to build code coverage analysis into the build system. In that way I can run the unit tests and get a report on the coverage with one single command. However, I typically don’t bother with analysing the coverage until I’ve finished the first version of some module, but then it gets interesting. In theory, when using TDD we will always get 100 percent coverage (remember, we’re only supposed to write code as an automated test fails). While I do have written fairly large programs with full coverage I don’t believe it is an end in itself nor particularly meaningful as a general recommendation; it’s just a number. Instead the value I get is as feedback on my test writing skills. If I have missed a line or branch during my test-drive I try to analyse the cause; perhaps it is okay to leave it as it is, but more often there was some aspect of the solution that I initially overlooked.

What code coverage analysis actually implies is an implicit code review and that provides a great learning opportunity. But there’s more to it. Another aspect where code coverage really helps is detecting broken windows.

Broken Windows

TDD works best when it’s actually used. Let me elaborate by connecting to the heading. Broken Windows, at least in this context, has absolutely nothing to do with operating systems; it’s a term from social psychology that comes from the following example: “if a window in a building is broken and is left unrepaired, all the rest of the windows will soon be broken” [WILSON].

The analogy to software is apparent; a class without a unit test is a broken window and just makes such an excellent excuse to code yet another class without unit test. I think it’s something very fundamental in human nature and I’ve been there myself. The original article on the subject puts it this way: “one unrepaired broken window is a signal that no one cares, and so breaking more windows costs nothing” [WILSON]. From there things only get worse; trying to repair a broken window by covering the code with tests afterwards is tough, as the code probably isn’t designed with respect to testing and now much effort has to be put into breaking dependencies in order to make the code testable (in fact it is such a tough problem that Michael Feathers has written a whole book about it.

On the other hand, as I extend or debug an existing program, was the program developed with full TDD from the very start, I just continue to write tests as I go along. The value I get from unbroken windows is obvious and during maintenance I learn to appreciate the unit tests as a regression test suite. Writing code without covering tests would in such a case be breaking the first window and that’s just too conspicuous.

TDD in a Maintenance Context

Software maintenance will always be hard but TDD may ease the pain. In fact, due to the small and rapid iterations in TDD, the software is put in maintenance mode almost instantly.

The secret to successful modifications of existing programs is to keep one factor constant all the time. In TDD terms this means either changing the unit test or the unit under test, but never both at the same time. After some initial analysis of the necessary changes I turn to the unit tests and write more of them. These may be either complementary tests to try out my understanding of the software to modify or a testcase for the required change. From now on the process is exactly the same as in TDD during greenfield-development.

TDD Recommendations

TDD is a high-discipline methodology. That makes it easy to slip. Below are some recommendations on what I believe are the most important practices to adhere to during TDD.

1. Keep the same quality on unit test code as on the code under test. There’s apparent danger in mentally and qualitatively differentiating between production code and test code. Remember, the unit tests are your primary interface to the code during development and maintenance and you do want that interface to evolve clean and nice over time in order to keep it alive.

2. Write unit tests that are small and independent. Particularly, avoid dependencies upon databases, network communication or files. It is in the vein of good design to keep software loosely coupled. Failure to follow this recommendation may have practical implications very soon, as such unit tests do not only require complicated set-up and clean-up code; they also take a long time to run. Unit tests that take a long time to run will probably not be run often enough (the same is true for the build process, if you’re using a compiled language the unit tests have to be fast to build and run) and there’s a risk that the unit tests get out of sync with the rest of the codebase and turn into heavy baggage that’s finally abandoned.

3. Use a consistent naming. My personal convention is to name the unit tests equally to the unit they’re testing and appending “Test” to the name. Returning to my initial example where I test-drove a Spaceship.java unit I named the corresponding unit test SpaceshipTest.java. The rationale is that most IDEs sort files and classes alphabetically making it easy to navigate between tests and production code.

Sumary

Test-Driven Development is a design technique that pays off soon and, at the same time, an investment in the future that continues to add value in subsequent versions of the software. TDD is not the long-sought silver bullet of software. It doesn’t really make any of the traditional phases in software development obsolete (possibly with the exception of desperate bug-hunting close to a release, but that rarely turns out to be a pre-defined and planned activity). Instead it inverses the order of coding and testing, thereby providing an excellent medium for detailed design with immediate feedback. TDD requires a lot of discipline and I hope that my recommendations will help you on the quest to great software.

April 2007

 

©2005 Adam Petersen adam@adampetersen.se