The Fallacy Of Fast, Putting Test Execution Speed In Its Place 04 Feb 2012

A colleague recently sent over a link to Corey Haines video on Fast Rails Tests from GoGaRuCo 2011. From watching this it got me thinking about fast tests and a common misconception that people have. For the record, Corey is not one of those people, but his talk got me thinking about this topic.

The Problem

People are becoming obsessed with execution speed, almost entirely. They’ve got umpteen thousand tests that execute in 2 seconds. On the surface this is awesome! Unfortunately, this can be a false achievement (but not always) that their tests test a whole lot of nothing and end up being quite brittle. Mocking is used, but inappropriately and way too much. The tests end up being incredibly fast but not very valuable.

Speed Kills

I’ve gone through my own Michael Angelo Batio phase in the past I’m not simply speculating. I’ve been in those trenches, I’ve experienced it, I’ve done it, and I’ve seen others succumb as well. Faster tests are better; they will be executed more frequently; they will provide faster feedback loops; they will be the cause of less idle time for the developers who sit and wait for the test suite to finish running. But fast is not the primary goal. It is merely one of the forces at play that we as developers must figure out how to resolve.

Goals of Testing

The primary goal for tests is to help continually drive the development of software over time. The broadness of this statement is intentional because there are many things at play which impact our ability to reach this goal. Things like: behavior validation, reliability, readability, frequency of execution, design/testability, etc.

Behavior Validation

Of these the most important thing is behavior validation. Without it the others don’t matter: they are pointless. If you have 100,000 examples that run in 5 seconds, but they’re not testing actual behavior of the components than you are providing nothing but empty statistics. Behavior validation is not about speed, it’s about behavior.

Greenhorns have slow, ugly, poorly designed tests. Those who have more experience and are better at testing will see a million ways to improve these tests. It’s tempting to simply throw at them all of the other things they should be doing, but all in good time. First and foremost, we need to make sure we all have this basic principle of testing under our belt. Otherwise, we’ll be putting the cart before the horse so-to-speak.

Reliability

Reliability is the next important. It may seem like a given but in many cases it is not and it often has to be learned through hard-ship and experience from people new to testing. Tests need to execute reliably all of the time. For unit level tests and examples this is the easiest because there are the fewest things at work in the test. For higher level (functional, integration, etc) this gets trickier because there are more things at play and which need to be considered.

Test Order Independence

There are multiple ways to look at reliability. One way is to run your test suite in backwards or random order (RSpec supports this). This will find tests which have dependencies on the order in which other the tests are run. Tests should reliably and consistently pass regardless of the order you run them in.

Component Independence (Isolating System Under Test)

Another way to consider reliability is thinking about it in terms of unexpected dependencies. For example, if we are testing something simple, like, the requirement that a User has a name, then when we change the Account, our test for that User validation should not fail. There are a number of cases where unexpected changes in other parts of the system impact our test. In some cases, this shows we have bugs, in other cases, it shows that our tests are not reliable because they are being impacted by things that have nothing to do with the behavior the test focuses on.

Reliability is so basic, yet it is often misunderstood. The less reliable the tests are the harder it is to trust them and have confidence in your system and continue to evolve the system swiftly. Unreliable tests put the tests themselves in question.

Readability

Readability is next down the list. As much as it pains me to make this third, I think it is the proper place for it. Being able to understand and have others understand your tests is hugely important. I have seen countless tests which test many things (some indirectly) and it is hard to determine what exactly they’re testing. Readability goes a long way to ensuring that the code can evolve, change, and maintain with much greater ease; irregardless of if its you or someone else doing the changes.

However, you can extract out readable tests from those that lack readability but still test actual behavior of the system and its components. It may be tedious and take longer, but it can be done. A good place to start here is to focus your tests around a single behavior of a component in the system and to strive to achieve one assertion per test.

Frequency of Execution

Here are we now: frequency of execution. This is where having fast tests become important. The faster they are the more they get executed. The more they get executed the quicker the feedback loop is which directly impacts all of the above things. But without the above things (except maybe for design) speed doesn’t matter: you just feel good about seeing more dots on the screen.

Having 100,000 tests pass in 5 seconds doesn’t mean you have 100,00 meaningful tests (although you may). People who get bit by the speed bug early on fall into the trap of over-mocking. They end up testing the mock library in various ways but don’t actually test their software. It’s a rather interesting phenomenon: we are providing countless creative ways to ensure mocking libraries in many application domains. This results in a meth-like high, feeling great about the number of examples, and speed of execution. At some point, the high ends.

But there are simple things you can do to increase execution speed without sacrificing value while you are learning how to best utilize things like mocking libraries. And Corey offers a great tip in his video: spec_no_rails/. Have a folder which doesn’t boot up the entire framework, but has tests for things that don’t actually depend on the framework being loaded.

Designing for testability, which is up next, is also a great way for increasing the speed of tests because you’ll be able to separate out the behavior into their rightful places and test them largely independent from one another.

Some other things that impact test execution speed is the language or framework itself. For example, Ruby 1.9.3 is much faster at loading and executing code than any earlier versions. And Rails 3.2 has a number of performance enhancements.

Design, Testability

Lastly (for this list at least), as we grow and become better at the above four we will begin to find ourselves exploring new ways of using tests to help drive and shape the design of our code. This often means we get better at applying better design practices in our code like Single Responsibility, Loose Coupling, etc.

Whether we test drive or test after we begin to use our experiences and newfound abilities to shape the design of your software so the behavior can be tested, reliably, and in a readable manner. This means that tests become a heavy influence on your design. Your code becomes testable. And before you know it you’ll be telling people you design for testability.

The Balance

In terms of balancing, we should err on the side of having good reliable tests first. It is more valuable to have a good reliable but slow test opposed to a fast, meaningless test. We can always come back to the slow test and find ways of improving its execution speed, but that should come after we are sure it’s doing its job (testing behavior) and doing it consistently (reliably).

Conclusion

Speed of execution is important but its importance is relative to the other properties at play. And all of the properties impact one another, but they are not all equally important. As we grow in all of these areas we are able to explore more advanced ways of achieving all of these. But speed must come after the ones before it are working to the advantage of our ability to ensure that our software is working. Otherwise, we may end up we good conference conversation and not much else.

There are other important things related to testing that were left out of this post. And the above areas could all have been taken to much greater depths. This post was a quick attempt to raise to the surface an issue I see with inexperienced testers latching onto things like fast tests without first understanding writing a good reliable test.