At StratEx[i], apart from developing the StratEx application, we obviously worry about quality and testing the app. After all, we are committed to provide quality to our users.StratEx uses a concept of code generation[ii] to shorten the development cycle, hereby being able to quickly implement new features and functionality. It also provides for a nice uniform User Interface, because we kept the code generation as simple as we dared to. So it is not very easy to create a (generated) screen that looks very different from the others.
The code generation concept gives us reliable code (the part that is generated at least), leading also to reduced testing time. Still, next to coding efficiently, we wanted to find ways to efficiently test as well. The obvious (from our perspective at least) solution was to automate and to generate our tests. While this sounds easy, in reality it is not so obvious. We spent quite some time figuring out what the best testing strategy could be, how to implement this (using test automation[iii])
At first we settled for User Interface (UI) testing, using Selenium[iv]. We hand-recorded some testing scenarios and included them into our Continuous Integration[v] build process (I’ll talk about this in another post). It helped for deploying builds that were smoke-tested[vi]. But it did not help at all when we made changes to the screens. And this was exactly what we were doing all the time, because it was easy to do with our generated code and we needed this flexibility for our users! And of course there is no way that we could compromise on this, not even to ensure our code was tested automatically. Yes, you are reading this right – we dare to deploy code that is not fully end-to-end tested! We prefer to deploy often and fast, with the risk that our users find a bug from time to time.
Still we were not happy about this, so we kept on looking for better solutions. A long investigation ensued, looking at better (faster) ways to test, ways to generate tests, ways to improve our hand-written code, reading many books and articles on testing (see the books’ and articles lists here below). Today the search is not over, and we still did not manage to generate our tests (fully). We do run automated tests these days however before deploying a new build.
Meanwhile, we’d like to share some observations on the various test/development approaches we have encountered:
TDD (Test Driven Design/Development)
The basic premise of TDD[vii] methodology is that your development is essentially driven by a number of (unit) tests[viii]. You first write the unit tests, and then run these against your (not yet existent) code. Obviously the tests fail, and your next efforts are directed to writing code that makes the tests pass. TDD and its strong focus on unit testing is in our opinion an overrated concept. It explains easily and therefore quickly attracts enthusiastic followers. The big gap in its approach is that (unit) tests are also code. And this code must be written, maintained and it will contain bugs. So if we end up on a project where developers spend x% of their time writing new (test) code, and not working on writing production code, we have to ask what the point is of this. In our view, you are just producing more lines of code, and therefore, more bugs.
Another problem or weakness of TDD is that there are a lot of bad examples on TDD around. Many TDD-advocating sites give sample code that essentially goes like this: you have a unit test that needs to test a new class. The unit tests usually set a property on the class to be tested and try to read back this property. With any class of reasonable size, this will give quite quickly a nice set of unit tests. But, let’s back up a bit, what are we testing exactly here? Well, all things being equal, in such scenarios we are testing if the code is capable of accepting a value and that possibility to retrieve this value. This comes down to testing the compiler or the interpreter, because as a developer the amount of code you wrote (and its complexity) to achieve this behavior is close to zero. In other words, such unit tests only provide a false sense of security as in the end you are not testing anything.
One argument against this is that over time such getter/setter[ix] code may evolve into something more complicated and then the unit test code becomes useful to avoid regression[x]. Our experience shows however that a) this rarely happens at a scale large enough to make the initial effort worthwhile and b) in case you are making such drastic change to your code, moving from elementary getter/setter pairs to more involved computations, it is very likely that you want your unit test to break.
What is the moral of this story for unit tests? Should we abandon them altogether? No, but we maintain that it requires some thought on what exactly you want to unit-test. 100% unit test coverage for us means we’re wasting effort.
Let us take another point. TDD as a concept is not bad, in the sense that it forces you to think about what you actually need to build and how you can get it accepted (tested).
We believe however that its fundamental approach puts too much “power” in the hands of the developer: it gives the strong impression that the developer (under time pressure) is tempted to build “something” that passes the test, and that is all. So if the test is wrong, the developer is not to blame! We believe this undervalues the strength of a good developer and gives anyone with a code editor the chance to position himself as a developer. This cannot be right.
Looking for software testability[xi], we found another issue that we could not accept. It is the burden that the approach of producing “testable software” puts on your architecture. To make a system “testable” is one thing; to enforce rock-hard principles on the architecture (Inversion of Control[xii], Dependency Injection[xiii], or a very strict separation of layers[xiv]) does in our opinion only dramatically increase the complexity of the code, and, very importantly it serves no purpose for the end-objective of the system, namely to solve a business problem. For highly complex systems such approaches are probably defendable and even very good. However in the fast-paced, ever-changing world we live in, the least of our problems is if a software architecture can be sustained for 10 years, if we already know that in 2 years time the entire business system will be obsolete. We may even argue that in 2 years time the insights on what a good architecture should be will have dramatically changed. So why bother with this? We should concentrate on software that works, and preferably that ships in record time.
So we concluded that any testing methodology that requires extensive re-engineering of what is basically a workable and dependable architecture should be looked upon with a certain suspicion. As a consequence, we do not use unit tests that require our code to be able to work without a database. We do not use “mocking”[xv] with all its complexities and we do not spend time on making our classes and objects independent of each other. It does not make for the most “correct” code, we know. However, we don’t mind. If tomorrow we find a better way to do it, we change our code templates and simply re-generate our application code (well, most of it). So we are not overly worried about having the right architecture to begin with – in fact we’ve already changed it twice, but that is another blog.
So does this mean the end of test driven development? Not at all: there are things you can very well test with unit tests. Plus, there is a new descendant which we have also investigated and that shows promise for other areas we’d want to test: Behavior Driven Development[xvi] (BDD).
BDD has the same initial outset as TDD in the sense that it starts the development process with the definition of the tests the future application will need to pass to be accepted. But, BDD is more appropriate for this task because it seems to focus more on the functionality of a system than on how it should be built. So it is a less prone to the criticism we have on TDD. For one, it provides for a way to bridge the gap between users and developers by using a specific language in which to specify tests (or acceptance criteria, if you wish). This language Gherkin[xvii] is so simple that the learning curve is as flat is it comes, meaning that everyone can be taught to understand it in record time. Writing proper Gherkin requires a bit more time.
For us, its main advantage is that Gherkin provides for a way to communicate functionality of a system at a level that is understandable to a developer. Its main downside is that you will end up with A LOT of Gherkin to fully describe a system of a reasonable size.
Which in the end is the main criticism we have on most of these “methodologies”, (UML[xviii] included): if you have a system that goes beyond a simple calculator (the usual example) no modeling language (as they all are, in a way) is powerful enough to describe a full and complete system in such a way that you can understand and describe it more quickly than by looking at the screens and the code that implement these screens.
So the search goes on…
Testing experience (PDF version)
Test-Driven Developments are Inefficient; Behavior-Driven Developments are a Beacon of Hope? The StratEx Experience (A Public-Private SaaS and On-Premises Application) – Part I
Do you want to buy the books from Amazon?
- “The Cucumber Book” (Wynne and Hellesoy)
- “Application testing with Capybara” (Robbins)
- “Beautiful testing” (Robbins and Riley)
- “Experiences of Test Automation” (Graham and Fewster)
- “How Google tests software” (Whittaker, Arbon et al.)
- “Selenium Testing Tools Cookbook” (Gundecha)
Do you want to buy the books from Amazon?
- “Model Driven Software engineering” (Brambilla et al.)
- “Continuous Delivery” (Humble and Farley)
- “Domain Specific Languages” (Fowler)
- “Domain Specific Modeling” (Kelly et al)
- “Language Implementation Patterns” (Parr)
[ii] Code generation: http://en.wikipedia.org/wiki/Automatic_programming
[v] Continuous integration: http://en.wikipedia.org/wiki/Continuous_Integration
[vi] Smoke testing (software): http://en.wikipedia.org/wiki/Smoke_testing_(software)
[vii] Test Driven Development (TDD): http://en.wikipedia.org/wiki/Test-driven_development
[xiv] Separation of concerns: http://en.wikipedia.org/wiki/Separation_of_concerns
[xvi] Behavior-driven development: http://en.wikipedia.org/wiki/Behavior-driven_development
[xviii] Unified Modeling Language: http://en.wikipedia.org/wiki/Unified_Modeling_Language