Code Freeze is a great, local, one day conference held at the University of Minnesota. Held last Thursday, this years topic was Software Testing for the 21st Century. While the lectures were great, the most enjoyable part to me were the interactive portions of the day. The middle of the day was organized into three testing dojos:
- The Story of the Code
- The Story of the Product
- The Story of the User Experience
I helped to run the Story of the Product testing dojo. This dojo focused on story, functional, and integration testing. The three tools we demoed were Cucumber, JBehave, and FIT. I have been using FIT at work for about five years now, but for this dojo I had the opportunity to learn and demo JBehave. Not having used JBehave before, I downloaded it the Tuesday night before and ran through some of the stories we would run through during the dojo. I was very impressed how easy it was to pick up.
With 20 minutes for each of the three tools, we created a project, wrote the tests and test fixtures, wired tests into the test fixtures, as well as implemented the system under test. All in a TDD format. Cucumber was demoed first. One thing unintentionally demonstrated was the set up time to get up and running with Cucumber. An interesting aspect of this tool is that it supports multiple languages, of which Ruby and Java were shown. Second up, I helped to demo JBehave. Set up was a little less complex and use the similar Given/When/Then syntax as Cucumber. The third tool shown was FIT. The immediate reaction from the audience was “Why would anyone willingly choose to write HTML or wiki tables to set up tests?” This is something I and others have struggled with at work, both in that no developer wants to write straight up HTML and no non-developer wants to write wiki tables. At least we haven’t been able to make it stick. There are options such as creating it in a Word document and saving it as HTML, but for some reason this still tends to be a roadblock. There was a lot of resistance to story TDD and many questions about how to get business type people to successfully adopt it. While there were success cases presented, it’s definitely still an open question in many people’s minds.
My Experience
I fell in love with this type of testing in a round about way. Several key developers, who were also subject matter experts, left the company shortly after I started. FIT was my way of exploring and quickly identifying issues.
I have found this type of testing exposes the brittleness of your integrations and poor API contracts. One might argue that we shouldn’t be testing at this level, but really shouldn’t the API model the business or the story? For instance, as a shopper I want to add a pair of jeans to my shopping cart and complete my order. If the system under test has a poorly designed API, my test classes may be forced to invoke several different classes and methods in order to fulfill this test. With a well designed API, maybe it simply needs to invoke the ‘add product to cart’ method, prepare some test shopper address and payment data, and then invoke the ‘complete order’ method.
One other unique way I’ve seen this type of testing used is with an application which has a complex deployment environment, even for developers. While this may be a design smell itself, sometimes there’s just not much you can do. This type of testing can help developers quickly test multiple stories that may take hours to set up test data and test one story manually.
If you already have your ‘library’ of test fixtures, this type of testing can also be used to prove or disprove (and subsequently debug) production issues. With a complex deployment environment and an application that is highly dependent on live production data that is not available in a test environment, this type of testing is the quick and dirty way to dig in and reproduce an issue. Often times, we find it’s not even reproducible in test environments, so with a FIT test, it’s reproducible in any environment with the click of a button.
But even with these benefits we’ve seen, there are a number of downsides. The largest one being the adoption of this form of testing by non-developers. It just never took off. The second one is simply the amount of maintenance it took to keep hundreds of tests running across three code branches (dev, test, production) and three databases.
Remaining Questions
The testing dojos resulted in more questions than answers for many: Does testing at the story level promote better design or does it ignore it? Is this type of testing really just “stringing different units together” and not really testing anything? Or does it simply just start the conversation that the story is intended to start, ensuring you are building the right product? What about testing the integration of different components or services? How should that be handled? Or should it? And the kicker: Is TDD even useful?