The picture above is of a tarsier. She is a curious little critter with great big eyes for hunting bugs at night. I felt that made for a fitting name for a testing framework - so after writing one, I named it "Tarsy".
Firstly, when I speak of "code testing" here, I am referring to writing code to test code – sometimes referred to as "white box" testing – namely:
Some other types of testing may overlap these types (these certainly overlap each other), such as end-to-end testing, integration testing, system testing, etc. And some types of testing are generally not tested with code and fall outside this article, such as load testing, acceptance testing and other "black box" style tests.
TDD reduces production bug density by 40–80%
A project with a well written test suite is easier to document*, easier to maintain, easier to upgrade and easier to refactor. Bugs are found faster, and are therefor much easier (and cheaper) to fix. If you care at all about the quality and success of your projects, you should be writing tests.
Obviously testing is a win. Its a win for the developer, a win for other developers on the project, a win for the project, a win for the developer's company, and a win for the users.
* In a future post I will make the claim that the testing is the documentation.
The concept of a code test is simple: You call a function and check that the return value is what you expect. Do this for a variety of inputs, testing the common path as well as edge cases. And do this for all the functions in your application, library or module that you are testing.
It is that simple. It should really be that simple.
I believe there are two primary reasons that testing is often more complicated:
One is due to the way many of us write code.
It is easy to test a function that is available on the global scope, takes simple arguments, does not depend on the application state, and returns a simple value. But many functions would not qualify for this description. They may be buried deep inside another module and may not be publicly accessible; they may depend on state within the module and/or call out to other modules that also have state, etc.
The more you write tests (particularly if you write your tests before you write your code), the more you will naturally write functions that are easy to test. In fact, proponents of TDD will argue that this is a central benefit of the TDD methodology.
The second reason getting started with code testing is such a challenge is due to the testing framework landscape.
How do you decide which framework to adopt?
And once you decide on a framework, you often then have to select an assert library and/or a report generator.
Supporting TAP output sounds like a feature, but when someone is just getting started, learning how to convert TAP to a comprehensible report is an added hurdle.
No, you don't need one - you could write a small program that loads your library or application, calls to various functions, and checks the return values. But a framework helps. It helps by providing the following small, but important features:
Many of these features would be pretty easy for a competent developer to code up in 10 to 15 minutes, but collectively it would take some time. And most developers wouldn't take the time to make those reports look so nice, now would they!?
There are many others, but these seem to get the most attention.
And now allow me to introduce the new kid in town: Tarsy.
The reason I wrote yet another testing framework is this: All the other solutions I reviewed did too much of what I didn't need, and were not quite as adept at the key fundamentals that I did need.
Here is the short list of what I really needed a testing framework to offer:
This list effectively disqualifies all the testing frameworks listed above. In most cases, multiple times over.
Tarsy was written with the above list as core requirements. Here is a chart showing the popular frameworks and their handling of these criteria:
|Runs asynchronous tests in parallel||No||No||No||Yes||No||Yes|
|Works in both browser and in Node||Browser Only||Browser Only (Plugin for Node)||Yes (separate file)||Node only||Yes||Yes|
|CI Integration / Test Runner||No||No||Yes||Yes||Yes||Yes|
|Built-in Assert Handling||Yes||Yes||No||Yes||Yes||Yes|
|Asserts use Strict Comparisons||No||No||No||Yes||No||Yes|
Of course, you may have different criteria for what you look for in a testing framework - but let me describe some of these requirements in more detail:
Installing via npm install <package> is not hard in most cases, but simpler is definitely better. Its a gift that gives every time you transfer files, copy projects, backup projects, check-in, check-out, etc. All else being equal, I prefer a solution with less files - and enjoy the The Cumulative Benefit of Multiple Small Improvements.
The simplicity is carried further into he usage of the framework. Namely, there is either a Tarsy global variable (in the browser environment) or you pull it in with require (Node/Browserify/AMD). There are only 3 functions to know (section, test and assert) and they work just as you might expect.
Does this simplicity come at a cost with regard to power and/or flexibility? I think not. Eric Elliott argues in this article that when it comes to testing frameworks, "Fewer features is the new feature-rich".
It is best to keep the focus on your tests and not on your testing framework. You will write more tests. You will write better tests. Your code quality will be better for it.
This one baffles me.
Why do so many (all, it seems) testing frameworks (or assert libraries) make loose equality the default?
Here are four asserts (as could be used in a vanilla Node environment) that should not pass, but they do. They pass because the standard assert in Node requires only "truthiness" and assert.equal does a loose comparison:
assert("false") // "false" as a string is truthy assert(new Error("Bad Value")) // Error objects are truthy assert.equal(1,"1") // numeric 1 is loosely equal to string "1" due to type coercion assert.equal(1,true) // again, type coercion converts true to 1 making them "equal"
These examples should not silently pass. If your test expected a function to return the string "1" but it instead returned the number 1, you would want to know about it. Yet in all the testing frameworks I tested, these errors would not be caught. Baffling.
Update: I've been told that AVA also uses strict asserts. High Five AVA!
Of course, you can generally use assert.strictEqual (or its equivalent) to improve the situation, but you have to remember to do so - and the core assert(condition) often has no strict counterpart.
With Tarsy, all tests are strict. In my opinion, there is no value in a loose test - it should be avoided. If you chose a framework other than Tarsy, I'd recommend always using strictEqual.
Note: It is likely that testing framework authors use this relaxed truthiness test due to "common expectation". i.e. "because everyone else is doing it". Indeed I see it is the case with the Ruby and Python communities as well. That is not a good reason to make your testing less robust. Test framework interoperability is not important – code quality is.
The sooner you catch a bug, the better.
For this reason, it is recommended that you run your tests constantly. And so it is critical that your tests run as quickly as possible. If a test run completes in 7 seconds, it will be run a lot more often than a test run that takes 45 seconds.
Often what takes time within a test suite is async tasks, such as File IO, Database access, and Network operations. Communication tasks often purposefully impose long delays in order to test timeout functionality. This increases code coverage and is smart, but will impose the full timeout (often 10 seconds or more) for each test while you simply wait.
If you follow best practices by independently setting up and tearing down each test, these will quickly multiply and severely increase the time your test suite takes to complete.
Tarsy runs asynchronous tests in parallel, so they will run simultaneously and complete much faster. And don't worry, if necessary you can define a section which runs its tests synchronously.
At the time of this writing, Tarsy is at version 0.4.1. So is Tarsy ready for production use? What does the future hold?
Tarsy has not yet been assigned a 1.0 version. It is very new has not been vetted by the community as have other popular frameworks.
If you are starting a new ("Greenfield") project, I think Tarsy is worthy of consideration, particularly for more seasoned developers. The codebase is small and easy to understand. It is now a critical part of our workflow, so I will be very responsive and quickly fix any issues that arise. And I will be very receptive to contributions!
I expect it to reach 1.0 status within a couple months.
Tarsy has proven useful for us already - but there are some additional changes on the roadmap. The first is to enable Tarsy to be used as a "library" - in addition to being a framework. You will be able to define sections, attach them to parent sections, and assign tests to them without requiring Tarsy to be in control of programatic flow. I will explain this further in a future post.
It would also be useful to allow choosing specific sections or tests to be run within a test suite. This is a common feature offered by other testing frameworks and would be easy to add to Tarsy.
The simple download/installation/usage is a fundamental I will not stray from, however. So any ambitious features will likely be in the form of extensions/plugins. Indeed, many framework-agnostic plugins already work just fine with Tarsy, including Istanbul (code coverage), Sinon (spies, stubs, mocks), power-assert, etc.
So grab Tarsy, and start hunting for bugs with "The little test framework with BIG EYES"!