Unit Tests Effects on “Testability”

If your code base looks like this attic, I pity you.

Unit tests seem generally regarded as critical for quality code bases. Prima facie, the key effect we might expect is correctness. After all, that’s what most unit tests aim for testing!

Unit testing may not be sufficient for correctness (no test approach is, in the extreme) but it does seem that having lots of tests should promote correctness.

However, this primary, most direct outcome is not the only outcome we might expect:

  1. Proponents of test driven development argue that having good unit tests promotes refactoring and other good practices because you can make changes “with confidence” because your tests protect you from unintended effects
  2. Some units are harder to test than others, i.e. are less testable. Intuitively, long functions or methods, complex ones with lots of code paths, and complex signatures all make a given unit hard to test! So we might expect that writing lots of tests tends to promote testable code. We might expect synergy with 1.

It all sounds plausible (but defeatable). But what does reality say?

We are living in a golden age for empirical study of software engineering in many ways. There’s so much stuff freely accessible on the web (code of all sorts, with revision history, and a vast amount of side matter…issue and mailing lists, documentation, etc). It’s a lot easier to get a survey or experiment going.

That’s what Erik Dietrich did in a very nice blog post. He looked at 100 projects off of Github, characterized then binned them by percentage of methods which were test methods. If 50% of your methods are test methods, it’s a pretty good bet that it’s heavily tested.

Right off the bat we have the striking results:

Of the 100 codebases, 70 had unit tests, while 30 did not.

(I’m really loving the WordPress iPhone app EXCEPT for the fact that I can’t strip formatting when pasting text and can’t keep that formatting from contaminating subsequent text. That sucks WP especially FOR A FREAKING BLOGGING APP!!!

Update: It seems that the formatting nonsense is only in the app but doesn’t come through in the actual post. Yay!)

This could be an artifact of his detector or maybe the tests are elsewhere. Still!

Overall, only 5 of his 10 very natural hypotheses were correct. For example, testing anticorrelated with method length and complexity.

For cyclomatic complexity…this may not be surprising. You generally need more tests (to hit all the code paths). Also, as supported by “Beyond Lines of Code: Do We Need More Complexity Metrics?” (from the awesome Making Software, which needs a second edition!!), complexity metrics including cyclometric complexity, tend to correlate closely with lines of code. So larger methods and more complex methods are going to march together (and probably nesting too).

In any case, this is a very nice start.