Notes on Unit Testing and Other Things

People often ask me what constitutes a good amount of unit testing. I think the answer is usually a very high level of coverage but it’s fair to say that there are smart people that disagree. The truth is probably that there is not one answer to this question. Notwithstanding this, I have been able to give some pretty consistent guidance with regard to unit testing which I’d like to share. And not surprisingly it doesn’t prescribe a specific level of testing.

No coverage is never acceptable

While it’s true that some things are harder to test than others I think it’s fair to say that 0% coverage is universally wrong. So the minimum thing I can recommend is everything have at least some coverage. This helps in a number of ways not the least of which is that each subsequent test is much easier to add.

Corollary: “My code can’t be unit tested” is never acceptable

While you may chose to test more lightly (see below) it is always possible to unit test code; there are no exceptions. No matter how crazy the context, it can be mocked. No matter how entangled the global dependencies, they can be faked. The greater the mess, the more valuable it will be to disentangle the problems to create something testable. Refactoring code to enhance its testability is inherently valuable and getting the tests to “somewhere good” is a great way to quantify the value of refactoring that might otherwise go unnoticed.

Unit Tests Cross Check Complex Intent

The very best you can do with a unit test, in fact the only thing you can do, is verify that the code does what you intended it to do. This is not the same as verifying correctness but it goes a long way.

  • you observe the side-effects on your mocks to validate that your code is taking the correct actions
  • you observe the publicly visible state of your system to ensure that it is moving correctly from one valid state to the next valid state
  • you add test-only methods as needed (if needed) to expose properties that need testing but are otherwise not readily visible with the public contract (it’s better if you do this minimally, but for instance test constructors are a common use case)
  • it doesn’t take weeks to unit test logging thoroughly, so great candidate to move fast by testing
  • logged data that is “mostly correct” is likely to become the bane of your existence
  • they protect you against future mistakes
  • well meaning newcomers and others looking to remove dead code and/or refactor can do so with confidence
  • if you’re planning such a refactor, “tests first, refactor second” is a great strategy
  • unit tests can help you trigger those events even if they normally take days, weeks, years
  • important failure cases where we need to purge the cache or so some other cleanup action that are not exactly policy but are essential for correctness can be tested
  • these cases are hard to force in end to end tests and easily overlooked in manual testing
  • in general, situations that are not on the main path but are essential are the most important to verify
  • whatever mistakes developers tend to make, write tests that will find them and stop them
  • even if developer check-ins are 99% right that means in any given week something important is gonna bust, that’s the math of it…
  • weakness: you can only test the interleaves you thought of (but that goes a long way)
  • that’s actually the universal weakness of unit tests

When to Consider Integration Tests

If you can take an entire subsystem and run it largely standalone, or wired out for logging, or anything at all like that really, it creates a great opportunity for what I’ll loosely call “integration testing.” I’m not sure that term is super-well defined actually, but the idea is that more live components can be tested. For instance, maybe you can use real backend servers, or developer servers, and maybe drive your client libraries without actually launching the real UI — that would be a great kind of integration test. Maybe this is done with test users and a stub UI; maybe it’s some light automation; maybe a combination. These test configurations can be used to validate large swaths of code, including communication stacks and server reconfigurations.

When to Consider End to End Tests

Again, it would be a mistake to think that these tests have no place in a testing ecosystem. But like unit tests, it’s important to play to their strengths. Do not use them to validate internal algorithms; they’re horrible at that. They don’t give you instant API level failures near the point of failure, there’s complex logging and what not.

Conclusion

It’s no surprise that a blend of all of these is probably the healthiest thing. Getting to greater than 0% unit-test-coverage universally (i.e. for all classes) is a great goal, if only so that you are then ready to test whatever turns out to be needed.

I’m a software engineer at Facebook; I specialize in software performance engineering and programming tools generally. I survived Microsoft from 1988 to 2017.