Testing and Instrumentation

  • you can easily hit diminishing returns in testing, and people often over-invest
  • there are so many combinations it’s just not at all feasible to prevent all your problems with testing
  • therefore, you should really set things up so that you can detect things in production when they happen, and,
  • you want it to be the case that, if bad stuff happens in production, it’s not that bad

Testing has diminishing returns so don’t test

Notwithstanding that it’s easy to go overboard, most systems you will encounter are not remotely in danger of going overboard… it’s far easier to encounter systems with single digit % coverage.

High levels of coverage are inherently overkill

You really have to decide for your own system what the appropriate level of coverage is going to be and stick to it. It’s actually not that hard to get to 100% block coverage in any codebase and in many (most?) cases, for professional software that’s going to be worth it, I consider it “ante”.

  • for native code, you can run the tests with ASAN, TSAN, LeakSAN etc.
  • you can hook all your logging code and ensure that no PII ever leaks
  • you can enforce other invariants such as all UTF8 conversions are checked or, all SQLite statements are finalized
  • you can use the correctness tests are a base for fuzzing in various areas

Sanity Check

ASAN/TSAN/LeakSAN etc. are self-evident I think, you just build the test suite with the appropriate compilation mode and run it as usual. Any test failures give you immediate actionable signal. This is really great at stopping huge classes of wild pointers and double frees.

PII

If you set up your test infra so that all sources of test PII are readily recognized (e.g. they have a poison pattern) then you can look for that pattern in places it does not belong. For instance maybe every user name in the test suite includes the text “$user”, you can now shim all the logging APIs and if any “$user” ever appears you fail the test on the spot. Likewise you can look for “$user” in any stored files and ensure it is not anywhere it’s not supposed to be. Note many applications are in the business of storing PII but that doesn’t mean it’s allowed to be everywhere, it’s supposed to go in certain exact places and you can automatically scrub test results to ensure it isn’t anywhere inappropriate.

Invariants

The test infra I use has a facility to shim/fake all of SQLite, this is useful for creating whatever error conditions you might need to create, that’s very normal. However, in addition, the SQLite shims also track prepared and finalized statements and the test system will fail your test should these calls not balance. This means you can be confident that statements are getting finalized even on error paths, or exotic paths. This isn’t perfect as we’ll see below but like ASAN it goes a long way.

  • run the test as usual, but fail the nth SQLite operation using the counts you previously got
  • verify that ASAN etc. still works, and all invariants still are, well, invariant
  • to pass, the test should also report “failed” for each of these variations or else either an error case was ignored or else the test did not properly detect a failure case, either way a fix is required
  • after this you can be sure that every SQLite API call has at least minimal error checks where needed

When To Stop

You could keep adding tests literally forever, after 100% block coverage comes 100% branch coverage, which is a lot harder to achieve, few codebases (e.g. SQLite) even aspire to this. If your library is going to be used in a few billion devices and you want a regular cadence you might also want this much coverage, but it’s likely overkill.

  1. You have to know what kinds of mistakes your devs tend to make, and which ones you simply cannot afford to allow into production (e.g. security, privacy), focus your tests there
  2. For everything else you need insight from production, and that means high quality actionable instrumentation and you better have tests for THAT. If I had a dime for every bit of useful logging that went untested and shortly became useless… Telemetry is Oxygen.

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Rico Mariani

Rico Mariani

I’m a software engineer at Facebook; I specialize in software performance engineering and programming tools generally. I survived Microsoft from 1988 to 2017.