[originally published 12/14/2017]
Threads are tricky. Constrained Threads are even trickier. By Constrained Threads I mean threads that are allowed to do some things but definitely not other things. For example, in some systems when a timer fires a worker thread executes the timer callback and that thread is not supposed to disturb the UI. In other systems, certain threads must avoid certain resources because of latency or starvation that would cripple the system overall. There are other similar kinds of constraints that can happen in callbacks and workers. It’s a common problem.
How do you handle this? There are several strategies all of which require some kind of enlightenment. What you must not do is resolve that “everyone who works on this code shall know the rules and not break them” simply because people make mistakes and if the consequence of getting it wrong are severe (security, privacy, reliability, money, whatever) then you really need better defense than “We’re gonna try real hard.”
There’s three good approaches, I prefer #3, but it’s naturally the most work.
#1. Use a cop/linter.
If you can codify a white list or a black list of key operations that are allowed or not allowed then you can do static analysis on the code (in .NET or Java it’s often practical to analyze the IL or bytecode rather than the source). This kind of analysis can tell you at build time that something bad is about to happen.
#2. Enforce invariants in the code
If you can identify the systems that are allowed (or disallowed) you can potentially add guards in the code so that at runtime you can identify immediately if a bogus call is being made. This will save you from having nasty delays between starting a bad pattern and that pattern becoming a serious problem because you can in principle fail very quickly (and find the failure) and you don’t have to solve the halting problem to do so.
#1 and #2 combine nicely.
#3. Constrain the thread with a capability style model so that doing bad things is not possible.
This is really two steps:
#3a: Get rid of every global from the thread code and never allow any global ever again.
You can use a linter or a cop to do this. Once you eradicate any possibility that global methods can be called or global objects accessed you’re left with exactly the objects that are visible to the thread; the ones that were handed to it on a silver platter. If anyone adds a new object it becomes very clear and can be audited at that time. But you can default to “oh hell no.”
#3b: Constrain the heck out of the objects you give that thread
Do not give the thread any global handles or any other useful handles, nor any general purpose functionality and certainly not any God Objects. Wrap any resources it needs with simple helpers that allow the thread to do exactly the necessary things and nothing else. Those helper objects can be quite specific and they will literally make it impossible for you to accidentally do something that you are not allowed to do. Any time a new capability is added it can be audited. Those capabilities can be scrubbed by a linter/cop as well but generally a very small set is all that is needed.
Even sub-parts of the thread can be constrained by simply not exposing those parts to some of the overall capabilities the thread has.
The combination of 3a and 3b will mean that most of the time you can add code to the thread with relative impunity knowing it’s very hard to get it wrong. Only when new capabilities are added do you have to be super careful and in those cases a more thorough audit can be easily triggered.
None of this is perfect, but it’s much better than. “We will try hard.”