Big O-no notation
Like everything else I write, the following will only be approximately correct.
For the most part, software performance engineering has almost nothing to do with the big O of algorithms and in fact big O can easily lead you to poor choices. In practice, the part that the notation tells us to ignore, the constant factor, is often the only part we control, and it is the place where improvements will most likely come.
It does happen from time to time that you find some O(n²) algorithm that can be economically replaced by something that is O(n), or O(n) that can be replaced by O(1) but those are the rare birds. When that happens usually it’s a sign that someone did something really silly rather than that you’re especially smart. It’s much more likely that you’ll find a way to shave 10% off of an already pretty good algorithm, and that might be a major victory. In a lot of contexts 10% is huge.
I can’t resist telling an old story at this point. I was once asked the following puzzle, which probably many of you have heard before. Suppose you have an arbitrarily long linked list. Can you write an algorithm that tells you if it has a cycle in linear time and constant storage?
I barely even thought about it and then said “no”. Then I produced a correct proof that it can’t be done. The nature of the proof was that constant storage meant it could be matched with a finite state automaton, I described the series of pointers you would see as an input language and trivially showed that it wasn’t regular with the pumping lemma. QED.
At that point Gerd, the fellow who challenged me, was very frustrated and he said I was crazy and showed me the two-pointer solution. I said, this only works if the number of nodes is less 2³² and you said “arbitrarily large” which means your answer isn’t correct because no fixed size pointer can address arbitrarily many nodes. If I’m allowed to assume < 2³² nodes, then I can give you an answer that runs in constant time. But you won’t like the constant.
In case you are wondering, the constant time answer is: follow the next pointer in the list at most 2³² times. If you don’t find the end in that many steps, then it has a cycle. This is clearly a O(1) algorithm. Woo hoo.
I think we can agree that the O(n) algorithm is much better…
This is a silly/extreme example but there is an important lesson there. As a practical matter some of your input sizes are bounded, often with very small bounds, and so they don’t properly belong in the O(..) part of the analysis but maybe more in the “constant you’re supposed to ignore but shouldn’t” part of the analysis.
Let me give you a more real example. Suppose you have to implement the find_first_of function. It takes two strings and returns the index, in the first string, of the first occurrence of any character in the second string. It returns -1 if there is no match.
int find_first_of(char *haystack, char *needles);
And I’ve made this simple but even the signature is complicated if you have to consider things like “the size of the string might be bigger than an int”. I’m going to disregard that stuff just to focus on the algorithmic pieces.
Let’s say that the size of the haystack is n, and the size of the needles is k.
The most naive thing you could probably do is check the characters of the haystack in the order for each character of the needle. So that’s going to be O(nk). Great. There are obvious things you could do to try to make it better. For instance, you might choose to first put the needles in a set or something, maybe something with an O(1) membership test. If you did that, you’d end up with something maybe O(n+k).
All of this kind of thinking could easily lead you to a bad place.
In order to actually come up with something that will help, the typical sizes of n and k will be decisive. For instance, if k is typically quite small, say less than 5, you could easily find that 5-character comparisons in order are just as fast as some kind of set lookup. And again, if k is small then really this whole O(nk) business is a bit of a farce. It’s O(n) in the first place.
What about n, is n going to be very large? If it is then it’s more likely to be worth it to do some set-based optimization of the needles but then, if it’s very large, well then probably effects like virtual memory are going to come into the picture and the time is likely to be dominated by disk i/o anyway. You might find needle optimizations to be barely measurable.
Maybe you could go with a bit-vector for the needle set, figuring if you use say an array of 32 bytes to hold every bit that matches a needle you could create that set pretty fast, with only stack allocation. Even fairly small n is likely to benefit.
Is there a clear answer? Well, if this find_first_of business important to your program then probably some kind of specialized implementation is worth it, and the choices are much less likely to be informed by big O analysis than they are by the typical sizes and constants. And if it isn’t important, why are we even talking about it? Use any implementation that happens to be handy and call it good.
I find it ironic that the part of the analysis that the big O notation tells us to ignore ends up being the thing that matters the most. That’s because big O is designed to compare algorithms, not to compare execution cost.
In the real world you mostly see O(1), O(lg(n)), O(n), O(n²), O(n*lg(n)). Anything worse is probably O(forget about it) *groan*.