Array Bounds Checking in 2024

Rico Mariani
3 min readJan 2, 2024

--

I was chatting with a friend of mine about the costs we expect for this kind of thing and I recalled an experiment I did in the .NET JIT a little over 15 years ago where I hacked out the bounds checking and ran the full test suite for the CLR and observed about 2% speedup with the bounds checks literally ripped out.

But also says me, compilers are much better at this now and processors have better static and dynamic branch prediciton and I bet the overhead is even lower than it used to be.

So he asked me if I had any evidence of this, even anecdotal.

At this point I remember I had recently published Wrathmark and it’s basically nothing but array access.

There’s one hack in the inline array version to try to remove some bounds checks. What happens if I take that out?

Let’s hack out that &7

Now it’s invariant that j is between 0 and 7 so j & 7 is a no-op designed to tell the JIT it doesn’t need to do a bounds check. The data member is an inline array of size 8 (inline array support is new in C# 8.0).

How do these compare?

With hack:

Evaluated 1,474,559,545 boards in 1:54.689 12,856,914 boards/sec Total time used: 1:54.689

Without hack (i.e. bounds check present)

Evaluated 1,474,559,545 boards in 1:53.442 12,998,241 boards/sec Total time used: 1:53.442

The hack doesn’t help at this point. The difference above is within the margin of error run to run.

Keep in mind that the Othello program is basically nothing but arrays and it is small enough to run hot in cache, it basically never misses the cache.

Now this change isn’t all the array accesses by any means, but the fact is that in basically all the other cases no hack was ever needed because the checks can be trivially elided like in for (x=0; x<8; x++) {...}.

As I mentioned, the code above used “inline arrays” — a feature that has been in .NET forever but was recently exposed in C#. The idea is that you can define an array of a fixed length. This is important: if the array length isn’t known at compile time then bounds checking becomes much harder.

If you use standard arrays and change nothing else, you get these numbers:

Evaluated 1,474,559,545 boards in 2:17.487 10,725,005 boards/sec Total time used: 2:17.487

The hack doesn’t help because even if the index is known to be between 0 and 7 the array length is not!

The upshot of this is that the overhead associated with bounds checking is much more about the use of good data types than it is about the actual branches. Fixed size arrays get great elision. Variable sized arrays can pay a huge cost.

Again this program is basically nothing but a lot of array lookups all the time. The board is arrays of bits. So of course YMMV. But in this case the difference is like 18%. Huge!

The major code difference arrays in the inline array version all look like this: (C# 8 only)

static x256<ulong> bit_values; // an inline ulong array of length 256

vs. this version that works on every version of C#

static ulong[] bit_values = new ulong[256];

Give the compiler a fighting chance and the overhead basically vanishes.

Of note:

  • Inline arrays have been usable in managed C++ forever. Hence JIT support is not new for them, it’s new in C#.
  • This program does not allocate after startup, all arrays are recycled in both versions, GC cost is zero. It’s a stress test for arrays and other compute centered features.
  • I said it before but YMMV, this is a benchmark.

--

--

Rico Mariani

I’m an Architect at Microsoft; I specialize in software performance engineering and programming tools.