C++ for Security and Systems Programming — Objections

Rico Mariani
5 min readMar 21, 2024

--

I was privileged to get early access to Herb’s excellent article on C++ safety. I love its pragmatic approach. I paraphrase its thesis like so: “As far as security goes, we should not let perfect be the enemy of better. Certainly not much better — 98% better. And we can do 98% better. Let’s do that.”

There is no question that as soon as a compiler like the one Herb is talking about becomes available that my team will use it. And probably every team with a C++ codebase should also use it.

And so having said I love the idea — and I’m going to embrace it — I will now proceed to be critical of where it leaves us. Sorry, Herb.

Herb writes, “The immediate problem is that it’s Too Easy By Default™ to write security and safety vulnerabilities in C++ [that could reasonably be caught].” He’s right. I would say the contrapositive, writing in C++ is not a Pit of Success experience from a security perspective. And I will go further. Writing in Modern C++ is not a Pit of Success experience from a systems software perspective generally.

What do I mean by this? Briefly, if I pick up The Rust Programming Language, read it and Do What It Says (i.e., write idiomatic Rust) I get excellent code. No qualifiers are needed. The usual patterns are performant, secure, and maintainable. You can reasonably expect to be within a very small percentage of optimal C code if you just Do The Usual Rust Stuff. Energy efficiency tells the story clearly, with Rust coming in at only 1.03x the cost of C compared to C++ at 1.34. “Just Reading the Book” does not suffice for C++.

If we think more broadly, the C++ situation is far worse than just 1.34 energy cost, and getting worse with each subsequent release of the libraries. Flatly, the C++ libraries (std::*) do not favor systems programmers and no hypothetical systems-friendly sys::* with different tradeoffs is on the horizon. When asked why I chose to use C over C++ in my last major system project at Meta I said simply, “In C++ you are about two angle brackets away from a 20k size regression at any given instant.”

The situation is perilous. Consider:

  • The best programmers, the ones most familiar with the standard, cannot reasonably predict what code they will get for many idiomatic C++ patterns.
  • The use of combinations of recommended features in the library makes this situation worse, the outcome of a composition of library features is even less predictable, even to God Tier C++ programmers.
  • The differences between good and bad patterns are material, e.g., use of control flow structures like std::any_of can easily increase loop overhead by 8x (16 bytes to 128 bytes) compared to foreach. There is actually no upper bound on how much worse the lambda choice might be, 8x is just one I happened to see last week. Or, it might be fine, for now.
  • Use of analysis tools does not help you to predict code-quality because the normal transforms are chaotic (in the mathematical sense). I.e., you can try out some patterns with Godbolt, but this doesn’t necessarily tell you about what will happen in your own code and if your own code changes in a small way it might result in materially different code generation.
  • Primitive types are not systems-friendly, firstly because they aren’t predictably economical and second because they do not facilitate hazard free borrowing (the most important pattern) and indeed the idiomatic patterns can give you the illusion of safety while providing none. E.g., in const std:string& foo, foo is still mutable, and it can be null.
  • Primitive types do not clearly advertise their costs or indeed their nature. Most users of C++ are unaware that a std::shared_ptr is twice as big as a normal pointer and easily 5x as expensive in terms of code generation to say nothing of being as much as 150 to 300 times slower if locality is unfortunate. You may think “yes but those are only exotic cases”, but if you use std::shared_ptr everywhere exotic becomes probable, and the size costs are universal in any case.
  • “If you use that everywhere you’re in trouble” is a recurring theme in Modern C++. E.g., again, const std::string& provides no additional memory protection compared to const char *, it looks like it provides some lifetime and null safety but it does not, and the code will be materially worse if only because the single most useful thing you can do with a std::string is call c_str() on it (because no system APIs take a std::string) and c_str() isn’t a no-op. In all the other cases you at least get an extra indirection for your trouble.
  • Use of std::span and std::string_view actually give you something in terms of sharing and safety, but it took almost two decades for the C# ecosystem to have Span<T> in all the places you need for it to be consistently useful, and C++ has a long way to go on that score. It didn’t have to be so. Rust started in the right place.
  • Even the most basic classes like std::string have this bizarre properly that if you ask a typical developer a simple question like “how big is a std::string?” they won’t be able to answer correctly, and if you ask them the cost of string assignment, they won’t know that there’s always a full copy and probably a heap allocation. This is not a friendly design if the cost matters.
  • Modern C++’s desire to put lambdas everywhere falls down compared to Rust because the compiler doesn’t always seem to be able to determine that the lambda + <algorithm> is emulating control flow and it doesn’t have to materialize the lambda at all, ever. The fact that the degree of inlining can’t be accurately predicted contributes to the chaotic nature of the code output. Should functors be implicated, things become even more complex, C++ functors are not cheap.
  • Meta-programming is the bane of my existence. Features that were designed for library developers are mis-applied by everyday developers with great aspirations and limited experience. The result is that a few lines of code that could have been simple can take an hour fully understand and the additional template bloat destroys compilation times. Yet we encourage developers to write templates. The situation is worse if you are using LTO/LTCG.

What does this all mean? Simply this: well-meaning, educated, expert-level programmers get this stuff wrong reliably and it materially affects the quality of the generated code. In response to this, it is normal for teams to craft additional documents that describe “How to Use C++ Really” in their universe. Typically, these documents can be briefly summarized as “Just Say No to all that new-fangled modern stuff.”

This is hardly a victory.

When C++ starts making systems programming friendly choices (again), or alternatively spontaneously spawns something like sys:: then I might see a long-term future for it. But until then I can only see trying to limit the damage developers can do to themselves and their codebase.

--

--

Rico Mariani

I’m an Architect at Microsoft; I specialize in software performance engineering and programming tools.