Debugging Optimized Code
The first time you ever have to look at a crash dump, or breakpad report, or whatever, from heavily optimized code your eyes may bug out. Looking at these things never gets easy but at some point you stop being quite so gobsmacked by what you see. This article is intended to help you get to that better place a little bit less painfully.
There are many debuggers or “breakpadish” things out there and they will all have different properties. So, as is my custom, I’m going to focus on approximately what is going on here. What I’m writing is based on my own memory of these things from back when I was a youngster and I owned the C/C++ debugger in Visual C.
Let’s start with a quick primer on the sorts of data a debugger has at its disposal.
Debugger Information Basics
There are a few essential pieces of information:
The Line Table
This is a mapping between between a code address and a source file/line
- the mapping from address to file/line is many:1 (a function)
- the mapping from file/line to address is 1:many (not a function)
When I say that the mapping from file/line to address is 1:many I don’t just mean that there are typically many code instructions for any given line of code, for purposes of this table typically we mean the first address associated with that line. But even mapping to just the first address of the line (so you could set a breakpoint there for instance) leaves you with a 1:many mapping. For instance any class template can expand into many different copies of the code, all of which have the same line origins. Any bit of inline code found in a header file might be processed many times in many different files and therefore those lines of code appear many times in your executable.
It’s typical ok to assume that given an address you can go back to the unique source line that created it using debug information (e.g. to print that source) but as we’ll see that isn’t 100% safe either. But normally address to source is a a well defined function.
The most basic symbolic information is a function mapping the name of every symbol to the address of that symbol. Great. Typically a range of lines corresponds to a function so you might do tricks like “the function consists of every byte from its first address to the address of the next function in order.” That’s usually good but there are cases where the code generation of a function might be split into pieces (e.g. hot/cold splitting for better locality). If that’s the case then the range of addresses that correspond to the function might be quite complex indeed. But the most important address is still the entry-point. The place where you will set a breakpoint on that symbol.
It’s important to be able to go from addresses inside a function back to the function name so that you can make good call-stacks. If code is contiguous that works out pretty well.
The symbolic information might also include information about what the stack frame looks like, or more generally the locals at some level of detail, i.e. this local is stored at this stack offset, and this other local is in this register, and so forth. The information about the stack frame can get quite complicated because functions can have nested frames. Furthermore in the presence of heavy optimization the location of a variable may change (even frequently) in different parts of the function. In many systems there’s only one copy of the variable layout for the function in the debug info that includes its various nested stacks and so the encoding will be … less than perfect. This kind of choice is made pragmatically — the debug information can be enormous and per-instruction variable information would not be small.
The situation can be even worse than a moving local; it could be that a local has been entirely elided, or merged with some other local. It all seems like a long running soap with too many actor changes “The part of el_index will be played by R7 in this episode.”
It’s often the case that to get to the bottom of what’s really going on you have to look at the disassembly. So, yeah, fun for the whole family.
And I’ll totally gloss over type information. Turns out we need to know the types of everything to print out values of stuff. But that’s not too relevant here, so moving on swiftly.
What Stuff Goes Wrong?
Problem 1: Inlining
If you’re looking at a call-stack and it seems like maybe stack frames are missing it’s possible that the missing code has been inlined into some other function. Some debuggers (e.g. windbg) have funky features that notice that a particular address is in the middle of an inlined block and they inject a fake frame into the call-stack so it looks like there was call even though there wasn’t. Remember the whole point of inlining is to remove those calls. But if your debugger (or breakpad or whatever) doesn’t really know about inlining, or the necessary information isn’t in your systems debug info, then it will look like there was no call (there wasn’t). So the stack frame looks off (but it isn’t).
Here’s the kicker. The line table is likely NOT off. The inlined code was charged to the line that created it. So if you were to dump the source lines in that area they would tell you the truth. The code is elsewhere. But the algorithm for finding “what function am I in” probably isn’t smart enough to figure that all out. The result might be that you’re looking at a crash that involves use of things that don’t seem to appear in the source code at all. Check your inlines.
Problem #2: Outlining
If you’re looking at a call-stack and it seems like there is a call to some function that you could not possibly have called on the stack, maybe in the middle somewhere, and then things get sane looking again, it’s possible that you are the victim of outlining.
To save space, the compiler/linker can take code that is common to two possibly unrelated functions and automatically create a “mini-function” to do that job thereby sharing some code and making your program smaller. Typically this only happens if the whole program is being code-generated and linked in one step.
The problem is that now there is a function call where there was none before. In most cases, the name of the “function” will relate to one of the places that it was outlined from but there might be dozens of such sites. That means all but one of them is likely to be “wrong” in a stack in which it appears. Remember this is just some random code that happened to appear in several places and the compiler decided “this is a good time for a function” so it made one; sua sponte.
If you are looking at those lines you will find that the source mapping is equally bizarre. The file/line numbers can only go to one of the sources and so the usual assumption that you can sanely go from an address to the line that generated it has gone up in a puff of smoke. If you walk through the functions in the call-stack, the outlined function probably has source code that doesn’t look like it belongs in the call chain. By comparing what you think should be done in the weird spot to what the debugger is telling you, you can get a guess at what was outlined. But this is not easy. Looking at the disassembly can help you a lot.
Problem #3: Reordering
Reordering is less likely to affect call-stacks directly, but it can make inlined and outline stacks even more confusing. What it will do is wreak havoc with your ability to reason over the code execution as you single-step. You may find yourself bouncing all over the source file for a single function and maybe even leaving that function in a haphazard way as you discover that the compiler has come up with a “much better order” in which to do the work you have specified. This is often great for space savings but hell with your mind.
Problem #4: Relabelling
In addition to instruction reordering of course registers and stack locations are being reassigned. You can easily imagine cases where a simple looking assignment like “y = x;” turns into no code at all but rather the compiler is thinking “OK the value now in R2 I will henceforth think of as y AND x” and then, “oh bonus, just in time for me to use the same multiply-by-two case as that other branch, so R2 is actually y and z at the same time but not x anymore since I just doubled it.”
This kind of relabelling probably won’t affect callstacks but it can easily affect the output of debugging tools (windbg’s !analyze tries to show you lots of stuff from the frames that can be weird). It can be quite tricky to interpret the current state with relabelling has happened.
With all these problems combining, things can get quite hairy indeed. “I’m looking at a relabelled register on a re-ordered line of an inline function in my outlined block.” That isn’t likely to be confusing…
Conclusion: Weird Is Not Always Broken
When you see a weird looking call-stack, if the stack seems well formed at all (like if it’s quite long and goes through several regions) it might actually be legit if all that’s wrong looking is a spotty frame or two. The normal inlining and outlining operations can easily produce a weird looking frame. But the clues should be there to help you spot the weirdness by consulting the source code and checking the function frames against the reported source lines. You can make a lot of progress this way. But don’t go to crazy with this stuff. Sometimes stack frames really are corrupted…