
GCC gOlogy: studying the impact of optimizations on debugging - ingve
https://www.fsfla.org/~lxoliva/writeups/gOlogy/gOlogy.txt
======
userbinator
More times than I'd like, I've encountered GDB refusing to show the value of a
variable, claiming it's "optimized away".

No, it's sitting right there in a register, and I know that because I just
stepped through, instruction-by-instruction, the code that computed its value.
Fortunately it does not claim the register has been "optimized away" too, but
the experience of debugging optimised code could definitely use a lot of
improvement.

(This isn't exclusive to GCC/GDB either --- I've seen the same with MSVC and
Visual Studio, but in my experience the latter tends to be somewhat better.)

~~~
DannyBee
(I wrote a bunch of gcc's initial support for optimized code debugging, as
well as the initial associated gdb support for evaluating location lists and
expressions, which is what makes this all work under the covers)

Given the invariant of "debug info does not affect optimization", you end up
with plenty of cases where your _only_ possible choices are "give the user a
possibly wrong value for the variable" or "say the value does not exist
anymore".

Most of the time, the latter is done instead of the former.

You also unfortunately cannot just mark them so the debugger tells you "this
may be wrong, it may be right, good luck".

Historically, compilers tried doing the former, users complained enough that
they switched to doing the latter.

~~~
mehrdadn
I wonder if the error message could just be better. Instead of using the
excuse that the variable has been "optimized away", maybe gdb should just say
it "cannot determine variable value due to optimization".

~~~
saagarjha
Or even: this variable has likely been optimized away, but you might still
find it in $rax.

------
Supersaiyan_IV
Thanks for sharing. I wish there was a similar article regarding the different
quirks of Link-Time Optimization and its effect on debugging, and how that
impact has changed since GCC 4.x. Also, an explanation of how LTO's link-time
optimization flags are chosen, and what happens when they're absent. And most
importantly, why -O0 has an impact on LTO when it's passed at link-time, but
eg. -O2/-O1 at compile time.

------
caf
What I'd appreciate is if GDB, when stopped during execution, could show _all_
of the source statements that are currently 'in progress' (for the common case
under optimisation where the execution of several source statements are
interleaved). I don't know how feasible that is, though.

~~~
speps
Visual Studio has something called "Parallel Stacks" [1] which shows a graph
of all the stacks of every thread of the application. It's very powerful when
debugging multithreaded code.

[1] [https://docs.microsoft.com/en-
us/visualstudio/debugger/using...](https://docs.microsoft.com/en-
us/visualstudio/debugger/using-the-parallel-stacks-window)

~~~
kccqzy
I believe GP is not talking about thread parallelism, but rather the compiler-
controlled instruction reordering. Compilers typically reorder instructions in
the hope that it will improve performance. For example if you have v+=2, the
compiler can choose to load v into a register, and then continue with the rest
of the statements, and ten instructions later perform the actual addition and
store it back into memory.

~~~
caf
Yes, that's what I was referring to. When single-stepping such code you'll
often see it skip backwards and forwards between the interleaved source
statements.

------
cjhanks
I kept reading and reading wondering, what is the point of this document? I
make it 1/3rd through.

It seems well informed, but it lacks any coherent structure. And throughout I
couldn't deduce what the point was. Like a well educated rant.

~~~
tom_
I used to send mails like this at work to people somewhat often, selecting
mailing group according to subject. I'd typically have spent a while trying to
figure something out, met with some success, and figured I might as well send
a summary to anybody that could be remotely concerned. Maybe save somebody
else the bother in future.

Most of the time, nobody cared. (Which is fine, I never did this stuff for no
reason, the write-up was typically beneficial for me personally, and if nobody
else cared, that's OK by me.) But the odd one would prove useful and end up
getting re-forwarded multiple times, and that happened often enough that I
kept doing it.

(I use the past tense because I do contract work these days, and that works
differently.)

Leaving aside the justification for this sort of thing in general, and moving
on to why you might be interested in this particular note: suppose you'd
always just built your code at one of either -O0 (shit code, but works well
with any debugger), -Os (slightly less shit code, debugging probably
tolerable) or -O2/-O3 (decent code, but debugging is a pain - and I always
knew the assembly language for every target! The people that only knew C++ had
a devil of a time).

So this works, and you're used to it, and what better argument for anything
could there possibly be. But you probably wondered whether you could do
better, and with this in mind you've probably looked at the gcc manual and
noted that there's this huge pile of other sub-options. What if you tweaked
those, you wonder? Would that make things better? Would it make them worse? If
you only had the time to investigate which would impede debugging, and which
you could just pop into your debug build with impunity, safe in the knowledge
that it wouldn't cause you any difficulty! If only you had the sort of more
in-depth knowledge of compiler internals required to make this sort of
judgement by yourself, rather than just go through the 500,000 combinations of
options to try each one in turn! And/or if you only you had somebody on staff
who could just try this stuff out and write you a summary! But you don't, and
you don't, and you don't, and you have a product to finish anyway...

So this document might at least be mildly interesting.

~~~
cjhanks
I appreciate your comment and its perspective.

I spend a lot of my time optimizing code switching between,01 and 03.

I am genuinely very worried about the trade-off between optimization and
pragmatism.

At the same time, I am limited. This very useful analysis would help me a lot
if it providied an executive summary.

Non-linear debug builds are obviously harmful.

------
khitchdee
If you approach debugging in a very minimalist way,

you should never use a debugger

because it only serves to complicate things

using the I/O subsystem to debug your program

is a far more efficient way to debug.

If you do things this way,

you orthogonalize the optimization side of your compiler

from the mess created by trying to assist the debugging process

via the rather complex and intrusive (from the compiler's perspective)
component called the debugger

~~~
john_moscow
except it's much faster to put a few breakpoints than to recompile your code
each time you want to add/remove a logging statement.

~~~
jcelerier
> except it's much faster to put a few breakpoints than to recompile your code
> each time you want to add/remove a logging statement.

that reeaaallly depends. I had cases where adding a printf and recompiling
would take mere seconds and a gdb startup on the order of 3 to 4 minutes
because recompiling only changes one shared library while GDB has to load all
of them.

