
My Favorite Debug Ever - strzalek
http://ryan.hypnoticocelot.com/post/136617949294/my-favorite-debug-ever
======
mrich
I prefer using AddressSanitizer for finding bugs like these now. In addition
to heap overflows it can also find stack and global overflows.

As a second option I'm using valgrind which can find uninitialized memory (and
doesn't require recompile of all libs as MemorySanitizer does).

------
timclassic
ElectricFence is great, and I recall having a similar feeling to the OP when I
first used it because many of the described issues were also new to me at the
time. It's a great experience :)

If you're on Solaris or a Solaris-derived OS, you also have the libumem and
watchmalloc libraries that can help you out. I have used libumem to great
effect in the past.

It's been a while, but I anecdotally recall that Solaris is more stingy with
its mallocs than Linux. I used to compile and run my C projects on Solaris as
a first pass in my search for memory errors.

------
teddyh
Is using GDB and Electric Fence to find a normal buffer overflow bug really
interesting enough to warrant a blog post these days, let alone a HN story?

I guess it might be interesting for all the people who, like the author, are
Java and PHP programmers and have no experience with C?

~~~
JadeNB
> I guess it might be interesting for all the people who, like the author, are
> Java and PHP programmers and have no experience with C?

Maybe it isn't snark, but that reads like it, and it seems unconstructive. Why
knock down someone who's learning by denigrating his background? What's the
harm in someone sharing something interesting (to him) that he learned,
whether or not you think he should have known it already? It's easy enough not
to read it, especially on a random blog and even on HN, if you don't want to
share his experience.

~~~
teddyh
I’m _mostly_ trying to understand why this is deemed interesting, and not to
snark. I mean, sure, anything voted up is interesting enough, but I thought
that this was below the minimum level of knowledge a C programmer ought to
have. It’s like if there was a story on the front page of HN about debugging
HTTP using “telnet”.

~~~
wink
But there's loads of not-C-programmers who might still work on segfaults from
time to time.

For example I am absolutely able to use gdb, but have never heard of Electric
Fence. (But I had heard of AddressSanitizer mentioned above.)

------
vardump
Usually the toughest things to debug is when stack is overwritten. Worse, if
it's stack and heap. Not my favorite things to debug...

Here's what I usually do in such cases:

Call stack is unreliable at that point. You'll see puzzling things like
function calls with impossible parameters, etc. If things just don't make any
sense, it's better to map all code paths that can lead to the crashing EIP/RIP
(hopefully valid pointer to the instruction that caused the crash). Check
EIP/RIP if it's in some rep movsd (= potentially inlined memcpy, check ECX
(RCX) rep counter, EDI (RDI) rep pointer), or if the execution is in some
runtime library code such as memset, memcpy, etc. similar. The next thing is
to make sense of the call stack manually, if there are portions not
overwritten, but what stack walk couldn't resolve. Of course it pays to take a
look around stack otherwise as well, for signs of overwrite and contents of
the overwrite.

It's also possible a pointer to stack object leaked at some point and the
crash occurs at completely different part of code than where it actually
segfaulted. Or some runtime structure was corrupted, like heap. Sometimes you
can find those by just inspecting and guessing struct/object shape and values
near stack pointer ESP (RSP).

If the bug can be reproduced, memory breakpoints, logging (especially if
multithreaded, but watch out for blocking I/O from logging), tools for
debugging memory corruption (valgrind, compiler paranoid mode, etc.) etc, even
mapping some pages unreadable and unwritable. It can take a while to find the
actual bug.

If reproduction is not possible, good luck. Better spend some quality time
with memory hex view, disassembler, trying to locate registers and stack
values that might contain pointers, etc. It might take a while to find the
issue...

------
adricnet
Thanks for sharing this! I could wish for some tool use (gdb) screenshots,
though really the inspiration to learn is valuable enough.

A next read in this vein might be Bug Hunter's Diary by Tobias Klein:
[https://www.nostarch.com/bughunter](https://www.nostarch.com/bughunter)

------
64bitbrain
Great article! I had a similar experience, a Java programmer turned Linux
Kernel programmer. The first think I remember doing was debugging a kernel
panic using crash. It is not doubt challenging for Java/PHP programmers to
debug C coredumps/KP, when there is pointer reference involved. But at the
same time I learned a lot, specially memory management.

------
mschuster91
I wonder how many of the young so called "programmers" would be able to solve
this. It's a pity that one can call himself a "developer" without having the
tiniest idea how a computer actually runs, what a process, a thread or a
pointer is...

~~~
malka
We all stand on the shoulder of giants. I'm pretty sure when the first
languages appeared, some people said things like "It's a pity that one can
call himself a "developer" without having the tiniest idea how a transistor
actually runs, what a perfored card is..."

~~~
mschuster91
The problem is: what happens when there is no one any more who remembers how
the basics work?

The banking and healthcare industry already has this problem because there is
a huge amount of decades old legacy stuff still in use and they have to pay
through their noses for expertise.

~~~
malka
I'd say it's not the same kind of basic. We still need people that know how to
manually manage memory. We don't need them as much though. But at some point,
Sun needs people to develop the JVM, and its memory management algorithm. The
"how this works" is well documented in academic papers too.

Obviously, as you mention, the less there are people that are capable to do X,
the more they cost. But it's not the same as "totally forgetting something".

