
How debuggers work: Part 1 - yan
http://eli.thegreenplace.net/2011/01/23/how-debuggers-work-part-1/
======
stcredzero
In Smalltalk, you can start writing a debugger that lets you browse a stack
trace in under 5 minutes. The debugger is just an ordinary application working
on (mostly) ordinary objects that happen to be the meta-level of Smalltalk.
(In particular, the contexts.) To complete the debugger, you just need to
implement a Smalltalk VM without GC, which is not all that hard, as it's
little more than a 256 case switch statement. Basically, your debugger is a VM
you control through an app, running the debugged process.

~~~
eliben
I fail to see how this is relevant. Smalltalk is a VM language, which makes
things very different. For instance, I can say that for Python you easily
write a debugger using the C API of the VM, and in fact you _don't need_ to
write a debugger, since one is part of the standard library. What point does
this make?

~~~
stcredzero
_I fail to see how this is relevant._

Sorry.

 _Smalltalk is a VM language, which makes things very different_

Yet somehow, you have stumbled right on one of the major points.

VM languages are often about eliminating impedance mismatch between ordinary
coding and the metamagical stuff. Python is another good example, but I am
less familiar with Python than Smalltalk. The same goes for Lisp and Ruby. I
have written the "5-minute debugger" in Smalltalk as part of a presentation.
(Really, it's just a stack browser.) I haven't done the same for the other
languages.

------
mahmud
There is a lot of tail-chasing going on here.

When a question like "How X works" is posed, it's best to pause for a minute
and solve for the most general form of X, not specific instances.

Whenever you see C, C++, Unix or assembly in what should be a very
"foundations" type article, you need to pause and ask yourself "why?". These
platform-dependent details don't help us learn anything at all, and their
presence is essay-smell.

For debuggers, no two are ever exactly alike. It's a generic term for a class
of software that's more of a continuum. At a minimum, they help us set
"breakpoints" at specific _locations_ , and allow for _manual_ intervention
when that location is _executed_ , and permit us to _inspect_ or _edit_ the
application _state_ at that _moment_.

That's the most generic description of it, and every italized word above is a
semantic mine-field, if we take into account the breadth of programming
paradigms and their vast differences in machine implementation, program-shape
(is "code" a vector? tree? graph?) and the subtleties of their execution
models, distribution (where is program location), and temporal properties
(when is the program running? time, what a precious concept that we take for
granted!)

For the interruption problem, there are a few common approaches:

\+ By instruction editing, for linearly executable programs where the "code"
is a writable vector, it's common to insert specific debugging instructions
(HLL code instrumentation falls under this)

\+ By an interrupt table; for programs where the executing machine is virtual
or itself programmable, it's sometimes easier to assign breakpoints to
locations within the program, and trigger an interrupt when that location is
reached. The machine maintains an "interrupt table", either a global one, or
per "application" (i.e. process, thread, your granularity du-jour, etc.)

\+ A rule-based approach where the break-point is triggered based on semantic
meanings produced by the program as it executes, or based on the shapes of its
execution and dataflow graphs. Requires the machine to be programmed almost at
the same abstraction level as the application, and perhaps tight integration.
Most debuggers for logic programming languages operate at this level; they're
more equation solvers than blunt shells for machine memory.

And many, may more. There are as many debugger designs as there are
programming paradigms, execution models or language implementations. Free
yourself from the short-sighted tyranny of antique designs (blech Unix and
x86!) and discover the wonderful world of formal systems and abstraction!
languages, machines, type-systems and semantics, all on a whim, as weird and
wonderful as you want them to be.

~~~
barrkel
It's all very well to understand something in the abstract, but frequently
that doesn't easily translate to the specifics without a lot of running around
mapping concepts. When your machine is abstract, or even better, a concrete
implementation of something virtual, the kinds of manipulations you need for
debugging are pretty trivial. But if that's all you know about debugging -
only understanding it at that high a level - it will limit your practical
usefulness actually getting work done in many scenarios.

Device drivers generally don't run on virtual machines; likewise most C
programs, nor most VM implementations, nor most interop through foreign
function interfaces in heterogeneous environments. Understanding how these
things can be debugged on any particular architecture is applied knowledge,
but it is knowledge nonetheless, and no less worthy of a blog post written for
those who would learn more about the topic.

Considering a concrete case that people using Delphi for Windows take for
granted: integer divide by zero diagnostics, communicated via exceptions on
Win32. If your implementation is a virtual machine, creating a debugger to
handle this kind of situation is ridiculously trivial; it's just a special
case in however you implement divisions. On the x86, it's a little trickier;
but Windows makes it easier for you. I don't remember all the details now (it
actually may not even be for integer divide, but I think it is), Windows will
handle the CPU interrupt, look at the faulting code, disassemble it, figure
out what kind of problem it was, and then make sure to dispatch a structured
exception corresponding to an integer divide by zero, rather than a more
general arithmetic exception. It does this work for Win32, but not Win64, as I
recall. Now, the mechanisms for how structured exceptions are dispatched is an
article in itself, and very different for Win32 vs Win64; then there's the
details of how debugger events are propagated to the debugger in Windows, etc.
All this is part of "how debuggers work" in the concrete.

But if all you have is an abstract understanding of debuggers, it won't help
you much when you're thrown into the deep pool of real-world writing of
debuggers. The devil is in the details; and you get paid for dealing with that
devil, _not_ for floating around with the angels of "clarity of thought". The
fact is, you won't get much done in the large without _also_ having that
clarity of thought at the higher level, and in particular, you won't innovate
much.

~~~
mahmud
When the title of the article is "How debuggers work", I thought it might be
more profitable to actually fulfill that premise, and not get side-tracked by
the quirks of ptrace. (For example, no one could implement a debugger for,
say, python scripts just reading this article.)

A better title might have been "how to trace child processes under POSIX".

Solving the general case is not just an academic exercise, it's also a richly
rewarding learning experience. When people learn to think abstractly, none of
these implementation details really matter. You can always specialize and
learn the details of an specific instance; it's best to get a good grasp of
the big-picture.

[Edit: Instead of discussing "debuggers", in general, I hoped to narrow down
and focus on just interruption mechanisms. Third-party execution tracing and
stepping are usually outside language semantics, so there is some "art" to
gaining control from an executing program. It would be nice to catalog the
lore, and I hope others add to the three I have suggested above]

------
xtacy
The fundamental mechanism: Setting a breakpoint at a particular instruction
requires overwriting the instruction at the address to "int 0x3" (which is
exactly 1 byte long).

Something nice to think about: why should the breakpoint interrupt instruction
be exactly 1 byte long?

~~~
scottdw2
Actually, it doesn't have to be. There are 3 forms of the interrupt
instruction. 2 1 byte versions for specifying either int 3 or int 4, and a 2
byte version that can specify any interrupt number. It's possible to use
either the one byte or 2 byte versions of int 3. Also, by adding arbitrary
instruction prefixes (for the "int" instruction the are meaningless) it's
possible to specify a breakpoint using anywhere from 1 to 15 bytes.

However, their really is no reason to use more space than necessary. If you
can encode an instruction with 1 byte, it doesn't make any sense to encode it
using any more bytes then what you need. Every byte that gets updated when a
break point is set needs to be backed up and restored once the breakpoint is
hit. Using extra bytes will hurt runtime performance.

This is particularly true given the "ptrace" interface on Linux. It only
supports reading or writing a single word (2 bytes) of memory in the target
process at a time. Backing up and restoring any more than 2 bytes would
require extra calls into the kernel, plus extra memory barriers to ensure
caches get updated. Using a scheme that did that more than once would just
waste CPU cycles.

~~~
axod
I think you missed the issue.

Consider the following code:

    
    
           jmp foo
           push ax        // 50
      foo: int 21h        // CD 21
    

So now we want to set a breakpoint on the "push ax". If we do it using more
than a single byte, it will overwrite the int 21h instruction and the code
will mess up and likely crash.

That's why 1 byte is used for breakpoints.

------
mfukar
Nice article, liked it a lot. But I like almost anything EB writes.

