
Self-modifying code for debug tracing in quasi-C - there
http://mainisusuallyafunction.blogspot.com/2011/11/self-modifying-code-for-debug-tracing.html
======
caf

      The code in this article is not production-ready.
    

Why do I get the feeling that this caveat will be roundly ignored?

------
rmc
This is interesting, but as someone who's not very knowledgeable about C &
assembly, I'm not sure what's going on. Could anyone explain what this does?

~~~
shadytrees
I'll take a stab at it!

A very simple model of a computer to think about is one that has two
components: a memory and a CPU. Picture the memory as a big array that
contains both code and data (and there's no way of telling which is which);
picture the CPU as a black box that has a pointer into the array, called a
program counter or a PC.

And here's how the CPU runs:

* Read `mem[PC]`.

* Decode that value into an instruction.

* Runs the instruction.

* Increments PC by 1.

* Repeat.

Now that's not a very interesting CPU since you can't really implement
conditionals or loops or functions like that because the PC just keeps
incrementing. So make sure one of your instructions is a jump, which lets you
skip the increment and assign an arbitrary value to PC.

Now, the original trace implementation was this:

    
    
         if (_point) printf(_args);
    

This translates roughly into this pseudo-assembler code:

    
    
           read mem[address of _point] into register
           branch-if-zero register label
           push _args onto stack
           jump to printf
         label:
           [rest of the function]
    

Lots of new, unexplained stuff here so here's a quick rundown:

* Registers are a small set of variables provided by the CPU that let you store memory an order of magnitude of faster than the regular memory. Throwing it over to Wikipedia: <http://en.wikipedia.org/wiki/Processor_register>

* Labels are just a way for us to mark places in assembly code instead of using indices into the code's memory. The assembler will figure out the indices for us and rewrite the labels to be numbers.

* Branch if zero! Think of branch-if-zero taking two arguments: a register and a label. If the register is zero, it jumps to the label. If the register is nonzero, nothing happens.

* The stack is a place chosen by the operating system where arguments to a function is passed. I'm being vague because there's a lot to say: <http://en.wikipedia.org/wiki/Call_stack>

So, back to the code. There are two problems here, as laid out by the post's
author:

* In constrained environments (the example in the post is the Linux kernel), an extra read for tracing is expensive. Real computers have caches: if you read mem[0x1234] a hundred times, the CPU will keep that value around so later reads are faster (much faster) than the first. Reading `mem[address of _point]` means one less slot in the cache, which depending on what you're writing can be unacceptable. There's so much more to be said about caches: <http://en.wikipedia.org/wiki/CPU_cache>

* Branch if zero! Real CPUs are optimized to decode and run a bunch of instructions in parallel. (This is called pipelining.) Branches are kryptonite because CPUs don't know ahead of time whether the jump will occur or not so they have to guess which instructions to pipeline. More more more to be said: <http://en.wikipedia.org/wiki/Instruction_pipeline> (especially the Complications section) and <http://en.wikipedia.org/wiki/Branch_predictor>

So the rest of the blog post is devoted to writing some assembly code that
gets around those two problems. There's a lot of time spent in the details,
but here's a very high-level overview:

Replace the original trace implementation one instruction: `nop`, which is a
no-op or an instruction that doesn't do anything. It's a placeholder.

At runtime, if tracing is enabled, rewrite the `nop` to be a jump to a
function. That function contains the actual code that calls `printf`.

This is possible because the CPU doesn't distinguish between code and data.
Just like you can manipulate data at runtime, you can also find and manipulate
code. And, as this is used for good here, it can also be used for evil:
Imagine taking advantage of a bug in a program to make it give you control of
its code. Then you can rewrite the code to email you secret information or to
make 100 HTTP requests a second to a server you don't like.

Everything else is bookkeeping, making sure the compiler, the linker, and the
operating system are all on board with this plan.

------
jhrobert
Javascript's closest equivalent is XXXX_de&&bug( xxxx)

where XXXX is the "tracepoint" and XXXX_de is a global variable set to "true"
when trace is required.

When trace is not required, thanks to the && "progressive and" operator, bug()
is not called (such avoiding bug()'s parameters evaluation).

de&&bug( faster) :)

