
How does GDB call functions? - alpb
https://jvns.ca/blog/2018/01/04/how-does-gdb-call-functions/
======
simias
I'm surprised the author opted for guesswork and reverse-engineering GDB when
it's of course open source. Admittedly GDB's source code can be a bit daunting
but at least you'll have all the nasty details.

The relevant code appears to be in gdb/infcall.c in the
`call_function_by_hand` method. It's pretty well commented too and shows that
while the author gets the basics right there are many, many corner cases to
consider to get it right all the time (threads, signals, exceptions,
architecture and ABI quirks, etc...):
[https://github.com/rofl0r/gdb/blame/master/gdb/infcall.c#L46...](https://github.com/rofl0r/gdb/blame/master/gdb/infcall.c#L467)

~~~
jvns
Thanks for the link to `call_function_by_hand`, that's super helpful! In
retrospect, I could have found this file by grepping for 'When the function is
done executing, GDB will silently stop' (the very useful "grep for UI text"
trick!)

the choice between whether to try to understand something by running or by
reading the code is a super interesting one (though if I'm seriously
interested in understanding exactly what some software does of course I'll do
both!).

I think experimentation/stracing a program to understand it has a lot of
advantages:

* It can be helpful to look at the system calls as a way to maybe see the big picture before diving into the code.("ok, it's setting some registers somehow, it's putting in some int3 instructions, then it does PTRACE_CONT, then it undoes all the changes it made").

* it lets me focus on what the program actually _does_ instead of understanding how the code is organized (figuring out how the code in a large project I've never looked at before is organized can be time consuming!)

* It's easy to make incorrect conclusions when reading the source. This happens to me a lot when reading distributed systems code -- often I'll think "ok, if I do X, then Y will happen" but that turns out not to be true in practice. So I think even when you _are_ reading code, doing experimentation as you go is a key part of making sure that your understanding of the code is actually correct.

* I find exploring programs interactively fun!

In this specific example -- I think it's really interesting that gdb modifies
the `longjmp` function in libc when running a function. The string `longjmp`
doesn't appear anywhere in infcall.c. So even though reading infcall.c seems
like a great overview, it just has different information that I get from using
strace!

~~~
mmjaa
>very useful "grep for UI text" trick

Just, one thing: for C/C++ code-bases such as gdb, stop using grep for things,
people! Use cscope! 'tis a brilliant means of navigating files and scoping out
the contents of a project.

[http://cscope.sourceforge.net](http://cscope.sourceforge.net)

$ cd gdb/src; cscope -R

~~~
cryptonector
Upvote for cscope mention.

I prefer cscope -Rqk. I run cscope on the zeroth window of a tmux/screen
session with a $CSCOPE_EDITOR program that's a script that starts the real
$EDITOR in a new window in the same session (with an appropriate window name)
and in the background so that cscope gets control back immediately. This
allows me to use cscope to drive large code investigations very easily.

~~~
mmjaa
Sounds cool. I use it in vim as a 'grok this C/C++/Lua/Textfiles' utility,
pretty much every day. grep too of course, and there are times when you just
wanna ack, but for full grok, cscope is cool however you do it.

~~~
cryptonector
I have a shell function that generates shell functions that do find | xargs
... patterns. Thus I have 'fsg' == find source grep -- that is, find with
-name arguments matching source | xargs grep ..., and fseg (find source egrep)
and fmg (find make stuff grep). These functions take all non-option looking
arguments and use them as directory arguments for find, then all remaining
arguments are passed to grep, so I get to write:

    
    
        $ fsg foo bar -w foo
    

which translates into something like:

    
    
        find foo bar \( -name '*.[chylm]' -o -name ... \) -print0 | xargs -0 grep "$@"
    

Still, cscope is fantastic, and the first tool I reach for.

------
DannyBee
Oh god, this brings me back (I used to be the C++ maintainer for GDB eons ago)

It's even more convoluted than the author suspects, because

1\. GDB has to figure out which function to call (it does overload resolution
for C++, for example, and this varies by language)

2\. GDB has to do type coercion on the arguments if necessary (ditto)

3\. There are significant ABI and other issues, both at the processor level,
and at the language level, in where the arguments go (registers or memory), in
calling the right thunks, etc.

This is one reason LLDB uses clang to do most of the work of telling it what
the expression you are trying to call actually means, so it can just pretty
much execute it.

~~~
barrkel
The Delphi compiler used the constant expression folder to evaluate debugger
expressions; it had hooks for converting symbol references into values, and
for performing method calls, but a good chunk of the evaluation was
interpreted that way. I thought it was a nifty reuse.

(Obviously, the expression is parsed, typed and bound using the compiler front
end using the same symbols as the compiled code, so that bit is easy.)

~~~
DannyBee
Yeah, that works nicely if you can do it. GDB unfortunately reimplements a
hacked up C/C++ frontend. (For some other languages, it doesn't have to
because they have dynamic expression evaluation interfaces as part of their
runtimes)

------
AdmiralAsshat
Related question for the GDB experts:

My day-job involves GDBing C code. I use a handful of functions regularly when
I'm trying to pin down a bug, mostly just to double-check why the code is
following a given branch. So I'm heavily dependent on memcmp, strncmp, etc.

I've noticed that those functions fail if I start trying to use them too
early, so I end up having to add a "b main" first and wait until I get to the
breakpoint before they become available.

Can someone explain why? I assume it has to do with getting far enough along
in the program's execution that the shared libs have been loaded into memory?

~~~
DannyBee
When you say fail, what happens?

If you mean you can't call them, yes, it's shared libs, they aren't in the
address space yet. If you mean something else (IE the symbol info is wrong),
you can debug what is going on as follows:

The logic for when it builds minimal symbol tables, converting them to partial
symbol tables, and then full symbol tables, is _incredibly_ convoluted.

In theory, needing a full symbol with type info should just cause the right
thing to happen lazily.

In practice, it does not.

You can debug what is going on by doing this:

    
    
      (gdb) set debug symbol-lookup 1
      (gdb) set debug symtab-create 1
      (gdb) file <your executable>
      (gdb) p strcmp("foo", "bar")
      <look at output>
      (gdb) b main
      (gdb) p strcmp
      <look at output>
    

This is going to produce a truly ridiculous amount of output on a large
executable (so i'd use a minimal one :P), but by comparing the output you get
trying to use the symbol before and after "b main" should give you some idea
what is going on.

------
sahilbadal
GDB builds a dummy-frame for the inferior function call, and the unwinder
cannot seek for exception handlers outside of this dummy-frame. What happens
in that case is controlled by the set unwind-on-terminating-exception command.

