
On libunwind and dynamically generated code on x86-64 - daurnimator
http://www.corsix.org/content/libunwind-dynamic-code-x86-64
======
haberman
I concur with the author, it's an absolute mess. And I would _love_ to see a
single unified API for this. It's been a wish of mine for many years.

Here are even more interfaces than the three the author listed (libunwind,
__register_frame, GDB):

\- oprofile has a JIT interface:
[http://oprofile.sourceforge.net/doc/devel/jit-
interface.html](http://oprofile.sourceforge.net/doc/devel/jit-interface.html)

\- perf has a JIT interface:
[https://github.com/torvalds/linux/blob/master/tools/perf/Doc...](https://github.com/torvalds/linux/blob/master/tools/perf/Documentation/jit-
interface.txt)

I wrote a blog article about the craziness around unwinding a few years back:
[http://blog.reverberate.org/2013/05/deep-wizardry-stack-
unwi...](http://blog.reverberate.org/2013/05/deep-wizardry-stack-
unwinding.html)

One of the questions I answer in my article which isn't addressed as much here
is: why do we care about generating backtraces in the first place? It might
seem obvious, but I identified four separate reasons which are all important:
to support runtime exceptions (a la C++), to offer backtraces in debuggers, to
sample the stack for profilers, and to print a backtrace from the program
itself. So if you have a JIT and you don't offer proper backtraces, your users
are going to be annoyed when they use any of these tools.

~~~
pcwalton
You probably knew this, but just to add in case others weren't aware: Those
reasons are the reasons you want unwinding on Unix, but on Windows it's even
more important. The Windows equivalent of Unix signals (very roughly speaking)
is Structured Exception Handling (SEH), in which the kernel actually arranges
for your stack to be unwound to deliver you signals such as segfaults [1]. So
unless you maintain proper unwinding info, your app isn't in a position to
receive any signals sent by the kernel (at least, the way you're intended to).

[1]: [http://www.nynaeve.net/?p=201](http://www.nynaeve.net/?p=201)

~~~
pjc50
That blog looks very interesting, thanks.

I found out the hard way that Microsoft has two sets of ARM compilers for
WinCE7, and only one of them generates code that's compatible with the SEH
unwinder.

I don't think SEH actually calls destructors when unwinding C++, unlike C++
exceptions. This makes it leaky to use for anything other than fatal
exceptions.

~~~
wolfgke
> I don't think SEH actually calls destructors when unwinding C++, unlike C++
> exceptions. This makes it leaky to use for anything other than fatal
> exceptions.

How do you come to the conclusion that SEH is a leaky abstraction instead of
the behavior of C++ destructors?

~~~
jschwartzi
He means it causes resource leaks in any classes that implement RAII.

~~~
wolfgke
SEH was there before C++ was fully standardized (first standardized version
was C++98). In this sense one can argue that the definition of RAII is leaky
since it "forgot" that SEH was already there and not the other way round. A
proper solution would have been that functions where RAII is used must not
call functions that can throw an SEH exception.

~~~
pjc50
All kinds of functions can throw SEH exceptions, because they're machine check
exceptions (divide by zero, reference invalid page/null pointer).

I went and looked this up to refresh my memory:
[https://msdn.microsoft.com/en-
us/library/swezty51.aspx](https://msdn.microsoft.com/en-
us/library/swezty51.aspx)

Microsoft recommend you don't mix the use of SEH exceptions with C++, or that
you use the flag which turns SEH into C++ exceptions (and therefore calls
destructors). In our case we didn't want to do that, because we're doing our
own crash reporting and all we want is to log a backtrace while exiting the
program, and _do not_ want to call destructors which may themselves crash
because the program is in an invalid state.

~~~
wolfgke
To me this all looks like stuff that is undefined behavior in C++, so a C++
program for which such a problem occurs is by definition invalid. In other
words: You have a large problem in your code.

~~~
pjc50
Like I said: it's crash logging code, to be used for detecting problems in the
code. Undefined behaviour is not invalid: if C++ compilers rejected all UB it
would be considerably easier to ship (harder to write!) reliable programs.

~~~
wolfgke
No: The C++ standard requires from programmer that it will not trigger UB.
This is a kind of contract between the programmer and the compiler.

~~~
zvrba
The compiler vendor is free to extend the standard and _define_ UB.

------
zvrba
>It so happens that the Windows API gets this right,

This is my general impression on Win vs Linux: Windows offers more fundamental
services as a core OS component, whereas Linux delegates this to a hodge-podge
of competing and mutually incompatible user-space components. (Examples:
tracing (ETW), transactional database (BlueJet/ESE), etc).

~~~
userbinator
Dynamic linking is the other big one. Windows had DLLs from the beginning, all
the various Unices didn't and it's always looked like a bit of an afterthought
to me.

~~~
zvrba
Windows DLLs are not position-independent though. This has its advantages and
disadvantages.

------
KenoFischer
Oh hey, somebody else feels the pain! It's not only the crazy number of
interfaces you have to support (not to mention that some of the interfaces are
not particularly well defined, e.g. GDB's interface), but also what happens
after. Most of these interfaces were not designed with very many JIT functions
in mind and have O(n^2) behavior. One of the Julia core contributors recently
worked with the GDB folks to fix their O(n^2), but libunwind still has it (to
the point where one of the julia tests takes 15% of it's total runtime just
having the unwinder try to find the JIT frame). Library support for JIT code
is surprisingly bad.

------
kaushiks
While RtlInstallFunctionTable callback was designed specifically for JIT
compilers, RtlAddFunctionTable may also be used. The trick is to give every
function its own function table. I wrote Chakra's P/XData handler and was
surprised to find that they still haven't switched to function table callbacks
[1], like the CLR has.

[1]
[https://github.com/Microsoft/ChakraCore/blob/7587147fac16654...](https://github.com/Microsoft/ChakraCore/blob/7587147fac16654797fca082deeef607c4f3d6a6/lib/Common/Memory/amd64/XDataAllocator.cpp#L220)

------
ziotom78
A few months ago I tried to port the "supposedly portable" libunwind to
OpenBSD, as a part of an effort to port Julia under OpenBSD. I failed
miserably, as I found the task was too complex for my abilities (see this post
of mine: [https://groups.google.com/forum/#!topic/julia-
dev/ndzuU9NixK...](https://groups.google.com/forum/#!topic/julia-
dev/ndzuU9NixKU)).

At that time, I blamed myself for not having been able to understand
libunwind's code. I confess that this post has changed a bit my perspective.

~~~
KenoFischer
If you ever pick this up again, I'd recommend starting with llvm's libunwind
library, which is an import of Apple's libunwind + support for other operating
systems. As far as I can tell it's the only fully functional unwind library
that's actually maintained at the moment. We discussed porting whatever
functionality it is missing (probably linux support and the basic assembly
profiling heuristics we added to our fork of Apple's libunwind) and just using
it for julia on all platforms.

------
kazinator
In the "C ecosystem", there is no such thing as universal unwinding.
Everything that needs unwinding rolls its own.

The "unwinding" that libunwind refers to is merely the restoration of the
procedure state (stack pointer and callee-saved registers). It doesn't know
the _ad hoc_ schemes for cleaning up the resources associated with a frame.

So that is to say, we can't take this library into the implementation of some
language, and gain the ability to unwind through the frames of a foreign
library which we called (and which called back into us) for doing things like
safely performing a non-local return across that library. It's not going to
deal with its resources, even if that library has some sort of unwind-protect
scheme.

Basically, it's hard to understand what libunwind is for; what do I gain by
integrating libunwind, beyond dumping back traces? Maybe I can integrate an
interactive mini-debugger into the program where you can step through frames
and examine state?

------
gumby
To be fair it's not unreasonable that every language has its own approach.

In "the old days" some systems had system-wide calling conventions. For
example VMS had a standard calling convention all languages were supposed to
use. Sounds good -- tools and subroutines can interoperate, right? Actually
not really: object representation differed, for example. Plus new features
(such as exceptions) that hadn't been thought up when the calling convention
was developed couldn't be supported.

SVR4 attempted to make a common set of calling conventions, and to some degree
succeeded, but really didn't help that much for the same reasons above.

As with so many problems in computing, the problem can possibly be solved by
adding a layer of indirection: add a special ELF section with special entry
points that support unwinding and examining frames. As long as you compile all
your code with one compiler you should be OK...but what happens when new
language features are added?

------
js2
Unwinding under Android, especially from JNI is also a mess.

[http://blog.httrack.com/blog/2013/08/23/catching-posix-
signa...](http://blog.httrack.com/blog/2013/08/23/catching-posix-signals-on-
android/)

------
tiles
Is anyone able to add perspective as to how this compares to OS X?

~~~
setpatchaddress
In general, OS X infrastructure looks more like Linux than Windows. If the
overall complaint is that it's not built into the OS, well, welcome to UNIX,
island of isolated, uncooperative, and sometimes broken toys.

I like UNIX, don't get me wrong, but some of the API design is truly
appalling.

~~~
simscitizen
Luckily Apple is sane enough to compile its core libraries with frame
pointers, so unwinding is pretty trivial. See for yourself (this function is
called by backtrace(3)):
[http://opensource.apple.com//source/Libc/Libc-825.40.1/gen/t...](http://opensource.apple.com//source/Libc/Libc-825.40.1/gen/thread_stack_pcs.c)

~~~
mappu
On the Windows side, Vista was the first client release to begin reincluding
frame pointers, having omitted them since (at least) NT 3.51.

[https://blogs.msdn.microsoft.com/larryosterman/2007/03/12/fp...](https://blogs.msdn.microsoft.com/larryosterman/2007/03/12/fpo/)

