
Selfrando: Securing the Tor Browser Against De-Anonymization Exploits [pdf] - ikeboy
https://people.torproject.org/~gk/misc/Selfrando-Tor-Browser.pdf
======
rinon
Tor devs wrote a short Q&A about this project:
[https://blog.torproject.org/blog/selfrando-q-and-georg-
koppe...](https://blog.torproject.org/blog/selfrando-q-and-georg-koppen)

------
nxzero
One thing I haven't seen done is blocking attempts to fingerprint keystroke
into puts from the keyboard. Easiest solution would be to sandbox field inputs
from JS and bulk submit queries. Best solution would be the injection of noise
real-time based on near average inputs for the same strings of characters.
Make no mistake though this is being done, and very rapidly marks you as you.

~~~
walterbell
Is this only search engine autocomplete fields in the browser, or also on
standard input forms? Is disabling JS sufficient for input forms?

------
comex
I've worked on this kind of thing before. This is generally a sane-looking
protection mechanism, which would ideally be combined with a lot of other
protection mechanisms, but the paper makes a few silly claims. To wit:

> For these reasons we assume that in a practical scenario the attacker cannot
> leak information that is not located on the heap, e.g., stack or code pages.

[..]

> Since the attackers can only disclose the virtual table pointer, but not the
> virtual table itself, as it is not on the heap, they cannot disclose gadget
> addresses.

[..]

> We therefore conclude that selfrando can thwart most real-world exploits.
> Attackers can only succeed in rare cases where they can disclose the
> complete heap and data section.

Yeah, no. It's actually not that hard to work around in most cases. You can't
build a ROP chain in one step after leaking a vtable address, but typically
the way you get into a ROP chain in the first place is by overwriting a vtable
pointer, and often (albeit not always) you have a pretty wide selection of
types of object to overwrite the vtable pointer from (because they all go in
the same heap, though separated by size class). All you need is some object
such that you can make a call to a native method from JavaScript and as part
of its implementation, (ideally) exactly one virtual method on the object will
end up being invoked, optimally with predictable parameters and with its
return value carried back to JS. I only have experience with WebKit: in its
case, most DOM objects are wrappers around a single C++ implementation object,
and autogenerated glue code translates JavaScript calls to its methods into
native calls to C++ methods with the same name, taking care of type checking
and such. And many of those methods are virtual, which fits the bill very
well. Firefox may be different, but I doubt it’s hard to find something that
works.

A C++ virtual call, for anyone who doesn't know, works like this at an
assembly level: (1) load the vtable pointer from the object (this is what's
being overwritten); (2) add a fixed offset to get to the right index in the
vtable; (3) load a function pointer from there; and (4) call it, with 'this'
as an implicit first argument.

If you can leak any object's vtable pointer, and you know its offset in the
original binary from the start of the data segment, you can calculate the
address of anything in that segment. So you look for a function pointer
(usually part of another vtable, but could be anything) to a function that has
a similar signature to the method invoked by the JavaScript call, but does
something more interesting. As a simple example, if the method has the
signature "int calculateSomething(int value)", where you can control 'value'
and get the return value, and you're on a 32-bit platform, you could find some
virtual method that just returns a field from an object passed as a parameter
- on any class anywhere in the codebase. You find the function pointer’s
runtime address based on the leaked vtable pointer, subtract the fixed offset
from the original method, and set that as your overwritten vtable pointer.
Then you call the JavaScript method, passing some integer of your choice as a
parameter... and execution passes to the new method, which dutifully treats
the integer as a pointer, loads a value from it, and returns it to you. Voila,
you can disclose whatever you want. After that, you have the whole Turing-
complete JavaScript language to assist you in generating a ROP chain.

In practice it's usually somewhat more complicated than that. And this
approach would be significantly mitigated if Selfrando randomized objects
within the data segment rather than just functions. That wouldn't completely
prevent it - for one thing you could try to find a replacement function at a
different index within the original vtable, which must be kept contiguous, and
there are other approaches - but it would help a lot. Considering that
functions and data are pretty similar from the linker's perspective, I'm
pretty curious why Selfrando doesn't do that.

Anyway, what they have implemented definitely won't "thwart most real-world
exploits", unless the attackers are dumb, or reluctant to spend time adapting
their exploit to the Tor Browser specifically. Personally, I usually use some
sort of vtable confusion anyway just to defeat ASLR, so Selfrando probably
wouldn't even affect me much...

Which is not to say it's not worth using.

Moving on:

> Based on these four observations, we examined the main TB library with
> selfrando enabled (libxul.so having a size of 92MB) to find out whether an
> attacker is able to disclose the address of a stack-pivot and a system call
> gadget based on addresses that can be found on the heap. We focus on stack-
> pivot and system call gadgets because they are less common, and therefore,
> harder to disclose compared to gadgets that only load a value into a
> register. In total, we found ten stack-pivot and 76 system call gadgets

If you only found 10 stack pivot gadgets in 92 MB, you _really_ didn’t try
very hard. I bet they didn’t consider jumping into the middle of instructions
on x86. Not that that’s the only way to change %rsp.

> While software protected by selfrando works smoothly with unprotected
> libraries (and protected libraries work smoothly with unprotected programs),
> the security guarantees provided by selfrando are obviously limited to
> software that was re-built with selfrando. The TB includes most needed
> libraries, and hence, is not affected by this.

Does that means it statically links against _everything_ , like GTK and glibc?
I don’t think it does, though I could be wrong. In any case, that’s
fundamentally impossible on OS X and Windows. You have to deal with system
libraries in your address space. Finding their address is another matter,
depending on the OS… I’ve never had to do much along these lines so I don’t
know how hard/unreliable it is on different OSes to find pointers to system
libraries in the heap, potentially guess an offset from the browser image
itself… but anyway, the issue deserves consideration and the Tor Browser is
certainly not “not affected”.

> Further, we found that some assembly instructions are sensitive to
> alignment, e.g., movdqa which is commonly used in the implementation of
> cryptographic functions.

> Moreover, we are working on a static analysis tool that can identify
> functions that contain these instructions, and mark them in the TRaP info so
> RandoLib can take their alignment constrains into account.

Huh? movdqa doesn’t need RIP to be aligned, it needs the operand to be
aligned. The operand could be RIP-relative, but at least on OS X that means it
points to the data segment, so there is a relocation for it and the reference
will be rewritten appropriately. Does GNU sometimes embed constants in the
text segment or something? I’ve never heard of that…

Anyway, you could just stick s/movdqa/movdqu before your assembler. On modern
processors it’s no slower.

> Future Work

> We are currently working on improving operating specific features, such as
> the support for thread-local storage (TLS). TLS is heavily used in Firefox’s
> default heap allocator jemalloc, however, it is possible to build the TB
> using the default heap allocator provided by libc instead, which does not
> rely on TLS. In fact, the TB developers expressed their desire to use a
> different allocator as well [56].

Any fast allocator must use some form of thread-local storage, so a “desire to
use a different allocator” is fairly irrelevant. In fact, glibc’s allocator
uses it - I guess it must use it in a different way that doesn’t cause
incompatibility…

I don’t have the slightest clue why Selfrando would be incompatible with
thread-local storage, though.

If “operating specific” means “operating system specific”, surely making this
work on anything other than Linux would be a useful such feature.

