
Retpoline: A Branch Target Injection Mitigation [pdf] - ingve
https://software.intel.com/sites/default/files/managed/1d/46/Retpoline-A-Branch-Target-Injection-Mitigation.pdf
======
donaldihunter
I can't shake the feeling that Intel are trying to take credit for retpoline.
They make no effort to credit the folk at Google who came up with retpoline.
You have to follow the reference in section 8 to infer that Google are the
saviours here.

~~~
hansendc
Disclaimer: I put a big chunk of that paper together.

That definitely was not the intent! Getting retpoline to the place it is today
took a ton of work from a ton of people, including the awesome folks at Google
like Paul Turner, and countless people in the Linux community.

------
Scaevolus
There are already compiler patches for retpoline, but section 5.2 is alarming
for Skylake and above:

"When the return stack buffer “stack” is empty on [>= Skylake] processors, a
RET instruction may speculate based on the contents of the indirect branch
predictor, the structure that retpoline is designed to avoid. The RSB may
become empty under the following conditions:

1\. Call stacks deeper than the minimum RSB depth (16) may empty the RSB when
executing RET instructions. This includes CALL instructions and RET
instructions within aborting TSX transactions.

[list of ~10 other situations that empty the RSB stack]"

They describe an "RSB stuffing" procedure, but I don't see any realistic way
to guarantee that it happens properly with general code. How many call stacks
do you have that are more than 16 frames deep? How many of those are recursive
or _dynamic_?

~~~
hansendc
Disclaimer: I put a big chunk of that paper together.

You ask how it can be guaranteed with "general code". The first thing to
remember is that retpoline is not for "general code". Linux, for instance,
does not support arbitrary call depth and barely uses recursion.

Also, take a close look at the "Exploit Composition" section of the paper.
Those five conditions are much harder to satisfy at 'RET' than they are during
the demonstrated Spectre variant 2 exploit points. For instance, a long
speculation window (#5) for 'RET' is interesting to generate since it means a
stall while waiting on something to come off the stack.

~~~
jesup
"Linux, for instance, does not support arbitrary call depth and barely uses
recursion." Perhaps there's context missing from that statement? Linux
certainly can have arbitrarily deep call depth, depending on the stack size.
Are you referring to the kernel? That would be odd, since the paper talks
about application code needing to be fully recompiled with retpoline to be
safe.

(Of course, that means that all libraries you link with or dynamically load
have to be compiled with retpoline too.)

~~~
hansendc
Yes, I was referring to the Linux kernel: it does barely uses recursion and
has small stack sizes compared to normal applications.

