
Time protection: the missing OS abstraction - walterbell
https://blog.acolyer.org/2019/04/15/time-protection-the-missing-os-abstraction/
======
niftich
This post lifts out the key parts of the paper [1] and is a good summary. I
think the paper is an accessible read as well.

Not too much discussion two weeks ago when the paper was posted on HN [2], so
I will raise a point I've made before [3][4][5] and is consistent with the
recommendations of the paper (and another post in this thread [6]): this is an
opportunity to improve the terminology, mental models, and formalisms of
observable state, and its implications for information hiding, privilege
separation, and computer system design.

This conversation needs not only to occur among (e.g.) computer chip designers
and cryptography experts, but also among higher-stack users of that
technology, so that the information leakage aspects and trade-offs can be
analyzed together with other performance indicators of the system.

It seems as if the haphazard, ad hoc way that chipmakers and system architects
dealt with this issue have contributed to an environment where Spectre could
occur: and such timing attacks were never a secret, but resistance to them in
various levels of mainstream computing appears to have been fitted in a
patchwork of hasty fixes and well-meaning but informal caution. The
conversation around this topic could use an upgrade, and the paper's authors
agree.

[1]
[https://ts.data61.csiro.au/publications/csiroabstracts/Ge_YC...](https://ts.data61.csiro.au/publications/csiroabstracts/Ge_YCH_19.abstract.pml)
[2]
[https://news.ycombinator.com/item?id=19547293](https://news.ycombinator.com/item?id=19547293)
[3]
[https://news.ycombinator.com/item?id=17308014](https://news.ycombinator.com/item?id=17308014)
[4]
[https://news.ycombinator.com/item?id=16165942](https://news.ycombinator.com/item?id=16165942)
[5]
[https://news.ycombinator.com/item?id=19644997](https://news.ycombinator.com/item?id=19644997)
[6]
[https://news.ycombinator.com/item?id=19670296](https://news.ycombinator.com/item?id=19670296)

~~~
ccvannorman
You might even say there's room for a startup to reinvent computers (and
operating systems) from the ground up.

Would you?

~~~
erikpukinskis
Well let's think it through. It would seemingly have to follow the Disruptive
Technology arc. That means it would start off as a crappy computer that had
really great information leakage control. And it would be marketed to a small
but enthusiastic market who really values that, and doesn't mind that most
everything else is a PITA.

Who might that market be?

~~~
ryacko
Major cloud companies and VM providers?

Most of these vulnerabilities come from speculative execution impacting a
shared cache.

~~~
gmueckl
Nah, cloud providers are under enornlis pressure to optimize resource usage
(real estate, electricity, labor). A hardware design without competitive
performance has a snowball's chance in hell of being adopted in that space.
The economics of cloud datancenter mean that security only needs to be good
enough, not perfect.

~~~
pferde
In that case, the perception that current hardware security is good enough
needs to change.

Or the cloud vendors could start offering compute running on new, more secure
hardware - e.g. special VM types marketed accordingly.

~~~
gmueckl
I think that this is wishful thinking on your part unless the drastic
performance per power drops are negated.

I personally think that these performance drops leads to a horrific
environmental impact because every performance drop means that more hardware
needs to be provisioned and powered to counter it. So this directly results in
more toxic waste from these elecronics and a higher carbon dioxide output into
the atmosphere (cloud data centers are a major consumer of electricity).
Compared to the long term impact of that, a extra few security breaches sound
like the lesser evil to me personally.

------
akersten
It saddens me that we're collectively going to spend a lot of effort trying to
patch out a problem that we've imposed upon ourselves. We were making such
great progress in terms of processing speed until someone came along and
decided that we need to have multiple tenants share the same hardware, and
they should have no way of knowing anything about each other. The vast
majority of consumer hardware will _never_[0] be exposed to this category of
attack, but will pay the performance penalty regardless.

Fundamentally, the need is for a completely different model of computation to
abstract away time-channel leaks. This cannot be fixed by patching existing
software and hardware, and we're going to go through a lot of pain and anguish
trying. As another comment points out, the well of possible timing attacks is
infinitely deep (attached hardware, network performance measurements, etc.).

The two options are performance or security, pick one. It seems the industry
is trying to pick both, and it's going to take us a long time to realize that
we're going to get neither.

For clarity - my proposal is segmenting hardware and software products between
the two categories of "general purpose, trusted computing" and "safe for
shared hosting." The 2nd category is so small compared to the first, it seems
unfair that its domain-specific problems should hamper the rest of us.

[0] Thanks to a combination of reasonable software mitigations (unprivileged
lower-resolution timers) and that most of these attacks require arbitrary code
execution in the first place

~~~
speedplane
> my proposal is segmenting hardware and software products between the two
> categories of "general purpose, trusted computing" and "safe for shared
> hosting."

If a normal user is using a web browser and one tab has their bank
information, and the other has a suspect website, then you have to be
concerned about sharing resources and security.

~~~
xxs
I'd prefer to see 'trusted' sites being to use javascript only. Heck, even use
javascript only after log in.

Alternatively, run them at an isolated context with no websockets, limited
access to timing (second or two precision - a lot sampling needed), limited
CPU and memory utilization, no sound, no GPU acceleration [likely another
large side channel surface], etc. Ahh yeah and delete all their cookies while
we are at.

Javascript has turned to a total bloat

~~~
bo1024
Running somebody else's code is putting a ton of trust in them. Maybe someday
we'll have great sandboxing, but for now, it makes no sense to let random
sketchy origins to just run whatever code they please on your device.

------
naasking
Very cool ideas. Time protection is indeed a must going forward, particularly
for cloud hosting. Not surprised that seL4 got there first either.

Getting this to work in raw Linux may be hopeless given the breadth of the
kernel data, but they have smart devs, so maybe they'll figure something out.

And Rob Pike said systems research is dead. Ha!

------
dooglius
One issue I see here is that time protection would need to extend to anything
shared, not just CPU micro-architecture. For instance, if a hard drive has a
DRAM-based cache, that could be used as a timing channel, and the complexity
of flash file systems opens up all kinds of potential leaks. In the case of
two processes sharing network access, one process could conceivably estimate
another's network access patterns implicitly by measuring latency through
shared switches or drops due to buffers being filled. Mitigating this would
require some kind of coloring support that goes as far as your ISP's switches,
which seems impractical.

~~~
adrianratnapala
> One issue I see here is that time protection would need to extend to
> anything shared,

We might see this issue as an opportunity. That is, by thinking about a
concept called "time protection" we expose all these things subsystems are
doing and make them easy to argue about. We can now say "Oh good, XYZ improves
best-case speed, but sadly it also compromises _time protection_ ".

Having such a language means the industry can slowly start improving these
things rather than sweeping them under the rug. It will not stop the
improvement from being slow and difficult.

------
_cs2017_
As an alternative to this approach, I wonder if it's possible to push all
sensitive computations into a few small components, and rewrite those
components carefully to obscure any information that could be obtained from
timing?

~~~
FrozenVoid
Of course. Branchless code equivalents can be written for practically
anything, and you can force-prefetch memory regardless of branch(its an
intrinsic in most C compilers), though this loses performance and branch
prediction benefits.

~~~
_cs2017_
Wouldn't this approach be cheaper, both in performance and human effort, than
fixing the entire OS to hide the timing information?

~~~
FrozenVoid
It would require doing this for any new/existing code exposing timing
information, the current timing fixes/isolation patches are much smaller. It
would make more sense in user code, like browsers/media players/etc to remove
influence of branch prediction regardless of host OS, just like GCC does with
retpoline insertion.

------
myWindoonn
Very dramatic graphs.

I wonder when programming languages will start factoring out the ability to
check system timers. Some capability-aware languages have already done so.

~~~
6keZbCECT2uB
I wonder if high resolution timers were privileged, we could get by with lower
resolution timers. I'm not sure any timing attacks would work with second or
even millisecond resolution timers.

I don't see handling this at the programming language could help, and I think
that whether timing is privileged or not is built into CPUs so there's nothing
we can do about it, but this seems plausibly acceptable to deal with
speculation. Permit speculation, but make it privileged to detect if
speculation occurred.

~~~
skybrian
Apparently low resolution timers do work, but you need more data.

[https://news.ycombinator.com/item?id=16068631](https://news.ycombinator.com/item?id=16068631)

------
irq-1
Just a thought, but can't the OS prevent applications from knowing anything
about other applications? Rather than isolating apps by flushing/coloring
everything, couldn't the apps not know what else is running? Two apps can't
communicate, or one app spy on another, if an app doesn't know what else
was/is running.

(Not sure why this is wrong, but confident it must be.)

~~~
psds2
These exploits work on a hardware level, and do not require the malicious app
to know what else is running. For example, a VM does not know what other VMs
are running on the same host in AWS, but Spectre/Meltdown still affected AWS
hosts. They are reading data other apps have written to memory.

------
JoeAltmaier
Or, don't let timing reveal privileged information? This seems like a
sledgehammer for a fly problem.

~~~
dooglius
How is what you're talking about different from what the article describes?

~~~
Spivak
Application vs system level. I think the parent is saying that since the
application is in a unique position to know what information is privileged it
would be better to make available a library of constant-time functions that
are resistant to timing attacks then constantly pay the performance cost of
blunter system-level boundary enforcement.

However, I'm not sure how much merit this argument has since very few
applications even bother with this level of protection but need it.

~~~
dooglius
That works fine when an application just wants to protect a private key in
memory. But if you want to build, say, an application where a user enters via
keyboard information that you want to protect from another application, you
have to worry about keystroke timing attacks. That means the application needs
to hide whether it did anything at all in a given time slice, which can be
inferred from the micro-architectural information discussed in the article.

------
simion314
I would like to see how Boeing and FAA decided to pretend that all is fine
after the first crash, there were enough clues that MCAS has issues and I
would like to see how was decided that is safe to fly while the software fix
was not deployed.

