
Hot or Not: Revealing Hidden Services by their Clock Skew [pdf] - lelf
http://www.cl.cam.ac.uk/~sjm217/papers/ccs06hotornot.pdf
======
awda
Tl;dr: By placing load on suspected hidden service hosts (over normal, non-
anonymous IP), one can then measure the change in clock skew (as a result of
higher CPU / chassis temperature from the higher load) over the anonymous
channel to confirm it is the same host (by comparison with clock skew before
load).

The result holds over many hops and onion layers (this is the usual "the
average of random noise is zero" thing). Very cool.

~~~
anon4
If the machine has no public services running, would that attack still work?
What if it's behind NAT or a hw firewall with ssh exposed only via port
knocking?

~~~
awda
It needs to be connectible over IP (and TCP, I think?) for it to talk to Tor,
which is necessary for running a hidden service. You could imagine an
anonymizing network where that is not necessary, I guess, by having the
service connect to some other internal node over NAT. (Although then you could
just Sygil attack with a bunch of nodes and wait for the target to connect to
you.)

------
tetha
This is a good demonstration why security and privacy are hard. Just think
about it for a second: Load on the CPU affects the clock and you can measure
this clock skew remotely. There are so many possible interactions in a modern
computer (and even more you're unaware of) and it looks like every single one
of them has to be considered a side channel.

------
zaroth
Inducing system load on a Tor hidden service, to generate heat from the CPU,
to increase temperature of the quartz crystal driving the system clock, to
cause system clock skew, which is remotely detectable via the TCP sequence
number generated by rand(), or more directly by TCP timestamps (RFC 1323).

This lets you try to check if a given hidden service is running on a known
machine, or if two hidden services are running on the same machine.

Could you skip everything in the middle, since request latency is correlated
with system load? You have to load the server in either case, so both are
active attacks. I think the problem is that latency is so variable due to Tor
itself, it's actually faster to measure server load through clock skew than
through request latency.

How would you find a candidate public server to run this attack against? "Many
hidden servers are also publicly advertised Tor nodes, in order to mask hidden
server traffic with other Tor traffic, so this scenario is plausible." But I
think you would run your public Tor relay on a different machine behind the
same firewall, since you want the absolute minimum amount of processes running
on the machine actually hosting the hidden service.

(My comment on this from yesterday, but it's back on the front page as a new
submission)

~~~
weland
> Could you skip everything in the middle, since request latency is correlated
> with system load? You have to load the server in either case, so both are
> active attacks. I think the problem is that latency is so variable due to
> Tor itself, it's actually faster to measure server load through clock skew
> than through request latency.

I don't think it's only the variable latency due to Tor, but also the fact
that latency is tightly enough correlated with the load and the software and
hardware configuration that it can't be reliably used for fingerprinting.
Clock skew, on the other hand, occurs due to various fabrication parameters
not being constant; it has a random element that can be reliably used to
fingerprint physical machines. To put it another way, two identical machines
-- built out of identical components, running perfectly identical software, on
entirely identical storage media, with exactly the same bits written in
exactly the same location of the hard drive and RAM -- placed behind a NAT
would be impossible, or at least much harder to discern based on latency
alone, while being comparatively easier to discern by clock skew.

------
llama-made
Isn't this trivially defeated by deliberately running the CPU on Tor nodes at
100% all the time, pointlessly burning cycles if there's no traffic to pass?
Obviously there's a heat and power consumption consequence in doing that...

~~~
hrrsn
Kind of impractical. Plus if you want to scale at any one point...

~~~
Sanddancer
Could mitigate that with the heater job running at a nice level of 19 or so.
Will keep the CPU under load, but server tasks will run with only a slight
degradation.

------
wfn
If you liked this, you might like to take a peek at:

[http://freehaven.net/anonbib/topic.html](http://freehaven.net/anonbib/topic.html)
(take special notice at the highlighted papers.)

------
jimmytidey
Presumably you could just add some random to the timestamps you transmit?

~~~
mikeash
This means that the attack requires more data, but doesn't make it impossible.
Fundamentally, adding randomness to your timestamps is adding noise to a
signal. By sampling the signal repeatedly, you can average out the noise.

~~~
darkmighty
I'd figure simply quantizing timestamps with larger step sizes would work
better. And sure, you can filter those out, but you can also make it take too
long to be useful. You could also perform the attack on yourself and adjust
accordingly, although this is not robust since it depends on deatils of each
attack.

~~~
mikeash
I think that might be worse. By polling your system and waiting for the clock
to roll over, an attacker can almost immediately narrow down your clock to an
accuracy equal to their polling interval.

Either way, though, more requests will defeat it one way or another. Whether
you can make such an attack impractical will come down to how many requests
the attacker can make versus how much noise you can tolerate in the
timestamps.

------
beernutz
Would it be practical to use some kind of external clock? This might remove
CPU temperature variations from the clock at least.

~~~
CaveTech
Couldn't you just not allow your box to be accessed outside of the anonymous
network? It's a neat trick, but how likely is it that you can trace two
different services to the same server anyways?

~~~
derefr
No. Sadly, Tor doesn't work like a VPN, presenting itself as a network
interface responsible for a virtual subnet you can restrict your connections
to. Instead, Tor connections are plain-old IP connections from random public
IPs. Which means that your hidden service needs to _accept_ connections from
random public IPs--and, therefore, to be public itself. A hidden service might
not have a _published_ IP address, but it must have one that you _could_ ,
potentially, connect to over the plain internet, in order for the proximate
Tor node to it (what would be called an "exit node" if it were a plain site)
to be able to talk to it.

 _In theory,_ you could hack on the Tor client so that clients and servers did
an SIP negotiation prior to connecting on the desired port. You'd then run
something on the host which would act like a port-knocking daemon, temporarily
allowing new connections on a port only in response to a request from the Tor
client, and only to the SIP peer in that message.

(Or, Tor _could_ just present itself like a network interface, giving each
N-proxied-peer a virtual IP that changes whenever it regenerates its identity.
"Hidden service" connections would be regular IP-to-IP communication. For
"public" connections, exit nodes would need to be running SOCKS proxies, and
then there could be an anycast IP address that picked a proxied-exit-node at
random. Then you'd just set that as your plain-old SOCKS proxy in your
browser.)

~~~
quasque
> Which means that your hidden service needs to _accept_ connections from
> random public IPs--and, therefore, to be public itself. A hidden service
> might not have a _published_ IP address, but it must have one that you
> _could_ , potentially, connect to over the plain internet, in order for the
> proximate Tor node to it (what would be called an "exit node" if it were a
> plain site) to be able to talk to it.

This is not at all correct. The IP address of the hidden service is masked in
exactly the same way that its clients' IP addresses are. That is, the client
and service connect across the Tor network to an client-chosen onion router
known as the 'rendezvous point', through which they set up a shared circuit.

See here for more detail: [https://www.torproject.org/docs/hidden-
services.html.en](https://www.torproject.org/docs/hidden-services.html.en)

Or here for the technical specification of the hidden service protocol:
[https://gitweb.torproject.org/torspec.git/blob/HEAD:/rend-
sp...](https://gitweb.torproject.org/torspec.git/blob/HEAD:/rend-spec.txt)

~~~
derefr
Er. To interpret what I said the way you did, is to assume I was saying "Tor
does nothing, and clients connect directly to servers", which is kind of...
silly, to say the least.

The point I was making was that _the proximate node to the hidden service_
\--the last one in the onion-routing chain--connects to its destination by
using _its_ public IP to talk to _the hidden service 's_ public IP. From the
perspective of the hidden service, the node proximate to it in the onion-
routing chain is a regular Internet peer, which is impossible to distinguish
from any other regular Internet peer.

In the end, what Tor gives you is a proxy (to a proxy, to a proxy.) And, from
the server's perspective, there's no difference between a proxy and a regular
client. It can't tell, by the IP, that the client it's speaking to is a proxy.
And _because of that_ , you cannot, at the server-level, block non-proxied
clients from speaking to you. Because you don't know which those are.

~~~
quasque
It's not a proxy in the way you are describing. The node (onion router) most
proximal to the hidden service is always connected to by the hidden service,
not the other way around.

Thus it is absolutely fine to firewall off all inbound connections on the host
running the hidden service, as it will only be making outbound connections -
and even those are to a limited set of IP addresses as defined by the guard
nodes it has chosen for entry into the Tor network.

------
ape4
What about a hosted service. eg Amazon Web. Would that be vulnerable to this?

~~~
jacquesm
It's not hidden. The whole idea is that you expose the hidden service's
location by doing this, if the location of the service is known there is no
point.

So vulnerable implies that there is something to be gained.

------
wcummings
What if I don't transmit timestamps?

~~~
zaroth
The TCP sequence number generated by rand() can leak your system clock.

~~~
wcummings
That's what I was wondering ty

------
mantrax5
\- Most services are IO bound, not CPU bound, so pegging the CPU to max might
prove a non-trivial task.

\- Well designed services don't overload until they're maxing out on either
CPU or any other resources, they just serve up to some capacity (say 80%) and
then start flat out refusing request with response semantics like "come back
later".

\- Timestamp sources are typically not from clocks originating inside the CPU.

~~~
jacquesm
You won't need to peg the CPU, you only need to get it to warm up a little
bit, enough to create a skew that can be detected. Worst case that means that
you need to wait longer but it will still work.

The crystal can be on the motherboard, it does not really matter, as long as
the total heat inside the case is large enough to create a skew that can be
measured the attack will work.

~~~
mantrax5
If you warm it a little bit I think the problem becomes your skew becomes lost
in the noise of the other people accessing. It's tempting to think that other
people accessing is "perfectly uniform noise" but that's not the type of
patterns people see in real web services. They get hit in waves most of the
time.

If a service gets hit by a wave while you're measuring some suspect server,
here's your false positive right there.

Nice paper but somehow I think this tactic would neither work out well in
practice, nor work in court as a proof.

~~~
jacquesm
All that means is that you need to sample over a longer period.

And it does not have to work 'in court as a proof' to be practically viable
attack, and they are _well_ beyond theory:

"Implementing this is non-trivial as QoS must not only be guaranteed by the
host (e.g. CPU resources), but by its network too. Also, the impact on
performance would likely be substantial, as many connections will spend much
of their time idle. Whereas currently the idle time would be given to other
streams, now the host carrying such a stream cannot reallocate any resources,
thus opening a DoS vulnerability. However, there may be some suitable
compromise, for example dynamic limits which change sufficiently slowly that
they leak little information. Even if such a defence were in place, our
temperature attacks would still be effective. While changes in one network
connection will not affect any other connections, clock skew is altered. This
is because the CPU will remain idle during the slot allocated to a connection
without pending data. Unless steps are taken to defend against our attacks,
the reduced CPU load will lower temperature and hence affect clock skew. To
stabilise temperature, computers could be modified to use expensive oven
controlled crystal oscillators (OCXO), or always run at maximum CPU load.
External access to timing information could be restricted or jittered, but
unless all incoming connections were blocked, extensive changes would be
required to hide low level information such as packet emission triggered by
timer interrupts.

 _While the above experiments were on Tor_ , we stress that our techniques
apply to any system that hides load through maintaining QoS guarantees. Also,
there is no need for the anonymity service to be the cause of the load."

