
Measuring Latency in Linux (2014) - krenel
http://btorpey.github.io/blog/2014/02/18/clock-sources-in-linux/
======
cataphract
I don't think his comment about CLOCK_MONOTONIC_RAW being slow to query
applies anymore. It used to be slow because it was not implemented in the vDSO
library and so it included the overhead of a syscall. But there was a big vDSO
refactoring that landed on 5.3 that I think fixed this problem.

Edit: found the patchset. In includes benchmarks for several architectures as
well: [https://lore.kernel.org/linux-arm-
kernel/20190621095252.3230...](https://lore.kernel.org/linux-arm-
kernel/20190621095252.32307-1-vincenzo.frascino@arm.com/)

~~~
thedance
It's a fun fact that on cloud VMs (AWS, etc) vDSO gettime doesn't exist, so if
you rely on vDSO to make time measurement free, it's not.

~~~
jstarks
Maybe this is true for AWS VMs that use Xen. I believe that Linux VMs on Azure
do not have this problem, since they use the Hyper-V reference time page,
which can be queried from the vDSO.

~~~
robbjuly
I believe newer generation AWS VMs, like C5, use kvm clock source now and not
xen. On the older ones switching to tsc speeds up things.

------
krenel
OP: Related stuff

How Golang[1] implements monotonic clocks — basically it retrieves always both
wall and monotonic clocks on "time.Now()" and when doing operations of
subtraction it uses the monotonic. When printing, it uses the wall time.
Pretty neat. Details on the proposal by Russ Cox [2].

[1] [https://golang.org/pkg/time/#hdr-
Monotonic_Clocks](https://golang.org/pkg/time/#hdr-Monotonic_Clocks) [2]
[https://go.googlesource.com/proposal/+/master/design/12914-m...](https://go.googlesource.com/proposal/+/master/design/12914-monotonic.md)

~~~
vlovich123
How does scheduling a timer to go off at 3pm this Saturday work?

~~~
coder543
That’s a hard question regardless of programming language. One commonly
referenced YouTube video about time that explains why this is hard:
[https://youtu.be/-5wpm-gesOY](https://youtu.be/-5wpm-gesOY)

A reasonable first approximation of a solution would be to just check every
second (or minute, or hour, depending on requirements) whether the current
system time is later than the scheduled time for any pending events. Then you
probably want to make sure events are marked as completed so they don’t fire
again if the clock moves backwards.

Trying to predict how many seconds to sleep between now and 3pm on Saturday is
a difficult task, but you can probably use a time library to do that if it’s
important enough... but what happens when the government suddenly declares a
sudden change to the time zone offset between now and then? The predictive
solution would wake up at the wrong time.

~~~
vlovich123
No, you say "sleep until 3pm on saturday" you don't predict anything. The OS
computes an exact value to wake up in when you arm. If that clock then jumps
forwards or backwards or does anything weird you then recompute the expiry
timeout for that timer. You can't do this in app-space but AFAIK all OSes
provide a facility for you to do so.

[https://developer.apple.com/documentation/dispatch/1420517-d...](https://developer.apple.com/documentation/dispatch/1420517-dispatch_walltime)
[http://man7.org/linux/man-
pages/man2/clock_nanosleep.2.html](http://man7.org/linux/man-
pages/man2/clock_nanosleep.2.html)

------
BeeOnRope
The thing about needing cpuid isnt true except perhaps on some older AMD
hardware.

lfence works as a execution barrier and has an explicit cost of only a few
cycles. You can accurately time a region with something like:

    
    
        lfence
        rdtsc
        lfence
        // timed region
        lfence
        rdtsc
    

This will give you accurate timing with some offset (i.e. even with an empty
region you get a result on the order of 25-40 cycles), which you can mostly
subtract out.

Carefully done you can get results down to a nanosecond or so.

rdtscp has few advantages over lfence + rdtsc, and arguably some disadvantages
(you can control where the implied fence goes).

~~~
jasonzemos
Specifically, the Intel manual makes the following important points, one
involving an `mfence;lfence` combo:

* If software requires RDTSC to be executed only after all previous instructions have executed and all previous loads are globally visible, it can execute LFENCE immediately before RDTSC.

* If software requires RDTSC to be executed only after all previous instructions have executed and all previous loads and stores are globally visible, it can execute the sequence MFENCE;LFENCE immediately before RDTSC.

* If software requires RDTSC to be executed prior to execution of any subsequent instruction (including any memory accesses), it can execute the sequence LFENCE immediately after RDTSC. This instruction was introduced by the Pentium processor.

rdtscp is usually a bit more disruptive, and cpuid is probably 100 or 1000
times more disruptive.

------
ggm
How does VM affect this?

How does KVM affect this?

How does Docker on KVM affect this?

How does Hypervisor affect this?

Add "... for a given network driver, e2e, measured RTT.."

~~~
pepemon
How can Docker affect this if it doesn't add any overhead?

~~~
ggm
Who said it doesn't add any overhead?

~~~
pepemon
Well, cgroups work like that. No overhead. Your systemd services are sliced
under cgroups. Where have you seen the overhead?

------
birdyrooster
Please put the year in the title (2015)

~~~
jstarks
The URL implies it's from 2014.

~~~
birdyrooster
Thank you for the correction.

------
snvzz
cyclictest, from rt-tests. That's the go-to.

~~~
rdtsc
Agree. I remember using that some years ago. Can draw plots with it as well
[https://www.osadl.org/Create-a-latency-plot-from-
cyclictest-...](https://www.osadl.org/Create-a-latency-plot-from-cyclictest-
hi.bash-script-for-latency-plot.0.html)

------
angry_octet
If you want finer resolution or multi-machine measurements then look at PTP.
You need custom hardware but the improvements are significant.

