Overflowing timeout

wahern · on July 15, 2021

> When you check the elapsed time, you subtract the old time from the new time. But if the new time is less than the old time, you get an overflow. Because we save time as an unsigned integer, you get a very large number that will for sure be larger than any timeout value you set, including our default 3 minutes.

If they were using unsigned arithmetic then they would've gotten the expected value. For example, 0 - 0xffffffffffffffff is 1 when using 64-bit types, just as one might expect. Unsigned (modulo) arithmetic works exactly how you'd want in a situation like this; it works until the interval exceeds the range of the type. So either they're describing things wrong, or their code wasn't doing what they thought.

Of course, in any event this wouldn't be the root of their problem--time stamp counter drift across cores. The Linux kernel has alot of code to decide whether to use the RDTSC-optimized clock_gettime vDSO implementation or fallback to a syscall. And when it does use the RDTSC method, IIRC it has code to train the time stamps to minimize any cross-core drift, and mask out any potential difference (reducing granularity). The moral of the story is to just use clock_gettime+CLOCK_MONOTONIC; at least on Linux it'll use RDTSC in most cases where that's practical.

antonio2368 · on July 15, 2021

Sorry if something was explained awkwardly or incorrectly, I'm trying to get a hang of writing posts like this, technical posts that are interesting and fun to read.

> If they were using unsigned arithmetic then they would've gotten the expected value.

And we did, but we did not expect to get the input values we got, which are explained by your last paragraph: "time stamp counter drift across cores." This was the real problem, and something new we learned.

> use clock_gettime+CLOCK_MONOTONIC

I forgot to include this because I was focused on the solutions we ended up using, but after some research this is 100% correct! We were afraid that too many syscalls during the query execution can produce huge performance hits so we decided to look for a different solution.

All in all, thanks for the comment!

namibj · on July 15, 2021

Just to correct you: `clock_gettime` is handled by the vDSO, and does not involve an actual syscall for over a decade now.

antonio2368 · on July 15, 2021

I didn't think about the vDSO, and that it's used for `clock_gettime`, so thank you for the information and correction!

It's really hard for me to get a feel of what is okay to use, what is the standard way to do something without actually implementing everything and seeing how it behaves, so information like this help me a lot.

namibj · on July 15, 2021

I recommend looking at it, as all those calls are very cheap:

  man 7 vdso

pornel · on July 15, 2021

It turns out lots of systems are subtly broken, and need workarounds. This is where it's handy to have a decent standard library instead of learning all of this the hard way:

https://github.com/rust-lang/rust/blob/b9197978a90be6f757074...

ComputerGuru · on July 15, 2021

__rdtsc drift isn’t an AMD bug, it’s by design. The only way to use the rdtsc instruction is to pin code to the same physical core between calls. They probably just got lucky in that the drift always exceeded the time that elapsed between calls on other cpus or the Linux scheduler’s behavior on Ryzen 7 differs from other CPUs (I’m not familiar with Ryzen 7 but ThreadRipper is basically non-numa and needed patches to the scheduler under Windows to fix some pathological conditions).

wahern · on July 15, 2021

It seems more of a QoI issue. Most Intel chips maintain synchronized (same rate of change) TSCs. That knowledge is encoded in the Linux kernel:

   /*
    * Intel systems are normally all synchronized.
    * Exceptions must mark TSC as unstable:
    */
   if (boot_cpu_data.x86_vendor != X86_VENDOR_INTEL) {
    /* assume multi socket systems are not synchronized: */
    if (num_possible_cpus() > 1)
     return 1;
   }

From unsynchronized_tsc function at https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/lin...

antonio2368 · on July 15, 2021

> __rdtsc drift isn’t an AMD bug, it’s by design.

Yes, I should've emphasized this.

> hey probably just got lucky in that the drift always exceeded the time that elapsed between calls on other cpus

And this is also true. :D But I would say we were lucky to notice this problem at all because it really doesn't happen on that many CPUs, at least on the ones we tried out.