Unlike say, coreutils, ntp is something very far from being a solved problem and...

hi-v-rocknroll · 2024-06-25T06:57:32 1719298652

You might be doing too much work at the wrong level of abstraction. VMs should use host clock synchronization. It requires some work and coordination, but it eliminates the need for ntp in VMs entirely.

Hosts should then be synced using PTP or a proper NTP local stratum (just get a proper GNSS source for each DC if you have then funds).

https://tsn.readthedocs.io/timesync.html

Deploy chrony to bare metal servers wherever possible.

rnijveld · 2024-06-25T07:42:06 1719301326

Our project also includes a PTP implementation, statime (https://github.com/pendulum-project/statime/), that includes a Linux daemon. Our implementation should work as well or even better than what linuxptp does, but it's still early days. One thing to note though is that NTP can be made to be just as precise (if not more precise), given the right access to hardware (unfortunately most hardware that does timestamping only does so for PTP packets). The reason for this precision is simple: NTP can use multiple sources of time, whereas PTP by design only uses a single source. This gives NTP more information about the current time and thus allows it to more precisely estimate what the current time is. The thing with relying purely on GNSS is that those signals can be (and are in practice) disrupted relatively easily. This is why time synchronization over the internet makes sense, even for large data centers. And doing secure time synchronization over the internet is only practically possible using NTP/NTS at this time. But there is no one size fits all solution for time synchonization in general.

ComputerGuru · 2024-06-25T19:16:50 1719343010

We've had issues relying on ESXi host/guest time synchronization, depending on the guest OS, and found this to be a better solution.

xorcist · 2024-06-25T10:17:55 1719310675

This makes sense. The clock is just another piece of hardware to be virtualized and shared among the guests.

But last time I said that with some pretense of authority, someone shoved me a whitepaper from VMware that said the opposite. Best practice was stated be to sync each guest individually with a completely virtual clock.

I'm not sure I agree, but at least I try to be open to be possibility that there are situations I had not considered. If anyone else knows more about this, please share.

yjftsjthsd-h · 2024-06-25T17:02:46 1719334966

> a whitepaper from VMware that said the opposite

Did it say why?

hcfman · 2024-06-27T18:04:09 1719511449

And by funds. That’s typically less than 100 euros with a Raspberry Pi

syncsynchalt · 2024-06-24T21:58:39 1719266319

The biggest danger in NTP isn't memory safety (though good on this project for tackling it), it's

(a) the inherent risks in implementing a protocol based on trivially spoofable UDP that can be used to do amplification and reflection

and

(b) emergent resonant behavior from your implementation that will inadvertently DDOS critical infrastructure when all 100m installed copies of your daemon decide to send a packet to NIST in the same microsecond.

I'm happy to see more ntpd implementations but always a little worried.

rnijveld · 2024-06-25T07:15:42 1719299742

I agree that amplification and reflection definitely are worries, which is why we are working towards NTS becoming a default on the internet. NTS would prevent responses by a server from a spoofed packet and at the same time would make sure that NTP clients can finally start trusting their time instead of hoping that there are no malicious actors anywhere near them. You can read about it on our blog as well: https://tweedegolf.nl/en/blog/122/a-safe-internet-requires-s...

One thing to note about amplification: amplification has always been something that NTP developers have been especially sensitive to. I would say though that protocols like QUIC and DNS have far greater amplification risks. Meanwhile, our server implementation forces that responses can never be bigger than the requests that initiated them, meaning that no amplification is possible at all. Even if we would have allowed bigger responses, I cannot imagine NTP responses being much bigger than two or three times their related request. Meanwhile I've seen numbers for DNS all the way up to 180 times the request payload.

As for your worries: I think being a little cautious keeps you alert and can prevent mistakes, but I also feel that we've gone out of our way to not do anything crazy and hopefully we will be a net positive in the end. I hope you do give us a try and let us know if you find anything suspicious. If you have any feedback we'd love to hear it!

dfc · 2024-06-25T14:53:34 1719327214

> I cannot imagine NTP responses being much bigger than two or three times their related request.

I think you must be limiting your imagination to ntp requests related to setting the time. There are a lot of other commands in the protocol used for management and metrics. The `monlist` command was good for 200x amplification. https://blog.cloudflare.com/understanding-and-mitigating-ntp...

rnijveld · 2024-06-25T15:54:41 1719330881

Ah right! I always forget about that since we don’t implement the management protocol in ntpd-rs. I think it’s insane that stuff should go over the same socket as the normal time messages. Something I don’t ever see us implementing.

syncsynchalt · 2024-06-25T18:40:19 1719340819

Thank you for your considered response!

I hadn't heard about NTS and I'm rolling it out to my fleet of timeservers now.

timmytokyo · 2024-06-24T22:50:36 1719269436

I really wish more internet infrastructure would switch to using NTS. It addresses these kinds of issues.

jaas · 2024-06-24T23:02:05 1719270125

ntpd-rs support NTS, I agree it would be great if more people used it!

1over137 · 2024-06-25T02:01:40 1719280900

Never heard of it. Shockingly little on wikipedia for example.

rnijveld · 2024-06-25T07:49:01 1719301741

I'm afraid this is a pretty common sentiment. NTS has been out for several years already and is implemented in several implementations (including our ntpd-rs implementation, and others like chrony and ntpsec). Yet its usage is low and meanwhile the fully unsecured and easily spoofable NTP remains the default, in effect allowing anyone to manipulate your clock almost trivially (see our blog post about this: https://tweedegolf.nl/en/blog/121/hacking-time). Hopefully we can get NTS to the masses more quickly in the coming years and slowly start to decrease our dependency on unsigned NTP traffic, just as we did with unencrypted HTTP traffic.

codetrotter · 2024-06-25T05:13:21 1719292401

Yeah. Seems it doesn’t even have its own article there.

Only a short mention in the main article about NTP itself:

> Network Time Security (NTS) is a secure version of NTPv4 with TLS and AEAD. The main improvement over previous attempts is that a separate "key establishment" server handles the heavy asymmetric cryptography, which needs to be done only once. If the server goes down, previous users would still be able to fetch time without fear of MITM. NTS is currently supported by several time servers, including Cloudflare. It is supported by NTPSec and chrony.

westurner · 2024-06-25T07:23:26 1719300206

"RFC 8915: Network Time Security for the Network Time Protocol" (2020) https://www.rfc-editor.org/rfc/rfc8915.html

"NTS RFC Published: New Standard to Ensure Secure Time on the Internet" (2020) https://www.internetsociety.org/blog/2020/10/nts-rfc-publish... :

> NTS is basically two loosely coupled sub-protocols that together add security to NTP. NTS Key Exchange (NTS-KE) is based on TLS 1.3 and performs the initial authentication of the server and exchanges security tokens with the client. The NTP client then uses these tokens in NTP extension fields for authentication and integrity checking of the NTP protocol messages that exchange time information.

From "Simple Precision Time Protocol at Meta" https://news.ycombinator.com/item?id=39306209 :

> How does SPTP compare to CERN's WhiteRabbit, which is built on PTP [and NTP NTS]?

White Rabbit Project: https://en.wikipedia.org/wiki/White_Rabbit_Project

denton-scratch · 2024-06-25T09:13:36 1719306816

I hadn't heard of NTS until a Debian upgrade quietly installed ntpsec. It seems to now be the Debian default.

rlaager · 2024-06-25T14:41:38 1719326498

ntp has been replaced by ntpsec in Debian. (I am the Debian ntpsec package maintainer.) By default, NTPsec on Debian uses the NTP Pool, so no NTS. But NTPsec does support NTS if you are running your own server and supports it opt-in on the client side.

As far as I know, the Debian “default” is systemd-timesyncd. That is what you get out of the box. (Though, honestly, I automate most of my Linux installs, so I don’t interact with a stock install very often.) AFAIK, systemd-timesyncd does not support NTS at all.

Doing NTS on a pool would be quite complicated. The easy way is to share the same key across the pool. That is obviously not workable when pool servers are run by different people. The other way would be to have an another out-of-band protocol where the pool NTP servers share their key with the centralized pool NTS-KE servers. Nobody has built that, and it’s non-trivial.

tialaramex · 2024-06-25T18:24:17 1719339857

Ah not quite, I think pooling would be rather easier than you've thought, there are Let's Encrypt people here, but let me explain what you'd do to have N unrelated machines which are all able to successfully claim they are some-shared-name.example

Each such machine mints (as often as it wants, but at least once) a document called a Certificate Signing Request. This is a signed (thus cannot be forged) document but it's public (so it needn't be confidential) and it basically says "Here's my public key, I claim I am some-shared-name.example, and I've signed this document with my private key so you can tell it was me who made it".

The centralized service collects these public documents for legitimate members of the pool and it asks a CA to issue certificates for them. The CA wants a CSR, that's literally what it asks for -- Let's Encrypt clients actually just make one for you automatically, they still need one. Then the certificates are likewise public documents and can be just provided to anybody who wants them (including the NTP pool servers they're actually for which can collect a current certificate periodically).

So you're only moving two public, signed, documents, which isn't hard to get right, you should indeed probably do this out-of-band but you aren't sharing the valuable private key anywhere, that's a terrible idea as well as being hard to do correctly it's just unnecessary.

denton-scratch · 2024-06-25T19:31:35 1719343895

Thanks - "ntp has been replaced ntpsec", but it's not the default. My mistake - my systems are systemd-free.

On my initial encounter with ntpsec, I found ntpsec running, but ntp was also installed. That's an interesting construction of "replace". This would be hard to replicate, because I don't know when ntpsec turned up; otherwise I'd try to make a bug report. If ntpsec was replacing ntp, I'd expect to find no ntp after the update.

rnijveld · 2024-06-25T06:20:15 1719296415

I would encourage you to take a look at some of our testing data and an explanation of our algorithm in our repository (https://github.com/pendulum-project/ntpd-rs/tree/main/docs/a...). I think we are very much in spitting distance of Chrony in terms of synchronization performance, sometimes even beating Chrony. But we’d love for more people to try our algorithm in their infrastructure and report back. The more data the better.

agwa · 2024-06-25T12:02:52 1719316972

What exactly does "time keeping abilities" mean? If I had to choose between 1) an NTP implementation with sub-millisecond accuracy that might allow a remote attacker to execute arbitrary code on my server and 2) an NTP implementation which may be ~100ms off but isn't going to get me pwned, I'm inclined to pick option 2. Is writing an NTP server that maintains ~100ms accuracy not a solved problem?