Unlike say, coreutils, ntp is something very far from being a solved problem and the memory safety of the solution is unfortunately going to play second fiddle to its efficacy.
For example, we only use chrony because it’s so much better than whatever came with your system (especially on virtual machines). ntpd-rs would have to come at least within spitting distance of chrony’s time keeping abilities to even be up for consideration.
(And I say this as a massive rust aficionado using it for both work and pleasure.)
You might be doing too much work at the wrong level of abstraction. VMs should use host clock synchronization. It requires some work and coordination, but it eliminates the need for ntp in VMs entirely.
Hosts should then be synced using PTP or a proper NTP local stratum (just get a proper GNSS source for each DC if you have then funds).
Our project also includes a PTP implementation, statime (https://github.com/pendulum-project/statime/), that includes a Linux daemon. Our implementation should work as well or even better than what linuxptp does, but it's still early days. One thing to note though is that NTP can be made to be just as precise (if not more precise), given the right access to hardware (unfortunately most hardware that does timestamping only does so for PTP packets). The reason for this precision is simple: NTP can use multiple sources of time, whereas PTP by design only uses a single source. This gives NTP more information about the current time and thus allows it to more precisely estimate what the current time is. The thing with relying purely on GNSS is that those signals can be (and are in practice) disrupted relatively easily. This is why time synchronization over the internet makes sense, even for large data centers. And doing secure time synchronization over the internet is only practically possible using NTP/NTS at this time. But there is no one size fits all solution for time synchonization in general.
This makes sense. The clock is just another piece of hardware to be virtualized and shared among the guests.
But last time I said that with some pretense of authority, someone shoved me a whitepaper from VMware that said the opposite. Best practice was stated be to sync each guest individually with a completely virtual clock.
I'm not sure I agree, but at least I try to be open to be possibility that there are situations I had not considered. If anyone else knows more about this, please share.
The biggest danger in NTP isn't memory safety (though good on this project for tackling it), it's
(a) the inherent risks in implementing a protocol based on trivially spoofable UDP that can be used to do amplification and reflection
and
(b) emergent resonant behavior from your implementation that will inadvertently DDOS critical infrastructure when all 100m installed copies of your daemon decide to send a packet to NIST in the same microsecond.
I'm happy to see more ntpd implementations but always a little worried.
I agree that amplification and reflection definitely are worries, which is why we are working towards NTS becoming a default on the internet. NTS would prevent responses by a server from a spoofed packet and at the same time would make sure that NTP clients can finally start trusting their time instead of hoping that there are no malicious actors anywhere near them. You can read about it on our blog as well: https://tweedegolf.nl/en/blog/122/a-safe-internet-requires-s...
One thing to note about amplification: amplification has always been something that NTP developers have been especially sensitive to. I would say though that protocols like QUIC and DNS have far greater amplification risks. Meanwhile, our server implementation forces that responses can never be bigger than the requests that initiated them, meaning that no amplification is possible at all. Even if we would have allowed bigger responses, I cannot imagine NTP responses being much bigger than two or three times their related request. Meanwhile I've seen numbers for DNS all the way up to 180 times the request payload.
As for your worries: I think being a little cautious keeps you alert and can prevent mistakes, but I also feel that we've gone out of our way to not do anything crazy and hopefully we will be a net positive in the end. I hope you do give us a try and let us know if you find anything suspicious. If you have any feedback we'd love to hear it!
> I cannot imagine NTP responses being much bigger than two or three times their related request.
I think you must be limiting your imagination to ntp requests related to setting the time. There are a lot of other commands in the protocol used for management and metrics. The `monlist` command was good for 200x amplification.
https://blog.cloudflare.com/understanding-and-mitigating-ntp...
Ah right! I always forget about that since we don’t implement the management protocol in ntpd-rs. I think it’s insane that stuff should go over the same socket as the normal time messages. Something I don’t ever see us implementing.
I'm afraid this is a pretty common sentiment. NTS has been out for several years already and is implemented in several implementations (including our ntpd-rs implementation, and others like chrony and ntpsec). Yet its usage is low and meanwhile the fully unsecured and easily spoofable NTP remains the default, in effect allowing anyone to manipulate your clock almost trivially (see our blog post about this: https://tweedegolf.nl/en/blog/121/hacking-time). Hopefully we can get NTS to the masses more quickly in the coming years and slowly start to decrease our dependency on unsigned NTP traffic, just as we did with unencrypted HTTP traffic.
Yeah. Seems it doesn’t even have its own article there.
Only a short mention in the main article about NTP itself:
> Network Time Security (NTS) is a secure version of NTPv4 with TLS and AEAD. The main improvement over previous attempts is that a separate "key establishment" server handles the heavy asymmetric cryptography, which needs to be done only once. If the server goes down, previous users would still be able to fetch time without fear of MITM. NTS is currently supported by several time servers, including Cloudflare. It is supported by NTPSec and chrony.
> NTS is basically two loosely coupled sub-protocols that together add security to NTP. NTS Key Exchange (NTS-KE) is based on TLS 1.3 and performs the initial authentication of the server and exchanges security tokens with the client. The NTP client then uses these tokens in NTP extension fields for authentication and integrity checking of the NTP protocol messages that exchange time information.
ntp has been replaced by ntpsec in Debian. (I am the Debian ntpsec package maintainer.) By default, NTPsec on Debian uses the NTP Pool, so no NTS. But NTPsec does support NTS if you are running your own server and supports it opt-in on the client side.
As far as I know, the Debian “default” is systemd-timesyncd. That is what you get out of the box. (Though, honestly, I automate most of my Linux installs, so I don’t interact with a stock install very often.) AFAIK, systemd-timesyncd does not support NTS at all.
Doing NTS on a pool would be quite complicated. The easy way is to share the same key across the pool. That is obviously not workable when pool servers are run by different people. The other way would be to have an another out-of-band protocol where the pool NTP servers share their key with the centralized pool NTS-KE servers. Nobody has built that, and it’s non-trivial.
Ah not quite, I think pooling would be rather easier than you've thought, there are Let's Encrypt people here, but let me explain what you'd do to have N unrelated machines which are all able to successfully claim they are some-shared-name.example
Each such machine mints (as often as it wants, but at least once) a document called a Certificate Signing Request. This is a signed (thus cannot be forged) document but it's public (so it needn't be confidential) and it basically says "Here's my public key, I claim I am some-shared-name.example, and I've signed this document with my private key so you can tell it was me who made it".
The centralized service collects these public documents for legitimate members of the pool and it asks a CA to issue certificates for them. The CA wants a CSR, that's literally what it asks for -- Let's Encrypt clients actually just make one for you automatically, they still need one. Then the certificates are likewise public documents and can be just provided to anybody who wants them (including the NTP pool servers they're actually for which can collect a current certificate periodically).
So you're only moving two public, signed, documents, which isn't hard to get right, you should indeed probably do this out-of-band but you aren't sharing the valuable private key anywhere, that's a terrible idea as well as being hard to do correctly it's just unnecessary.
Thanks - "ntp has been replaced ntpsec", but it's not the default. My mistake - my systems are systemd-free.
On my initial encounter with ntpsec, I found ntpsec running, but ntp was also installed. That's an interesting construction of "replace". This would be hard to replicate, because I don't know when ntpsec turned up; otherwise I'd try to make a bug report. If ntpsec was replacing ntp, I'd expect to find no ntp after the update.
I would encourage you to take a look at some of our testing data and an explanation of our algorithm in our repository (https://github.com/pendulum-project/ntpd-rs/tree/main/docs/a...). I think we are very much in spitting distance of Chrony in terms of synchronization performance, sometimes even beating Chrony. But we’d love for more people to try our algorithm in their infrastructure and report back. The more data the better.
What exactly does "time keeping abilities" mean? If I had to choose between 1) an NTP implementation with sub-millisecond accuracy that might allow a remote attacker to execute arbitrary code on my server and 2) an NTP implementation which may be ~100ms off but isn't going to get me pwned, I'm inclined to pick option 2. Is writing an NTP server that maintains ~100ms accuracy not a solved problem?
For example, we only use chrony because it’s so much better than whatever came with your system (especially on virtual machines). ntpd-rs would have to come at least within spitting distance of chrony’s time keeping abilities to even be up for consideration.
(And I say this as a massive rust aficionado using it for both work and pleasure.)