Part 1 - The Problem with NTP: https://web.archive.org/web/20210627035347/https://libertysy...
Part 2 - How NTP Works:
Part 3 - Installation and Configuration:
Part 4 - Monitoring and Troubleshooting:
Part 5 - Myths, Misconceptions, and Best Practices:
A large number of them were out-of-date and at their end-of-life. HP was charging a super premium for keeping them in support beyond their normal end-of-life period... some reseller pointed this out as a justification for why it would be cheaper to replace them than to keep them in support. It back-fired: QLD police just took them out of support without replacing the hardware. State-level critical infrastructure running on obsolete equipment with no vendor support....
I inherited an "enterprise environment" to look after that had attempts to talk to on prem NTP services via VPN, but that had failed over time. Cybersec had closed the route without notice and the environment eventually drifted out of sync and was completely unable to get updates. It hadn't had any updates for 3 years. There were still other elements of the VPN that could talk to parts of both networks used between two big agencies supported. That system was classified as sensitive. Also, the firewall hadn't had a definitions review in 4 years. .Net Core alpha release was being used.
Fortunately I was able to nuke the whole thing because of the low number of users.
Most of the time none of that matters and you can just install chronie and point it to whatever.pool.ntp.org and you’re off to the races. But boy does it suck when you have to to know.
This is not wrong, but it's missing a large chunk of information. All non-UTC-based geonavigation satellites also broadcasts both the offset between internal and UTC and if there's an impeding leap second.
Of course on Linux most of the arcane details of the ntp daemon aren't relevant because most distros end up running SystemD with timesyncd instead. I discovered this when all of my T1 time sources (GPS receivers) stopped working after an update. As usual you can disable the systemd bit, but it doesn't like it.
which tends to be "good enough" for the average desktop user, but anything even vaguely server-ish should run a full NTP implementation such as chrony and not an SNTP one such as systemd-timesyncd.
a few years ago at $dayjob we had a fleet of CoreOS hosts. CoreOS, at the time, defaulted to systemd-timesyncd using pool.ntp.org addresses.
our CoreOS hosts, obviously, ran Docker containers.
systemd has a neat "feature" where if your network configuration changes, it'll trigger a time synchronization through timesyncd.
when a new Docker container was started, this counted as a "network config change" and caused a time synchronization.
by itself, this isn't too bad. it caused time syncs to happen more often than they need to, strictly speaking, but shouldn't have caused any further problems.
except...enter "falsetickers". hosts in the NTP pool are run by volunteers. an individual host in the pool may have the incorrect time.
the infrastructure for the NTP pool has monitoring for this, and will kick a host out of the DNS rotation if it's wrong. except this won't happen immediately - there'll always be some lag between when the host starts being wrong and when the monitoring system kicks it out.
and if your hosts are synchronizing their time more often than necessary, it increases the chance they'll do a time sync in one of these small windows where a falseticker is being advertised by the pool.
a full NTP implementation is specifically designed to handle this, of course. a client polls multiple servers, and will discard significant outliers.
SNTP? not so much. I haven't looked at timesyncd to see if it's improved since then, but at the time it would pick one of the [0-3].pool.ntp.org hosts at random, send it one NTP packet, and then jump the time to that response.
...and that's the story of how some of my company's production hosts would have their system time autonomously jump to be 5-10 minutes fast, maintain that time for several minutes to an hour, and then jump back to the correct time, all without human intervention.
Then at least if I’m off we’re all off together.
Generally if you've made the effort to have internal recursive DNS server(s) for your network, then just enable NTPd or chrony as well and have a single source of Time Truth for your network.
Point to ≥4 NTP servers, even using pool.ntp.org, and you probably don't have to worry about false ticker(s) either.
How big is that unmeasured error?
True in XP (it was a crappy SNTP implementation), but it was rehauled significantly in Windows 10/Server 2016 and above because of Azure requirements. It can now guarantee accuracy within 1 second at all times and even higher when the NTP server is local (https://docs.microsoft.com/en-us/windows-server/networking/w...)
Knowing how accurate the timers within Windows is, this is actually not "rubbish" as you say, at least relative to Windows. Windows is designed for general computing, not for superprecise timings. Use Linux for that use case, not just shovel NTPD (spoiler: NTPD uses the Media timers in Windows, those are not definitely designed for that use case and there are too many applications that breaks if they are forced with the high-precision timers).
P.S. Linux can hold precise timings, but there are certain configurations that will break this assumption. Double check if this is important to you.
If you really need high precision time synchronization, for example when triangulating signals on different machines, you should look at ptpd (https://github.com/ptpd/ptpd).
I probably won't read (m)any of the comments below, but if I had to pick the "do not miss" parts, they would be https://d38if4m2in2lkc.cloudfront.net/2016/10/the-school-for... (It really isn't a typical consensus algorithm.) and https://d38if4m2in2lkc.cloudfront.net/2016/12/the-school-for... (Them: "1 NTP peer is better than 2"; me: "Don't make me come down there"), but really, you should just go read https://tools.ietf.org/html/rfc8633