So is it or isn't it?
A few quotes from ESR from here http://esr.ibiblio.org/?p=6881Interesting
> tossing out as many superannuated features as I could
> full of port shims for big-iron Unixes from the Late Cretaceous
> I do have an an advantage because I’m very bright and can hold more complex state in my head than most people [speaks to attitude :)]
> This differs dramatically from the traditional Unix policy of leaving all porting shims back to the year zero in place because you never know when somebody might want to build your code on some remnant dinosaur workstation or minicomputer from the 1980s.
> Yet another important thing to do on an expedition like this is to get permission – or give yourself permission, or fscking take permission – to remove obsolete features in order to reduce code volume, complexity, and attack surface.
> Then ntpdc was deprecated, but not removed – the NTP Classic team had developed a culture of never breaking backward compatibility with anything.
> I shot ntpdc through the head
I don't know the facts in the case at all, but I can imagine if you were the existing maintainer, you'd find this language and attitude incredibly difficult to take. And after all if you're not being paid to work on it, why should you swallow your pride?
> So is it or isn't it?
It's technically correct since they started from the reference implementation then "refactored" it.
They're skirting real close to claiming they are the reference NTP implementation but not quite doing that, just leaving the door open for misunderstandings.
(As an aside, have you ever tried to articulate unambiguously, with words, a complex technical topic, such as the design for a network protocol?)
On the one hand, NTP is defined by an IETF RFC (https://www.ietf.org/rfc/rfc5905.txt, version 4), using words, diagrams, and some code. The traditional RFC standards process requires two interoperating implementations for standards-track work. A reference implementation usually aims to be one of them, implementing the whole protocol, including features that are not needed in all use-cases. The reason for this is that it is easy to specify something that cannot be implemented, well or at all, and for the spec to be ambiguous so that different implementations do not work together.
The reference implementation also acts as a pseudo-formal-specification, in that it is more formal than the English language version and its behavior can be compared with what the spec writers intended.
(Personally I don't think that's the best way to do time synchronisation, but it is fully documented, measured, etc.)
The article and many comments here indicate that they believe NTPd is a "reference implementation". What does that mean other than "specifies behavior"?
If it instead means "Here's an example, it might do what the specification says or not" then it's not really a "reference" - the specification is.
1) It implements the whole thing, so it can be used as-is.
2) It serves as a reference for any questions that are unanswered by the spec. Yes, specs should ideally answer all questions themselves, but there has not been a spec written yet that does not have at least one unanswered question somewhere.
3) It serves as a way to test new implementations, because you can see if your new implementation is compatible with the reference implementation. If not, then you probably screwed something up somewhere.
That it's the implementation that a) everyone uses, and b) actually supports the whole spec.
Unfortunately most people find it "exciting fun" to "just write code" and "boring work" to carefully consider exactly the behavior of their software in all conditions.
This is true but serves as an indictment of that software, not the concept of clear specification.
So I guess it's not the "reference implementation".
I don't know much about NTP, but if I would take their claims at face value - "reduction in code of over 2/3 (from 227kLOC to 74kLOC)" is a valuable gain even if means removing compatibility with some less used systems; and "NTPsec was immune to over 50% of NTP Classic vulns BEFORE discovery in the last year" is a realistic consequence of a large reduction in code size/attack surface.
If half of people need the 'deleted' functionality (and it seems that the proportion is much smaller, but let's assume half) then obviously they should use NTP Classic, but the other half definitely should not.
It's not clear to me offhand whether the NTP Classic functionality that NTPsec deleted is the type that doesn't affect you if you don't use it, or if it's the type that can't be disabled.
Most of them, [Sons] said, "are older than my father.... [and] are not always up to date on the latest techniques and security issues." [...] Sons suggested that they "should be retired."
That is incredibly asinine and discriminatory to boot.
From the NTPsec project manager's comment:
The main point of contention that caused the fork was BitKeeper vs Git.
I can't believe that arguing over VC tooling is what caused the fork. Why not just compromise on this, use BitKeeper for long enough to get on the maintainer's good side, and helpfully offer to convert to Git later? This is how ESR got GNU Emacs development migrated from Bazaar to Git, and it seemed to have gone pretty well for all involved.
It's amazing* how many people here are willing to roast me over a third-hand account of my opinions, when I've already offered to answer questions directly.
* Not actually amazing, fairly typical of internet commentary, really.
(5:26) [O'Reilly interviewer] Mac Slocum: Related question on this: how can the Internet's infrastructure remain up to date and secure, particularly when it's distributed like this?
(5:33) Susan Sons: So the really terrifying thing about infrastructure software in particular is when you pay your ISP bill, that pays for all the cabling that runs to your home or business. That pays for the people that work at the ISP. That pays for their routing equipment and their power and their billing systems and their marketing and all of these wonderful things. It doesn't pay for the software that makes the Internet work. (5:54) That is maintained almost entirely by volunteers. And those volunteers are aging. [Um.] Most of them are older than my father. And [um,] we're not seeing a new cadre of people stepping up and taking over their projects, (6:10) so what we're seeing is ones and twos of volunteers who are hanging on and either burning out while trying to do this in addition to a full-time job, or are doing it instead of a full-time job, or should be retired, or are retired. [Um.] And it's just not giving the care it needs. (6:27) And in addition to this, these people aren't always up to date on the latest [um] techniques and security concerns of the day. And the next generation isn't coming up. I recently started a mentoring group called the #newguard that takes early and mid-career technologists and we cross-mentor and then we match them up with the old guard who are maintaining and who built this software to try to help solve that problem. But in the meantime there's still not enough funding going in this direction. And there's not enough churning happening. [Um.] And it's a really tough thing because there's a certain amount of what I call "functional arrogance" involved. [Um.] I don't have a certificate of "Susan is good enough to save the Internet" anywhere. I don't know who hands those out.
(7:08) Slocum: Sure.
[The software that makes the Internet work] is maintained almost entirely by volunteers, and those volunteers are aging. Most of them are older than my father, and we're not seeing a new cadre of people stepping up and taking over their projects, so what we're seeing is ones and twos of volunteers who are hanging on and either burning out while trying to do this in addition to a full-time job, or are doing it instead of a full-time job, or should be retired, or are retired, and it's just not getting the care it needs. And in addition to this, these people aren't always up to date on the latest techniques and security concerns of the day, and the next generation isn't coming up.
In context "should be retired" sounds awfully prescriptive, but I can see how that could mean something like "these volunteers want to retire but feel obligated to continue their maintenance duties".
Then at the 7:00 mark, you say:
It's a really tough thing because there's a certain amount of what I call functional arrogance involved... There's a certain point where you just have to say, "I'm going to decide that I'm in charge of this"...
I dunno. I can see where that's going to rub people the wrong way while at the same time seeing the value in having some moxie. I get the impression, though, that Stenn wasn't too happy with this approach.
3:25 It was death by a thousand cuts. "And I was seeing things that were not yet C99 compliant in 2015. The status of the code was over 16 years out of date in terms of C coding standards which means that you can't use modern tools for static analysis..."
4:30 "And in the mean time, security patches were being circulated secretly and then being leaked, and the leaked patches were being turned into exploits which we were seeing in the wild very quickly, when the security patches weren't being seen in the wild for a long time."
6:00 "...but it doesn't pay for the software that makes the Internet run. That maintained almost entirely by volunteers, and the volunteers are ageing, most of them are older than my father. And we're not seeing new [?] people stepping up..."
They're mostly still on CVS these days, right? Regardless of one's personal feelings about esr, is there any good reason to stay on CVS in this, the second decade of the twenty-first century?
Why? Basically because they are the ones using it and it continues to work just fine for their needs.
Is there any good reason for them to migrate everything to a new VCS, be forced to learn new tools, etc., when what they are using now meets their needs just fine?
This is a perfect example of change for change's sake.
there were other reasons but the discussion hasn't re-happened in the last 18-24 months
And why do people fork a project when they want to fix things? I'm sure there were dozens of problems in getting started on fixing NTP, BitKeeper was just the first problem to deal with. What can you do when the current maintainer is so contrary that they have already lost access to free funding to modernize a dying code base? We don't need NTP backported to unix workstations that EOL'd over a decade ago, we need an NTP that doesn't cause more problems than it solves.
Eric S. Raymond, who seems to be behind the "young" fork, is 59.
Stenn said in a phone interview, "Then all of a sudden I heard they have this great plan to rescue NTP. I wasn't happy with their attitude and approach, because there's a difference between rescuing and offering assistance. [Their plan was] to rescue something, quote unquote, fix it up, and turn it over to a maintenance team."
Most of them, she said, "are older than my father.... [and] are not always up to date on the latest techniques and security issues." Many are burning out from trying to maintain critical code while working full time jobs, and Sons suggested that they "should be retired.
Wow. Keep away from the ICEI I suppose.
Several years ago, the project's inadequate funding became known in the media
and Stenn received partial funding from the Linux Foundation's Core
Infrastructure Initiative, which was started after the discovery of how the
minimal resources of the OpenSSL project left systems vulnerable to the
The work done in NTPsec echos this in what seems to be a repeat of OpenSSL/ LibreSSL with NTP/ NTPsec.
Yes forking is the "easy way out" in these circumstances and it's a shame to see efforts split in such projects but in reality it's often what's needed to get things moving in the right direction.
Both of those problems are easily attributed to a lack of funding. Fulfilling feature requests generates money, fixing bugs does not. When you can only afford to spend 10 hours a week on a project, you're going to end up cutting corners to work around those time limits.
By contrast, the temporary rescue team funded a project-manager-slash-information-security-officer, three developers, and a bright intern who did a great deal of documentation work. Granted, these were not all full-time positions, and I was able to lean on my parent organization for administrative support, so while paperwork was minimal I didn't have to fund it directly.
NTPSec's staff has varied over time, but at the time I stepped down to hand the ISO role off to my successor, they had the following funded positions: a project manager, an ISO, two developers, and a sysadmin-slash-developer. (Note: not all of these were full-time positions, and some were funded by third parties rather than by NTPSec directly.)
Open source infrastructure software projects, when well run, do not spend over half their resources on non-development work. It's just not responsible. It's how you lose donor confidence, and how you fail to maintain good software engineering practice even when you have the resources to do better.
This has been a running theme in open source failures in the last few years: "it will be better if we throw more money at it!". Sometimes this is true: there are developers out there simply burning out for lack of resources and splitting their attention between OSS work and making a living. However, often, there is mismanagement at work, or the project doesn't have a good enough talent pool to pull from to use funds effectively when they do get funds. I love when the problems are purely technical, because it's a clean fix and everyone thanks me and I walk away quietly. When the underlying problems are social, they tend to fester, because nobody really wants to be in the hot seat over the disagreements that happened.
If you take average wages, two part time administrative employees would cost you somewhere around $40,000 a year (2 * 12 * 30 * 52, rounded up). One software developer's value is conservatively $120,000 a year (likely more; $80,000 salary, $40,000 payroll taxes & benefits).
By those estimated numbers (I'm not in a position to look up the actual numbers right now, but they're probably available), that's only around 25% for administrative overhead, for a three person team. Not that bad for a foundation. And certainly nowhere near "half their resources" as you claim.
I'm glad that NTPSec appears to be relatively well funded, but knocking on NTF for being less well funded and having administrative work that they don't want to pawn off on an already overworked developer is pretty nasty business.
Since Eric and the rest of my team started working on the NTP code base in early 2015, we've eliminated over 50% of its vulnerabilities before they were disclosed simply by applying good software engineering practice where it hadn't been. In the year before my O'Reilly presentation, it was more like 80 or 85 percent. Everything we hadn't eliminated by disclosure or discovery time was fixed promptly.
There are other NTP protocol implementations besides NTP classic or NTPSec that are worth considering for some users. However, we felt that refactoring the reference implementation was necessary due to its use in many less-mainstream, but often highly-critical (in a life-critical or economically-critical or critical-to-scientific-research sense) applications. The non-NTP-related implementations don't always do what high speed trading houses need, or scientific installations built on aging but extremely precise equipment need, or controls system interfaces need, and on and on and on. We just didn't have a drop-in replacement available for all of the things that weren't web servers, workstations, and other commodity applications.
The "rift" article is now subscriber-only, so I can't respond there to its many inaccuracies (I was passed a PDF by someone who cached it, this is the only way I was able to read it). I was never contacted about it by the author, and I don't feel it was a fair treatment of the subject. That's okay. I learned a long time ago that fixing a mess will make some people thank you and some people angry with you. It wouldn't have become a mess by the time I found it if there weren't a cost to fixing it. People who fear controversy will have a hard time making a difference in the world.
I'm at work, but I'll do my best to answer any questions fired at me today on this thread. If there's something you want to know, ask!
Can you provide a breakdown of the vulnerabilities NTPsec HAS and HAS NOT been vulnerable to, along with their severity (low: degrades time service, medium: provides a practical vector for corrupting integrity of time service, high: compromises integrity of the server itself) and whether they're exposed (a) in the default configuration, (b) in a configuration run widely on the Internet, or (c) in no configuration actually known to the project maintainers?
You clearly have the list somewhere, because everyone involved in the project has this statistic ready to quote.
If you don't have the severity and exposure breakdowns, that's OK. Post the list anyways. Maybe it'll be obvious what the severity and exposure is.
This business of counting vulnerabilities and claiming victories has been a problem for software security for two decades now. Ops people don't care about the vulnerability count, if the vulnerabilities left exposed in the codebase are the ones that get their servers popped.
The severity varies (many weren't that big, some were)... the point of claiming the victory is to demonstrate that I'm not just having a fuss about testing code, using static analysis tools, using an accessible code repository, refactoring for lower attack surface and better separation of concerns because they are beautiful in abstract. I like results. NTPSec, and before it the temporary "rescue" team, have been slowly chipping away at the big picture mess, making the code safer and more maintainable, because it's likely to remain in service for another decade or two.
Every time 14 vulns are disclosed and we are already immune to half of them, we get to put twice the effort on the half we do need to deal with, if even we need that much. We aren't just firefighting, NTPSec can develop proactively. That means something for our users.
lots of personnel overlap here...the main difference being pre- and post- fork and where the funding came from, probably not interesting to most people.
Can you provide that list of vulnerabilities now? You're obviously keeping track of them, that being part of the premise of the project. I know you don't have them broken down, but we can help with that.
One of the coolest after-effects of this whole thing was that, after the fork, when NTP classic began feeling the pressure of competition, their speed in addressing security vulnerabilities increased incredibly. While I was sorry that it didn't happen on its own, I was pleased and impressed to discover what Mr. Stenn was capable of once his competitive hackles were raised.
Many people experience hurt feelings during a fork, and a fork represents a frustrating duplication of effort that I'd usually rather avoid. However, forking is a central tenet of the open source ethos for a reason. Competition can do incredible things. <3
> Stenn denied many of Sons's statements outright. For example, asked about Sons's story about losing the root password, he dismissed it as "a complete fabrication."
Unless either you or Stenn is outright lying (neither of which seems likely, on priors), this seems like a strange misunderstanding to crop up. Do you know what's going on with this?
While the password problem made a good rhetorical flourish--it illustrated how the scaffolding supporting NTP development had been allowed to rot--the fact is that the server was in Mr. Stenn's control and he could have rebooted it to rescue media at any time, fixing the problem in a few minutes. Yet, the server was never properly brought up to good maintenance practice. I suspect that the majority of people reading this know how to reset a root password, so the password doesn't really matter that much in the grand scheme. The server was just another thing being neglected.
As I described in my O'Reilly talk, technical problems of this magnitude stem from social problems. The project didn't have a culture of sound engineering practice. I did what I could to work with Mr. Stenn to offer support and resources to bring that practice to his project. I didn't want to lose the years of institutional knowledge he'd acquired working on NTP. That's costly to replace. However, I wasn't going to forgo sound engineering practice to keep him on board: over time, smart people could learn the ins and outs of even the most tangled code base. The costs of bad engineering practice just keep coming, and I cannot force people to do the right thing, only lay out the costs and benefits then see what they choose.
That, and throw a little storytelling prowess at the problem now and again, in the hope of motivating people.
You can eliminate vulns and improve stability a lot of ways. Total rewrite is definitely not the best way. Even if you're the best programmer in the world, rewrites often run into old bugs as well as new ones, and require a lot of testing and a lot of repeated effort.
And I can't speak for the other infosec nerds, but for me, name-dropping ESR does the opposite of inspiring confidence in a security-focused project. I wouldn't trust him to secure my shoelaces.
If the old codebase was really bad, perhaps eventual rewrite would have been useful. But what would help existing users more is fixing the existing product so they can upgrade in place and be more secure, and not forcing them to go through a whole product migration cycle just for better security.
* Moving NTP development from a private Bitkeeper repo which requires all people accessing it (10 at most without private license purchase, given that Network Time Foundation has only 10 licenses) to agree to a restrictive license that may interfere with their other development work, to a public git repository which is accessible by the public as a whole. Stenn felt that tarball releases were sufficient, and did not agree that giving the public an opportunity to see code prior to release was important.
* Releasing patches to NTP vulnerabilities to everyone at the same time. NTP had a practice (for which Mr. Stenn never explained to me the reason) of releasing vulnerability patches to a closed group months or more ahead of the public release. These patches were typically leaked fairly rapidly and turned into exploits which were then used against NTP deployments in the wild.
There were other disagreements, but these were the big two technical disagreements upon which Stenn walked away. They were not points upon which I was willing to compromise, especially given that neither I nor other people in a position to help NTP could possibly have signed Bitkeeper licenses while maintaining our primary employment. This was a massive roadblock for increasing contribution to NTP, from us or anyone else.
If you look at the slides from my O'Reilly presentation here: http://slides.com/hedgemage/savingtime you will see that even when the rescue proceeded without Stenn, we did not do a major refactor! Slide #20 outlines the original rescue, which had 4 points:
* migration to git
* replacing the build system (when Stenn had been on board, we'd intended to repair the build system in-place, but without the mystery scripts residing on his build box, we decided that a from-scratch replacement was more reliable and efficient than to reverse-engineer and repair)
* updating documentation so that new developers could be onboarded
* fixing what vulns we could given limited resources
That is it. Refactors came later when, after this "rescue" work, Mr. Stenn declined to use these work products and the NTPSec fork was born.
We did make every effort to avoid a fork, but in the end, I could only offer help, I could not force anyone to take it. Forking is, in the end, the OSS community's last protection from failing projects.
Honestly, both of those sound like very common issues which do not result in whole new product forks. Large projects maintain patchset and private security lists all the time. To me the ends don't justify the divergence.
- ESR is so productive and involved in so many things that this is actually a normal drama ratio
- ESR only involves himself in critical issues with high potential for drama
- ESR brings the drama
American blacks average a standard
deviation lower in IQ than American whites at about 85. And
it gets worse: the average IQ of African blacks is lower
still, not far above what is considered the threshold of mental
retardation in the U.S. And yes, it’s genetic; g seems to be about
85% heritable, and recent studies of effects like regression towards
the mean suggest strongly that most of the heritability is DNA rather
than nurturance effects.
ESR's famous melodramatic-response to having his CML2 ("Eric's configuration markup language for kernel building") patch rejected by Linus is also quite telling about his self-perception and his perception of other people and his interaction with anyone who would have an opposing viewpoint to his.
I'm not sure how much more needs to be said beyond the fact that ESR has repeatedly claimed, including recently, that the Internet might fail if people don't continue to donate to his Patreon --- a Patreon which, he has previously stated, stands as direct evidence of his fame as a programmer, a fame rivaling that of Donald Knuth or Linus Torvalds.
I think the CML incident and the time where he yelled at some Debian developer about "our tribe" are better examples, because at least there he's going into other people's spaces and actually confronting them.
He's not an academic researcher. Don't you think it's odd for some old white guy to get worked up over minority test scores? Why write about this rather than basically anything else? Either he's a troll or profoundly unaware about how bringing this line of argument up may make people feel threatened. There are a lot of things that simply don't need to be discussed. It's like graphically describing your partner's episiotomy over dinner. Some things just inspire a visceral reaction.
Here's an unflattering guess why he might be interested in this: he has cerebal palsy. Despite that, he still is a fairly smart (even if crazy and/or racist) guy. He might be extremely interested in differences in human intelligence for that reason. To figure out why he still has above average intelligence while most other people with his birth defect don't. That could explain a lot of his crazy beliefs and his narcissistic "I'm a super unix hacker" routine.
Why does he have to be an "academic researcher" to have opinions about race that he wants to talk about on his blog? Or for that matter, any topic? Do you have an anthropology degree to assert your opinion that his blog may make people feel threatened, or are you just another guy on the internet with an opinion?
Why not? If it's true, and you offered absolutely no refutation to his claim whatsoever, why shouldn't we discuss facts? Do they stop being true if we stop talking about them? Do their ramifications stop existing if we just refuse to accept them?
Er, why do we think there are no black people reading his blog?
Even then, that quote isn't a individual indicitment of stupidity. Something which there are many examples of him doing, more aggressively, in other contexts.
Yes, I know I'm nobody but that doesn't mean I can't call bullshit when I smell it.
As get as I remember, his philosophical contributions primarily involve being in the right place at the right time, when people were searching for the reason that Linux was taking over the world, and having something to do with convincing Netscape to go open source.
Yes, but one could hardly call those great accomplishments that stood the test of time.
It was a puff piece based on the popular at the time starry eyed look at OSS development, that caught on (then), because it fit with the overblown hype for community "bazaar" style software development prevalent in the late 90s.
Plus, ESR's documents have been debunked in their claimed benefits of the bazaar style time and again (plus the whole end of "linux on the desktop" dream -- again, meant by those believing in it as Linux-based distros being a major desktop OS player to overtake MS, not in the "works for me and I've installed it to my parents too" sense).
One can debate whether these things are all valuable, and how any value they may have may fit into how to judge him as a person along with his flaws, but they are notable.
Which is notable itself (OSI) because?
I have a much higher trust level for OpenBSD on security issues than I do with either of these projects.
(This is because neither supports leap seconds, so using them as servers will cause clients to desynchronize. Their belief that we don't need leap seconds is quite irrelevant: Right now we have them.)
RFC5905 (NTP 4) notes:
The goal of the NTP algorithms is to minimize
both the time difference and frequency difference between UTC and the
system clock. When these differences have been reduced below nominal
tolerances, the system clock is said to be synchronized to UTC.
They are part of the time standard used by the entire world. BSD doesn't like them, fine, but pretending (in their code) that they don't exist is not the correct approach.
It works more or less fine for clients, but not for servers.
> a server would still be NTP conforming if it smeared the leap second
Not really. Consider the situation if a client is speaking to multiple servers, some normal servers, some buggy BSD servers.
Consider a client that manages to do a mixture of correctly applying a leap second, while also doing time smearing because an upstream source is doing time smearing.
BSD is simply doing the wrong thing here. (My understanding is that BSD time servers are not permitted in the ntp pool because of the problems they cause.)
It seemed like perfectly decent software except for the fact that it failed to keep the time in sync.
I switched to the usual (ntp.org) ntpd, with the same pool servers, and it worked fine.
I do not claim that OpenNTPD has a bug. Maybe it has requirements on the quality of the local clock that the virtual machine didn't meet. Maybe it has requirements on the remote time sources or the network that the setup didn't meet.
What I do know is that over the years I've deployed ntp-over-the-Internet in a dozen or two different setups (including some user-mode-Linux ones with really poor local clocks), and ntp.org's ntpd has always managed to sync the clock.
The one time I gave OpenNTPD a go it didn't. Looking at the log, I was running it for several months. After a while I got tired of fixing it manually when the clock was a few seconds out.
At a previous job, we pointed our in-office linux dev systems at our Windows domain controller and never had any issues with time synchronization. Since this was back in this embrace-and-extend days, I was pleasantly surprised by how interoperable MS was when it came to NTP.
That said, I didn't realize that the w32time service got a _lot_ better in Windows Server 2016:
It looks like Microsoft added support for NTPv4, and w32time now boasts 1-ms accuracy, which is at least a factor of 10 better than the previous Windows Server release.
Usually signals the start of much drama.
I wasn't surprised to learn that the project one year later was caught up evaluating Rust vs Go. That's great and all, but it's not saving the Internet from "meltdown".
Anyone who needs a modern ntpd and doesn't need the refclock stuff should probably just go with chrony.
There were several large reflection issues in ntpd in the past few years which led it to become a popular DDoS method. Reflection is the big winner lately.
No need for a botnet, just scan UDP services, find ones which reply with large packets or a series of packets and then spoof your IP (easy from many VPS and dedicated server providers) and flood out requests for those large packets.
You can get amplification of 100x as much traffic with only a small request and it's quite challenging to trace back to the original source due to the spoofed packets.
You definitely have to hunt around if this is something you want and the number does seem smaller than I remember it being so many more are probably implementing it since I last checked, but still not all.
Whoever comes to some big project sees only the small part of it that he immediately understands, and it appears to him as he can do it "much simpler." Yes he can, if he just does that small part. The problem is, the big projects actually do more. If you don't need the big project, you can use the small simple project too. Everybody has fun making something small. Maintaining something big -- that's the hard part, and we see from the article we comment to, it's again what the "saviors" want to avoid:
"[Their plan was] to rescue something, quote unquote, fix it up, and turn it over to a maintenance team."
Of course, I haven't worked on the NTP code. But given the described circumstances, I'm willing to entertain the argument that older features now cost more to maintain than they are worth.
The problem is, none of these "liberators" actually improves NTP, and none can (by default from their goals) provide a "better" NTP but only a small subset, typically even less tested. And everybody of those gets some funding and a lot of the attention, instead of the real maintainers of the darned NTP.
Doing big, widely present and very compatible and long-lived projects is hard. Very hard. And the maintainers should be helped, not blamed.
Does "very compatible" mean "wider compatibility than POSIX and C99"? If so, that's a very ambitious goal, but is it worth in this age?
If somebody complains "the source is not up to the current security standards" that doesn't mean that anything has to be discarded to work towards that goal.
And if somebody wants to make the source compiling only with the most recent compilers and only for the small subset of the previous platforms he obviously has another agenda than improving the security, or actually helping the project. It's just enforcing his taste to the people who never asked for that.
For my personal experience, at my previous work I was able to "just use" NTP, and I know that on these platforms I still can't use any of the alternatives. I admit I surely have another perspective than the average Linux user who doesn't really care what keeps his time in sync as long as it "looks right."
And reading this comment:
"in general all the other NTP implementations likewise lack broad support for all the various reference clock hardware and drivers. This is why ntpd is still used so heavily as stratum 1 servers."
And, additionally.. are the servers running Windows a small and obscure base?
Aka not supporting Windows.
Those people are toxic, and removing them from your community makes everything better. If you can't do that, ensure that the only topics of discussion are reality based, and focussed on getting things done.
I'm not saying the people here meet those criteria. But it does answer your question.
Stenn just gets things done.
disclaimer: have known Stenn since the 90s, Raymond casually since the 80s.
Why don't any professional servers come with this kind of connection built-in? Why can't I synchronize my Xeon chip's clock to my own reference frequency?
(associated HN discussion - https://news.ycombinator.com/item?id=8066915)
Here's a choice quote from the NTP FAQ (http://www.ntp.org/ntpfaq/NTP-s-sw-clocks-quality.htm):
Unfortunately all the common clock hardware is not very accurate. This is simply because the frequency that makes time increase is never exactly right. Even an error of only 0.001% would make a clock be off by almost one second per day. This is also a reason why discussing clock problems uses very fine measures: One PPM (Part Per Million) is 0.0001% (1E-6).
And if you want to delve into the physics, you'll enjoy this tutorial by John Vig (U.S. Army Communications-Electronics Command):
Also, there's another unrelated problem which is "how accurately does a user process running in a linux kernel get the current time?". That's a problem no matter what kind of hardware the kernel is running on (although hardware might assist with that as well).
2. For applications where you need extremely highly accurate time, there already are other solutions like PTP. NTP is mainly useful for wide-area connections. I want to keep my laptop's time accurate, so that if I modify a file here and rsync it elsewhere, make doesn't get confused about the time my laptop sent. I'm not plugging in a hardware time server to my laptop.
It absolutely would span racks. You must account for the additional speed-of-light delay the cable itself would add though.
Because large numbers of servers in a mass-scale deployment frequently have only two things: Ethernet connections and power. Introducing coaxial cable so you can feed a 10MHz reference signal (such as you feed a reference clock signal to certain RF Rx and Tx equipment) is a non starter. Introducing a serial cable to every server into an environment that is purely IP/Ethernet is a non starter.
You severely underestimate the cost of rolling out another cable type one per server. In fact there are entire billion-dollar segments of the industry that specifically exist to remove said cables.
Anything more than power and a few ethernet connected to a rack full of 40+ servers gets real messy real fast, and starts to actually impact cooling and thus power usage in a very measurable way.
Even things like screws in server rails vs. tool-less impact the bottom line in most large scale server ops.
For the 3 total servers that actually need very precise timekeeping we effectively do this. For the tens of thousands of others, that's what IP and network protocols were made for.
I'm not sure what you're referring to here. The traditional 10BASE5 cable hasn't been used in literally decades. There's none of it deployed in modern datacenters.
AKA 0.375 inch diameter "frozen yellow garden hose".
So standard fiber optic patch cable is the new "traditional"?
and another example of duplex uniboot inside one cable jacket:
versus zipcord/duplex fiber which has a figure-8 shape, and the total jacket diameter is fatter in the range of 2.2 to 3mm:
ignore the SC or LC connectors, the difference is the cable diameter itself. Figure-8 shaped duplex two strand fiber cables that are fatter and occupy more space than a single circular jacket that contains two strands and a single boot connector.
This matters for cable management when you're dealing with a switch that might have 4 x 48-port 10Gbe SFP+ linecards in it, all heavily populated. All of the cables coming off the switch in a neatly organized way and going vertically up some fiber cable management to a patch panel.
No, it's not and that's the problem. IP was meant for wide area internetworking, not local low-latency time distribution.
With enough engineering one could potentially include the time and frequency reference signals directly onto existing cabling - for example they could be put onto a fiber at a different wavelength from the ethernet traffic. But it would take quite a bit of money to convince a NIC manufacturer to do this.
Some NICs already come with 1pps reference: http://www.solarflare.com/ptp-adapters
One scenario which we certainly don't have today: access to a tick-accurate global clock across a large number of computers. It would be interesting to see how distributed scheduling and communications between computers would be coordinated if each computer had access to an internal clock timestamp that exactly matched all of its peers.
Then the RF and/or serial cabling from each server has to go somewhere, to some equipment (what, a new special 10 MHz reference 1U top of rack device that is a bespoke creation?), which both consumes rack space and electricity. And is managed over some sort of management network.
It's not a "bespoke creation", such things have been available for decades:
products like the one linked above are priced and intended for providing timing to crucial core network infrastructure. Such as if you're an ISP with a POP that has a 1+1 pair of Juniper MX960 at the main IX point in a major city. Where the ISP would then have its own set of NTP servers serving as internally-accessible-only stratum 2 to all of their other network equipment that is topologically nearby within the same AS.
It depends on what capabilities you want. A good oscillator with GPS discipline capabilities would cost between $100-$10000 depending on specifications. In particular, if you want it to have high stability with time and temperature when GPS fails you must spend much more money.
>and can feed reference signals to how many individual end-point servers?
In theory one of these units could feed a whole datacenter (thousands of endpoints), if you also purchased enough cabling and distribution amplifiers.
I'm not underestimating anything. If people want to have highly accurate time available at a hardware level, they can pay for it. I'm complaining that it isn't even an option from server vendors, who are quite capable of implementing it.
Everything else in my house uses NTP and, in some cases, PTP, however.
I can't help but wonder if there are other options, most notably C++. Don't get me wrong, I have a well developed loathing of C++ based on past experience, but it seems like a category fit and people I respect have said good things about the latest versions. Nim would also seem like a pretty good fit technically, though I could well understand if its relative immaturity and lack of developers (another recent story here on HN) cause it to be deemed inappropriate for this situation.
"Already, where once only Stenn was looking for support, now Raymond is in a somewhat similar position, as NTPsec has lost its Core Infrastructure Initiative funding as of September 2016."
So, a drive-by fork.
BTW, had anyone used any project that ESR is heavily involved in? The only things I know about are NTPsec, fetchmail and his attempt to hack the Linux kernel build system.