FPGA NTP Server

magicalhippo · on Feb 16, 2021

For less demanding users, a Raspberry Pi and a cheap GPS module can do the trick[1]. I get less than +/-300us in error according to chronyc on the clients.

Note that if you get one of the GPS modules, some do not expose the PPS pin to the headers, so might require some board modification (they use it just to drive a LED). I got one like this[2] that has it exposed.

Also note that the small antennas only work outside, the slightly larger square ones work inside but very near a window. I got an active antenna[3] as I wanted it more inside the room.

And finally the NEO-6M module I linked to is quite old, the newer NEO-7M and NEO-8M lock on faster etc, but for me this was sufficient.

Oh and I had to disable serial echo[4], almost forgot about that.

[1]: https://n4bfr.com/2020/04/raspberry-pi-with-chrony/2/

[2]: https://www.aliexpress.com/item/4001136384325.html?spm=a2g0s...

[3]: https://www.aliexpress.com/item/33059221782.html?spm=a2g0s.9...

[4]: https://raspberrypi.stackexchange.com/a/104296

LeonenTheDK · on Feb 16, 2021

That reminded me of Mitxela's GPS clock[0]. Using GPS modules to get an incredibly accurate time is really interesting, I need to find some time to make one of these.

[0]https://mitxela.com/projects/precision_clock_mk_iii

JoachimS · on Feb 16, 2021

There exists an open source NTP server project by Netnod in Sweden.

The servers has been up and running, providing NTP including NTP AUTH services since about 2015. The FPGA based platform also provides experimental NTS services since about last summer.

https://www.netnod.se/ntp/connect-to-ntp-servers

https://github.com/Netnod/FPGA_NTP_SERVER

lmilcin · on Feb 16, 2021

> Microprocessor based Network Time Protocol (NTP) servers suffer from a large amount of timestamp jitter, due to the hardware and Operating System (OS) being shared among other applications.

So instead of configuring Linux correctly and dedicating a core for your NTP you have created custom hardware. Congratulations, now you have two problems.

I have been working on algorithmic trading and it is not that hard to reliably (like 100% of the time) respond within couple microseconds.

You will never notice it, though, because your network will introduce way more variability. This is especially the case with NTP which usually is a single box shared between large number of servers throughout your network.

Unklejoe · on Feb 16, 2021

> I have been working on algorithmic trading and it is not that hard to reliably (like 100% of the time) respond within couple microseconds.

As someone who works in the flight test instrumentation industry (that primarily uses PTP rather than NTP), my first thought was "is a couple of microseconds supposed to be good?". With PTP, I usually achieve RMS offsets of < 50 nanoseconds over gigabit Ethernet.

I sort of agree with your point in that it's probably pointless implementing NTP inside of an FPGA, but I'd like to extend it further and say that if you really care about accurate timing, you should be using PTP rather than NTP anyway. In that case, an FPGA solution makes much more sense. However, in the systems I've worked with, the FPGA only implements a few pieces of core functionality - like the frame time stamping and the numerically-frequency-controlled clock. The actual protocol is usually implemented in software, except in the case of transparent switches.

That said, most modern NICs have PTP hardware, but you still usually need some external logic to actually use these synchronized clocks.

lmilcin · on Feb 16, 2021

On a "normal" box the time is usually combination of various sources neither of which was really designed to provide accurate, sub microsecond, absolute time. (That except possibly for this "PTP hardware" in NICs that I don't know anything about)

So even if you somehow devise a piece of hardware to pass the accurate time to the operating system, there would be no way to track it accurately.

Now, I think that really accurate absolute timing on machines isn't really that critical. For example, for consistency protocols usually it is enough to provide guarantee of 1s around true time.

For algorithmic trading we were mostly interested in responding as quickly as possible to the event that came from the stock exchange. The packets always came with a known delay and the timing wasn't all that important except for debugging. Everything was done on a single box (so no need to synchronize with anything else) and even on that box the events were processed such that there was no need to coordinate with other threads, each thread just consumed, processed and published results.

mlyle · on Feb 16, 2021

Most Intel 10 gigabit adapters support PTP.

It's just about inserting the exact time the frame will be sent into a sent frame, and noting the exact time the frame is received. All that's left is a small amount of clock jitter and cable delays.

High end switches support PTP themselves so you don't need to worry about queueing delays, but you don't need a switch that supports PTP to get high resolution, low jitter, low uncertainty time-- at least on reasonably sized networks that are usually not saturated.

There are many algorithmic trading applications that rely upon high quality time derived from a single coherent source, along with telecommunications, control systems, instrumentation, etc. And there are other distributed systems approaches where true ordering is nice: yes, you can provide ordering within a system of dependent events using a Lamport clock, vector clock, etc, but without high quality time you can also correctly sequence causally related events originating outside your tightly coupled system in many use cases.

That said: PTP relies on a tightly coupled master and realistically is intended for a tightly coupled, hierarchal system. NTP is a better "internet" time protocol-- lots of logic for clock precedence, averaging of multiple sources, slower control system, etc.

eqvinox · on Feb 16, 2021

A lot of intel 1G adapters do too, likely including the one in your laptop.

  $ lspci -s 0:1f.6
  00:1f.6 Ethernet controller: Intel Corporation Ethernet Connection (4) I219-V (rev 21)
  $ sudo ethtool -T enp0s31f6 
  Time stamping parameters for enp0s31f6:
  Capabilities:
   hardware-transmit
   software-transmit
   hardware-receive
   software-receive
   software-system-clock
   hardware-raw-clock
  PTP Hardware Clock: 0
  Hardware Transmit Timestamp Modes:
   off
   on
  Hardware Receive Filter Modes:
   none
   all
   ptpv1-l4-sync
   ptpv1-l4-delay-req
   ptpv2-l4-sync
   ptpv2-l4-delay-req
   ptpv2-l2-sync
   ptpv2-l2-delay-req
   ptpv2-event
   ptpv2-sync
   ptpv2-delay-req

mlyle · on Feb 16, 2021

Yup-- I didn't know the prevalence so I didn't say.

You're wrong though-- there's no ethernet adapter in my laptop ;)

eqvinox · on Feb 16, 2021

Touché :D

(sadly, 802.11 timestamping on wifi is very fringe...)

Unklejoe · on Feb 17, 2021

It's hard to do PTP reliably over WiFi since the symmetry in both directions (client<->ap) isn't really consistent. It _does_ sort of work though.

mlyle · on Feb 18, 2021

It could be made to work fairly well with hardware / driver support. Timing information of frames is known rather precisely, and delay spread is <100ns in most environments.

g0xA52A2A · on Feb 16, 2021

> dedicating a core for your NTP

Heh as soon as I read this I thought "algo trader?". If you don't mind me asking outside of trading how often do you see systems seriously manage jitter (specifically I'm thinking about interrupt management). I'm in HPC and whilst some such management is present it seems seldom thought about.

hansor · on Feb 16, 2021

NTP is hilariously insufficient for HFT, we have far better protocol around: PTP.

https://en.wikipedia.org/wiki/Precision_Time_Protocol

Bootvis · on Feb 16, 2021

Jane Street has an interesting podcast on this very subject: https://signalsandthreads.com/clock-synchronization/

g0xA52A2A · on Feb 16, 2021

I'm aware, I was talking about jitter in general.

lmilcin · on Feb 16, 2021

Heh, I already thought somebody will point it out.

See, they decided to create custom hardware just to do NTP. How is instead dedicating a single core a worse solution? Setting dedicated core is basically configuration detail.

There are no gains from custom hardware as network variability would mask them. If you want really good time guarantee because having right time is critical for your application, you put atomic clocks in your servers as one well known company does.

Also, it is customary to have dedicated machines for various functions. You would not normally want to mix some different types of loads that conflict with each other, for example high security with low security, high throughput with real time, etc.

You could just configure a single machine dedicated to NTP (which is customary), but instead of sharing cores between OS and NTP (and introducing jitter) you can have separate core for the OS and separate for NTP so that NTP can work undisturbed.

So if you maintain an NTP instance and have problem with jitter and you red this article my hint is: don't set up custom FPGA to fix your problem, just dedicate single core on a machine you already have and you will get as good results as you can.

Unklejoe · on Feb 16, 2021

> How is instead dedicating a single core a worse solution?

It is pretty much guaranteed to perform worse, if that's what you mean. Is it worth the effort and extra complexity? Probably not.

The timestamping jitter with an FPGA would probably be on the order of tens of nanoseconds while we're talking microseconds with a software solution.

> There are no gains from custom hardware as network variability would mask them.

I _think_ multiple cascaded jitters in a system add with the square root of the sum of squares, so there would still be an improvement.

Consider a system where you had 10 microseconds of inherent jitter, then you added another 10 microseconds on top of that. The total is not 10 microseconds - it would be 14.

lmilcin · on Feb 16, 2021

> Is it worth the effort and extra complexity? Probably not.

That's pretty much definition of a worse solution.

Unklejoe · on Feb 16, 2021

Not necessarily when it comes to something like HFT, where they do things like run point to point microwave links to reduce latency by microseconds.

1_person · on Feb 16, 2021

The precision achievable with FPGAs is well beyond 10s of nanoseconds.

The high rate transceivers are used with capacitative dividers or internal signal propagation delays defining the reference intervals for time to digital conversion in extreme accuracy applications, these and similar approaches yield precision to picosecond scale and beyond.

Unklejoe · on Feb 16, 2021

Right, but Ethernet timestamping usually occurs on a symbol or byte level inside the MAC, so for GigE, you're looking at 8 nanoseconds per symbol. This could be improved by oversampling the symbols, but at least with GigE, there's (I think) 4 cycles of uncertainty each time the link comes up due to how the clocking agreement between master/slave is established. It's also somewhat difficult to run large counters faster than a few hundred MHz inside of an FPGA. I've worked with systems that use two different clocks - one small counter for Ethernet timestamping, and another main clock, with a periodic latching between the two to correlate them, but it's uncommon.

There's also the fact that most PHYs operate by having a FIFO with data being clocked in and clocked out on different clock domains. The way I understand it, the PHY FIFO fills up half way, and only then does it start draining into the MAC. This is to allow for oscillator frequency errors between each device. This is also why there's a maximum frame size and max oscillator tolerance specified for Ethernet - so that this FIFO doesn't under or overrun before the entire frame is received.

I think sync-e can improve this quite a bit.

Of course, FPGAs themselves are capable of much tighter timings, but in practice, it's usually implemented this way.

1_person · on Feb 16, 2021

The approaches I am referring to are indeed oversampled and occur in a "bump in the wire" or on a parallel data path.

Moreover, the oversampling occurs at a much higher frequency than the maximum counter frequency of the FPGA -- each clock period is divided into a fractional offset from the rising edge by the capacitative divider or delay line.

It is not uncommon at all for dedicated PTP/NTP hardware to implement synchronization and timestamping in this way, even commodity NICs supporting hardware timestamping often implement it on at least a parallel data path.

g0xA52A2A · on Feb 16, 2021

Oh I wasn't meaning specifically in the context of NTP, just generally.

lmilcin · on Feb 16, 2021

> Oh I wasn't meaning specifically in the context of NTP, just generally.

So you decided to just answer with a generalization without regard for the particular case of NTP?

Are you frequently building custom hardware when default OS settings do not suit your needs?

Linux has a lot of configuration potential. It is wise to explore and understand various knobs available before you decide to complicate your life and build something very complex that could easily be replaced with a one line shell script.

g0xA52A2A · on Feb 16, 2021

Have you perhaps mixed up my comment with this one https://news.ycombinator.com/item?id=26152783?

I was just asking about your thoughts of how outside of trading jitter is seldom considered. I appreciate it's off-topic from the actual HN post but I was curious given your experience in trading.

1_person · on Feb 16, 2021

The uncontrollable and unobservable SMM interrupts of most modern CPUs add sufficient jitter to the reference clock sampling that there is no correct configuration.

You generally must use an FPGA or appropriate DSP/microcontroller to achieve precision beyond ~10s of nanoseconds, which is entirely achievable even with NTP between hosts that have low contention 10+Gbps per transceiver lane interfaces / properly configured DCB/QoS so the time synchronization packets always egress without delay. PTP can, of course, achieve even better precision.

Round trip time / transmission time ("respond within couple microseconds") are irrelevant, it's a rising or falling edge feature of the packet burst which timestamps the sample and the interval between these samples that's used to train the clocks. This can be accurate to within femtoseconds at the extreme.

lmilcin · on Feb 16, 2021

Let's say you get FPGA and then what? Where are you going to use that signal? On a server that has "uncontrollable and unobservable SMM interrupts"?

1_person · on Feb 16, 2021

Sure, why not?

It's going to offer an incremental improvement in accuracy for the server's derived clock when the reference clock is not affected by that and other sources of jitter.

Beyond the accuracy, the hardware is significantly cheaper and more efficient than a server with a general purpose x86 CPU.

For a dedicated network reference clock, an FPGA or ASIC solution is simply better in every measurable way.

It is more complex, to be sure, but the complexity needn't be your concern.

varjag · on Feb 16, 2021

Yes, as we inch towards mid-21st century, doing things fast is the unremarkable part. Doing things fast and synchronously over the network is the trickier one.

hummo56 · on Feb 16, 2021

Interesting project, especially to gain insights into the FPGA programming part.

I think it is essentially a mixture of PTP and NTP now.

I guess this will work within the same local network, as the major inaccuracy of NTP comes from the asymmetric path delays at the network layer over the Internet.

PTP solves this by incorporating these hardware timestamps exactly. But this works only within the same LAN.

Here is an interesting PTP core:

https://github.com/freecores/ha1588

Yes it is "old" but that is typically fine with HW projects.

mlichvar · on Feb 16, 2021

The main difference between PTP and NTP is that PTP relies on hardware support in switches and routers. Those are not cheap. If they had the same support for NTP, it would perform as well.

A highly accurate stratum-1 NTP server can be build with a common computer NIC. No need to mess with FPGAs (unless that's your thing). The Intel I210 is about $50. It has a PPS input and output. With some calibration, the timestamping can be accurate to few tens of nanoseconds.

NTP can work very well between directly connected NICs. But without hardware support in the switches/routers, that accuracy degrades quickly in the network. A single switch can easily add hundreds of nanoseconds worth of jitter and tens of nanoseconds worth of asymmetry.

varjag · on Feb 16, 2021

PTP at this point is ubiquitous and cheap. It is hard to source a PHY that has no support for it in 2021.

> If they had the same support for NTP, it would perform as well.

Unless you'd extend NTP to become PTP, it absolutely wouldn't.

mlichvar · on Feb 16, 2021

Yes, NICs with support for hardware timestamping are common (it's typically in the MAC, not PHY), but switches that have a good support for PTP, either as a boundary clock, or transparent clock, are not cheap. At least I have not seen one yet. Do you have any examples?

Some switches support NTP as a server and client (equivalent to the PTP boundary clock), but there don't seem to be any using hardware timestamping. It's just the classic ntpd using software timestamps, good to few tens of microseconds at best.

And yes, NTP could definitely perform as well as PTP if the switches had a proper support. In my tests with directly connected NICs the synchronization is stable to few nanoseconds, same as with PTP. At the protocol level, they use the same timestamps.

varjag · on Feb 16, 2021

> Do you have any examples?

Probably all current Cisco offerings? Ubiquiti industrial switches? A whole crowd of second tier vendors like Lantech or Korenix? These are just those I had direct experience with.

Of these, Cisco definitely does boundary clock on L3, on at least several models of their routers.

> In my tests with directly connected NICs the synchronization is stable to few nanoseconds, same as with PTP.

Yes network sync is piece of cake if you drop the whole network bit. That said am slightly skeptical about ns level precision with NTP. Did you measure synchronicity between the two devices via scope?

traceroute66 · on Feb 16, 2021

Boy is this a timely thread !

I've been looking into doing a little COVID lockdown project consisting of a multi-GNSS timeserver.

Does anyone have examples of projects using something like the UBlox ZED-F9T [1] which can do concurrent connections to 4 GNSS constellations ?

I'm not an electrical engineer, so unfortunatley I lack the skills to design a PCB from scratch. ;-(

[1]https://www.u-blox.com/en/product/zed-f9t-module

fortran77 · on Feb 16, 2021

Why not PTP too?

mobilemidget · on Feb 16, 2021

the reason I read hacker news comments, I never heard of PTP. So I started google'ing. And now I know more and am happier :) thanks for that

quick link for those like me

https://linux.die.net/man/8/ptp4l

fortran77 · on Feb 16, 2021

The reason I asked was if you’re already taking packets apart with an FPGA and you have the accuracy, it’s a logical next step.

wmf · on Feb 16, 2021

PTP seems like a telco/enterprise thing; hackers don't care about it.

jbronn · on Feb 16, 2021

I personally enjoyed the challenge of setting up PTP at home. Why would a hacker scoff at nanosecond-level timekeeping —- isn’t the entire internet a “telco/enterprise” thing?

wmf · on Feb 16, 2021

To do PTP "right" requires every switch to support it and a NIC with hardware timestamps. Also, I've seen claims that PTP is no more precise than a good implementation of NTP.

markfeathers · on Feb 16, 2021

ptp is <1us synchronization. From my testing NTP is ~20-60us after about 10 minutes of sync, but it intentionally drifts the phase around. On average, NTP is pretty close.

If you look at the white rabbit FPGA PTP updates, its in the ns range.

Any kind of GPS + most intel nics will get you PTP with an accurate clock. If you didn't need to sync too many devices you could use a single system with a bunch of nics as your "switch".

willis936 · on Feb 16, 2021

This post didn’t sound right to me, but I realized that my raspi4 GPS NTP server has been running ntp and not chrony. Chrony is better at modeling non deterministic timing behavior, so I swapped to that.

It’s been ten minutes now and chronyc tracking has been marching the offset down. It’s sub 1 us at this point.

  System time     : 0.000000123 seconds fast of NTP time
  Last offset     : +0.000000366 seconds

How to get this precise time out of a non deterministic OS? Beats me. Once I figure that out I can finish my clock project.

My best lead is to step through the different python timing and scheduler implementations and see which has the lowest jitter relative to the PPS on an oscilloscope.

sgtnoodle · on Feb 16, 2021

Assuming you're using a PPS signal and a kernel driver, presumably there's an interrupt handler or perhaps a capture timer peripheral that is capturing a hardware timer when the PPS edge occurs. It doesn't matter too much when the userspace code gets around to adjusting the hardware timer as long as it can compute the difference between when the PPS edge came in and when it should have come in. The Linux API for fine tuning the system time works in deltas rather than absolute timestamps, so it is once again fairly immune to userspace scheduling jitter.

Even good hardware oscillators can have a wide amount of drift, say 50uS per second, but they tend to be stable over several minutes outside of extreme thermal environments. Therefore, it's pretty easy to estimate and compensate for drift using a PPS signal as a reference. Presumably, that compensation is partially what takes a while for the time daemon to converge on.

Additionally, the clock sync daemon likely takes a while to converge because it isn't directly controlling the system time. Rather, it is sending hints to the kernel for it to adjust the time. The kernel decides how best to do that, and it does it in a way that attempts to avoid breaking other userspace programs that are running. For example, it tries to keep system time monotonically increasing. This means that there's relatively low gain in the feedback loop, and so it takes a while to cancel out error.

It's possible for a userspace program to instead explicitly set system time, but that really isn't intended to be used in Linux unless time is more than 0.5 seconds off. The API call to do that is inherently vulnerable to userspace scheduling jitter, but it's fine since 0.5 seconds is orders of magnitude longer than the expected jitter. You get the system time within the ballpark, and then incrementally adjust it until it's perfect.

If you're not using a kernel driver to capture the PPS edge's timestamp, then you're going to have a rougher time. Either you're just going to have to accept the fact that you can't do better than the scheduling jitter (other than assume it averages out), or you're going to have to do something clever/terrible. One idea would be to have your userspace process go to sleep until, say, 1ms before you expect the next PPS edge to come in. Then, go into a tight polling loop until the edge occurs. As long as reading the PPS pin from userspace is non-blocking and your process doesn't get preempted, you should be able to get at least within microseconds. You can poll system time in the same tight loop, allowing you to fairly reliably detect whether the process got preempted or not.

willis936 · on Feb 16, 2021

Thank you for the detailed response! The PPS is currently driving a hardware interrupt on the raspberry pi that is read in by kernel mode software. My project is to drive an external display. Normally I would bypass the raspberry pi altogether and connect the PPS signal to the strobe input of the SIPO shift register. The problem is that the PPS signal cannot be trusted to always exist. Using a raspberry pi has a few benefits. Setting the timezone based on location, leap seconds, and smoothing out inconsistent GPS data. So while opting to use system time to drive the start of second adds error, I think the tradeoff for reliability is worth it.

I have considered adding complexity, such as adding a hardware mux to choose whether to use the GPS PPS signal or the raspberry pi's start-of-second. I should walk before I run though.

sgtnoodle · on Feb 16, 2021

If you want to precisely generate a PPS edge in software with less jitter than you can schedule, you can use a PWM peripheral. Wake up a few milliseconds before the PPS edge is due, get the system time, and compute the precise time until the PPS is due. Initialize the PWM peripheral to transition that far into the future, then go back to sleep until a bit after the transition should have happened, and disable the PWM peripheral.

This works because a thread of execution generally knows what time it is with higher precision than it can accurately schedule itself.

I'm not sure I understand how you're using a PPS signal to drive a display, though. Is it an LED segment display? I assume you want it to update once a second, precisely on the edge of each second. Displays generally exist for humans, though, and a human isn't going to perceive a few milliseconds of jitter on a 1Hz update.

willis936 · on Feb 16, 2021

Nixie tubes driven by a pair of cascaded HV5122 (driver + shift register). The strobe input is what updates the output registers with the recently shifted in contents. The driver takes 500 ns to turn on and the nixie tubes take about 10 us to fire once the voltage is applied.

I know it's absurd to worry about the last few ms, but it's part of what interests me about the project. The goal is to make The Wall Time as accurate as I can. I could go further with a delay locked loop fed from measuring nixie tube current. There is room push down to the dozens of nanoseconds of error relative to the PPS source, but I am content with the 10s of microseconds. I can't imagine ever having access to a camera that could capture that amount of error.

Thanks for the tip. Hardware timers are best. I'll likely have to take some measurements to calibrate the computation time of getting the system time and performing the subtraction.

sgtnoodle · on Feb 16, 2021

Sounds like fun! For what it's worth, ublox GPS modules and their clones should be configurable to always produce a PPS signal regardless of whether or not they have a satellite fix. The module would probably do a better job than software on a pi could during transient periods without a fix (due to how accurate the oscillators need to be in a GPS module). So, as long as you can trust the GPS module to exist and be powered, you should be able to reliably clock your display update with it. The only reason really to generate your own PPS would be if you want it to work without a GPS module at all, perhaps by NTP or something; you're then of course again looking at only a millisecond or so of accuracy.

willis936 · on Feb 17, 2021

I'm using an uputronics GPS/RTC hat that has a u-blox M8 engine. I set it to stationary mode for extra accuracy. I'll have to look into other configuration options.

I'm currently trying to source PPS from software with this repo: https://github.com/twteamware/raspberrypi-ptp/blob/master/pp...

You might find this repo interesting.

I have a 500 MHz oscilloscope on hand so I can decently measure the offset between the raspberry pi PPS and GPS PPS.

raginalix · on Feb 16, 2021

I've always had a lot of trouble getting low jitter with python. C is much better.

varjag · on Feb 16, 2021

NTP gets worse if you sync more than two devices across a broader network with other switched traffic, more into low 100s of µs. PTP does not degrade similarly and yes, most of PHYs made since middle of the last decade support it.

HourglassFR · on Feb 16, 2021

> If you look at the white rabbit FPGA PTP updates, its in the ns range

As I recall, I had even better performance than that. Around the tens of picoseconds. But I guess the advertised 1 ns is a conservative estimate. The precison is incredible but its not magic, they squeeze the maximum amount of determinism out of custom hardware and fiber optic links. It is a bit of pain too set up, as you need to calibrate each link individually every time you change the fiber or the SFP.

Unklejoe · on Feb 16, 2021

> To do PTP "right" requires every switch to support it and a NIC with hardware timestamps.

I agree, but PTP will in fact work over regular commercial switches on a LAN. The problem is that it will introduce jitter if there's other traffic on the network, but as long as the paths from master to slave (terms used in the standard) remain symmetric, you can filter this out and achieve performance almost as good as if you were using PTP transparent switches.

makomk · on Feb 16, 2021

The NIC with hardware timestamps part should be pretty easy if you're already implementing it using an FPGA like this project did - in fact, in a sense that seems to be exactly what they're doing with NTP. Finding switches that support it might be a little harder.

wyldfire · on Feb 16, 2021

Many many computers on the internet have a RTC with second resolution and timer tick resolution in the microseconds. Reference time resolution that's that much higher than your timer tick period is useless.

jstrong · on Feb 16, 2021

did you write about your experience? I am interested in this but have very limited knowledge about it at this point.

varjag · on Feb 16, 2021

Install linuxptp on the boxes you designate as grandmaster and client. On your grandmaster clock host:

  ptp4l -i eth0 -m

On the client:

  ptp4l -i eth0 -s &

If you want to sync client system clock and watch the synchronization process:

  phc2sys -a -r -m

jacquesm · on Feb 16, 2021

Hackers do care about it. Plumbers don't.

paxswill · on Feb 16, 2021

I was surprised to find PTP on my network a while back: Airplay 2 uses it in some cases.

pjmlp · on Feb 16, 2021

The ultimate unikernel. :)

lxe · on Feb 16, 2021

Doesn't jitter depend more on the network latency, rather than the real-timeliness of the process?

mlyle · on Feb 16, 2021

Jitter is the variation in latency of all sources.

The latency of the A) delay in the original packet being sent + B) the latency of the network itself + C) latency of the ntp server receiving the packet + D) latency of the process answering the packet + E) latency of the kernel sending the response + F) the latency of the network itself for the return pack + G) the latency of the client receiving the return packet and giving it an accurate local time mark.

On a uncongested LAN, B & F can be very low in absolute terms and low in variation. This server constrains C, D, & E to basically nothing. So only A & G-- the limitations of the client itself -- remain. How long does it take to receive the frame, get it to system RAM, dispatch an interrupt (which may be coalesced), service the interrupt, and context switch to/wake up the right client process? Alternatively, the network driver can capture a timestamp at the moment the packet is received, eliminating a lot of this variation.

NTP clients assume half of the delay is client->server and half is server->client, for the purpose of computing offset. That is, they subtract half the roundtrip delay off the received time. So reducing delay in C, D, & E further reduces the amount that this guess can be wrong by.

jstrong · on Feb 16, 2021

how much would it cost to buy the underlying hardware?

jsolson · on Feb 16, 2021

Less than $200.

The board there is overkill -- if you really want something with a Zynq including an Arm Cortex A9 that can run Linux, the Arty Z7 from Digilent is another Zynq board that's essentially a one-stop shop for this: https://store.digilentinc.com/arty-z7-zynq-7000-soc-developm...

However, for something as simple as an NTP server you could easily get away with an Arty A7: https://store.digilentinc.com/arty-a7-artix-7-fpga-developme...

From there you can instantiate either Xilinx's Microblaze MCU _or_ you could stick to Arm and use one of Arm's DesignStart cores (Cortex-M1 or Cortex-M3): https://developer.arm.com/ip-products/designstart/fpga/fpga-...

Glancing around Digilent's catalog, they've got an even cheaper Zynq board: https://store.digilentinc.com/cora-z7-zynq-7000-single-core-...

jstrong · on Feb 16, 2021

thank you for providing all this info!

as someone who doesn't know much about fgpa boards, what does it mean when you write "From there you can instantiate either Xilinx's Microblaze MCU"?

Also, one thing that is confusing to me is that the source code that is linked on this page is a .h and .c file, is the fgpa on these zynq boards programmable with C code?

Cyph0n · on Feb 16, 2021

They mean putting a “soft” CPU core on the FPGA.

“Soft” means the CPU is added to your overall design as a block and is compiled along with your design to the FPGA bitstream. This way, the CPU ends up being implemented (instantiated) on the FPGA. This approach eats into your overall FPGA resource budget and leads to lower CPU performance due to the FPGA overhead.

The alternative to a soft core is a “hard” CPU core, which simply means that the CPU is included in a separate area of silicon (usually on the same die). The Zynq 7000 is a good example.