
Reducing UDP Latency - abondarev
https://medium.com/@deryugin.denis/reducing-udp-latency-ce60d98c7bff
======
dwheeler
I'm glad he solved his problem, and I guess it's good to post "I solved this
problem" information.

But this seems _painfully_ obvious. TL;DR: "I had a demanding real-time
requirement for a Linux process, I fixed it by using the Linux real-time
scheduler for my real-time process."

Let's look at this in more detail. He's trying to implement time-critical
functions with Linux and has a demanding real-time requirement ("Maximum
acceptable latency is 0.1ms while basic Linux solution could only provide
0.5ms."). He's even struggling getting this latency with an RTOS (!). He's not
getting the real-time latency he needs by using the default Linux settings. He
also can't get it by setting the "nice" value, but the "nice" value isn't
relevant for real-time processes... which makes it clear that he's trying to
get real-time performance without using the real-time schedulers.

His solution:

    
    
        chrt --rr 99 ./client
    
    

That asks Linux to using the SCHED_RR (Round-robin scheduling) scheduler, one
of its real-time policies, instead of the default (non-real-time) scheduler.

In short: If you're doing real-time work, you need to use a real-time
scheduler for it.

Technically this doesn't completely solve his problem; if he was serious about
0.1 msec, SCHED_RR isn't doing it as shown by the posted bar chart. But he
seemed happy enough with it, so perhaps his requirements are really more
probabilistic ("X% of the time, UDP must be delivered in 0.1 msec").

Again, I'm glad he solved his problem, but I hope most developers already know
that they need to use real-time schedulers for real-time work.

~~~
zwieback
I interpreted the Linux thing as just making the Linux host a better analysis
tool, the main focus was the embedded system side.

I would probably use Wireshark to find the packet timing, who knows what
happens in something as complex an OS as Linux. You can also use something
like Wireshark to extract debug payloads like ISR or DMA timing on the
embedded system although admittedly a script on the LInux side is easier.

------
zackmorris
This is interesting, but I would have thought that gigabit ethernet would have
higher latency than 100 megabit, because with spread-spectrum communication,
latency tends to go up as bandwidth goes up. So ethernet must use a more
discrete encoding.

I started looking up info related to this, but I'd be curious to hear opinions
first. I'm having trouble finding what the potential bandwidth for ethernet
would be if it went to spread-spectrum, and how much that might increase
latency.

~~~
CodesInChaos
For 512-byte packets transmission over 100Mbit takes ~50us by itself, so the
OP's target latency is difficult to reach on 100Mbit, even under ideal
circumstances.

Though I don't think most of the difference the OP found is caused by the
linkspeed difference, but the hardware or driver being better after the swap.

------
easytiger
Got UDP send down to 800nanoseconds blocking calls for small payloads on
solarflare cards using efvi

~~~
emj
Sending in 0.8 µs is scary fast, I can't even get my clock synced to that with
a (crappy) pps, but that would be on a 10G board I guess which is probably not
going to be available on a embedded system with resource constraints.

~~~
absurdmind
For the right combination of budget, application and team you can get a custom
solution with much lower latency. Back in the days my team beat Solarflare
hands down, but only for a single application. Don't remember the total
number, but each additional 4 bytes of the payload added 5-7ns, IIRC.

~~~
abondarev
Yes, of cause. If you have the big-budget and great team you can optimize your
application very much. But in Embox the user applications stay the same, and
customization was very easy!

------
xxpor
I mean, if you really want to reduce latency the classic solution is to use
DPDK: skip the kernel entirely and run in polling mode.

Not great if you care about power consumption, but your latency will be low.

~~~
abondarev
I agree, but in Embox you can use usual applications. Kernel is rather
difficult

------
farazbabar
Wouldn't you want to take your process (and threads if appropriate) out of
scheduling entirely using cpu and thread/process pinning? Using taskset
command for instance.

------
mmxmb
Off-topic: code formatting on Medium is terrible.

~~~
abondarev
Yes, I agree. There is no syntax highlight

~~~
jsjddbbwj
That's not that bad, what's worse is that code blocks don't have a horizontal
scrollbar so long lines simply wrap on a mobile browser

~~~
abondarev
Yes, sure!

