You're a bit pessimistic, but beyond that I feel like you're missing the point a...

kragen · 2024-09-20T15:08:08 1726844888

thank you very much! mostly that is in keeping with my understanding, but the 100–200ns number is pretty shocking to me

mlyle · 2024-09-20T15:23:20 1726845800

That's a best case number, based on warm power management, an operating system that isn't disabling interrupts, and the interrupt handler being warm in L2/L3 cache.

Note that things like PCIe MSI can add a couple hundred nanoseconds themselves if this is how the interrupt is arriving. If you need to load the interrupt handler out of SDRAM, add a couple hundred nanoseconds more, potentially.

And if you are using power management and let the system get into "colder" states, add tens of microseconds.

kragen · 2024-09-20T15:33:18 1726846398

hmm, i think what matters for hard-real-time performance is the worst-case number though, the wcet, not the best or average case number. not the worst-case number for some other system that is using power management, of course, but the worst-case number for the actual system that you're using. it sounds like you're saying it's hard to guarantee a number below a microsecond, but that a microsecond is still within reach?

osamagirl69 (⸘‽) seems to be saying in https://news.ycombinator.com/item?id=41596304 that they couldn't get better than 10μs, which is an order of magnitude worse

mlyle · 2024-09-20T17:14:37 1726852477

But you make the choices that affect these numbers. You choose whether you use power management; you choose whether you have higher priority interrupts, etc.

> that they couldn't get better than 10μs,

There are multiple things discussed here. In this subthread, we're talking about what happens on amd64 with no real operating system, a high priority interrupt, power management disabled and interrupts left enabled. You can design to consistently get 100ns with these constraints. You can also pay a few hundred nanoseconds more of taxes with slightly different constraints. This is the "apples and apples" comparison with an AVR microcontroller handling an interrupt.

Whereas with rt-preempt, we're generally talking about the interrupt firing, a task getting queued, and then run, in a contended environment. If you do not have poorly behaving drivers enabled, the latency can be a few microseconds and the jitter can be a microsecond or a bit less.

That is, we were talking about interrupt latency (absolute time) under various assumptions; osamagirl69 was talking about task jitter (variance in time) under different assumptions.

You can, of course, combine these techniques; you can do stuff in top-half interrupt handlers in Linux, and if you keep the system "warm" you can service those quite fast. But you lose abstraction benefits and you make everything else on the system more latent.

kragen · 2024-09-20T19:18:38 1726859918

i see, thank you!

i didn't realize you were proposing using amd64 processors without a real operating system; i thought you were talking about doing the rapid-response work in top-half interrupt handlers on linux. i agree that this adds latency to everything else

with respect to latency vs. jitter, i agree that they are not the same thing, because you can have high latency with low jitter, but i don't see how your jitter can be more than your worst-case latency. isn't the jitter just the variance in the latency? if all your latencies are in the range from 0–1μs, how could you have 10μs of jitter, as osamagirl69 was reporting? i guess maybe you're saying that if you move the work into userland tasks instead of interrupts you get tens of microseconds of latency

i'm not sure that the 'apples to apples' comparison between amd64 systems and avr microcontrollers is to use equal numbers of cores on both systems. usually i'd think the relevant comparison would be systems of similar costs, or physical size, or power consumption, or difficulty of programming or setting up or something. that last one might favor a raspberry pi or amd64 rig or something though...

mlyle · 2024-09-20T19:38:53 1726861133

> i thought you were talking about doing the rapid-response work in top-half interrupt handlers on linux.

When we talk about worst-case latency to high priority top-half handlers on linux, it comes down to

A) how much time all interrupts can be disabled for. You can drive this down to near 0 by e.g. not delivering other interrupts to a given core.

B) whether you have any weird power saving features turned on.

That is, you can make choices that let you consistently hit a couple hundred ns.

> i guess maybe you're saying that if you move the work into userland tasks instead of interrupts you get tens of microseconds of latency

I think "tens" is unfair on most computers. I think "several" is possible on most, and you can get "a couple" with careful system design.

> i'm not sure that the 'apples to apples' comparison between amd64 systems and avr microcontrollers is to use equal numbers of cores on both systems.

I wasn't saying equal numbers of cores. I was saying:

* Compare interrupt handlers with interrupt handlers; not interrupt handlers with tasks. Task latency on FreeRTOS/AVR is not that great.

* Compare latency to latency, or jitter to jitter.

> be systems of similar costs

The price of a microcontroller running an RTOS is trivial, and you can even get to something running preempt_rt for about the cost of a high-end AVR (which is not a cheap microcontroller).

You have to sell a lot of units and have a particularly trivial problem to be ahead doing things the "hard way."

kragen · 2024-09-24T13:23:28 1727184208

i want to thank you again for taking the time to explain things to me!