Music, for example. If you're doing audio processing for music, it's much more preferable to have something that takes 20ms consistently to transform an input than something that takes 5ms 90% of the time and 50ms 10% of the time (due to, e.g. interrupts). That 50ms gap will show up as a note that's audibly behind the beat, while the improvement from 20ms to 5ms isn't really that beneficial.
EDIT: another example is controlling a UAV. You can design around a system that consistently takes 20ms to process an input (e.g. by limiting the max speed). It's a lot more difficult to design around a component that takes 5ms most of the time, but will randomly take 50ms here and there, because you don't get to control when those lag spikes happen.
For music, it seems like the rt-linux patch ought to be enough. They claim sub-100 microsecond timing jitter for process wakeup (https://rt.wiki.kernel.org/index.php/RT_PREEMPT_HOWTO#Benchm...). That ought to be tiny, if you're talking about audio latency in the range of milliseconds.
For UAVs, you're typically talking to ESCs that want PWM signals in 1-2ms ranges, so 10s of microseconds of jitter would certainly matter for PWM generation, but that's why you'd probably offload that to an external PWM chip, and handle the code that will tolerate 20ms processing times with 100us jitter (like the flight-control loop) on the CPU.
Bare-metal can't be beat for tiny (easily-auditable) code size, and general lack of "what if the timing goes wrong somehow" situations, of course. Plus, who knows if the rt-linux patches would actually perform to that level on a Pi-zero.
My last experience with rt-linux teached me that the kernel is not the only thing that needs to support real-time. As soon as you have any drivers (DMA!) or e.g. Service Management Interrupts (SMIs) that circumvent any smart preemptive scheduler of the kernel you're still screwed.
So, what was said is correct, hard real-time is hard, and an OS on top of a "general purpose" hardware platform makes it harder.
That being said, there is stuff like VxWorks, which kind of proves that it is possible, if you have full control over the hardware and the OS.
100μs of jitter is enough to cause major problems if you're bit-banging or doing high-speed closed-loop control. Something like the Bus Pirate is almost trivial to implement on bare metal, but a complete nightmare if there's an OS involved.
RTOSes can be extremely useful, but they're not a perfect substitute for bare metal.
Well, right, that needs to be a multi-purpose serial interface device, and who knows what crazy timing requirements some sensor or interface chip will need. An example: There are some cheap little radios around. They literally just output a carrier with a binary value encoded on it in a super-simple modulation. Even with an RT device, you've got to keep transmissions very short and include a clock-synch section at the beginning.
You can still transmit and receive if your devices have a lot of jitter, but you'll have to kill your transmission rate to keep things working reliably.
EDIT: another example is controlling a UAV. You can design around a system that consistently takes 20ms to process an input (e.g. by limiting the max speed). It's a lot more difficult to design around a component that takes 5ms most of the time, but will randomly take 50ms here and there, because you don't get to control when those lag spikes happen.