
Streaming Video and Audio with Low Delay - mortenvp
http://steinwurf.com/blog/2018-04-25-2022.html
======
urlgrey
Periscope developed a Low-Latency HTTP Live Streaming (LHLS) technique that
relies on HTTP chunked transfer-encoding to stream video bytes as they are
encoded at the origin. This is still subject to TCP packet retransmission
overhead, but the time-to-first-byte is reduced significantly and leads to
less buffering on the client.

Here's a Periscope post about LHLS:
[https://medium.com/@periscopecode/introducing-lhls-media-
str...](https://medium.com/@periscopecode/introducing-lhls-media-streaming-
eb6212948bef)

Most systems that serve HLS media use fixed content-length segments, which
requires knowledge of the length of a segment before the first byte can be
sent over the wire. So, for a 5 second segment you would need to encode the
entire 5 seconds before the first byte can be sent; this does not apply when
streaming the segments with chunked transfer encoding.

Incidentally, at Mux we also use chunked transfer-encoding to stream video
that is encoded on-the-fly with great performance.

~~~
reggieband
I've heard from colleagues that this won't be possible with DASH due to the
switch to fMP4 format. One of my co-workers tells me that fMP4 requires the
entire segment to be loaded before playback can begin while TS segments don't
require this. We've been looking into very small segments (e.g. 1s duration)
to reduce latency but I've been interested in the LHLS approach since I first
heard of it.

~~~
urlgrey
Very short segment durations are effective only when latency is more important
than quality.

Each TS segment must start with a key-frame, and the GOP size can't exceed the
duration of a segment (e.g. one second). Lowering the segment duration
increases the frequency of key-frames, which has the effect of lowering the
quality you can achieve at a given bitrate.

~~~
RBO2
Note that this is a Apple requirement for HLS. Most people don't realize that
the GOP size doesn't impact the latency, but it impacts start-up time.

------
kazinator
Low delay is much, much more important for _calls_ than for streaming. One
second of buffering delay may be acceptable in streaming playback (users often
contend with longer delays). That much delay will severely degrade a video
call, especially if the audio stays synced with the delayed video.

~~~
MiniMax42
Also, for real-life auctions where online bidders can participate

~~~
kazinator
There is justification in regarding that as a form of video call rather than
media playback, even though the video may be only in one direction and the
reverse communication (flow of bids) isn't AV.

------
fenesiistvan
Instead of a separate protocol you can already use a codec with built-in FEC
such as OPUS.

~~~
nh2
Which video codecs have that inbuilt?

~~~
TD-Linux
None, so a separate FEC can still be useful. You can use FEC like described in
the article with WebRTC: [https://tools.ietf.org/html/draft-ietf-rtcweb-
fec-08](https://tools.ietf.org/html/draft-ietf-rtcweb-fec-08)

------
jakobegger
Anybody have an idea how to put this into practice? I recently tried streaming
video over wifi from a Raspberry (for a robotics hobby project) , and
everything I tried was either unusable or very delayed.

Is there an open source low latency video streaming solution for hobbyists?

~~~
mbrumlow
I use ffmpeg and some custom services for
[http://robot247.io](http://robot247.io)

All the code is on GitHub. I can get very low latency video. Most of the delay
comes in the form of the speed of light being so slow.

On the web site in use jsmpeg.

~~~
m3adow
Can you link the repo please? I couldn't find a link in the site and would be
interested.

~~~
mbrumlow
Here is the core of the site. I have not had much time to keep robots on line
but if you want a demo you can ping me at my username at gmail and I can pop a
robot on line (during work hours if I am in the office).

I have been working on a better web interface that has on screen controls,
because most people seem to want to type in like twitch, but the arrow keys
are what you use.

[https://github.com/mbrumlow/webbot](https://github.com/mbrumlow/webbot)

------
coldsauce
Anyone know what the best streaming solution for browser <-> browser video
calling is? It probably has to be built on top of WebRTC but I'm wondering if
there are codecs and forward error correction algorithms out there already in
Javascript to use.

~~~
kwindla
For small numbers of people in a call (say, n < 5), the "best" \-- meaning
lowest latency -- solution is direct browser to browser WebRTC connections.
Both Chrome's and Firefox's WebRTC implementations have quite good FEC built
in. And sending UDP packets directly between peers will have much lower
latency than routing through a media server.

Of course, sometimes peer-to-peer won't work for you. Maybe you have
requirements that push you towards routing media through a server. (Content
filtering, or compositing video or mixing audio, for example.) Or maybe you
have more than a few people in a call. If so, upstream bandwidth and encoding
become bottlenecks for mesh/peer-to-peer. Finally, some firewalls won't allow
UDP traffic from/to computers behind them, so you'll need to route UDP through
a central server, or (much worse) tunnel over TCP.

Back on the subject of latency and error correction in WebRTC, here are some
fun links:

Draft spec for FEC in WebRTC: [https://tools.ietf.org/html/draft-ietf-rtcweb-
fec-08](https://tools.ietf.org/html/draft-ietf-rtcweb-fec-08)

Mozilla article from when they first turned on Opus FEC. Includes sample audio
for calls with 19% packet loss. (19% packet loss is very, very bad. My startup
makes a browser-to-browser video calling tool, and we try hard to deal well
with packet loss that high, but it's a losing battle.)
[https://blog.mozilla.org/webrtc/audio-fec-
experiments/](https://blog.mozilla.org/webrtc/audio-fec-experiments/)

Notes from the very knowledgeable folks at Callstats.io about WebRTC FEC.
Covers some of the same material as this thread's original post:
[https://www.callstats.io/2016/11/09/how-to-recover-lost-
medi...](https://www.callstats.io/2016/11/09/how-to-recover-lost-media-
packets-in-webrtc-with-fec/)

Tsahi Levent-Levi's benchmarks showing how a few different media servers
perform in the context of 10% packet loss: [https://testrtc.com/webrtc-media-
server-packet-loss/](https://testrtc.com/webrtc-media-server-packet-loss/)

------
thefourthchime
The title is misleading, when they say low-delay, really they mean over UDP
instead of TCP. Their answer to video over UDP is a marginal efficiency gain
in a forward error correction algorithm called RLNC.

I haven't looked at RLNC, but it does seem to have some gains over more
traditional FEC schemes.

[https://arxiv.org/abs/1802.04873](https://arxiv.org/abs/1802.04873)

~~~
mortenvp
As stated in the article, low delay is very hard to achieve over a reliable
transport based on re-transmissions (such as TCP) - you have to wait for the
loss to be detected and then the retransmitted packet to arrive. This is at
least 3 x the latency of the link. Therefore if you really care about bounded
low latency you need to use some form of erasure correcting algorithm (to
provide upfront redundancy).

In the article we simply show that substituting one "old" algorithm for a more
modern one can give you much better efficiency (protection against packet
loss) for the same bandwidth and latency/delay budget.

~~~
thefourthchime
Please help me understand then. I see RLNC compared to 2022 Single mode has
the same overhead of 25%. When I compare the two using your tool at the end of
the article, the only change I see is an improvement from 5% random loss to
25%. The burst loose stays the same. Correct?

~~~
mortenvp
Yes, you get a much better random loss protection with RLNC. So if you know
that all your losses are bursts you may be able to live with 2022. Essentially
the difference between the two algorithms are that with 2022 only a subset of
your packets are protected by a redundancy packet (and you have to choose how
to protect them), whereas with RLNC you can protect all available packets. If
we leave the premise of this post (namely that we want to generate traffic in
the same pattern as 2022) RLNC can even protect against longer bursts compared
to 2022.

Did that make sense?

------
rjeli
Anyone know what PSNow/ cloud gaming services use? I’m using it to play some
ps3 games and the delay is immediately noticeable when you start playing, but
your brain adjusts and it’s not noticeable. Has to be around 300-400 ms round
trip

