What amazes me is the lack of concern on latency for web streaming.
We use the internet (and IP in general) to stream video. At high bitrates (200mbit+) we aim for sub 100ms end to end, for compressed services we're happy with 500ms, maybe upto a second if it's something like Sydney to London over the internet.
I was in a control room a couple of weeks ago watching some football. There were two displays, one end was the feed from the stadium, one was the feed from the web streaming service.
There were cheers and then groans from the live end of the room. nearly a minute later someone on the web end started running up the field to score. Of course I knew at that point that it wouldn't be a goal, as not only did the people watching the live stream tell me, but twitter was abuzz.
1 minute end to end delivery latency is shocking for this type of program. Heck 10 seconds is bad enough.
1. network latency, in milliseconds, that affects stream quality and stability
2. the delay (lag) between real-time capture and what the end user is seeing; this one is usually measured in seconds. A stream needs to be ingested, transcoded, sent from distribution servers to edge servers in each target region - with each step adding to the delay.
Minimizing the lag is very hard, because stripping all buffers (to reduce the delay) makes the stream very sensitive to network conditions (which reduces quality). With most commercial CDN providers you will get 5..10 seconds. It can be reduced to 2..3 sec if you know what you're doing.
edit: in case anyone is interested - in the second scenario, where we achieved 2..3 sec broadcast lag vs real-time, stream source (ingestion) was in the US, and the viewers were in mainland China. Network latency was over 600 msec. Wasn't easy!
60 seconds in the case of iplayer for the fa cup. 1m20 in the case of BBC News channel right now. HLS tends to be packetised in something like 15 second chunks at the end of the process.
I know why there's a delay, I'm just amazed that people aren't concerned about it. The BBC used to offer multicast sources of live TV, which is a far more sensible solution, far more bandwidth efficient and allows end-to-end transmission in the satellite (or even less) range.
Wowza did a talk at demuxed last year about how to do "3 second latency end to end at scale", which I found amusing given that TV people have been doing sub millisecond latency at scale for nearly 100 years, so at least some people in the industry recognize the problem (which is mainly for sports events)
Hey, I was also at Demuxed last year! I help maintain Hls.js; it should have some form of LHLS support by soon(tm). There's no standard yet so actual adoption will be tough (current implementations use non-standard EXT-X tags to signal LHLS). But pretty much everyone does the same thing: early signaling of segments, chunked transfers, and on the client side, progressive demuxing. HTTP-based solutions typically achieve the aforementioned 3 seconds, but anyone talking about LHLS now (Akamai and Wowza) have their own protocols; it remains to be seen what the rest of us will get.
Twitch has recently implemented LHLS (looks like a "periscope-style" implementation) and I was seeing 1.2s glass-to-glass.
I hate that I try to stream a game for my friends and they don't see what I did til 10-15s later. I'm trying to have discussions with them in real time on our voice chat server and any input they give me is inherently dated. Mumble gives extremely low latency, so achieving this kind of thing over the internet isn't exactly impossible, the bandwidth requirements of video would probably make it a bit more iffy, but should still be doable with some random issues.
Maybe I just need to get away from the public streaming services which use HLS, switch to UDP streams and sub-1s buffer sizes.
Honestly, I think people are concerned, but everyone's experience to date with big streaming events has been that something invariably goes wrong - people can't connect, login issues, event won't start, buffering, etc (e.g. the McGregor Mayweather PPV). If you can get a high-quality stream at all, perhaps the time shift is secondary!
I only usually deal with the distribution side as an end user, and personally I tend to watch about 2 live events a year (I'd watch new years, but that's clearly pointless, so that leaves eurovision and maybe an election program), so I don't have much experience with that side.
It does amuse me when we were looking at latency for a program from a ropey bit of connectivity which we were using ARQ on. We were discussing whether we could push the latency up from 2 seconds to 6 seconds (it kept dropping out for 2 or 3 seconds at a time), as it's sport. Then we realised there was a good 30-40 seconds downstream before it even left to the CDN!
I still don't understand half of what Streampunk [1] are trying to do with their nmos grain workflows, but they are talking about sub-frame HTTP units
This is not an approach that supports line-synced timing and may not be appropriate for live sports action that requires extremely low latency.
However, for many current SDI workflows that can tolerate a small delay, this approach is sufficient.
With UHD you're talking 20MBytes for a single frame, or each "grain" (a subdivision of a frame) being in the order of a millisecond/megabyte.
I think I prefer this approach to the SMPTE 2110 approach to be honest, especially with the timing windows that 2110 requires (it doesn't lead well to a COTS virtualised environment when your packets have to be emitted at a specific microsecond)
In the US we've seen a half-dozen over-the-top TV providers launch in the past couple of years. They all seem to understand that live sports is their core feature, yet none of them tries to compete on latency. I regularly see two or more minutes from reading tweets about a goal or touchdown (from broadcast or cable viewers) until I see that TD live on Sling / PSVue / DirecTV NOW / YouTube TV (I've tried them all). Second-screening live sports is impossible with the OTT apps, and is very likely to drive me back to Comcast.
There's a real opportunity for a sports-oriented OTT company to compete on latency, a DVR that actually works, and expansive rights (I never have to guess if I have access to any sporting event).
For games like American Football, there's only 11 actual minutes of action, but it takes 3+ hours to complete the game[0]. That leaves a lot of time to watch a second screen.
Second screening even in things like drama is very popular.
If you're watching a sports game, I can see that having a second stream (perhaps curated) with easy to access stats on that game, or a different angle to what the director thinks you want, or whatever.
I don't see the appeal of second stream in drama, but in things like sport, yes
I stream a lot of sports from home. I would gladly trade delays for stream stability/quality. I frequently have to switch from a legal stream that I pay for to a more robust illegal stream.
It would be annoying if there were multiple devices nearby on different delays, but for me in my living room. I don't care if it's 30 seconds or 3 minutes. I've gotten spoiled by twitter feeds a few times, but it's not the end of the world.
"Live" TV broadcasts are also on a delay, so I guess it's all relative. I had a friend who lived near enough to an NFL stadium that you could hear when something big happened and the TV delay made it impossible to enjoy a home game there.
Very minor delay in the UK -- well under 2 seconds from pitch to TV on DTT in the case of the FA Cup. Sure enough to cause some grief when you can hear the crowd, but not enough to get notifications on twitter or whatever.
Interesting factoid, but some press have agreed to intentional latency as a security feature around certain events and dignitaries. The theory being that a sniper or drone operator cannot use the "live TV" footage to know their exact position.
In some cases international viewers may see the "live" footage before local ones.
> The theory being that a sniper or drone operator cannot use the "live TV" footage to know their exact position.
How is this helpful in either of these cases? A sniper needs eyes on, a drone operator probably has live video from the drone. I don't see how a TV delay would have any effect.
Already since over 20 years, the german TV chanel "RTL" is about 15 seconds later than all other TV chanels on live events like formula 1. But I have no idea why it is.
BBC is doing UHD, with HDR and HLG, for the world cup, with the top stream 36Mbit/second [0]
There are a limited number of "spaces" available [1] -- I think it's upto 100gbit of output.
Unlike with the FA Cup (where the BBC did a uhd trial), the world cup will have a lot of games during the week, where people will be watching from the office (although probably not UHD). This will mean far higher loads on the distribution.
Fortunately England's only 3 games are either at the weekend or at 7PM. The second half of the tournament will really stress the UK internet though, with both World Cup and Wimbledon on during the working week.
I've started building P2P live adaptive (DASH) live video streaming using WebRTC with distributed rate control mechanism some time ago and I am planning to open source it. Basically using that you could build your own p2p global distributed live adaptive video streaming CDN (or use it on one server only). Adding new supporting server (to add additional bandwidth) in this solution is just as easy as spawning vm/server and launch binary. Distributed signaling server with GEO/region based peer distribution, full real time statistics on whole network health, analytics, network automatically adapts to bandwidth shortage (if for some reason network can't sustain itself) switching to lower bandwidth versions of the stream. Very easy to use, you would need to set one config file only, launch 2 binaries and add one JS file to source of your site and you are ready to go.
I've paid for 4-5 different streaming services over the last few years trying to see matches. They inevitably don't work, which is the only reason I use streams I find online.
If some random dude with pr0n popups can make it work but Fox/NBC/ESPN/etc can't, tough shit.
Hey, I work at Peer5 (another Peer5 employee wrote this article). We typically measure video user experience in several ways:
- The amount of rebuffering the user is getting (basically, the less the user sees the loading wheel the better).
- The bitrate the user has (are the users seeing the video in the highest possible quality?)
- Whether or not there are any media errors.
- The amount of time the video takes to load after a seek.
From the user experience - you don't really have control over these things - it's up for the broadcaster to set up a good service.
We've found (unsurprisingly) that services that are either paid or are national broadcasters offer better user experience than ones that are free and easily found online so my recommendation would be to spend a few dollars and get a solid provider.
(Also, the bandwidth you get now doesn't mean too much when the network is very congested - so it's worth checking how fast your network is during big peaks and considering a different ISP or a streaming provider that utilizes P2P)
The P2P approach is interesting. Do you have data on how much bandwidth is saved at peak usage?
It might be especially interesting when many users share the same connection, effectively achieving broadcasting - the CDN pushes the data to one client, and it broadcasts it to the local network.
I watched the UCL final via YouTube this year and the Bet365 webapp kept telling me about goals half a minute before it happened on the screen. To be honest, if you're isolated the delay doesn't really matter.
still doesn't scale efficiently. http based streaming is the only reliable way to get economies of scale. The cpu cost per webrtc socket is high compared to a cache-hit for a static resource. Not to mention that WebRTC is way more cpu intensive client side compared to hls/dash which has kernel/hardware offloading.
WebRTC is stateful whereas dash/hls are HTTP based and stateless - caching with HTTP is easy and CDNs have a ton of infrastructure on the internet to support it.
Caching with WebRTC is very hard - even for a single second - since every connection is stateful.
Software for scaling up live-streaming CDN points of presence (POPs) is a pretty crazy domain. For on-demand video, you can think of a CDN as a cache, getting known-ahead-of-time chunks. But what about for live streaming? It's not feasible to stream frame-by-frame directly from your encoding backend to all the viewers of the World Cup, over something like RMTP - you'd want to use a CDN. So typically, you distribute meaty (multi-second) HLS segments as individual video files, or collections of files, to your CDN; once available, they then need to be requested by browsers/mobile clients as a whole segment, over HTTP(S). Works well with existing CDN infrastructure (provided they can handle the write volume and have big enough inbound pipes)... but the huge issue is that the length of the segment plus round-trips is a lower bound on effective latency. And when interactivity is required, multi-second delays can be horrible.
> In HLS live streaming, for instance, the succession of media frames arriving from the broadcaster is normally aggregated into TS segments that are each a few seconds long. Only when a segment is complete can a URL for the segment be added to a live media playlist. The latency issue is that by the time a segment is completed, the first frame in the segment is as old as the segment duration... By using chunked transfer coding, on the other hand, the client can request the yet-to-be completed segment and begin receiving the segment’s frames as soon as the server receives them from the broadcaster.
> This Grand Challenge is to call for signal-processing/machine-learning algorithms that can effectively estimate download bandwidth based on the noisy samples of chunked-based download throughput.
(IMO) If you're thinking that this is all rather silly, and that live video streaming is not something that should be done over HTTP in the first place... there are a lot of reasons why this is the case. All the CDN POPs are optimized for HTTP GET requests rather than stateful sessions, and Apple's smiting of Flash removed a lot of incentive for innovation on RTMP servers. The ironic thing is that Internet connectivity is fast/reliable enough nowadays that RTMP might have been able to escape its association with "buffering" spinners, and would provide a much lower-latency experience. Hopefully there's better standardization in the future as live video becomes more mainstream.
We use the internet (and IP in general) to stream video. At high bitrates (200mbit+) we aim for sub 100ms end to end, for compressed services we're happy with 500ms, maybe upto a second if it's something like Sydney to London over the internet.
I was in a control room a couple of weeks ago watching some football. There were two displays, one end was the feed from the stadium, one was the feed from the web streaming service.
There were cheers and then groans from the live end of the room. nearly a minute later someone on the web end started running up the field to score. Of course I knew at that point that it wouldn't be a goal, as not only did the people watching the live stream tell me, but twitter was abuzz.
1 minute end to end delivery latency is shocking for this type of program. Heck 10 seconds is bad enough.