I was in the IETF meeting. It was actually a very civil discussion, and I think ...

ElijahLynn · on March 28, 2018

You may have done this already, but I encourage you to copy/paste this as a comment on the article at apnic.net so it has more visibility. The opening line about "It was actually a very civil discussion, and I think almost everyone in the room could see both sides on the issue." needs to be shared.

IshKebab · on March 28, 2018

The thing I don't understand is why these network operators care about RTT outside their network? Shouldn't they only care about getting packets through their network as quickly as possible, which you don't need this spin bit for.

gmueckl · on March 28, 2018

The global optimum might not be the combination of locally optimal routes. The shortest route through a network segment to a peer may end up being globally slower than a locally slower route to another peer which can handle the traffic faster.

catern · on March 28, 2018

That's fine. If global optimization turns out to be significant, we can do it at the edge with overlay networks. No need for network operators to be involved.

nine_k · on March 28, 2018

> we can do it at the edge with overlay networks

Is this really easy to do, with negligible overhead, and no need to seriously reconfigure what you already have?

anameaname · on March 28, 2018

Realistically, are network operators going to add artificial delay to packets going through their network? If they don't get their bit, and they start slowing down communication, it would seem like market correction would solve it.

zero_intp · on March 28, 2018

Assume there are a minimum of three network operators involved in every flow; source, destination, and transport. It is highly likely that either the source or destination operator will have opportunity and motive to do said inspection/detection.

bsder · on March 28, 2018

Sarcastically: who will even notice a 200ms hiccup, nowadays, given the enormous waste of time that everybody shoves into their Javascript ad networks.

Personally, I don't want them to expose anything. The RTT on my packets is yet another piece of metadata capable of being abused.

In addition, I hope that the whole QUIC thing is a nice reboot of actually getting end-to-end connectivity back on the Internet so we can get some protocol experimentation moving again.

slededit · on March 28, 2018

That's 200ms that could be used loading more tracking scripts.

supertrope · on March 28, 2018

Sounds like AMP. Improve performance to fit more bloat instead of not adding superfluous elements in the first place. What else would you expect from an advertising company?

Buge · on March 28, 2018

Aren't network operators already adding artificial delay? I pay for X Mbps download speed, and my ISP artificially limits my download speed to that, even though their network is perfectly capable of providing me with more.

tialaramex · on March 28, 2018

No. Bandwidth limits and delay aren't the same thing.

CPE (the WiFi router or similar in your home) often adds delay because a few megabytes of RAM is cheap and the people who built it don't understand what they're doing. This is called "buffer bloat". But inside the network core this rarely comes up.

nvarsj · on March 28, 2018

You need buffers to maintain high bandwidth. There is no way around that. The problem is lack of AQM in end user routers - where a single high bandwidth flow can fill up and hog those buffers, disrupting latency sensitive flows. We're slowly seeing some adoption of things like fq_codel, but it's not perfect since the user has to go and manually enter their upload/download speeds (which again are not easy to determine, especially for ISPs with "boost" limiting). Ideally home routers would dynamically adjust based on observed latencies.

tialaramex · on March 28, 2018

Codel is knobless, there literally aren't parameters you _can_ set let alone ones you need to set according to your "upload/download speeds". So, I have no idea what you're tinkering with that you think needs to know "upload/download speeds" or why, but it's nothing to do with Codel.

nvarsj · on March 29, 2018

fq_codel alone doesn't solve buffer bloat. There are still huge buffers upstream. You need to combine it with local rate limiting so you can control the buffer in your local home router. fq_codel/cake work by looking at time a packet spent in a local queue. If you don't rate limit locally, then your local queue is always empty and everything queues upstream. Modern routers with things like "Dynamic QOS" that use fq_codel all require providing downstream/upstream values. This is way simpler than more traditional qos, but it's still a barrier to wide spread adoption.

Buge · on March 28, 2018

Isn't the only way to implement a bandwidth limit to delay or drop packets during specific periods?

tialaramex · on March 28, 2018

The only _sensible_ thing you can do if the transmitter won't stop is to ignore them, and thus drop packets, yes.

Queueing them up instead makes some artificial benchmark numbers look good but is a horrible end user experience, so you should never do this, but lots of crap home WiFi type gear does.

So, as I said, bandwidth limits and delay are different. The canonical "station wagon full of tapes" is illustrative, it has _tremendous_ bandwidth but _enormous_ delays. In contrast a mid-century POTS telephone call from London to Glasgow has almost no delay (barely worse than the speed of light) but bandwidth is tightly constrained.

isostatic · on March 28, 2018

I would far rather my udp packet is delayed by a millisecond than dropped.

I have an RTP stream running from India to Europe at the moment, 3000 packets per millisecond, so 0.33ms between pacekts. Typical maximum interpacket delay is under 1.5ms, looking at the last 400,000 seconds of logs, 200k are 1-2ms, 170k are 0-1ms, and about 4-5k on the 2-3ms gap, 3-4ms gape, etc. Less than 1% does interpacket delay increase past 10ms.

bsder · on March 28, 2018

Then you should have forward/backward error correction in RTP so you can completely ignore the packet drop/delay.

isostatic · on March 28, 2018

Standard SMPTE FEC doesn't allow more than 20 columns of error correction, at 30mbit that's about 7ms of drop, at best of times (assuming the required FEC packets aren't lost as well). At low bitrates it works better, and the majority of our international vision circuits rely on single streams with FEC. Had one ISP in the Ukraine that had an intermittent fault where, regardless of the bitrate, they would occasionally drop 170ms of traffic.

I currently have a difficult provision that for a variety of reasons I can't use ARQ on. To keep the service working I have 4 streams going, over two routes, with timeshifting on the streams to cope with route-change outages that tend to sit in the 20ms range. FEC is meaningless at these bitrates.

RTP is fine for delay and re-orders, but it doesn't cope with drops. I was at a manufacturer's earlier this week and said that I've experienced dual streaming skew of over 250ms (we had one circuit presumably reroute via the US), and I laugh at their 150ms buffer. Dual streaming can still fail when you have both streams runnning on the same submarine cable though. Trust me, intercontinental low latency interoprable broadcast IP on a budget isn't trivial

IncRnd · on March 28, 2018

If you are running over UDP, that is so that packets get dropped while preserving the overall "stream". That is actually the purpose for the design of RTP over UDP.

isostatic · on March 28, 2018

RTP runs over UDP

IncRnd · on March 29, 2018

That's exactly my point.

isostatic · on March 29, 2018

You realise that streams can't cope if packets are lost -- how badly they are affected depends on how many packets are lost and which packets they are, in some cases a single lost packet can cause actual on-air glitches

IncRnd · on March 29, 2018

Surely you realize the differences between an a/v stream, according to human perception, and what RTP actually does?

tialaramex · on March 28, 2018

You would rather all your UDP packets are delayed by a millisecond rather than have one dropped? Why?

isostatic · on March 28, 2018

Because that UDP packet is important, it can lead to millions of people watching a TV program having a 3 second outage in their audio.

muxator · on March 29, 2018

The need to protect against this type of problems is the reason for which Forward Error Correction is included in almost all the codecs.

Moreover, since your specific use case is not interactive conferencing, but IPTV, there would be no problem in incrementing even more the FEC ratio, at the cost of a small decoding latency.

isostatic · on March 29, 2018

Standard SMPTE 2022-1 FEC does not cope with real world network problems, that's why we have things like 2022-7, but even then that struggles in the real world.

And yes, this use case is interactive conferencing, where we aim to keep round trip delay from Europe to Australia down below 1 second.

ksec · on March 28, 2018

Does anyone know how much buffer bloat or delay is added in the router, or CPE or Modem? I assume they are less then 1ms?

I have been trying to figure this out for a while but information aren't really available.

tialaramex · on March 28, 2018

Bufferbloat is potentially almost unlimited, assuming that the people who built it are idiots (a safe assumption for most consumer gear) it basically just depends how much they were willing to spend on RAM.

For example, let's say we can move 10Mbps, and we've decided to use 10 megabytes of buffers to make our new WiFi router super-duper fast. Do a big download, the buffer fills with ten megabytes of data, that's eight whole seconds of transmission, now the latency of packets is eight seconds, so that's 8000 times larger than your "I assume less then 1ms".

{Edited to correct numbers}

mikeash · on March 28, 2018

Back around 2006, I had a RAZR that I figured out how to use to tether my computer onto Verizon’s 1xRTT service. I’m not sure which part of the system was to blame, but something had massive buffers and a strong aversion to dropping packets. If the connection got saturated I could easily see ping times of two minutes or more.

supertrope · on March 28, 2018

To deal with the lossy RF environment, there are re-transmits at the PHY layer. Effectively creating a TCP on top of TCP situation.

floatboth · on March 28, 2018

Why would a single big download fill the buffers? Router → LAN is typically an order of magnitude more bandwidth than internet → router, so shouldn't the buffer be emptied faster than filled?

supertrope · on March 28, 2018

The bottleneck is at your ISP's CMTS or DSLAM and your modem. e.g. The DSLAM has 1 Gbps in and only 40 Mbps down the line to your VDSL modem. Or your cable modem has access to 600 Mbps of capacity but your plan is only 100 Mbps so the modem limits. So there's quick stepdown: 1 Gbps, 600 Mbps, 100 Mbps.

floatboth · on March 28, 2018

Yeah, the stepdown is on the ISP side, how would it affect buffers of my consumer router?

supertrope · on March 28, 2018

For downloads it's buffers in your ISP's hardware that matter. For uploads it's your router's egress buffer.

e.g. You are syncing gigabytes to Dropbox. A poorly designed router will continue to accept packets far past upstream capacity. Now that's there's 2000 ms of bulk traffic in the router's queue, any real time traffic has to wait a minimum of 2 seconds before getting out.

floatboth · on March 28, 2018

Yeah, exactly. I mean, in your original comment you talked about bufferbloat on crappy consumer hardware and the example was a big download :)

isostatic · on March 28, 2018

Depends on the router, and the load. If you have a 100mbit uplink and 4x1 gig on the downlink side, you could easily have 10mbit of packets arrive in 3ms (70 packets per millisecond), but would take 30ms to send those packets. You can either

1) Drop -- despite total trafic being only 10mbit a second

2) Queue -- introducing a delay of 30ms.

In reality you'd put latency critical applications (voip etc) at the top of the queue so the pcakets get transmitted without the delay, and your facebook packets get delayed by 31ms rather than 30ms.

supertrope · on March 28, 2018

As much as 2000 ms when saturating the downlink with a long file transfer. Web browsing stops working and nothing loads in as every little resource request takes two seconds.

mhandley · on March 28, 2018

I've no idea what techniques will actually be applied by operators - I never expected middleboxes to go to the extremes they currently do either. But the technique I mentioned, of briefly delaying a short burst of packets, if done only occasionally, isn't really going to be noticed during a large transfer. I'm not claiming its a good idea though!

anameaname · on March 28, 2018

Large transfer's wouldn't be the place where the effect would be noticeable, it would be JSON payloads that can fit in a few packets. If operators would do this seldomly enough to not be noticed, why do they need the bit at all? It feels like a blank (albeit single digit) check.

floatingatoll · on March 29, 2018

Yes. Market correction isn’t applicable in most US markets, which have either a broadband monopoly or duopoly for broadband service.

Steltek · on March 28, 2018

To me, excessive queueing seems indicative of network operators making assumptions about traffic and their solutions ultimately lead to worse outcomes, buffer bloat etc. Isn't QUIC meant to prevent the network from making these assumptions, forcing a more agnostic network, and letting QUIC manage congestion, latency, etc?

0xbadcafebee · on March 28, 2018

I think the point is that network operators want to use QUIC to measure network segments and make routing decisions. They are not necessarily trying to improve QUIC performance.