Hacker News new | past | comments | ask | show | jobs | submit login

From https://en.wikipedia.org/wiki/Bufferbloat

_Some communications equipment manufacturers designed unnecessarily large buffers into some of their network products. In such equipment, bufferbloat occurs when a network link becomes congested, causing packets to become queued for long periods in these oversized buffers. In a first-in first-out queuing system, overly large buffers result in longer queues and higher latency, and do not improve network throughput._

I hope I get this right, please correct if needed: So basically Intels chipsets were creating what looked like a fat network pipe that accepted packets from the host OS really fast but in fact was just a big buffer with a garden hose connecting it to the network. The result is your applications can write these fast bursts, misjudge transmission timing causing timing problems in media streams like an ip call leading to choppy audio and delay. The packets flow in fast, quickly backup and the ip stack along with your application now have to wait (edit: I believe the proper thing to say is the packets should be dropped but the big buffer just holds them keeping them "alive in the mind" of the ip stack. The proper thing to do is reject them and not hoard them?). The buffer empties erratically as the network bandwidth varies and might not ask for more packets until n packets have been transmitted. Then the process repeats as the ip stack rams a load more into the buffer and again, log jam.

A small buffer fills fast and will allow the software to "track" the sporadic bandwidth availability of crowded wireless networks. At that point the transmission rate becomes more even and predictable leading to accurate timing. That's important for judging the bitrate needed for that particular connection so packets arrive at the destination fast enough.

Bottom line is don't fool upstream connections into thinking that your able to transmit data faster than you actually can.

It's also a problem because a few protocols in you may use from time to time (like, say, TCP) rely on packet drops to discover and detect network throughput. TCP's basic logic is to push more and more traffic until packets start to drop, and then back off until they stop dropping. And then it keeps on doing this in a continuous cycle so that it can effectively detect changes in available throughput. If the feedback is delayed, then this detection strategy results in wild swings in the amount of traffic the TCP stack tries to push through, usually with little relation to the actual network throughput.

Buffering is layer 4's job. Do it on layer 2[a] and the whole stack gets wonky.

[a] Except on very small scales in certain pathological(ly unreliable) cases like Wi-Fi and cellular.

Is there no way to limit how much of the buffer is used via some config?

Usually not. There may be an undocumented switch somewhere in the firmware that a good driver could tweak? Depending on the exact hardware. But end-user termination boxes, whether delivered by the ISP or purchased by the end-user, are built for as cheap as possible and ship with whatever under-the-hood software configuration the manufacturer thought was a good idea. Margins are just too narrow to pay good engineers to do the testing and digging to fix performance issues. (Used to work at a company that sold high-margin enterprise edge equipment, and even there we were hard-strapped to get the buggy drivers and firmware working in even-slightly-non-standard configurations. Though 802.11 was most of the problem there.)

And in the case of telco equipment, that's an tradition-minded and misguided conscious policy decision.

Your analysis is correct.

Smaller buffers are in general better. However advanced AQM algorithms and fair queueing make for an even better network experience. Being one of the authors of fq_codel (RFC8290), it has generally been my hope to see that widely deployed. It is, actually - it's essentially the default qdisc in Linux now, and it is widely used in quite a few QoS (SQM) systems today. The hard part nowadays (since it's now in almost every home router) is convincing people (and ISPs) to do the right measurement and turn it on.


Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact