Could you provide a link for that blog post? I am not seeing it.
And just for reference, AWS does provide enhanced networking capabilities on VPC:
(Though I've been optimising my tiny low-traffic site on a solar-powered RPi 2 to apparently outperform a major CDN, at least on POHTTP...)
Basically measuring time to first byte (TTFB) (and indeed visually complete etc) as a reasonable correlate of perceived performance, for a simple site over http from my RPi vs via CloudFlare fully cached http and https. The most direct comparison is of a separate fossil site, but other measuring tools show (generally) my RPi http beats or matches CDN http beats CDN https (with HTTP/2 etc) for a UK visitor.
Tests are made from WebpageTest and StatusCake in the main, both from their test points in the UK in various data centers. These sites of mine are UK focussed, not global, so the UK test points are representative, I believe, and should be advantageous to the CDN since it is terminated closer and faster to the client than I can be, from by kitchen cupboard!
Though still, the CDN folks have faster machines not running in power-saving mode, and lower latency into the core UK Internet peering points / LINX, and my RPi is doing other things too, such as running a substantial Java-based server.
One fabless SoC maker called Baikal claim that they can saturate 10gbs link with an 8 core arm chip running under 5W (phy power not counted in) for as long as users are using userspace driven network and hardware offloading
I observe, though, that if you are tuning a system to this level of detail you likely have a number of web servers behind a load balancer. To be complete, the discussion should include optimization of interactions with the load balancer, e.g. where to terminate https, etc.
Not necessarily. There are other ways of spreading traffic, like DNS round robin for example, using the DNS delegation trick or even have client-side load balancing (which is perfectly feasible if you control the client, like your own app for example).
The article itself mentions that what they're talking about is the Dropbox Edge network, an nginx proxy tier, which sounds like load balancers to me.
What is "the DNS delegation trick"?
I'm curious - is there a case where you want to terminate HTTPS on end device instead of (only) on load balancer?
Let's say I want to prepare a server to respond quickly to HTTP requests, from all over the world.
How do I optimize where I put it?
Generally there are three ways I can tackle this:
1. I configure/install my own server somewhere
2. Or rent a preconfigured dedicated server I can only do so much with
3. I rent Xen/KVM on a hopefully not-overcrowded/oversold host
Obviously the 1st is the most expensive (I must own my own hardware; failures mean a trip to the DC or smart hands), the 2nd will remove some flexibility, and the 3rd will impose the most restrictions but be the cheapest.
For reference, knowing how to pick a good network (#1) would be interesting to learn about. I've always been curious about that, although I don't exactly have anything to rack right now. Are there any physical locations in the world that will offer the lowest latency to the highest number of users? Do some providers have connections to better backbones? Etc.
#2 is not impossible - https://cc.delimiter.com/cart/dedicated-servers/&step=0 currently lists a HP SL170s with dual L5360s, 24GB, 2TB and 20TB bandwidth @ 1Gbit for $50/mo. It's cool to know this kind of thing exists. But I don't know how good Delimiter's network(s) is/are (this is in Atlanta FWIW).
#3 is what I'm the most interested in at this point, although this option does present the biggest challenge. Overselling is a tricky proposition.
Hosting seems to be typically sold on the basis of how fast `dd` finishes (which is an atrocious and utterly wrong benchmark - most tests dd /dev/zero to a disk file, which will go through the disk cache). Not many people seem to setup a tuned Web server and then run ab or httperf on it from a remote with known-excellent networking. That's incredibly sad!
Handling gaming or voice traffic is probably a good idea for the target I'd like to be able to hit - I don't want to do precisely that, but if my server's latency is good enough to handle that I'd be very happy.
For what it's worth, I know that dd'ing /dev/zero makes df show a smaller value. AFAIK, df (for ext4 in my case) isn't reading a logical abstraction of capacity as arbitrarily decided upon by a bunch of layers, but is straightforwardly reporting the free blocks on disk.
Following the link will show "Open Positions" with, well, nothing to follow. They did not only optimize their servers for throughput but also HR!
So many bells & whistles and I don't even know where to begin.
If someone can point me to a thorough article like this on the lua module, I will thank her/him forever.
When I've previously tuned a server I have used both of those to my advantage... Another comment on here talked about this ignoring an existing load balancer so maybe those sysctls are more appropriate on an LB?
For downloads high RTT can be mitigated by a congestion control that ignores constant packet-loss rates (which are common for high-rtt paths). Other tricks that you can try: fq+pacing and newer kernels with more sophisticated recovery heuristics.
> and I’m not even talking about IW10 (which is so 2010)
I wonder if this is good advice. I would have said the opposite: do not mess around with any of that stuff unless there's a security advisory or a problem points to a specific piece of hardware. It's not like updating this stuff is without risk.
Same goes for kernel, libc, all other libraries, and pretty much anything else. This is not an argument for not upgrading them though.
Also, sadly, almost all firmware updates fix a "problem [that] points to a specific piece of hardware" that you use.
If you have a budget to responsibly keep all that stuff up to date in your project and test to make sure you haven't broken anything, more power to you. It's a total waste most of the time, but why not? Especially if you're a consultant and getting paid by the hour
Firmware is different. It's not always possible to back out a change and you risk bricking hardware every time you apply an update, especially in the realm of PC-based servers. If something like a NIC or motherboard is performing as expected, updating the firmware for no reason is generally a stupid thing to do.
Edit: oh, you're the author. Please stop telling people to do this, or at least explain the risk. At a minimum, if the vendor provides a defect list with each update, it's not necessary to blindly apply updates. Take the changes if they're needed. It's possible you're accustomed to working on high-grade hardware that exhibits fewer of these firmware related blowups, but that doesn't mean it does not happen to people...
Does this imply that Dropbox has started testing out EPYC metal?
It would be nice if someone make a docker image with all the tuning set (except the hardware)
It would have be nicer, if the author has shown what the end result of this optimization looks like, with numbers, comparing against a standard run-of-the-mill nginx setup.
Have you not read the article?
>In this post we’ll be discussing lots of ways to tune web servers and proxies. Please do not cargo-cult them. For the sake of the scientific method, apply them one-by-one, measure their effect, and decide whether they are indeed useful in your environment.
This degree of optimization is something that really depends on a specific use case and the precise configuration of the entire hardware+software stack, and not a general-purpose best practices list that can be put into a container.
The author says as much at the top in the disclaimer:
> Please do not cargo-cult them. For the sake of the scientific method, apply them one-by-one, measure their effect, and decide wether they are indeed useful in your environment.
Better to take away the profiling and optimization methodologies from the article rather than specific settings.
> In this post we’ll be discussing lots of ways to tune web servers and proxies. Please do not cargo-cult them. For the sake of the scientific method, apply them one-by-one, measure their effect, and decide whether they are indeed useful in your environment.
Far to often I see people apply ideas from posts they've read or talks they've seen without stopping to think whether or not it makes sense in the context they're applying it. Always think about context, and measure to make sure it actually works!
At Netflix, we can serve over 90Gb/s of 100% TLS encrypted traffic using a single-socket E5-2697A and Mellanox (or Chelsio) 100GbE NICs using software crypto (Intel ISA-L). This is distributed across tens of thousands of connections, and all is done in-kernel, using our "ssl sendfile". Eg, no dpdk, no crypo accelerators.
I'm working on a tech blog about the changes we've needed to make to the FreeBSD kernel to get this kind of performance (and working on getting some of them in shape to upstream).
But is this traffic ongoing connections, new connections, a mix? They have different penalties, and result in different numbers: 90Gbps of ongoing connections might be, like, 100,000hps, but 90Gbps of new connections during primetime might only net you 50,000hps. And are you using Google's UDP TLS stuff?
Google also hacked on the kernel a lot to improve their performance, I don't know if any of that's upstream currently though. Maybe Cloudflare can answer you, as they seem to support the most HTTPS wizardry of the big CDNs.
The traffic is mostly long-ish lived connections. Eg, the duration of a TV show or movie. So there is some churn, but not a lot.
This is all TCP. By "UDP TLS", I assume you mean Quic?
So the library feed can serve tens of thousands of streams, an aggregate 100Gbps, on a single node. And then... how many nodes to support the front-end UI operations to get to that point?
It's funny how amazingly efficient we can be moving encrypted bits, but to support the APIs for login, browsing titles, updating account info, and setting up a stream; I'm going to guess ~100 of those nodes for every one of your stream-tanks?
I believe the open connect team at Netflix choose FreeBSD because a lot of them used to work at Yahoo and had lot's of FreeBSD experience. Not so much because of a performance difference between the two. As for now, the two network stacks are pretty equal when it comes to performance, some work loads are better on FreeBSD some are better on Linux.
I'll edit this post as I find the other articles, videos and benchmarks about this subject.
Edit: I don't really care for Phoronix benchmarks but here's some benchmarks showing Linux winning some and FreeBSD winning some benchmarks.
How about you post the benchmarks you used to form your answer, otherwise your post is nothing but speculation and probably just favoritism.
1) give an bcc/perf example to check the need for tuning and verify its effect.
2) give code/docs/paper reference as an embedded link.
3) give a generic monitoring guideline at the start of a "chapter".
Seems like I've (at least partially) failed. I'll do better next time.