I haven't been hugged by HN yet, but I've survived being featured on a couple other front pages long ago.
It's easy if it's a lightweight page and you have something faster than a 486. If it's heavy (lots of images, videos, downloads, etc) you can run out of bandwidth pretty quickly. However, that doesn't necessarily sink the ship!
Traffic shaping makes a big difference. Without it, the modem buffer quickly fills causing multi-second ping times. This easily leads to congestion collapse. TCP grows the receive window to large values, and the congestion avoidance algorithm doesn't do well when the feedback is delayed due to the latency. TCP starts sawtoothing, and you end up with occasional moments where a few pages get served, interspersed with so much loss that connections get dropped.
We used htb [1] which tightly controls the outbound traffic rate, preferring to drop packets instead of delay them. That prevents the modem's buffer from filling up, which keeps the latency down. This creates a smoother flow. Pages load slowly, but at least they keep loading, instead of wasting bandwidth on retries. It also prioritizes traffic: other services are guaranteed their minimum rates. We were using ssh through the line and didn't even realize we were getting hit until someone asked why the site was running slowly.
Newer algorithms are even better, especially for keeping jitter down, but htb is simple and vastly better than letting your router blast packets at full speed. OpenWRT and all the other open source router firmwares can do it these days. It improves things for even normal end-user uses, but if you're running a server at home, you really want to set up some QOS before someone hugs you to death.
I used tc, CAKE and ifb for a small ISP router-on-a-stick implementation (around 2000 end-users) and it did wonders with the limited bandwidth I had to work with (around 3Gbps). More simple to configure than HTB, no more playing around with lots of knobs. Even during high peak traffic the latency would stay low.
Back when I was hosting servers in my home, I used an old Linux box (maybe a K6?) with several NICs. It didn't take much to keep up with DSL.
Since then, I've moved all my services to the cloud. My home router is a TP-Link Archer C7 v2, which I chose mainly because it was well-supported by OpenWRT.
The last few years, I've been using routers from GL.iNet [1] when setting things up for friends and family. The factory firmware is OpenWRT, lightly modded with a simplified UI in front. The full OpenWRT Luci UI is still available behind an "Advanced" link.
I've been hugged twice on 100 mbit domestic broadband, and it's honestly not been all that bad.
In general I've seen somewhere between 1-5 visitors per second, a couple of million over a day. I think keeping the number of requests per visit down is a big key to surviving this traffic with low resources.
The first hug was search.marginalia.nu, which used to load fonts and had a load of about 50 Kb/visit; but still only generated in maybe 4-5 GETs per visit. I measured about 2 Mbps.
The second time was search.marginalia.nu/browse/random, which loads 75 images at a resolution of 512x348 (mostly WebP; so anywhere between 2 and 30 Kb). These were randomly picked from a very large pool, so they can't be meaningfully cached. I presciently set them to lazy loading and I think that made a big difference, especially from mobile visitors as they'd only load a few at a time. I didn't measure the bandwidth usage at this point, but it stayed up and stayed responsive throughout the day.
I think something that possibly also makes a difference is that I don't do any sort of session cookies or mutable non-ephemeral state in my backend. I can only imagine having to validate sessions and manage access/roles for every request would make for a lot more work.
I don't think it's necessarily the bandwidth that's going to be the bottleneck, rather than something like contested locks in the database, resulting in back pressure in turn causing traffic jam of lingering connections. Reducing mutable server state helps a lot with that.
Sometimes I wonder what kind of scaling startups are scared of. I keep reading about popular websites running on a single server, and how cheap bare metal is compared to an equivalent AWS setup.
I don't think most startups really understand how their systems scale, so assume they'll scale like a social media platform (which do not scale well at all).
I attribute a lot of my performance to initially targeting very low powered hardware. I ran it on a raspberry pi cluster until the index was around 1 million documents (I think). Having those limitations early on excluded a lot of design mistakes that would have been a lot harder to correct later.
Serving search requests is a embarrassingly parallel operation that could be very easily load balanced to different instances. If so Max Schrems hectored every other search engine out of Europe, and Ursula von der Leyen, taking pity on my living room search operation, sent me a bunch of money, I'd probably be able to deal with 1000x the traffic about as fast as I'd be able to lease space in a datacenter.
I am a bit choked in terms of how big I can grow my index, though. I could probably double, maybe even a couple of times, but not too many times if I can't find serious optimizations.
Startups don't always want the most cost-effective solution.
A lot of startups' intentions is to look "serious". If everyone has an engineering playground full of microservices with 5 different programming languages all hosted on AWS, you have to be the same if you want to join the "big boys" club.
Sometimes startups don't even care about solving the business problem. The objective is instead to get the "startup CEO" lifestyle for the connections and reputation regardless of whether you end up solving your business problem in the end. Same for a lot of engineers - they'll happily join and build the engineering playground while enjoying their salary regardless of whether it's the right option from an engineering point of view. If anything, an engineering playground gives you the opportunity to manage lots of people and solve (self-inflicted) problems that will look good on your resume in a way that a boring solution with a handful of engineers won't.
On the other hand, if you live and die by solving the business problem the most cost-effective way, all the "scalability" goes out the window and a simple stack on top of bare-metal makes much more sense, because the only thing that pays you is solving the business problem (in the cheapest way possible) as opposed to bragging about your self-inflicted technical complexity at AWS conferences.
There seems to be many people who simply fall for the marketing and think that kubernetes and hundreds of microservices are somehow necessary to run a small to medium sized web service. They can't comprehend how a technology stack with a 20 year track record could be even nearly as stable and reliable as a grab bag of 2 weeks old javascript microframeworks. It's absolutely inconceivable how something that performed well a decade ago on hardware that was a hundred times slower could keep up on modern servers.
And unfortunately the effect flows both ways. I'm having a heck of a time finding a simple, modern log aggregator not designed for kubernetes and infinite scale.
I have 4 load balanced VMs and literally just need to pull the logs into a central location sorted by time. There's a shocking amount of complexity around all the popular tools in this space.
You can use logstash, without elasticsearch, without kibana. Example: use the logstash file output plugin and have it collect logs (from filebeat, syslog, or whatever you prefer) and dump them into a directory named /logs/<date>/service ... then just use grep or whatever.
I interviewed with a company that was like this. Their Engineering Manager took me through a systems design quiz. The problem was simple:
* 100k users updating their gas / electricity meter readings every second
My first question was what unit was the meter reading in, he said it was the current total (called a totalizer in the biz). I'd worked for a Flow Metering company before. So I said:
"There's no need to upload the total every second reduce the frequency and the problem becomes easier"
Didn't accept that and I said ok a few optimized servers behind a HAProxy load balancer would the trick as the processing is very simple here. The database is the harder part as you can't have a simple approach here is 100k requests / second is going to cause contention for most DB's. You'd need to partition the database in a clever way to reduce locking / contention.
This answer was unacceptable he ignored it and asked me to start building some insane processing system where you use Azure Functions behind an api gateway to process every request into a queue. Then you have a scheduled job in Azure of which the name has changed by now. This job looks at the queue in batches of say up to 1000 records at a time and writes these in bulk into the DB Once it's finished it then immediately reinvokes itself to create a "job loop".
You would create multiple of the "job loops" that run in parallel. Then you would need to scale up these jobs to drain the queue at a rate that meant we could process 100k requests per second.
You would also need something to handle errors, for example requests that are broken somehow. These would go into a broken request queue and could be retried by the system automatically a certain number of times if they went over the number of retries then they would go into a dead letter queue to be looked at manually.
I'm pretty sure this was the actual process he would take in designing this type of system. Use every option the cloud gives him to make some overly complex thing which is probably quite unreliable. I also suspect that's the reason I wasn't hired as I was just looking at the thing sensibly rather than in the "cloud native" way.
He doesn't work for that company anymore and is now back to a normal (non-senior) software developer.
He was very upset when I questioned the requirement.
To give him the benefit of the doubt maybe it was an artificial question and the real question is how do you make a large scale "cloud native" system say something like Netflix / Facebook.
Of course the problem with those systems is that it's very difficult to explain it in detail of every part in an interview so his example was just a small part of that system. Also most companies do not require that level of scalability but I concede they are fun to build just not very fun to maintain/operate.
I have no doubt that the architecture I described was something he had personally built as part of what he was doing at the company. Me questioning that was probably taken as an attack.
I never got the job or even a single word of feedback.
In his hypothetical system with queues and tasks, I wonder how much resources are being wasted on the management overhead (receiving tasks from the network, scheduling functions, the cold start of the Azure function, etc) compared to a boring solution in a cron on a high-performance bare-metal machine costing a fraction of his solution?
Serving static HTTP GETs is easy. The moment you start trying to make money off the generated traffic its no longer a read only affair. Your system is not stateless anymore and tracking costs spiral up quickly.
Sorry, I lost context and was not thinking of the search engine but of a statically generated blog with small pages and little media. Also, thanks for confirming my hunch about the bigger factor being statefulness rather than dynamicity.
CPU, RAM storage are incredibly cheap nowadays. But so much can be going on at a time that _certainty_ is now what's pricey.
> And engineering teams don't notice that their solutions are suboptimal until their product hits scale.
A lot of engineering teams scale prematurely. Maybe their complicated engineering playground will indeed scale to 100M users, at the expense of being extremely complex and costly from an infrastructure point of view, where as a more "boring" stack may not scale to 100M but will handle up to 10M in a much simpler & cheaper way.
The more complex solution makes sense if you're sure you'll hit 100M, but megalomaniac founders' dreams aside, this rarely happens and they end up dealing with the unnecessary complexity & expense of a stack whose scalability isn't actually required.
Yup. Not even server hardware. If it wasn't for the unusual number of hard drives, 128 Gb RAM, $20 graphics card, and UPS, you could almost confuse it for a gaming rig.
I made a very pretty retro control panel for Kerbal Space Program a few years back, and made a nice writeup on my personal website about it, hosted from a low power Intel baytrail based server I keep in my garage. It got picked up by hackaday, then popular mechanics and a few other similar sites, and hacker news. I vaguely recall my upload rate was about 5 Mbps at the time, and my connection was quite saturated for a few days. The page has half a dozen jpg photos, so I suspect that was the bulk of the data.
People see servers with multi Gbps and think it’s a requirement for high traffic websites. Unless you’re serving videos, bandwidth is almost never the bottleneck.
High bandwidth protects you against basic DoS attacks that just fill up the pipe with garbage traffic.
The other problem is that bandwidth is usually harder to scale than let's say CPU or memory, so having a large buffer there is helpful.
Let's say the aforementioned VPS is suddenly under a lot of CPU load and legitimately can't keep up - Nginx can be trivially installed on it to load-balance between many downstream VPSes.
Now let's say you are running out of bandwidth. You may need to switch to a different connection with a different IP, change DNS records and wait for them to propagate, etc.
It's a good article, please don't get me wrong here. I was just a little confused after doing the math. I'd imagine a very image-heavy post might be more of a bandwidth problem, but yeah, definition of "website" and all.
My only nitpick (and sorry if I missed it, skimming) was the focus on total requests - which I find kinda useless, you need to look at resolution per minute, or per seconds for peaks.
This is what it boils down to: Serving web pages in 2022 is easy. A raspberry pi is seriously overpowered for the task, even for dynamic web pages. 15 years ago, well into the WWW revolution, it would be considered not only server-class performance, but serious server-class performance. (It wouldn't be server-grade hardware... missing redundancy, etc., although if they could have picked them up for ~$50 instead of ~$5000 they would have just bought more and dealt with it.) Massive websites have been run on far less.
You have to work really hard to take even a low-end laptop's worth of hardware and make a website that is slow to serve. You have to run Apache in a bad configuration that holds connections open too long. You have to make queries to a database that aren't correctly indexed so they're all table scans, and you have to make a lot of them. You have to serve huge amounts of crap to the user that is poorly optimized for their browser. You have to serve incorrectly-compressed images and videos. You have to design for a microservice architecture internally where everything's synchronous calls and some of the calls are overloaded and you don't notice, or there are messages busses you're using that are backed up, or something like that.
You don't have to do all of those at once, but you need to do a few of them because modern hardware will still blast through serving if you just have a couple of those problems.
However, there are programmers, design teams, managers, and frameworks up to the task. Frameworks that enable query flurries. Programmers that don't understand the concept of query flurries and don't realize they just wrote a renderer for their home page that runs seven hundred queries just to load the username of the currently-logged in user and their most recent activity. Slow backend languages that are less "io blocked" than people realize and really are slow, especially by the time you're done stacking 3 or 4 "nifty" language features on top of each other. Leadership that mandates breaking fast monoliths into slow microservices. Marketers demanding that you put the video they provided up, as is, with no modifications, or you're fired. Front-end frameworks that are only 'blazingly fast' if you use them carefully and correctly but make it very easy to end up using dozens of components that don't work together and end up with websites that take 3 seconds just for the browser to lay them out, despite the page itself being very simple visually in the end. It's hard work to slow a web page down that much, but it turns out that the work ethic is alive and well in this segment of the economy and there's a lot of people up for it!
If you're used to all this and it's just the way things are for you, and you're watching a 2022-spec server groan under the load of getting 3 pages a second, the idea that a single home computer and home connection could easily serve 100 or 1000 pages a second, or that a super-low-end cloud instance could easily serve 10,000/second, sounds like someone lying to you.
I'm in the same situation (50 Mbps uplink at one place, 100Mbps at another) and that's enough to do all hosting for hobby projects at home, which I really love.
Instead of exposing my home directly using DynDNS, I got a really cheap low end server (currently a VM w/ 1 CPU, 512MB RAM and 400mbps traffic flat) for 1€/month that proxies all traffic to the (hidden) servers hosted at home.
The spec is enough to reliably run a HAproxy that can max out the available bandwidth w/o sweat and it allows me to do failover between servers in my two "home datacenters" + possibly cache assets.
I don't use it for home services (I don't have any), but use it in production for a couple of businesses for a decade.
Another advantage of this solution is what you can have ANY Internet connection, even 3G/4G/99G (and switch between your connections), your clients would still have the same IP to connect.
With a proper provider and configuration you can even host MS Exchange there.
No specific reason against Cloudflare, just a DIY attutide in this case.
HAProxy is fun, and I also run it as a TCP proxy, so HTTPS is terminated in my (hidden) home server and I don't need to trust my proxy server, I guess that's not possible with cloudflare.
1&1/Ionos is one of the largest and best connected ISP and Hosting providers in Germany, not some small shady shop. I see they doubled the price to 2€ though :)
I made a WordPress site for a friend through Ionos's managed offering. After a couple of months of the site being up, they decided to detach the database for literally no reason. I had to go through support to get it back up. I'm guessing it got erroneously flagged as over its quota, but they couldn't explain why it happened. Just my experience.
To be fair, whenever you pay 1-2€/Month for something, you cannot really expect human/timely interaction. Paying a single support employee to look into your case for 10minutes already erases all revenue from the past years from you.
Seems to be a common theme with all the "big" hosting providers. Especially if they offer cheap services. They have an abuse problem so they set up automation to deal with it and they don't care about collateral damage; you're just a number, and not worth their time. Actual proper support costs more than race to the bottom allows for. Same deal as with Google et al, one day you might just get fucked.
Turns out finding a good, reliable hosting provider who isn't a tyrant and gives you the benefit of the doubt & treats you like a human when something's up is really really hard. And probably not very cheap. It's kinda sad that one could host all kinds of interesting things on a $5 device like raspberry pi zero, but then you have to pay out of your nose -- twice, and every month -- to get proper internet access.
Having correct cache headers helps as well, as a single visitor probably comes back to the page fairly often, or visits another page with shared assets. It's surprising how often some things that should be cached come back with headers that make it uncacheable by the browser. KeyCDN has a pretty comprehensive guide: https://www.keycdn.com/blog/http-cache-headers
I've had the experience of being on the opposite end of the spectrum when I made a joke site[0] that kept the connection open to every connected browser. Needless to say, it got hugged to death in the very first ten minutes.
What got it back up was switching from my homebrew webserver to Nginx and increasing the file descriptor limits. Switching to a battle-tested webserver made all the difference. I'm still not sure how many connections it can keep open now, but I don't have to worry about the static parts of the site going down.
Firefox user here. Right click on image -> Open image in new tab. Then kill the connection and could "Save Image as" just fine. Or print screen it and get it in PBrush.
> Firefox user here. Right click on image -> Open image in new tab. Then kill the connection and could "Save Image as" just fine.
That doesn't work for me.. Firefox just starts a new download for the image and it never finishes. Sometimes it seems to flush about 32KB to the disk, giving you half of the image, but the rest must still be waiting in buffers.
Maybe because in my case Firefox is augmented with uBlock Origin + NoScript + Privacy Badger? I don't know, maybe because I'm on desktop and you're on mobile? No idea, but that's how I can do it.
I wish I had 50Mbps uplink. My web server (https://timshome.page) runs on a ~15Mbps uplink. The hardware is still overpowered for it's usage, though, with a Ryzen 9 3900x (the server gets my gaming rig hand-me-downs) and 16GB of ram.
In my case, the limited bandwidth probably protects my server in some respect.
You can get the web fonts below 20 KB each if you subset them. You can also check if they are present on the system with `local()` before `url()` in the font-face declaration. You could also give the Apple system fonts a higher priority than Inter, because Inter and San Francisco are pretty similar.
I wonder how much bandwidth a hn hug normally is. 50 mbps sounds like a lot to me unless you have heavy images/movies/not text, but i'm out of my area of expertise.
Some of my blog posts have been #1 on HN for a couple hours. I have lightweight pages, about 100 kB including images and all resources. IIRC traffic peaked at 30 page hits per second. So about 3 MB/s or 24 Mbit/s
A million hits from 75k views, seems pretty reasonable.
Taking as an example page http://canonical.org/~kragen/memory-models/, the last thing of mine that hit the front page, that's 248.7 KiB, including all the images and other liabilities, so if that were distributed over a single day, that would be about 18 GiB, or 1.8 megabits per second, maybe with a peak of a few times that. The numbers provided here seem pretty comparable, maybe a factor of 2 or 3 bigger.
Things can vary by an order of magnitude or more from this. The HTML of that page is 17 KiB, so bandwidth could be an order lower, and as Õunapuu points out, it's also easy to drop in a few multi-megabyte photos or videos that bulk up a page to 25 or 25 megabytes instead of 0.25. http://canonical.org/~kragen/larvacut.webm is 640×480 × 13 seconds and encodes to 1.8 megabytes.
I'm using the exact same setup as yourself. I have a script that runs every 10 minutes that checks if my external IP has changed and then updates my DNS records via the GoDaddy API.
I also decided to use Hugo to generate my blog (https://tsak.dev)
Quick Tip for your <figure> inconvenience with markdown:
Just write a script that copies whatever is in your clipboard to a ./images/ directory and replace the content of your clipboard with the markdown you usually write pointing to that image. See my hacky version, here [1]. I have that on a keyboard shortcut but you can also just alias/source it and run it when needed.
[1]: https://gitlab.com/jfkimmes/dotfiles/-/blob/master/scripts/m...
(this requires an `activate` command that symlinks my current project directory to a path in my home directory so that all my scripts can work there. You can probably make this more elegant but it works for me.)
Sidenote: You pretty much generated traffic analytics anyway, and nginx logs have referrer information if you want to see where the traffic is coming from.
Regarding image compression, have you considered using jpegrescan on top of mogrify resizing? In my experience, it can shave off a few percent losslessly.
Actually, it looks to me like the HTML response is the only thing that's gzip'd. It doesn't make sense to waste time gzip'ing the PNGs, but the CSS and JS files should definitely be compressed. (And maybe the fonts? Not sure.) The default `gzip_types` setting in the nginx config shipped by Debian/Ubuntu does include `text/css` and `application/javascript`, but not `text/html`. Could be that OP is using a non-distro config that only enables gzip for `text/html`, or that OP accidentally disabled gzip'ing for the other MIME types while enabling it for HTML.
Yep, that's why I am planning on getting rid of them completely. I did a quick test this morning, seems like removing references to the fonts from the theme and adjusting some font weights should do the trick.
I did run a few semi-private sites from my home but I soon found the incoming traffic was causing my outbound connection to fall over. Lots of dropped video calls and spotty vpn connections. I assume because my ISP provided router was not handling the incoming requests very well. I wonder if anyone here has any good advice to get this sort of setup to run well?
Do you use your own router behind the box provided by the ISP? And is it DOCSIS?
If so, using the ISP-provided box in bridge mode could help. This was the cause of many woes in my setup, whenever there were a lot of connections open, the internet connectivity would drop off. But once I changed the ISP-provided box to be a dumb bridge between the internet and my own router, this stopped being an issue.
Yeah, always bridge the ISP box if possible. Get an EdgeRouter or run a decent router with OpenWRT or run pfSense/OPNSense on a spare x86 machine, those should be plenty fast for moderate usage.
If you're into tinkering, then anything that has good-enough specs and has great openWRT support is a safe bet. Personally I use a TP-Link Archer C7 v5, but there might be better options out there by now.
I am asking because you did not even mention it in the "Residential connections and DNS" section. Depending on the ISP you might get a static IPv6 prefix.
This is a joke, right? We colo at a location where the standard is 100mbit and the websites hosted there are doing just fine. Just slim down your pages and stop generating 25MiB landing page bundles.
There didn't seem to be any data in the article analyzing how much bandwidth was being used under load. I'm not sure if for example compressing those images made it possible to now survive.
I’ve had a few visits on my solar powered Pi3b+ on a 50 mbit uplink and because it’s a static site both pi and my internet had no issues whatsoever (looking at Grafana).
Not entirely sure, since I rely on the defaults provided by LinuxServer.io SWAG Docker image. The backing store for my site runs on ZFS though and it’s highly likely that web assets are cached in memory.
When I began hosting web sites around 2002 we started with a 1.5mbit leased line. It's amazing how many requests we handled with what is now considered so little bandwidth. Of course the media attached to web sites has grown larger since then.
If availability is your goal, then yes, a CDN is a good idea.
My setup is optimized more towards self-hosting and just doing it as a hobby. The web is centralized enough already, I will do my best to balance that out.
It's easy if it's a lightweight page and you have something faster than a 486. If it's heavy (lots of images, videos, downloads, etc) you can run out of bandwidth pretty quickly. However, that doesn't necessarily sink the ship!
Traffic shaping makes a big difference. Without it, the modem buffer quickly fills causing multi-second ping times. This easily leads to congestion collapse. TCP grows the receive window to large values, and the congestion avoidance algorithm doesn't do well when the feedback is delayed due to the latency. TCP starts sawtoothing, and you end up with occasional moments where a few pages get served, interspersed with so much loss that connections get dropped.
We used htb [1] which tightly controls the outbound traffic rate, preferring to drop packets instead of delay them. That prevents the modem's buffer from filling up, which keeps the latency down. This creates a smoother flow. Pages load slowly, but at least they keep loading, instead of wasting bandwidth on retries. It also prioritizes traffic: other services are guaranteed their minimum rates. We were using ssh through the line and didn't even realize we were getting hit until someone asked why the site was running slowly.
Newer algorithms are even better, especially for keeping jitter down, but htb is simple and vastly better than letting your router blast packets at full speed. OpenWRT and all the other open source router firmwares can do it these days. It improves things for even normal end-user uses, but if you're running a server at home, you really want to set up some QOS before someone hugs you to death.
[1] https://linux.die.net/man/8/tc-htb