Hacker News new | comments | ask | show | jobs | submit login

Meh, I'm always super unimpressed when simple text based websites have trouble scaling. Everything that's highly requested should be available in memory, and it should be trivial to spit it out instantly.

I'm not a scaling wizard, but I'd guess 99/100 times the reason CRUD apps have problems scaling is because they are over-engineered, and there is a tendency to solve scaling issues by adding another layer of complexity instead of optimizing the root application.




4chan is an image board. None of the content is there long enough to be considered "Highly requested.". And like half the posts have jpeg's and png's attached.

Besides, how does caching solve the "My bandwidth bills are killing my wallet!" issue?


Here's how I would scale 4chan. All static items served from s3/cloudfront. Posted images pushed to s3/cloudfront. (Or wherever filehosting is cheapest). These are all the high bandwidth items, all thats left is the text/html, which isn't that much work.

I'd argue that all of the text could be served from 1 nice box if you wanted to(multiple boxes make it more complicated but not that much more). Send the post to each box, add it to a table/indexes in memory and write the post to disk and backups for recovery purposes. Then either update and cache every page the new post affects, or mark the pages dirty and update and cache them the next time they are requested.

Done, all pages served out of memory super fast, what am I missing?

As far as bandwidth bills, most browsers observe cache settings and won't re-download what it has already downloaded. His complaint is about getting hit too hard serving the html/text not the images.


> All static items served from s3/cloudfront. Posted images pushed to s3/cloudfront. (Or wherever filehosting is cheapest).

Congratulations, you just massively blew out your bandwidth bill. The cheapest option is to host your static content yourself, especially if you're serving over a petabyte of it per month.


It really doesn't matter where you host the static/image files, as long as it's completely separate from the application server and doesn't eat into it's resources.


> We've also seen our image server's RAID array go from being relatively idle to getting absolutely slammed

Sounds like it is?


  > His complaint is about getting hit too hard serving
  > the html/text not the images.
Then why is he complaining about things like image pre-loading and 'expand-all' extensions/userscripts?


Yea, I see that now. He's complaining about both. Still, asking 4chan add-on developers to be courteous seems pretty silly. Gotta detect/ban/rate-limit them on the back-end.


Are you new to the Internet? Detect/ban... against 4chan users. facepalm


They already use CloudFlare.

> In the past 30 days, CloudFlare has proxied 1,331,004,996 page views from 4chan


> Notice that I say proxied and not cached. CloudFlare does not cache HTML—every connection/request for HTML is passed on to our servers, and our server must send a response.

CloudFlare is not caching his html, thus the performance problems, because his backend is probably dog-slow.


500 requests per second on average (far higher during peak hours) on a single box?


All you are doing is spitting out strings from memory. With a beefy box it could probably be higher than that.


Apparently your "advice" is so naïve as to have spawned an entire other post http://news.ycombinator.com/item?id=4208134




Applications are open for YC Summer 2019

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: