I haven't built anything that requires load balancing (no users) my thought was to use something like public cloud where it's 0.01cents/GB or something.
Not sure how I could easily "make an image" copy of my stack/code and possibly this is where serverless is nice, you just need to throw code/traffic to something.
load balancing at application level with nginx is pretty easy and well documented. The bigger problem is to write your application in a way that enables clustering.
in order to utilize load balancers, you'll need to have your application running simultaneously after all. This ultimately means that you'll most likely need to outsource your jobs into a queue tool such as RabbitMQ and have your cluster nodes fetch them.
lots of refactoring if you didn't plan for such a scenario from the start.
Yeah I've primarily been developing for myself/not with a lot of users in mind/concurrency so I have yet to figure that out. I mean I have built say templating "engines" that produce pages for different people using the same db but I couldn't say update a database (like importing) and keep the live one active/not be affected as I only have one instance running.
I have yet to use nginx/only using apache at this time but have heard/looked into it.
Thanks for the tip on RabbitMQ, is this different from using say CRON (for scheduled tasks) or are you talking just processing the large amount of real-time requests, which at this time for me is just first-come-first-serve as far I'm aware.
I suppose it depends on what you application does. I have not used web sockets yet personally but have some experience with them, currently figuring out how to use web workers. Yeah lots to learn still.
No reason to bring up web sockets but some applications may have real time updates or push notifications.
edit: I want to get to the point where I have backend statuses telling me how much applications are running (resources) that would be cool and then hitting thresholds do something.
> Thanks for the tip on RabbitMQ, is this different from using say CRON (for scheduled tasks) or are you talking just processing the large amount of real-time requests, which at this time for me is just first-come-first-serve as far I'm aware.
oh, I never realized that they're both task schedulers with different meanings (I'm not native...^^)
cron is meant to fire off a command at a specific interval or date. RabbitMQ creates a schedule of what should be done in what order.
You generally only use cron for tasks outside of your application context, however. (i.e. creating backups, flushing old temporary files, maybe fetching updated files/data from a third party)
RabbitMQs use-case is much closer to your application: say one of your users wants to export some data. Your frontend application would just add this task to the task schedule and your next worker/cluster node would start processing it as soon as it has resources available.
this decouples the actual web application from what's being done on the server and lets you scale indefinitely, depending on how far you go on that front.
it, however, creates a lot of administrative overhead and is IMO not necessary until your project is already quite large. It might also be possible to scale in another way - it depends entirely on the application you're building.
a shared calendar, for example, wouldn't need this, because there are almost no resource heavy tasks that have to be done.
Typical setups that I've seen break the app up into (micro)services. Each service specializes in one type of "work". You may have API nodes that receive/validate incoming HTTP requests and some worker nodes that do more involved work (like long running tasks, for example). Then you can add more worker nodes as load increases (this is "horizontal" scaling). Your API nodes would be sitting behind a loadbalancer.
Your nodes need to communicate with each other. This could happen over a queue (like RabbitMQ). Some nodes will put tasks onto a queue for other nodes to run. Or each node could provide its own API. There are a lot of strategies here.
All of these services would still connect to a "single" database. For resiliency, you'll want more than one database node. There are plenty of hosted database solutions (both for SQL and NoSQL databases). I've seen lots of multi-master sql setups, but whatever serves your app best.
If your application supports user sessions, you'll have to ensure the node that handles the request can honor the session token. The "best" way of doing this is having a common cache (redis/memcached) where session tokens are stored (if a node gets a session token, just look it up in the common cache). There are other (imo, less attractive) strategies where the loadbalancer is made to be aware of session tokens (it can direct requests with the same session to the same api node, which has a node-local cache of session tokens).
Of course, now your deployment is very complicated. You have to configure multiple nodes, potentially of different service types, which may need specific config. This is where config management tools like ansible/puppet/chef/terraform/etc come up. You can write some code that can deploy and configure your application, in a repeatable consistent way. One thing you can do is create an "image" - this could be a server/vm image or a docker image - a deploy multiple identical servers/containers base on that image, and even have an autoscaler monitor your application nodes and redeploy when it sees a node go down. As you mention, serverless ("Functions as a Service") options are available as alternatives to many of these pieces.
But these are some of the techniques I've seen to deal with high load. You pay a price with all the added complexity, but you get scalability in return. There are lots of tools on public cloud providers (AWS, Google Cloud, Azure, etc). If you have google/facebook/netflix level load, that's a little bit of another story.
Wow thank you so much for this. I'll be referring to this for things to learn (a lot). I started listening to this AWS podcast and the crazy amount of technology they have that's applicable here.
I also listen to a Go podcast and the "throughput? they talk about like 25,000 requests per seconds seems insane to me.
And regarding the big companies handling petabytes/hour man insane and probably redundantly backed up too.
My concern is cost (as you mentioned) in particular if you have no users yet, what justifies the cost to set all of this up except for the future-mindset of growth (where some companies have failed due to inability to scale).
Right now I've just been cheep using 1 VPS but it's good to know the proper deployment design for growth.
Not sure how I could easily "make an image" copy of my stack/code and possibly this is where serverless is nice, you just need to throw code/traffic to something.
Again something I have to cross at some point.