Your nodes need to communicate with each other. This could happen over a queue (like RabbitMQ). Some nodes will put tasks onto a queue for other nodes to run. Or each node could provide its own API. There are a lot of strategies here.
All of these services would still connect to a "single" database. For resiliency, you'll want more than one database node. There are plenty of hosted database solutions (both for SQL and NoSQL databases). I've seen lots of multi-master sql setups, but whatever serves your app best.
If your application supports user sessions, you'll have to ensure the node that handles the request can honor the session token. The "best" way of doing this is having a common cache (redis/memcached) where session tokens are stored (if a node gets a session token, just look it up in the common cache). There are other (imo, less attractive) strategies where the loadbalancer is made to be aware of session tokens (it can direct requests with the same session to the same api node, which has a node-local cache of session tokens).
Of course, now your deployment is very complicated. You have to configure multiple nodes, potentially of different service types, which may need specific config. This is where config management tools like ansible/puppet/chef/terraform/etc come up. You can write some code that can deploy and configure your application, in a repeatable consistent way. One thing you can do is create an "image" - this could be a server/vm image or a docker image - a deploy multiple identical servers/containers base on that image, and even have an autoscaler monitor your application nodes and redeploy when it sees a node go down. As you mention, serverless ("Functions as a Service") options are available as alternatives to many of these pieces.
But these are some of the techniques I've seen to deal with high load. You pay a price with all the added complexity, but you get scalability in return. There are lots of tools on public cloud providers (AWS, Google Cloud, Azure, etc). If you have google/facebook/netflix level load, that's a little bit of another story.
I also listen to a Go podcast and the "throughput? they talk about like 25,000 requests per seconds seems insane to me.
And regarding the big companies handling petabytes/hour man insane and probably redundantly backed up too.
My concern is cost (as you mentioned) in particular if you have no users yet, what justifies the cost to set all of this up except for the future-mindset of growth (where some companies have failed due to inability to scale).
Right now I've just been cheep using 1 VPS but it's good to know the proper deployment design for growth.
Dang, so much to learn haha. Thanks.