- Static frontend hosted on Netlify (free unlimited scale)
- Backend server on Google App Engine (connecting to Gcloud storage and managed DB via magic)
I realize I'm opening myself up to vendor lock-in and increased costs down the road (if I even get that far), but I've wrangled enough Docker/k8s/Ingress setups in the past to know it's just not worth the time and effort for a non-master.
In my experience, the issue isn't that Google will jack up the costs but that they'll deprecate their infrastructure and push the migration work onto you, often forcing you to reimplement major features.[0]
One notable example is how their NDB client library used to automatically handle memcache for you, but they got rid of that with Cloud NDB Library and forced clients to implement their own caching.
The sequence of datastore APIs I've seen during my experience with AppEngine is:
* Python DB Client Library for Datastore[1], deprecated in favor of...
* Python NDB Client Library[2], deprecated in favor of...
* Cloud NDB Library[3], still supported, but they ominously warn new apps to use...
If you're using the App Engine Flexible editions, it's really easy to not worry about vendor lock in or really even deprecation much at all. E.g. it's easy to run a basic Node, Python or Java backend in App Engine Flexible, making use of a MySQL or Postgres DB in Cloud SQL, so you don't have to worry about managing servers at all and you get all the benefit of automatic scaling without the semi-nightmare of running your own kubernetes cluster. Then even if App Engine totally went away you just have a normal Node, Python or Java app running against a MySQL or Postgres DB that is pretty trivial to migrate to another platform.
I still use GCP, but I avoid locking myself into their proprietary infrastructure when I'm writing new stuff. I feel like Google is far too cavalier about deprecating services and forcing their customers to do migration work.
It is hard to replace GCP's managed datastores because I really don't want to maintain my own database server (even if it's a managed service that someone else upgrades for me). So I've stuck to Google Cloud Datastore / Firestore, but I've been experimenting a lot with Litestream[0], and I think that might be my go-to choice in the future instead of proprietary managed datastores.
Litestream continuously streams data from a SQLite database to an S3 backend. It means that you can design your app to use SQLite and then sync the database to any S3 provider. I designed a simple pastebin clone on top of Litestream, and I use it in production for my open source KVM over IP. It's worked great so far, though I'm admittedly putting a pretty gentle workload on it (a handful of requests per day).
You don’t want to maintain your own database server, even managed by GCP, but with SQLite you have to maintain state on GCP Persistent Disks and backups to S3 using Litestream. Why do you think this is easier?
I don't have to maintain state on GCP persistent disks. I can blow away a server without warning, and I'll only lose a few seconds of data.
True, I have to maintain state on S3, but there's not much work involved in that.
If I was maintaining my own database server, I have to manage upgrades, backups, and the complexity of running an additional server. With Litestream, I don't have to manage upgrades because nothing bad happens if I don't upgrade, whereas there are security risks running an unpatched MySQL/Postgres server in production. Litestream has built-in snapshots and can replicate to multiple S3 backends, so I'm not too worried about backups. And there's no server to maintain.
What operational complexity do you see in Litestream?
SQLite is really great. By using it, you don't have to install and maintain another service, and you don't have to think about things like network security. From that point of view, that's clearly simpler.
But it also introduces a few challenges. It's not as easy to connect to your database remotely to inspect it, with something like SequelPro for MySQL. It's not possible to create an index or drop a column without blocking all writes, which can be annoying if your database is large. Database migrations in general are harder with SQLite because ALTER TABLE is limited. [1]
One last thing regarding losing the few seconds of data. If you use something like Google Cloud Regional Persistent Disk, then your data are replicated synchronously in two different data centers, which means you can lose your server, restart another one, and not lose any data. Can still be combined with Litestream for backup to S3 with point-in-time restores.
yeah, this is the more sane approach. Just use Google's replication/durability, and export to S3 when you want/need to change vendors. In this case, you wouldn't even need lightstream. Just SQLite.
If you can lose the last few seconds then yes that's fine. But for most applications I've been working on, we didn't have that flexibility (committed means durable).
I don't see any operational complexity with Litestream.io. I think that's an awesome tool. But it's not that different of managing PostgreSQL backups with something like WAL-E.
The complexity of managing your own database server only exists if you don't use a managed service. Then there is no server to maintain and they do all the things you mentioned for you.
I agree with you in terms of using what you already know best.
> If you're not already familiar with these tools consider using a managed platform first, for example Render or DigitalOcean's App Platform (not affiliated, just heard great things about both). They will help you focus on your product, and still gain many of the benefits I talk about here.
And:
> I use Kubernetes on AWS, but don’t fall into the trap of thinking you need this. I learned these tools over several years mentored by a very patient team. I'm productive because this is what I know best, and I can focus on shipping stuff instead. Your mileage may vary.
I actually spend very little time on infrastructure after the initial setup (a week of part time work, since then a couple of hours per month tops).
For comparison, this post describing what I did took nearly a month of on-and-off work. But I might just be slow at writing :)
Makes sense, didn't mean my comment as a criticism of your setup Anthony. The product and infra look very cool! Just highlighting that things can be a lot simpler for those of us with more mundane requirements.
cloud vendor lock-in fears are overblown. pricing and features will always be competitive between the big vendors. I suspect people waste a lot of time/money trying to be cloud agnostic.
Real vendor lock-in is when you have decades of code written against an Oracle DB and you're getting charged outrageous Oracle rates and it would also cost a fortune migrate.
Real cloud vendor lock-in is when you have decades of code written against a [cloud vendor] and you're getting charged outrageous [cloud] rates and it would also cost a fortune migrate.[sic]
A decade has to past first. Most start ups don't last 5 years. Statistically speaking he's right and if he's not, well, a project that lasted 10 years ought to be profitable so pay up. Not profitable? Then who cares that cloud lock-in broke the camels back. If it wasn't profitable enough to justify the investment needed to switch to another vendor then it wasn't profitable enough to begin with.
The thing Ive learned is that a lot of people have both a vested interest and a sort of stockholm syndrome with vendors (cloud or otherwise). If you spent tons of time learning AWSs special tooling, you are going to see everything as a nail if you catch my drift. Ive seen a few particular users here spend many threads defending their choices despite the often very logical criticisms levied against the "cloud everything" approach.
One thing I like to talk about to Cs is their strategy on capex vs opex, because honestly that determines quite a lot, but is often something engineers dont think about.
The ultimate “vendor independence” is racking your own servers in your own on-prem data centre with multiple internet connections. Very high capex, potentially low opex depending on scale. In the middle would be racking your own servers at multiple DCs. Less capex (you’re still buying servers, but not air handlers and power distribution), higher monthly opex. On the other end are things like GCP and AWS, where you have virtually no capex but relatively high opex.
And in the end, it really depends on how much you trust different vendors and how you want to manage cash flows. Racking your own servers reduces some risks (Google deciding to terminate your account on a whim, Azure pushing wild updates, Amazon jacking prices wildly) while increasing other risks (only your own staff are watching your hardware).
You are painting an incomplete picture. Between high (racking your own servers at multiple DCs) & very-high (your own DCs) CapEx options and low CapEx options (IaaS and PaaS), there is a middle ground that - unless you need specific managed services, the larger PaaS ecosystem and/or an extreme scalability - is to use bare-metal cloud providers. This approach combines multiple benefits, including bare metal's max. performance, full isolation / no "noisy neighbors", pretty much total control of the equipment that you rent, cloud-like elasticity, flexible, usually globally distributed, network architecture and reasonable pricing.
Yes. This becomes clear when the cloud costs rise to be the largest burn in your budget and the runway keeps getting shorter and you can't migrate away because your code has tendrils deep into every AWS crevice...
Any company after a decade is going to have growing pains.
Spend your early time working on your core business. If your core business isn't cloud agnosticism then you shouldn't be investing your resources there.
Vendor lock-in depends heavily on exactly what vendor you’re using and especially if it’s OSS API hosted on the vendor or a vendor API.
If you use something like AppEngine to run a Flask or Django app, you will not be locked in much because those are open source libraries with well known runtime options elsewhere.
Same to some extent with any sort of managed OSS database.
If you use something like Cloud Datastore or Firestore or DynamoDB , you are using a proprietary API and will have to rewrite all your client calls , or write an extensive shim, and probably significantly re architect to port.
Even in the “hosted OSS” option there are usually some vendor specific stuff but it can vary a lot. Something like AppEngine specifically used to be an absurd amount of API lock-in but has evolved over the years to be more of a general container runtime.
Cost involved really depends upon how you did it and the differences between what you're migrating to/from.
If all database access is compartmentalized and the two datastores are fairly similar then it can be pretty cheap. If you didn't compartmentalize it will be expensive. If their characteristics are different enough then your compartmentalization will probably fall down in some cases and it will probably be expensive, although not as expensive if it weren't compartmentalized.
I love this post. I'm a big believer that one and two man startups will continue to build more and more impressive products. My one man startup 42papers.com (A community for top trending papers in CS/DL/ML) has the following stack.
1. Firebase Hosting for the React frontend
2. GraphJin (Automatic GraphQL to SQL Engine) on App Engine for the backend
3. Cloud SQL Postgres for DB
Another way to do something similar would be to use Cloud Run https://cloud.google.com/run and that way you can avoid vendor lockin since you can move your manifests to another knative hosting provider or spin up your own K8s cluster and deploy knative
Interesting, thanks. I used to use Google AppEngine a lot and very much liked it, but haven’t touched it for years. Now, I like the idea of using Heroku better, and just pay a little more.
Heroku my feels cheaper when you think about how long you can punt on having ops proper person(s) & how much time you save rolling your own everything.
My experience of Heroku has mostly been the pain of migrating to a different platform once you grow to the point that their pricing (and abstraction) starts to act against your growth.
Heroku is great for general applications, but if you're trying to do something that isn't a standard CRUD app, it can really start to bite you in the arse.
Their DB pricing in particular is incredibly inflexible compared to AWS RDS. Among other issues we had with Heroku at my old job, was having a DB that was hitting its storage limits, but was miles away from hitting its memory or connection limits. There was no option but to upgrade to the next tier, with additional memory etc., even though all we needed was additional disk.
That's not to say that Heroku is bad, but like any tool, you need to be aware of the long term costs that are often associated with term convenience.
I used them both in the same time period. I liked GAE because it was basically free to use for low use web apps, but has scalability built in. I liked Heroku because it was just so easy to develop and deploy with.
If you haven't checked out App Engine in a while, you really should. Especially check out the App Engine "Flexible" editions, which make it really easy to run on App Engine withOUT getting locked in.
I run a NodeJS GraphQL server in App Engine Flexible, and it is basically just like running it in a Docker container. It's also pretty trivial to run in Google Cloud Run if I so desired, there is even a tool to assist: https://github.com/GoogleCloudPlatform/app-engine-cloud-run-...
If you're just now looking in to GAE, you should likely be using Cloud Run instead. My company is busily migrating everything there and reaping the benefits.
Converting (it's more of a conversion than a migration) from flexible GAE to cloud run is super easy, check out the conversion tool I posted in my previous comment.
Basically, your code shouldn't really need to change at all, it's really just your deployment scripts and configs that need to be updated. At their heart flexible GAE and cloud run are both just running Docker containers.
GAE Flex is super old at this point and I've never personally met someone who migrated between them (they're pretty different offerings imho). Moving between either GAE to Run has been pretty seamless though.
Agreed, would have gone with their managed app platform if I was using one of the supported techs. For search I use a $5/mo meilisearch DO droplet that took almost no time to set up and I never have to pay attention to.
Price and functionality. It’s incredibly easy to use, unlike AWS and Google Cloud. The downfall is that you have a bit less control, but that’s never been an issue for me. Their servers have been incredibly reliable, they offer managed databases now, load balancer, S3 compatible Spaces. Everything I’ve needed so far, predictable and affordable pricing, and none of the complexity.
App engine (and Google's cloud in general) is pretty fantastic. I find it much easier to navigate and use than AWS (as someone whose day job isn't running infra on clouds), and I would have gladly put my side projects in there and recommended it to my clients... if only it wasn't Google and its history of randomly locking people out of their Google account, thus the entire Google ecosystem, without appeal.
Some would argue that identity management is the real lock in anyway and while a business may mostly be abstracted from their cloud via Kubes, any Internal IT systems may be such a kludge that moving away is a nightmare hell
First of all nothing important, mostly stuff that's a distraction unless it becomes a need.
That said, using a static frontend cached on a CDN in general improves initial pageload and cuts down on traffic to your server by a lot. Netlify makes this easy if you want to use React on the client (with NextJS).
With AppEngine you get direct access in one console to all the bells and whistles of Google Cloud, basically the same as the other infra giants. AWS has even more bells and whistles but I find its console more annoying.
You can always add Cloudflare to the mix to cache static assets. This change is additive meaning you can start with a single Heroku deployment and if static asset traffic becomes an issue, you can create a Cloudflare account, configure DNS and be done.
Well if you're deploying a static site they are the same, but that's still not the whole picture. They have support for lambda style "serverless" functions and Fauna DB[1], and can bundle functions with apps automatically for some tools like Next.js to do server side rendering for dynamic routes[2]. So while they don't support quite the same level of custom stacks, backends and DBs, they do provide tools that enable full stack applications.
That's right, I'm exaggerating. At current rates I'll hit that limit at 7.5MM pageviews/month.
I've also paid for extra builds once or twice in the past (automatically charges a few dollars when you cross the build time limit), and I pay them $9/mo for analytics.
Are you happy with their analytics? I have no experience with website analytics but I find their offering a bit too minimalistic. I wish for the following features:
- Break down page views into unique visiters for all views (per site, per country etc.). (or some other comparison between those).
Agreed, they're extremely mediocre, but worth $9 to me. Seems like they have better analytics available at a "custom" price, which I assume would be quite expensive. For my use case, minimal analytics at a minimal price works fine.
It would be if it was interpreted as an actual Roman number. But in this case it's treated as M x M.
The actual Roman version of a million is an M with a bar over it, where the bar means x1000. But that's not an ordinary character, so wouldn't work for this purpose.
I oversimplified a bit. I have a low-traffic "admin" interface that's rendered server-side. The people using that are my direct customers and are the only authenticated users (they auth in a traditional in-app way).
I also have a high(er)-traffic frontend on a CDN which is used by their customers. User writes there are purchases/payments handled by third(fourth?)-party SaaS.
Many sites have low write:read ratios and don’t leverage that fact in their architectural choices. Availability for maintainers is often less critical than for consumers, and your life is better if you build that in.
My current employers still haven’t learned this lesson and think caching fixes everything.
Tons of reasons, but the main one is that cache is shared mutable state, pretending not to be. It has all of the ugly attributes of global variables, especially where knowledge transfer and reliability are concerned.
In a read-mostly environment you can often more easily afford to update the state all at once. It’s clear what the effects are because they happen sequentially. The cost of an update isn’t fanned out and obscured across the codebase, where or your team can delude yourself of the true system cost of a suspect feature.
I agree that caching is mostly a bandaid fix. But IMO if it's used judiciously -- namely in response of a demand for a quick fix of a performance problem -- they can be OK mid-term.
As for shared mutable state, yes, that's true, but what are the alternatives? Whether it's memcached or Redis or an in-process cache (like Erlang/Elixir have), the tradeoffs seem mostly the same.
> namely in response of a demand for a quick fix of a performance problem
Caches are addictive. The first one is 'free' (easy) and people start wanting to use that solution for all their problems, especially social problems (we can't convince team A to get their average response time to match our SLA, so we'll just cache them to 'fix' it)
They defer thinking about architectural problems until later, when they are so opaque that "nobody could blame you" for having trouble sorting them out. But I do. Blame them, that is.
I'm almost inclined to believe that the relationship is inverted from what many assume.
Amazon will bend over backwards to accomodate a company spending $500 mil a year on hosting (apparently what Snap spends). Sure it's only a fraction of its revenue ($386 bill for AWS), but half a billion is half a billion.
Google Cloud SQL. I say magic because locally I need a service key to connect to the proxy, but the production app doesn't seem to need anything but the internal google address.
The service credentials are supplied via an env variable that points to their location. Locally, you can provide the location directly or set the env variable yourself. When deployed, most GCP service environments just have that variable setup already and you don't have to think about it, so it feels a bit like magic. Same thing underneath the good though.
- Static frontend hosted on Netlify (free unlimited scale)
- Backend server on Google App Engine (connecting to Gcloud storage and managed DB via magic)
I realize I'm opening myself up to vendor lock-in and increased costs down the road (if I even get that far), but I've wrangled enough Docker/k8s/Ingress setups in the past to know it's just not worth the time and effort for a non-master.