Hacker News new | past | comments | ask | show | jobs | submit login

My one-man-SaaS setup:

- Static frontend hosted on Netlify (free unlimited scale)

- Backend server on Google App Engine (connecting to Gcloud storage and managed DB via magic)

I realize I'm opening myself up to vendor lock-in and increased costs down the road (if I even get that far), but I've wrangled enough Docker/k8s/Ingress setups in the past to know it's just not worth the time and effort for a non-master.




In my experience, the issue isn't that Google will jack up the costs but that they'll deprecate their infrastructure and push the migration work onto you, often forcing you to reimplement major features.[0]

One notable example is how their NDB client library used to automatically handle memcache for you, but they got rid of that with Cloud NDB Library and forced clients to implement their own caching.

The sequence of datastore APIs I've seen during my experience with AppEngine is:

* Python DB Client Library for Datastore[1], deprecated in favor of...

* Python NDB Client Library[2], deprecated in favor of...

* Cloud NDB Library[3], still supported, but they ominously warn new apps to use...

* Datastore mode client library[4]

[0] https://steve-yegge.medium.com/dear-google-cloud-your-deprec...

[1] https://cloud.google.com/appengine/docs/standard/python/data...

[2] https://cloud.google.com/appengine/docs/standard/python/ndb

[3] https://cloud.google.com/appengine/docs/standard/python/migr...

[4] https://cloud.google.com/datastore/docs/reference/libraries


If you're using the App Engine Flexible editions, it's really easy to not worry about vendor lock in or really even deprecation much at all. E.g. it's easy to run a basic Node, Python or Java backend in App Engine Flexible, making use of a MySQL or Postgres DB in Cloud SQL, so you don't have to worry about managing servers at all and you get all the benefit of automatic scaling without the semi-nightmare of running your own kubernetes cluster. Then even if App Engine totally went away you just have a normal Node, Python or Java app running against a MySQL or Postgres DB that is pretty trivial to migrate to another platform.


Are you still with them? If yes, would love to hear why. Otherwise, what made you jump?


I still use GCP, but I avoid locking myself into their proprietary infrastructure when I'm writing new stuff. I feel like Google is far too cavalier about deprecating services and forcing their customers to do migration work.

It is hard to replace GCP's managed datastores because I really don't want to maintain my own database server (even if it's a managed service that someone else upgrades for me). So I've stuck to Google Cloud Datastore / Firestore, but I've been experimenting a lot with Litestream[0], and I think that might be my go-to choice in the future instead of proprietary managed datastores.

Litestream continuously streams data from a SQLite database to an S3 backend. It means that you can design your app to use SQLite and then sync the database to any S3 provider. I designed a simple pastebin clone on top of Litestream, and I use it in production for my open source KVM over IP. It's worked great so far, though I'm admittedly putting a pretty gentle workload on it (a handful of requests per day).

[0] https://litestream.io/

[1] https://github.com/mtlynch/logpaste


>I feel like Google is far too cavalier about deprecating services and forcing their customers to do migration work.

Having worked with quite a few ex-Googlers this is a pretty standard Google engineering pattern.


You don’t want to maintain your own database server, even managed by GCP, but with SQLite you have to maintain state on GCP Persistent Disks and backups to S3 using Litestream. Why do you think this is easier?


I don't have to maintain state on GCP persistent disks. I can blow away a server without warning, and I'll only lose a few seconds of data.

True, I have to maintain state on S3, but there's not much work involved in that.

If I was maintaining my own database server, I have to manage upgrades, backups, and the complexity of running an additional server. With Litestream, I don't have to manage upgrades because nothing bad happens if I don't upgrade, whereas there are security risks running an unpatched MySQL/Postgres server in production. Litestream has built-in snapshots and can replicate to multiple S3 backends, so I'm not too worried about backups. And there's no server to maintain.

What operational complexity do you see in Litestream?


SQLite is really great. By using it, you don't have to install and maintain another service, and you don't have to think about things like network security. From that point of view, that's clearly simpler.

But it also introduces a few challenges. It's not as easy to connect to your database remotely to inspect it, with something like SequelPro for MySQL. It's not possible to create an index or drop a column without blocking all writes, which can be annoying if your database is large. Database migrations in general are harder with SQLite because ALTER TABLE is limited. [1]

One last thing regarding losing the few seconds of data. If you use something like Google Cloud Regional Persistent Disk, then your data are replicated synchronously in two different data centers, which means you can lose your server, restart another one, and not lose any data. Can still be combined with Litestream for backup to S3 with point-in-time restores.

[1] https://sqlite.org/lang_altertable.html


yeah, this is the more sane approach. Just use Google's replication/durability, and export to S3 when you want/need to change vendors. In this case, you wouldn't even need lightstream. Just SQLite.


If you can lose the last few seconds then yes that's fine. But for most applications I've been working on, we didn't have that flexibility (committed means durable).

I don't see any operational complexity with Litestream.io. I think that's an awesome tool. But it's not that different of managing PostgreSQL backups with something like WAL-E.

The complexity of managing your own database server only exists if you don't use a managed service. Then there is no server to maintain and they do all the things you mentioned for you.


I agree with you in terms of using what you already know best.

> If you're not already familiar with these tools consider using a managed platform first, for example Render or DigitalOcean's App Platform (not affiliated, just heard great things about both). They will help you focus on your product, and still gain many of the benefits I talk about here.

And:

> I use Kubernetes on AWS, but don’t fall into the trap of thinking you need this. I learned these tools over several years mentored by a very patient team. I'm productive because this is what I know best, and I can focus on shipping stuff instead. Your mileage may vary.

I actually spend very little time on infrastructure after the initial setup (a week of part time work, since then a couple of hours per month tops).

For comparison, this post describing what I did took nearly a month of on-and-off work. But I might just be slow at writing :)


Makes sense, didn't mean my comment as a criticism of your setup Anthony. The product and infra look very cool! Just highlighting that things can be a lot simpler for those of us with more mundane requirements.


Hey no worries :) I think my reply came off differently than I meant it.

I just wanted to complement your sentiment.


cloud vendor lock-in fears are overblown. pricing and features will always be competitive between the big vendors. I suspect people waste a lot of time/money trying to be cloud agnostic.

Real vendor lock-in is when you have decades of code written against an Oracle DB and you're getting charged outrageous Oracle rates and it would also cost a fortune migrate.


... a decade later:

Real cloud vendor lock-in is when you have decades of code written against a [cloud vendor] and you're getting charged outrageous [cloud] rates and it would also cost a fortune migrate.[sic]


A decade has to past first. Most start ups don't last 5 years. Statistically speaking he's right and if he's not, well, a project that lasted 10 years ought to be profitable so pay up. Not profitable? Then who cares that cloud lock-in broke the camels back. If it wasn't profitable enough to justify the investment needed to switch to another vendor then it wasn't profitable enough to begin with.


If anything, vendor lock-in is consistently underblown.


The thing Ive learned is that a lot of people have both a vested interest and a sort of stockholm syndrome with vendors (cloud or otherwise). If you spent tons of time learning AWSs special tooling, you are going to see everything as a nail if you catch my drift. Ive seen a few particular users here spend many threads defending their choices despite the often very logical criticisms levied against the "cloud everything" approach.

One thing I like to talk about to Cs is their strategy on capex vs opex, because honestly that determines quite a lot, but is often something engineers dont think about.


> The thing Ive learned is that a lot of people have both a vested interest and a sort of stockholm syndrome with vendors.

Not to mention “resume driven development”. Recruiters love cloud experience.


> I like to talk about to Cs is their strategy on capex vs opex, because honestly that determines quite a lot

For example?


The ultimate “vendor independence” is racking your own servers in your own on-prem data centre with multiple internet connections. Very high capex, potentially low opex depending on scale. In the middle would be racking your own servers at multiple DCs. Less capex (you’re still buying servers, but not air handlers and power distribution), higher monthly opex. On the other end are things like GCP and AWS, where you have virtually no capex but relatively high opex.

And in the end, it really depends on how much you trust different vendors and how you want to manage cash flows. Racking your own servers reduces some risks (Google deciding to terminate your account on a whim, Azure pushing wild updates, Amazon jacking prices wildly) while increasing other risks (only your own staff are watching your hardware).


You are painting an incomplete picture. Between high (racking your own servers at multiple DCs) & very-high (your own DCs) CapEx options and low CapEx options (IaaS and PaaS), there is a middle ground that - unless you need specific managed services, the larger PaaS ecosystem and/or an extreme scalability - is to use bare-metal cloud providers. This approach combines multiple benefits, including bare metal's max. performance, full isolation / no "noisy neighbors", pretty much total control of the equipment that you rent, cloud-like elasticity, flexible, usually globally distributed, network architecture and reasonable pricing.


Totally true :). It's a spectrum and there's a ton of options in the middle. I was mostly pointing out the extremes.


Yes. This becomes clear when the cloud costs rise to be the largest burn in your budget and the runway keeps getting shorter and you can't migrate away because your code has tendrils deep into every AWS crevice...


Any company after a decade is going to have growing pains.

Spend your early time working on your core business. If your core business isn't cloud agnosticism then you shouldn't be investing your resources there.


Vendor lock-in depends heavily on exactly what vendor you’re using and especially if it’s OSS API hosted on the vendor or a vendor API.

If you use something like AppEngine to run a Flask or Django app, you will not be locked in much because those are open source libraries with well known runtime options elsewhere.

Same to some extent with any sort of managed OSS database.

If you use something like Cloud Datastore or Firestore or DynamoDB , you are using a proprietary API and will have to rewrite all your client calls , or write an extensive shim, and probably significantly re architect to port.

Even in the “hosted OSS” option there are usually some vendor specific stuff but it can vary a lot. Something like AppEngine specifically used to be an absurd amount of API lock-in but has evolved over the years to be more of a general container runtime.


Cost involved really depends upon how you did it and the differences between what you're migrating to/from.

If all database access is compartmentalized and the two datastores are fairly similar then it can be pretty cheap. If you didn't compartmentalize it will be expensive. If their characteristics are different enough then your compartmentalization will probably fall down in some cases and it will probably be expensive, although not as expensive if it weren't compartmentalized.


If you have a DAO layer in your code, it shouldn't be too significant of a refactor to switch between noSQL vendors for simple tables.

The real heavy lifting would be if you've optimized your tables to that specific architecture. You might need to re-design a lot of your schemas.


My worry with these providers is I get locked out of my accounts for some arbitrary reason or bug.


Yeah and in the case of google, good luck finding support.


I love this post. I'm a big believer that one and two man startups will continue to build more and more impressive products. My one man startup 42papers.com (A community for top trending papers in CS/DL/ML) has the following stack.

  1. Firebase Hosting for the React frontend
  2. GraphJin (Automatic GraphQL to SQL Engine) on App Engine for the backend
  3. Cloud SQL Postgres for DB
https://github.com/dosco/graphjin


Another way to do something similar would be to use Cloud Run https://cloud.google.com/run and that way you can avoid vendor lockin since you can move your manifests to another knative hosting provider or spin up your own K8s cluster and deploy knative


Interesting, thanks. I used to use Google AppEngine a lot and very much liked it, but haven’t touched it for years. Now, I like the idea of using Heroku better, and just pay a little more.


Heroku my feels cheaper when you think about how long you can punt on having ops proper person(s) & how much time you save rolling your own everything.


My experience of Heroku has mostly been the pain of migrating to a different platform once you grow to the point that their pricing (and abstraction) starts to act against your growth.

Heroku is great for general applications, but if you're trying to do something that isn't a standard CRUD app, it can really start to bite you in the arse.

Their DB pricing in particular is incredibly inflexible compared to AWS RDS. Among other issues we had with Heroku at my old job, was having a DB that was hitting its storage limits, but was miles away from hitting its memory or connection limits. There was no option but to upgrade to the next tier, with additional memory etc., even though all we needed was additional disk.

That's not to say that Heroku is bad, but like any tool, you need to be aware of the long term costs that are often associated with term convenience.


If you don't mind my asking, can you say why you moved from GAE to Heroku and/or why you prefer Heroku over GAE?


I used them both in the same time period. I liked GAE because it was basically free to use for low use web apps, but has scalability built in. I liked Heroku because it was just so easy to develop and deploy with.


People here underestimate google app engine a lot. But I doubt if there is better one person Saas service out there.


If you haven't checked out App Engine in a while, you really should. Especially check out the App Engine "Flexible" editions, which make it really easy to run on App Engine withOUT getting locked in.

I run a NodeJS GraphQL server in App Engine Flexible, and it is basically just like running it in a Docker container. It's also pretty trivial to run in Google Cloud Run if I so desired, there is even a tool to assist: https://github.com/GoogleCloudPlatform/app-engine-cloud-run-...


If you're just now looking in to GAE, you should likely be using Cloud Run instead. My company is busily migrating everything there and reaping the benefits.


So, this what scares me: in 5 years someone using GAE would be busy with two migrations: classic GAE -> flexible GAE -> and now cloud run?..


Converting (it's more of a conversion than a migration) from flexible GAE to cloud run is super easy, check out the conversion tool I posted in my previous comment.

Basically, your code shouldn't really need to change at all, it's really just your deployment scripts and configs that need to be updated. At their heart flexible GAE and cloud run are both just running Docker containers.


GAE Flex is super old at this point and I've never personally met someone who migrated between them (they're pretty different offerings imho). Moving between either GAE to Run has been pretty seamless though.


If you want to upgrade your architecture.


Digital Ocean is fantastic.


Agreed, would have gone with their managed app platform if I was using one of the supported techs. For search I use a $5/mo meilisearch DO droplet that took almost no time to set up and I never have to pay attention to.


I'd actually use Caprover then manage my services using that... it's as close to a self-managed platform as you can get for one-click deploys.


Would you mind being more specific? Is it the price / functionality balance that makes DO fantastic?


Price and functionality. It’s incredibly easy to use, unlike AWS and Google Cloud. The downfall is that you have a bit less control, but that’s never been an issue for me. Their servers have been incredibly reliable, they offer managed databases now, load balancer, S3 compatible Spaces. Everything I’ve needed so far, predictable and affordable pricing, and none of the complexity.


Digital Ocean + Cloud66


I'm not a big fan of google but i have to say that GAE solves a lot of problems I don't want to deal with.


App engine (and Google's cloud in general) is pretty fantastic. I find it much easier to navigate and use than AWS (as someone whose day job isn't running infra on clouds), and I would have gladly put my side projects in there and recommended it to my clients... if only it wasn't Google and its history of randomly locking people out of their Google account, thus the entire Google ecosystem, without appeal.


Re: google app engine. Is it possible to set a cap on how much resources are used?


Some would argue that identity management is the real lock in anyway and while a business may mostly be abstracted from their cloud via Kubes, any Internal IT systems may be such a kludge that moving away is a nightmare hell


I'm curious, as someone knows probably only enough about this stuff to get myself into trouble, what am I missing out on by just pushing to heroku?


First of all nothing important, mostly stuff that's a distraction unless it becomes a need.

That said, using a static frontend cached on a CDN in general improves initial pageload and cuts down on traffic to your server by a lot. Netlify makes this easy if you want to use React on the client (with NextJS).

With AppEngine you get direct access in one console to all the bells and whistles of Google Cloud, basically the same as the other infra giants. AWS has even more bells and whistles but I find its console more annoying.


You can always add Cloudflare to the mix to cache static assets. This change is additive meaning you can start with a single Heroku deployment and if static asset traffic becomes an issue, you can create a Cloudflare account, configure DNS and be done.


Netlify free tier is pretty great, giving custom DNS and support for NextJS build was so simple.

Heroku is similar, but Netlify is at least equally simple I found. Maybe someone else can shed more light on differences.


Aren't they different products though? Netlify is front end only while Heroku is full stack.


Well if you're deploying a static site they are the same, but that's still not the whole picture. They have support for lambda style "serverless" functions and Fauna DB[1], and can bundle functions with apps automatically for some tools like Next.js to do server side rendering for dynamic routes[2]. So while they don't support quite the same level of custom stacks, backends and DBs, they do provide tools that enable full stack applications.

[1] https://www.netlify.com/tags/database/

[2] https://www.npmjs.com/package/@netlify/plugin-nextjs


I wouldn't say Netlify has free unlimited scale. There are some limitations, especially the data transfer limit of 100GB.


That's right, I'm exaggerating. At current rates I'll hit that limit at 7.5MM pageviews/month.

I've also paid for extra builds once or twice in the past (automatically charges a few dollars when you cross the build time limit), and I pay them $9/mo for analytics.


Are you happy with their analytics? I have no experience with website analytics but I find their offering a bit too minimalistic. I wish for the following features:

- Break down page views into unique visiters for all views (per site, per country etc.). (or some other comparison between those).

- Don't lose the history after 30 days.

- Export to .xlsx


Agreed, they're extremely mediocre, but worth $9 to me. Seems like they have better analytics available at a "custom" price, which I assume would be quite expensive. For my use case, minimal analytics at a minimal price works fine.


Free unlimited scale does exist though. Cloudflare has no bandwidth limit for free plans, and it pairs well with App Engine.


How much is 1MM? I'm unfamiliar with that suffix.


1 million


M == thousand


M does happen to be the Roman numeral for “thousand” — as seen in the credits of old movies. But let’s not go there.


k = thousand, M = million, G = billion

There is actually a standard for this. I'm fine with MM, it's confusing to me but not ambiguous, just please don't reassign existing prefixes...


This is a false-cognate in MM.

MM means millions, plural of M which means million.


Source? Every source I've seen says that MM is derived from the Roman numeral M, meaning thousand.


Wouldn't that make MM 2,000, the same way II is 2 and XX is 20?


It would be if it was interpreted as an actual Roman number. But in this case it's treated as M x M.

The actual Roman version of a million is an M with a bar over it, where the bar means x1000. But that's not an ordinary character, so wouldn't work for this purpose.


If your frontend is on CDN, how are you handling auth? Do you use Firebase for that?


I oversimplified a bit. I have a low-traffic "admin" interface that's rendered server-side. The people using that are my direct customers and are the only authenticated users (they auth in a traditional in-app way).

I also have a high(er)-traffic frontend on a CDN which is used by their customers. User writes there are purchases/payments handled by third(fourth?)-party SaaS.


Many sites have low write:read ratios and don’t leverage that fact in their architectural choices. Availability for maintainers is often less critical than for consumers, and your life is better if you build that in.

My current employers still haven’t learned this lesson and think caching fixes everything.


Not sure I am following you: why shouldn't we use cache?


Tons of reasons, but the main one is that cache is shared mutable state, pretending not to be. It has all of the ugly attributes of global variables, especially where knowledge transfer and reliability are concerned.

In a read-mostly environment you can often more easily afford to update the state all at once. It’s clear what the effects are because they happen sequentially. The cost of an update isn’t fanned out and obscured across the codebase, where or your team can delude yourself of the true system cost of a suspect feature.


I agree that caching is mostly a bandaid fix. But IMO if it's used judiciously -- namely in response of a demand for a quick fix of a performance problem -- they can be OK mid-term.

As for shared mutable state, yes, that's true, but what are the alternatives? Whether it's memcached or Redis or an in-process cache (like Erlang/Elixir have), the tradeoffs seem mostly the same.


> namely in response of a demand for a quick fix of a performance problem

Caches are addictive. The first one is 'free' (easy) and people start wanting to use that solution for all their problems, especially social problems (we can't convince team A to get their average response time to match our SLA, so we'll just cache them to 'fix' it)

They defer thinking about architectural problems until later, when they are so opaque that "nobody could blame you" for having trouble sorting them out. But I do. Blame them, that is.


I work at a unicorn.

We're all in on AWS and don't care about lock-in.

The vendor lock-in argument isn't worth considering for most businesses.


I'm almost inclined to believe that the relationship is inverted from what many assume.

Amazon will bend over backwards to accomodate a company spending $500 mil a year on hosting (apparently what Snap spends). Sure it's only a fraction of its revenue ($386 bill for AWS), but half a billion is half a billion.


>managed DB via magic

What product is this?


Google Cloud SQL. I say magic because locally I need a service key to connect to the proxy, but the production app doesn't seem to need anything but the internal google address.


The service credentials are supplied via an env variable that points to their location. Locally, you can provide the location directly or set the env variable yourself. When deployed, most GCP service environments just have that variable setup already and you don't have to think about it, so it feels a bit like magic. Same thing underneath the good though.


Oh cool, that makes perfect sense, thanks for the explanation!





Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: