
Moving from Heroku to Google Kubernetes Engine - shosti
https://www.rainforestqa.com/blog/2019-04-02-why-we-moved-from-heroku-to-google-kubernetes-engine/
======
markbnj
We've been running in production on GKE for a little over two years and it's
been a solid platform since day one. It's nice to read articles like this and
see others coming to the same conclusions we've come to. If your practices and
workflow are oriented to containers and you outgrow your PAAS then k8s is the
logical place to land.

With respect to the choice of helm: we started out rolling our own pipeline
with sed and awk and the usual suspects. When that became too complex helm was
just taking off and we moved to that. We still use it to install various
charts from stable that implement infrastructure things. For our own
applications we found that there was just too much cognitive dissonance
between the helm charts and the resulting resources.

Essentially the charts and the values we plugged into them became a second API
to kubernetes, obfuscating the actual API below. The conventions around the
"package manager" role that the tool has taken for itself also contribute to
lessen readability due to scads of boilerplate and name mangling. We recently
started deploying things in a new pipeline based on kustomize. We keep base
yaml resources in the application repo and apply patches from a config repo to
finalize them for a given environment. So far it's working out quite well and
the applications engineers like it much better. Now with kubectl 1.14
kustomize's features have been pulled in to that tool, something I have mixed
feelings about, but at least the more declarative approach does seem to be the
way the wind is blowing.

~~~
atombender
We use Helm, but we really only use it for two things: Templating and atomic
deploys/deletes.

Helm templating is pretty terrible. Whoever thought generating YAML as text
was a good idea deserves a solid wedgie. But it gets us where we need to be.
During our prototyping of our GKE environment, we had lots of individual YAML
files, which was not tenable.

Atomic deploys/rollbacks is essential. What Helm brings to the table is a
high-level way of tying multiple resources together into a group, allowing you
to both identify everything that belongs together, and to then atomically
apply the next version (which will delete anything that's not supposed to be
there anymore). Labels would be sufficient to track that, in principle, but
you still need a tool to ensure that the label schema is enforced.

We don't use any of the other features of Helm -- they're just in the way. We
don't use the package repo; we keep the chart for every app in the app's Git
repo, so that it's versioned along with the code. We've written a nice wrapper
around Helm so people just do "tool app deploy myapp -e staging", and it knows
where to look for the chart, the values by environment etc. and invoke the
right commands. (It also does nice things like check the CI status, lint the
Kubernetes resources for errors, show a diff of what commits this will deploy,
etc.)

I've looked at Kustomize, and I don't think it's sufficient. For one, as far
as I can see, it's not atomic.

I'm hoping a clear winner will emerge soon, but nothing stands out. My
favourite so far is Kubecfg, which is similar to the unnecessarily complex
Ksonnet project, which has apparently been abandoned. Kubecfg is a very simple
wrapper that only does Jsonnet templating for you.

I'd be interested in how Google does these things with Borg. My suspicion is
that they're using BCL (which Jsonnet is based on, last I checked) to describe
their resources.

~~~
renaudg
Kapitan ([https://kapitan.dev](https://kapitan.dev)) is on my radar as a
possible sweet spot between Kustomize and Helm.

Until now I've used Jinja2 templates for our Kubernetes definitions with a
variables file for each environment, but this is awfully manual.

I'd love Kustomize to be sufficient for us as it's poised to become a standard
thanks to now being part kubectl.

Unfortunately, in some ways its YAML patching philosophy is too limited, and
coming from a templating system would be a step back even for relatively
simple use cases : for example, you're very likely to need a few variables
defined once and reused across k8s definitions (a host or domain name, project
ID, etc). You can't really do that in a DRY way with Kustomize.

AFAIK, it also currently doesn't have a good story for managing special
resources like encrypted secrets : it used to be able to run arbitrary helper
tools for handling custom types (I use Sealed Secrets), but this has been
removed recently for security reasons, prior to the Kubectl merge.

Kapitan seems to cover these grounds, and it doesn't carry the weight of those
Helm features which are useless for releasing internal software, but I'm still
a bit worried about the complexity and learning curve for dev teams.

Is there anything else out there that goes a little further than Kustomize, is
simpler than Kapitan and Helm and fits well into a GitOps workflow ?

~~~
atombender
I've only looked briefly at Kapita . It looks interesting, but I think what
Helm gets right, and these other tools don't, is to have a real deployment
story that developers can like. Helm doesn't excel here, but it's better than
kubectl.

In short, I think the winning tool has to be as easy to use as Heroku. That
means: The ability to deploy an app from Git with a single command.

It doesn't need to be by pushing to git. I built a small in-house tool that
allows devs to deploy apps using a single command. Given some command line
flags, it:

* Checks out a cached copy of the app from Git

* Finds the Git diff between what's deployed and current HEAD and pretty-prints it

* Checks the CI server for status

* Lints the Kubernetes config by building it with "helm template" plus a "kubectl apply --dry-run"

* Builds the Helm values from a set of YAML files (values.yml, values-production.yml etc.), some of which can be encrypted with GPG (secrets.yml.gpg) and which will be decrypted to build the final values.

* Calls "helm upgrade --install --chart <dir>" with the values to do the actual deploy.

The upshot is that a command such as "deploytool app deploy --red mybranch"
does everything a developer would want in one go. That's what we need.

The tool also supports deploying from your own local tree, in which case it
has to bypass the CI and build and push the app's Docker image itself.

Our tool also has useful things like status and diff commands. They all rely
on Helm to find the resources belonging to an app, and we did this because
Helm looked like a good solution back when we first started. But we now see
that we could just rely on kubectl behind the scenes, because Helm's release
system just makes things more complicated. We only need the YAML templating
part.

I hate YAML templating, though, so I think something Kubecfg is the better
choice there.

~~~
markbnj
> The upshot is that a command such as "deploytool app deploy --red mybranch"
> does everything a developer would want in one go. That's what we need.

That tool for us is a gitlab pipeline, and I guess the logic in your tool is
in our case split between the pipeline and some scaffolding in a build repo.
The pipelines run on commit, the image is built, tested, audited, then the
yaml is patched and linted as you describe before being cached in a build
artifact. The deploy step is manual and tags/pushes the image and kubectl
applies the yaml resources in a single doc so we can make one call. We
recently added a step to check for a minimal set of ready pods and fail the
pipe after x secs if they don't come up, but haven't actually started using it
yet.

~~~
atombender
That sounds similar, except you prepare some of the steps in the pipeline.
Sounds like you still need some client-side tool to support the manual deploy,
though. That's my point -- no matter what you do, it's not practical to reduce
the client story to a single command without a wrapper around kubectl.

Interesting idea to pre-bake the YAML manifest. Our tool allows deploying
directly from a local repo, which makes small incremental/experimental tweaks
to the YAML very fast and easy. Moving that to the pipeline will make that
harder.

Also, you still have to do the YAML magic in the CI. We have lots of small
apps that follow exactly the same system in terms of deploying. That's why a
single tool with all the scaffolding built in is nice. I don't know if Gitlab
pipelines can be shared among many identical apps? If not, maybe you can
"docker run" a shared tool inside the CI pipeline to do common things like
linting?

------
_0nac
If you can't wait for their teaser of " _In a future post, I’ll cover the
migration process itself_ ", the GCP site has a hands-on tutorial of migrating
an app which may prove interesting:

[https://cloud.google.com/solutions/migrating-ruby-on-
rails-a...](https://cloud.google.com/solutions/migrating-ruby-on-rails-apps-
on-heroku-to-gke)

Disclaimer: I work for GCP and wrote most of that :D

At the end of the day, Heroku and GKE are rather different beasts with
different philosophies, so migrations are never going to be 1:1. I expect this
to become simpler over time as tooling matures though, eg. using
[https://buildpacks.io](https://buildpacks.io) to build Docker images instead
of having to craft them by hand seems promising.

~~~
ZeroCool2u
I've been playing around with GCP, mostly app engine, the past couple weeks
making a toy application to get a little more comfortable with building and
designing serverless apps. Just wanted to say you, and whoever else, is
writing those docs does an amazing job. Seriously, I've tried all 3 platforms,
because of work and GCP is by far the easiest to become productive with and
the main reason is the glorious documentation. Keep up the great work!

~~~
devlance
Great to hear about your experience with the docs and appreciate the call out.
You can use the "Send Feedback" button on any of the doc pages to let us know
about any feedback you have. Real people read it.

~~~
ZeroCool2u
I'll keep that in mind!

------
wrs
We are happy with GKE, but have gone from Cloud SQL PostgreSQL back to self-
managed PostgreSQL VMs. Cloud SQL is still stuck on version 9.6, and still has
no point-in-time recovery ability. It's disappointing because the rest of the
GCP offering is pretty well thought out and making rapid progress but Cloud
SQL seems not to be getting much love.

~~~
pm90
They mentioned that there will be some interesting updates next week for
Google Next; do you think your feelings might change if they announced support
for newer versions and point-in-time recovery?

On a related note, if you have tried it, what do you think of their Cloud SQL
MySQL offering?

~~~
Spartan-S63
Not the original commenter, but my team uses CloudSQL MySQL. It's not too bad.
It's pretty performant, but we've run into some weird issues surrounding
replicas.

As far as I know, MySQL 5.7 is as far as they go and like PostgreSQL, they
don't support point-in-time recovery. Also, perplexingly (at least as of a
year ago?), deleting the instance deletes all backups associated with it, so
there's an opportunity to accidentally blow away all your data. I'm sure
Google can recover it, but you'll have to submit a support ticket for that.

~~~
takeda
Yes, when deleting an instance you can't reuse the same name for a week, so
I'm quite sure it is there for at least that amount of time.

------
shay_ker
> At one point we attempted to migrate to Heroku Shield to address some of
> these issues, but we found that it wasn’t a good fit for our application.

This part seems very hand wavy, given that Heroku Shield would've solved many
(all?) of their problems.

> We were also running into limitations with Heroku on the compute side: some
> of our newer automation-based features involve running a large number of
> short-lived batch jobs, which doesn’t work well on Heroku (due to the
> relatively high cost of computing resources).

How much memory did their batch jobs actually need? If they're using Rails,
then I'm assuming they're just running a bunch of Sidekiq jobs that are
querying PG. I'm surprised that they'd need that much in terms of compute
resources. They should be able to get very, very far by making PG do a lot of
the work, or by streaming data from PG and not holding a lot of data in
memory.

Even if they did need all this, the following two options seem WAY easier to
manage:

1) Use dokku to run your super-intense Sidekiq batch jobs on beefy EC2
instances. You can still schedule them in your Rails app in Heroku, no big
deal. Many engineering teams have to do this type of split-up anyway when it
comes to Application Engineers and Data Engineers, this is just a simpler way
to do it.

2) Similar to 1), use a different language runtime for the batch jobs. If you
really need to run CPU intensive jobs, why are you using Ruby? If the jobs
aren't so intense to mandate maintaining two languages (fwiw, not that hard),
why will moving to k8s solve the issue?

Personally, I'm not sold on their decision to move to Kubernetes, and I use
Kubernetes for my job.

~~~
shosti
> This part seems very hand wavy, given that Heroku Shield would've solved
> many (all?) of their problems.

Author here; I don’t want to go into too much detail, but we tried Shield
early on and had a negative experience that made us wary about using the
platform (it seems to use a different tech stack under the hood from “normal”
Heroku and lacks a lot of the things that make Heroku great). Also it’s very
expensive compared to VPC-based solutions on AWS and GCP.

W.R.T. the batch jobs, I think I didn’t explain super well—we are using a
different language and runtime from our “normal” background processing jobs
(which use worker queues in Rails), it’s just that Heroku isn’t very well
suited for the use case (which is basically FaaS-like but with long-lived
jobs).

The “split” workflow you described is basically what we were doing (but with
AWS Batch instead of Dokku); it’s just that it’s more cost-efficient to
consolidate everything into one cluster (especially with preemptible gke
nodes) and also better to have a common set of tooling for the Ops team.

To be fair, we haven’t yet completed the move from Batch to k8s so it’s
possible that part of the plan won’t pan out as expected.

~~~
holografix
Disclaimer: I work for Salesforce, Heroku’s parent company.

Heroku Shield is a service added on top of Heroku Private spaces.

You usually don’t need Shield unless you want to be compliant with things like
HIPAA, etc

Which of course could be your case here.

~~~
ukd1
It is and we needed HIPAA. For me, it's priced aggressively (~600%, compared
to zero for GCP) and wasn't ready when we looked - i.e. caused a few SEVs.

~~~
shay_ker
> ~600%, compared to zero for GCP

I've always been curious. What do you need to do to be HIPAA compliant, from a
technology standpoint? I figured it's similar to PCI compliance, but I'm not
sure.

From what I've heard, though, the cost isn't quite zero, it's just that you
have to own & implement all the work to be HIPAA compliant. But perhaps it's
not that bad?

~~~
holografix
I’m not in product or legal so take this with a grain of salt:

I know that for a customer I spoke to, keystroke logging on running dynos was
something they were really interested in, from a compliance point of view.

I think being able to spin up Postgres DBs with rollbacks, fork and follow, HA
etc etc (don’t want to sound like a sales rep) in this highly compliant
environment also involves some serious infra wrangling.

------
NightlyDev
I've always wondered: Is kubernetes hard to host on your own on a couple of
servers for production?

I've never tried, but I've heard a lot of people saying it's very hard, but
people are often complaining about the most basic stuff as being hard..
Sooo..?

~~~
flurdy
You would have to follow this the whole way:
[https://github.com/kelseyhightower/kubernetes-the-hard-
way](https://github.com/kelseyhightower/kubernetes-the-hard-way)

~~~
barbecue_sauce
There are plenty of tools that do a lot of "the hard way" heavy lifting for
personal deployments of Kubernetes, like kubeadm, though doing "the hard way"
at least once is good so you know what actually is going on.

------
autotune
Basically moved to my dream tech stack, jealous of the SRE’s over there.

~~~
maciejgryka
We are hiring :)
[https://rainforestqa.com/careers](https://rainforestqa.com/careers)

~~~
autotune
Ha, would love to as a remote candidate out of NYC, don't see anything for SRE
though.

------
neya
I'm interested to know why they didn't move to Google AppEngine instead, which
offers a better experience and more advanced features overall. Especially
considering that AppEngine is a direct competitor to Heroku than Kubernetes
engine.

~~~
alasdair_
>I'm interested to know why they didn't move to Google AppEngine instead,
which offers a better experience and more advanced features overall.
Especially considering that AppEngine is a direct competitor to Heroku than
Kubernetes engine.

AppEngine has an enormous number of limitations that you only hit once you
scale up and gets very expensive very quickly.

~~~
neya
> AppEngine has an enormous number of limitations

I have used AppEngine in production for about 50+ clients and I am genuinely
really curious to know what these limitations are. Maybe it is dependent on
the programming language/framework? I run Phoenix/Elixir with Vue.JS +
PostgreSQL as standard for most of my clients, it's really a breeze to work
with.

In addition, these are the advantages of working with AppEngine:

[https://news.ycombinator.com/item?id=17516530](https://news.ycombinator.com/item?id=17516530)

~~~
clhodapp
The AppEngine Standard Java 8 API is severely limited because it is coupled to
the Servlet API, which severely screwed up on the design of its async API. As
a Scala developer, this limitation pretty much sinks the platform for me,
since the flexible environment is really expensive and has a worse value
proposition vs managed Kubernetes.

~~~
haimez
App engine flex is a “bring your own container” set up.

~~~
clhodapp
Right, but at that point, you almost might as well move up to "full"
Kubernetes if you know it. I'm not saying that there's _no_ use case for App
Engine flex, just that it is kind an uncomfortable middle-ground between
standard and k8s.

~~~
devlance
Worth pointing out that Cloud Run was just released and maybe better meets
your needs? [https://cloud.google.com/run/](https://cloud.google.com/run/)

------
rdsubhas
I work at a large European startup. We are also heavily invested in and love
kubernetes, but this is a totally apples to oranges comparison. Heroku is a
PaaS and Kubernetes is a CaaS. K8s is great at what does, but to make it act
like Heroku is a huge amount of effort and needs a team to manage the tooling
around it. Assuming that such team is not needed, and k8s can simply be used
like heroku with a bunch of extra CLI tools, usually leads to "Wild Wild West"
clusters.

------
mcguireio
Now if only they could quit spamming every inbox of every company I've worked
at, id be impressed. Their sales automation is out of control. Honestly...I
write them off before ever looking at their services.

------
zenlot
For some reason those blogs become boring. Whoever moves to Kubernetes, or
chooses to use some service of major cloud provider feels a need to write a
blog about it having a same theme as everyone else.

Bonus points if it explains what containers are and differences between EKS,
ECS and others.

~~~
techslave
well you are missing the point.

the tech details are irrelevant in posts like this. no one, but no one, is
throwing down knowledge and insight that anyone and their mama doesn’t know or
can’t easily acquire. people just don’t give away their secret sauce that
easily.

the 2 points are

a. a recruiting signal. it lets candidates know “we are like you”. we care
about the tech stack and what we do. we care about elegant and “good”
solutions.

b. it gives company devs a public sounding board. a chance to have a bigger
voice than in house obscurity.

if you think these blogs are really about the tech, well then yes they are
quite boring.

~~~
nitinreddy88
I guess it's becoming a trend "we moved from here to here" and let's make it
to top of HN.

Unless current provider has horrible support/issues which you want to specify
to help others, these articles are useless.

HN community making these articles to front page which neither describes
architect challenges/scalability challenges is pointless

~~~
staticassertion
I found the section on their evaluation of other potential systems like ECS to
be interesting - it's something I'm currently considering myself.

~~~
ukd1
glad it was useful!

------
auslander
> ..agile without hiring a large Ops team, .. we were beginning to outgrow
> Heroku. We ended up .. running on Google Kubernetes Engine

Or you could hire few cloud infra devs and stick to autoscaled VMs (EC2), not
spending time on k8s? I bet k8s took you a while to set up and maintain.

------
kerng
Crossing my fingers for you that GCP won't shut down your account because some
mysterious Google AI decides so.

~~~
ec109685
One of the reasons for choosing GKE was reducing vendor lock in.

