
Why We Chose Kubernetes - sciurus
http://code.haleby.se/2016/02/12/why-we-chose-kubernetes/
======
jakejake
I can't seem to get into the game of fully automated deployments to
production. It definitely interests me but a few things always hold me back.

The first issue is that I've probably set up 10 or 15 app development and
deployment "systems" if you will. I've found that it's very beneficial to
automate the simple stuff but it quickly reaches a point of diminishing
returns. A super-custom system always works great for a while until some big
change or library upgrade or refactor or whatever comes down the pipe. Then we
spend a ton of time resetting up everything. We have to keep the build system
components up to date so it doesn't turn into an ancient mystery box.
Sometimes an upgrade breaks the whole thing and then we're on stack exchange
all day debugging a parser library or some other thing that we don't care
about. Basically spending hours and days and weeks on the build system so we
can have that sweet one-click (or fully automated) deploy.

The other thing is that we release frequently but we tend to double check
everything before it goes to production. Our staging server is auto-deployed
except DB changes which we do manually. Right now it's about 2-3 clicks for us
to deploy to production and it works fine. We still do DB changes manually
though. It takes a minute or two to deploy. I feel like the process encourages
that final check that everything is cool.

I guess I'm nervous to set up something that deploys to production simply by
adding a tag to a slack message or the git commit message. Should I get over
myself? If I change my thinking is it possible that deployment to prod could
be a non-event?

~~~
dekz
It can still get tested and smoke tested before deploying directly to
production. If you cant trust your automated tests and smoke tests then you're
setting yourself up to fail.

~~~
officialchicken
Very true, but what's the ROI for full automation?

If a typical 2-3 click deploy generously takes an hour, and they do 40 per
year... then it would take 1 year to break even presuming that a fully-
automated system could be built and deployed in 1 man-week. If the deploy
takes 10 mins, can it be built in less than 1 day?

Ignoring the development time for a fully automated system, I think the real
question is, "how does a rollback and unscheduled downtime impact the ROI due
to unforeseen problems?" because it will happen, eventually.

~~~
brianwawok
Well how many mistakes do they make with those 4 clicks? That's part of the
point of automation - remove chances for humans to flub.

~~~
reddit_clone
Why downvote this? This is a valid point.

Automation is not just for saving time. It is also preventing the system from
human errors.

People make errors due to fatigue, inattention etc. performing even simple
tasks.

------
jasonjei
Amazon ECS/Elastic Beanstalk was painful to use. Their Docker Image Registry
requires you to have rotating keys. You try to setup Docker registry
authentication with their container service, only to find out they ignore the
authentication settings for Amazon hosted registries. Deployment errors give
unhelpful messages (and after much googling you realize it's due to some IAM
policy that you were supposed to add with that bit of info hidden in some
marketing page FAQ, meanwhile losing 3 hours of sleep). Redeployments take a
long time.

Google Cloud, on the other hand, was truly easier to use. Redeployments didn't
take forever, and it wouldn't try to fail over and over for minutes before
returning an error like AWS.

~~~
boulos
Glad to hear GCR (Google Container Registry -
[https://cloud.google.com/container-
registry/](https://cloud.google.com/container-registry/)) is working great for
you! It's main purpose today is certainly for Kubernetes users on GKE, but we
also have people using it directly and for App Engine Managed VMs. Having a
fast, secure, and cheap (just storage and networking) place to push and pull
containers is _really_ helpful.

Disclaimer: I work on Compute Engine, but didn't work on GCR.

~~~
wstrange
I'd love to see GCR expanded to support more Docker Hub like functionality (a
nice GUI, being able to search for public images, etc.).

And for public images, it would be nice if Google would foot the bandwidth tab
:-).

------
Galaxeblaffer
we've been using gke(managed kubernetes on Google cloud aka container engine)
in production for a year now, and I've been super happy ! I've only had one
hiccup where the loadbalancer for some unknown reason stopped routing traffic
to my cluster.. I must say, Google cloud has become really really good and
there are new features and improvements every month.. It's obvious that Google
i channeling many resources into its cloud services.. I have full logging from
all my pods, I have rolling updates, I have https load balancing, they just
released a cdn, there's automated health checks, super nice cli tool that can
control every service on the platform(gcloud) makes it really easy to script
it all, advanced monitoring the list just goes on.. The only third party ops
we use is Opbeat.. And no, I don't work for Google, I just really love their
cloud service and i think it deserves more attention instead of everyone just
defaulting to Amazon. I'm the only one i know who uses it.

~~~
wstrange
I agree - Google needs to promote GCE more. Maybe they should buy some adwords
:-)

Having used both AWS and GCE, I find GCE is just a better experience.

I love the cloud console and the cloud shell. The CLI tools work well. VMs
start quickly.

For development, preemptible VMs are an incredible bargain.

------
ptrincr
I may have misunderstood but it appears kubernetes as a service was chosen
(via GCE), but this is compared to alternatives which you have to install by
yourself.

This is slightly unfair as the setup and configuration of kubernetes on your
own kit is fairly difficult, at least it was the last time I looked,
especially the networking side of things.

~~~
SEJeff
Distributed systems aren't easy, highly available properly built ones even
less so.

------
daurnimator
Looks like an un-finished draft article? I'm only seeing headers for each
section. Screenshot:
[https://i.imgur.com/ZjBchfo.png](https://i.imgur.com/ZjBchfo.png)

(chromium, linux)

~~~
sfilipov
If you are using Arch Linux: pacman -S otf-fira-sans

~~~
wyc
Thanks! Helped me solve this in Gentoo: emerge -av fira-sans

------
jacques_chester
Since we're calling out OSS alternatives that the author missed, I'll point
out Cloud Foundry. It solved the "rolling upgrade" problem years ago.

Currently you can run it on AWS, OpenStack or vSphere; Azure and GCE support
are being worked in concert with Microsoft and Google respectively.

Disclaimer: I work for Pivotal, the company which donates the largest chunk of
engineering effort to Cloud Foundry.

~~~
ec109685
Why describe your contributions as a donation? There are paid Cloud Foundry
hosting options, so it is in your company's best interest to make the system
awesome.

~~~
helloiamaperson
There's a foundation that's a separate entity that owns the IP:
[https://www.cloudfoundry.org/membership/members/](https://www.cloudfoundry.org/membership/members/)

------
reitanqild
Anyone has any good resources (blog series, books, videos or courses preferred
in roughly that order) to get started with Kubernetes on CoreOS?

Recently installed a three vm cluster in Vagrant (actually super simple, some
things actually have improved a LOT in 2016) but I still need to understand a
lot it seems to get rolling upgrades etc.

~~~
davidopp_
[https://coreos.com/kubernetes/docs/latest/getting-
started.ht...](https://coreos.com/kubernetes/docs/latest/getting-started.html)

Also [http://kubernetes.io/v1.1/docs/getting-started-
guides/coreos...](http://kubernetes.io/v1.1/docs/getting-started-
guides/coreos.html)

------
SEJeff
Excellent post, this is really well written.

Regarding the bug in Mesos found by Aphyr, that has been fixed:
[https://issues.apache.org/jira/browse/MESOS-3280](https://issues.apache.org/jira/browse/MESOS-3280)

The internet should thank people like him for finding these issues.

------
webo
I see the benefit of ECS/Kubernetes for managing many services.

Say I already use ECS or Kubernetes for orchestration of some of my services.
If I'm working on a new Rails/Node/Python app that doesn't talk to the rest of
my services, would it make sense to stick the app into my existing cluster? If
not, what would be an easy way to launch, deploy, and manage (non-PaaS) these
kind of stand-alone services?

~~~
cpitman
I would deploy all of your applications to the same Kubernetes instance. If
you need to worry about isolating specific applications on different hosts,
that can be accomplished by labeling those hosts and setting up affinity
rules. In other words, hosts can be split into different zones and pods
deployed for specific zones, but still have one larger Kubernetes instance.

------
idiocratic
a relatively newcomer in the space is Nomad by Hashicorp, very promising and
with simpler architecture. Of course it makes even more sense if used in
conjunction with Consul and Atlas from the same company. It's still young and
buggy though.

------
bacheson
Rancher is the perfect blend between tutum and kubernetes.

I highly recommend you check it out before going down the long winding road
that is kubernetes.

~~~
lukebennett
Agreed, we've been trying out Rancher since its early days and we're currently
in their beta program, we've really enjoyed using it - hits that sweet spot
between power and simplicity. Biggest frustration is the lack of a CLI which
is preventing us from being able to proceed to production as we don't want to
manually handcrank API calls.

You'll actually be able to use Rancher to manage Kubernetes clusters[0] in the
next few weeks, if you want the best of both worlds.

[0] [http://rancher.com/introducing-kubernetes-environments-in-
ra...](http://rancher.com/introducing-kubernetes-environments-in-rancher-
recorded-online-meetup-february-2016/)

~~~
stuff4ben
This is awesome, never heard of Rancher before! I like the simplicity, but I'm
curious why you'd want to stand up Kubernetes on top of Rancher though? What
is the benefits to doing so versus running Rancher by itself?

~~~
lukebennett
I'm equally curious to be honest :) It's not something we do ourselves, I just
mentioned it given the context of this thread.

~~~
stuff4ben
In watching the video, it looks like it's a replacement for "cattle" which I
guess is how nodes are scaled out. Interestingly they've made deployment of
K8s extremely simple. I'm looking forward to playing with Rancher in some test
environments.

------
anentropic
anyone got an opinion on [http://armada.sh/](http://armada.sh/)

(vs Rancher, Tutum, Deis...)

?

------
kordless
> It was the most robust solution we tried (we only tried it on Google
> Container Engine)

So they didn't actually "choose" Kubernetes. They just chose to use something
that is run by someone else.

~~~
wstrange
How is that different than AWS ECS? In both cases, someone else runs it.

~~~
kordless
I'm not sure what your question means, but I'm simply pointing out that this
didn't actually decide anything useful in terms of rolling your own solution
and running it yourself. Choosing GCE over ECS is not the same as choosing
Docker over Kubernetes if you are self hosting.

