
Ask HN: Docker, Kubernetes, Openshift, etc – how do you deploy your products? - BloodKnight9923
I use docker extensively with python backed ansible scripts to manage my product deployments (with a jenkins CI&#x2F;CD pipeline). That has been a lot of fun, but I have also played with both Kubernetes and Openshift.<p>I love what Openshift Origin can do, but the learning curve is like a brick wall (See Dwarf Fortress Fun for an example) and the costs are far from minimal.<p>Kubernetes is easier to learn, but comes with its own gotchas.<p>What do you do to maintain stable deployments that allow for easy CI&#x2F;CD? How do you minimize costs with your solution?
======
zalmoxes
I recently(past 6 months) joined a new startup as the operations person, and
we standardized on kubernetes for deployment. In the past I've worked with
puppet/chef/ansible/heroku/aws/appengine/vmware you name it, and Kubernetes is
the nicest and most flexible platform to build on top.

There's a learning curve, and new features are being added, but at this point
I would not hesitate to recommend Kubernetes to just about anyone.

CI: We standardize on CircleCI and it gets the job done, but has some serious
shortcomings. I've also come close to building my own on top of the k8s
cluster and it's not the correct time investment for me right now, but I'd
consider building my own in the future. I've yet to find a CI framework I
really like.

~~~
Svenstaro
Have you had a proper look at Gitlab CI? It know, it's hard to make that sound
credible with all that's going on with Gitlab right now but I found their CI
to be the best around and you can even use your own machines to run stuff on.

~~~
andrewstuart2
To be fair, GitLab's current issues are purely operational on the GitLab.com
side: their backup strategy and testing needs some work, but the product is
really solid and most of the features are already there in GitLab CE.

For me, getting GitLab up and running on kubernetes was a breeze (using the
popular docker image [1]) and my `pg_dump; duplicity;` backups are chugging
right along. I haven't played with their CI yet, but I'm pretty excited to see
how much it can do for me automatically managing my cluster.

[1] [https://github.com/sameersbn/docker-
gitlab](https://github.com/sameersbn/docker-gitlab)

------
eicnix
Openshift is essentially Kubernetes + Redhat Extensions + Redhat Support

I use Gitlab CI and helm[1] for deploying. The last step of the ci process
checks out the helm chart which is just another git repo and executes a helm
install/upgrade CHART-NAME. Making things accessible is done through
kubernetes ingress with nginx[2](which includes getting let's encrypt
automatically for all external endpoints) so when I want to deploy a new
staging version of the app I can do helm install app --set host=my-
stage.domain.com .

There still a few gotchas like the pods won't update when a configmap was
changed which is important because I keep the container configuration maps as
configmaps. A crude workarround for this is [3] which triggers a configuration
reload of the application running inside the container.

This solution has no licensing cost unlike Openshift(Tectonic[4] is another
enterprise Kubernetes distribution which is free for 10 nodes) and the cost
are based on the amount of time to set this up. But after you got into helm
and more complex kubernetes deployments it should be easy.

[1] [https://github.com/kubernetes/helm](https://github.com/kubernetes/helm)

[2] [https://github.com/jetstack/kube-lego](https://github.com/jetstack/kube-
lego)

[3] [https://github.com/jimmidyson/configmap-
reload](https://github.com/jimmidyson/configmap-reload)

[4] [https://coreos.com/tectonic/](https://coreos.com/tectonic/)

~~~
rollulus
Hi, nice, this is a lot similar to our initial approach. Glad to see that! To
make this flow a bit more easier, we've created a tool [1] (apologies for the
plug) that is on top of Helm. What it does, is taking a diff between a bunch
of Chart references with values (the desired state) and what's currently in
Kubernetes (the actual state), and perform a few create/updates/deletes. So
for instance you don't have to add an explicit helm delete if you remove a
component, instead just remove it from the desired state. In our day to day
work we add and update tons of components through this system, only updating
.yaml files, checking them into Git and have the CI/CD do all the work. Also
our tool has a dry-run mode, which acts like a test stage for pull requests.

[1]:
[https://github.com/Eneco/landscaper](https://github.com/Eneco/landscaper)

~~~
eicnix
Hi rollulus,

thanks for the info. As far as I understand it you configure the state of the
helm release + configured values in a file and apply it to a kube cluster.
Depending on your cluster setup this is really helpful. Do you have a solution
for triggering a rolling update of pods if a configmap has changed? In your
examples I didn't see any configmaps.

~~~
rollulus
Good point, eicnix. No, we don't have a solution for that, but hadn't the need
so far either. Sorry.

------
riceo100
I used to use Marathon on Mesos for deploying Docker containers, and
orchestrated it via a hacked together Jenkins cluster, which worked well but
took a lot of configuration and was somewhat brittle.

I moved to Kubernetes about 6 months ago and have been really enjoying it. My
first production cluster was hand rolled on AWS, where I found the cloud-
provider load balancer integrations extremely helpful
([https://kubernetes.io/docs/user-guide/load-
balancer/](https://kubernetes.io/docs/user-guide/load-balancer/)).

I'm now using Google Container Engine which is effectively just a hosted
Kubernetes cluster on GCP, which has really been 0 effort setup, and have been
deploying to it with Wercker
([http://www.wercker.com](http://www.wercker.com)) [Disclaimer: I currently
work at Wercker as of the last few months, but was a fan/user for many years
before joining]

One thing I noticed across Openshift, Mesos, and Kubernetes: none of them
handle the Docker daemon on a node hanging particularly well, which in my
experience happens fairly often.

------
jhspaybar
I use Convox ([http://www.convox.com](http://www.convox.com)). It is backed by
ECS which gets me out of the infrastructure game for the most part and the CLI
interactions in Convox are similar to heroku style commands so the learning
curve is much simpler than deploying and learning my own Kubernetes or
OpenStack or ECS configurations. They've also thought of the other things you
need like environment based secrets(uses DynamoDB and KMS behind the scenes),
as well as external load balancing, TLS, RDS integrations and more with single
simple commands.

They also have CI/CD out of the box and builds can be triggered in your
existing cluster with a 'convox build' or triggered on pushes to your private
github repos.

Overall, unless you have a team that actually sees benefit in managing your
own container and cluster manager(you better be big), id recommend embracing
Convox, or something like it. The complexity still exposed by Kubernetes,
OpenStack or ECS is still significant.

~~~
wstrange
If you don't want to manage Kubernetes, you can use Google Container Engine
(GKE), which is Kubernetes-as-a-service.

A nice bonus is that they only charge you for the minion nodes. The Kube
master is free.

~~~
thesandlord
The master is free for small clusters (0-5 nodes). After that, you pay for it.

[https://cloud.google.com/container-
engine/pricing](https://cloud.google.com/container-engine/pricing)

(I work for Google Cloud)

------
ams6110
I deploy on bare metal. Docker, Kubernetes, et. al add layers of complexity
that I don't need. I'm not saying that they don't have benefits at a certain
scale, but for the types of single-server deployments I do, I have not been
convinced.

~~~
sandGorgon
please do #sig-onprem on kubernetes.slack.com . We are discussing a lot of
stuff to make this easier.

The biggest challenge right now is the ingress/loadbalancer abstraction.
Hopefully, that should get resolved over the next few months.

------
webo
Our team is <15 engineers. The set up is roughly as below. We have around 40
services. Ping me if you wanna talk more.

[https://cloudcraft.co/view/5582ddd4-c6f8-4354-8f5b-9fb0a3744...](https://cloudcraft.co/view/5582ddd4-c6f8-4354-8f5b-9fb0a374412a?key=NkuLpYphuk30fWbXYgIWwQ)

* Development: docker + docker-compose. Ideally, we would want to get rid of docker-compose for development.

* CI: Travis (planning on switching to something that is more on the CD side)

* Infrastructure management: terraform

* Prod: AWS, CoreOs, Kubernetes 1 master node and 5-6 worker nodes (m4.large) in an autoscaling group.

Infrastructure deployments and updates are done by Terraform. Blue/Green
deployments thanks to the autoscaling group.

Kubernetes deployments and updates are done by kubectl.

There's still problems with each piece, but for the most part they work great
without much trouble.

~~~
sandGorgon
docker 1.13 swarm has full compatiblity to use compose file format to launch a
cluster. you should try that.
[https://www.infoq.com/news/2017/01/docker-1.13](https://www.infoq.com/news/2017/01/docker-1.13)

~~~
webo
In my experience, docker itself is not very stable, and swarm is nowhere close
to Kubernetes offers.

------
ownagefool
We mainly use drone and have built a templating tool that wraps around
kubernetes deployments to give us feedback on whether they were successful or
not.

Example kube-deploy files: [https://github.com/UKHomeOffice/kube-
piwik](https://github.com/UKHomeOffice/kube-piwik)

Example app / drone files: [https://github.com/UKHomeOffice/docker-
piwik](https://github.com/UKHomeOffice/docker-piwik)

Platform Documentation: [https://github.com/UKHomeOffice/hosting-
platform](https://github.com/UKHomeOffice/hosting-platform)

KD - our deployment tool
[https://github.com/UKHomeOffice/kd](https://github.com/UKHomeOffice/kd)

I can't really comment on whether or not this specific pipeline actually works
as I've just picked a random open source example but the workflow is there.

We also have a legacy tool and use jenkins sometimes, but mostly that won't be
open sourced.

Legacy deployment tool - don't use this.
[https://github.com/UKHomeOffice/kb8or](https://github.com/UKHomeOffice/kb8or)

------
jcahill84
At Schezzle ([https://schezzle.com](https://schezzle.com)) we use docker swarm
on AWS.

The build jobs creates images that are published to ECS repositories, and
there are auto scaling groups that add and remove engine hosts to and from ALB
target groups for each deployed service. It makes service discovery, scaling,
etc. really easy.

Definitely try swarm out if you haven't already. 1.12 was good, 1.13 is
amazing (secrets, health-based VIP membership, etc).

~~~
sandGorgon
could you talk about docker swarm ? any pitfalls that you are seeing, etc?

we have been considering playing with swarm. How many images and instances do
you have, etc

And especially, how have you leveraged 1.13

------
Svenstaro
I'm currently on an all-docker pipeline but I resent it. It's slow, tedious
and everybody's trying to use docker against its design (everybody tries to
make images with as few layers as possible, I think docker should just do away
with the layers altogether). It also makes it harder than it should be to make
an image that works both for local development and deployment at the same
time. Also, docker-compose is riddled with fairly old but important bugs (for
instance, Dockerignore files are ignored by docker-compose's build).

I'd much prefer doing simple bare-metal deployments again.

~~~
andrewstuart2
> I think docker should just do away with the layers altogether

I'd take the opposite stance, really. As far as the image format, it's the
major differentiation Docker has, and IMO a really clean way of keeping image
pulls DRY. Once your hosts have pulled a single image, _given that you don 't
actively undermine it_, subsequent pulls, even for different images, only need
to retrieve the absolute minimum since they already have hopefully pulled the
majority of the file system.

------
MexicanMonkey
[https://cloud.docker.com/](https://cloud.docker.com/)

Surprised it wasn't already in the long list of suggestions. Have been using
the tool when it was called Tutum. The guys behind Docker bought Tutum and
renamed it to Docker Cloud. It's currently set up to redeploy services when I
push an image to my repositories. Really loving the simplicity, even tough
it's got some quirks.

You can now link your bitbucket or github repositories. Let it build your
containers and deploy it to production. This way you can build an easy CI/CD
pipeline.

~~~
molszanski
+1 on everything.

With some personal flavor:

\- I use autoredepleoy only in a test environment \- We have a locally running
old server with drone.io that is web-hooked to GitHub / BitBucket

------
logn
[http://rancher.com/](http://rancher.com/) works with minimal fuss. The tools
in this space are so much in flux I just care about something working easily
and reliably in the short/medium term.

------
oelmekki
I use dokku for deployment and on-premise gitlab for CI.

Dokku's main advantage is that it's a no-brainer : if you're used to deploying
heroku apps, it's very similar. It also automates the creation of data
containers for database services, for example. On top of that, while I can use
heroku's buildpacks for small sideprojects, I can also take full control of
the build using a Dockerfile (which is what I do for bigger projects). The
main inconvenient is that it can't manage multi host container deployments,
like docker-swarm or kubernetes (I don't need it, so no need to compromise on
simplicity).

Gitlab's pipeline both offer CI and CD, with a lot of cool features around it,
like being able to tell on a commit page when it has been deployed on
production, for no configuration cost.

Regarding costs : well, it's the cost of a dedicated server.

~~~
sheraz
hear hear. I've taken the last couple of months to refactor all of my side
projects to 12-factor apps for deployment on a nice Dokku instance. Absolutely
effortless.

Next steps now is getting these projects deployed through a CI/CD. I evaluated
a few, and it looks like I'm down to drone.io or jenkins.

This will bring some much needed sanity I need to keeping all these side /
personal projects in order. I can go weeks or months without touching them,
but then know EXACTLY how they will get tested and deployed.

------
zie
we use Nomad[0], we pretty much use Hashicorp's entire stack (consul, vault
and nomad). Vault has been fabulous for secret(s), authentication, etc. Consul
for service discovery and Nomad for job running/deployment. We have a mix of
static binaries that we run and docker containers. Most of our new stuff is
all docker containers. We use Jenkins as our CI/CD, that just run nomad jobs
and confirm their successful deployment.

Cost management is easy, all the projects are open-source and since we can
spin Nomad up against any cloud provider or internal machine hosts, depending
on what's the cheapest at the time. It's pretty easy to wrap your head around
Nomad and make it do what you need.

0: [https://www.nomadproject.io/](https://www.nomadproject.io/)

------
timeu
We are evaluating Openshift Origin on an existing OpenStack on-premise cloud.
So far I have been playing around with the oc cluster up deployment on a local
workstation and it works fine but I haven't played around with the CI/CD
option (they support jenkins deployments, etc). From the docs I see that there
is a bit of complexity regarding the security constraints and integration of
volumes that I need to wrap my head around.

I also attended the DevConf.cz and saw a lot of presentations regarding
Openshift. They have most of the talks on youtube
([https://www.youtube.com/channel/UCmYAQDZIQGm_kPvemBc_qwg](https://www.youtube.com/channel/UCmYAQDZIQGm_kPvemBc_qwg))
in case somebody is interested

~~~
BloodKnight9923
My big issue was getting multi-node deployments working well in AWS. I hit
walls of configuration issues, DNS issues, poor documentation on fields, and
generally could not make much forward progress. Running locally or on a single
node OpenShift was fantastic, the haproxies for ingress were easy to configure
and launching new services was impressively easy.

I was leveraging EFS as NFS mounts for my persistent volumes and had good
results.

You might check out fabric8 if only for their visualizations of what is going
on in your openshift / kubernetes environment.

Thanks for the youtube link! I'll be sure to check it out

~~~
timeu
Yeah I can imagine that setting up a multi-node deployment on AWS might be an
issue. Fortunately for openstack there is a redhat maintained heat template
that should hopefully make the installation quite straightforward (but haven't
tried it yet).

[https://github.com/redhat-openstack/openshift-on-
openstack/](https://github.com/redhat-openstack/openshift-on-openstack/)

~~~
jeremyeder
Yep, have used these several times. They work very well.

------
trolla
I can recommend Rancher. I’ve used Openshift, Kubernetes and Rancher - so far,
Rancher has been the best experience.

[http://rancher.com/](http://rancher.com/)

~~~
ecliptik
You can also test Rancher now easily 0 setup:
[https://try.rancher.com](https://try.rancher.com)

We deploy all our containerized applications to Rancher (using Cattle for
orchestration) via Jenkins jobs with a standardized Makefile for build, test,
and deploy, making things consistent.

We look at running straight k8s, but it was like using a chainsaw to sharpen a
pencil for our use case.

In addition their devs are extremely helpful and also have a hobby of getting
things to run on ARM.

------
backordr
At the company I work at we use Docker with Kubernetes. The deployment process
involves Ansible and Jenkins CI.

I, personally, prefer the bare-metal deploys of automated scripts. I usually
just spin up a VM and write a bash script to "prep" it the way I want. After
that, I just run "./deploy" and it pushes where I want. I like this because I
feel like I have more control and it actually feels easier. Plus, I've run
into weird issues with Docker that take so long to debug that it completely
cancels out the benefit of using it for me.

The bash script I have works for every side project I create, and is simply
copied from project to project. :)

~~~
justinsaccount
> bare-metal deploys of automated scripts. I usually just spin up a VM

You're mixing terms here. A VM is not bare-metal.

~~~
backordr
Yes I'm aware. Using the term in a different sense, but I can see how that's
confusing under this context.

I meant just an old-school deploy without containers and the sort.

------
ksri
We use AWS elastic beanstalk. It's simple to setup a high availability
environment. And if needed, you can always access the underlying ec2 instances
or elastic load balancer.

Jenkins has a plugin that integrates with elastic beanstalk. This makes ci/cd
straightforward.

There's no extra cost for elastic beanstalk, other than what you'd pay for
ec2, s3 and elastic load balancer.

We've a starter template with a bunch of .ebextensions scripts that simplify
common installation tasks.

If your application is a run-of-the-mill web app speaking to a database -
elastic beanstalk is pretty much all you need.

------
aruggirello
Nobody mentioned it, but I'm using Vagrant and the digital_ocean plug-in to
manage local VMs and droplets for my small projects, it's a simple, quick and
convenient way to bring up fully replicable apps/services. I'm using small
scripts to provision my machines with Caddy, PHP7, MySQL, and a few other
goodies. Given available droplet sizes, I'm not hard pressed to scale beyond a
single machine per app/service, and this keeps everything simple; otherwise
I'd probably go with Kubernetes.

------
thinkdevcode
We use Rancher with Cattle and do CI/CD via our self-hosted GitLab CI. Pretty
easy to setup & maintain. Would definitely recommend taking a look at Rancher
if you haven't yet.

~~~
Snappy
I haven't played much with Rancher, but I'm really curious about it. I'd like
to hear more about your setup. Perhaps you could even contribute a post on
setting it up with GitLab CI?

------
spudfkc
At my last job, we started off using Mesos and Marathon, but eventually ended
up dropping that in favor of a homemade solution using SaltStack (the manager
demanded we drop Mesos/Marathon and use Salt - it was pretty shitty).

At my current place, we are using Teamcity to run tests and build images, and
Rancher for the orchestration part. I built a simple tool to handle auto-
deployments to our different environments.

I cannot recommend Rancher enough. Especially for small teams, it's just a
breeze to set up and use.

------
rickr
I've been working on creating a platform for a non profit to get veterans
coding ([http://operationcode.org/](http://operationcode.org/)). We're a slack
based community and have been rolling out some home grown slack bots and we
currently have a rails app hosted on heroku. Managing and keeping track of the
different apps was getting unwieldy so in an effort to consolidate our apps
and reduce costs I evaluated a few different options. I ended up going with
rancher and after working with it a bit I'm pretty happy.

I have github hooked up to travis. When a new PR (or commit) is pushed travis
shoves the app into its container, and runs the test suite inside the
container.

If that passes AND the branch is master we push the image to docker hub. As of
now we manually update the app inside of rancher but I think automating that
will be a simple API call. Once we get more stable I'll be investigating that.

I still haven't quite figured out secret management but outside of that and a
tiny learning curve it's been pretty smooth sailing.

An example travis config:
[https://github.com/OperationCode/operationcode_bot/blob/mast...](https://github.com/OperationCode/operationcode_bot/blob/master/.travis.yml)

~~~
spudfkc
I do something very similar but with GitHub and Teamcity.

Automating the upgrades (i.e. redeploys) in Rancher is pretty straight forward
- their API is super easy to use. I ended up writing a simple tool in mostly
Bash to handle it, and threw it in a Docker container to run on Teamcity.

------
prgk
Since most of the solutions mentioned here are container based, I will provide
something different.

Started using Juju[1]. Basically, Juju handles bootstrapping/creating
instances you need in the public clouds and you can use Juju Charms to specify
how to deploy your services. So our deployment looks like this:

    
    
      - juju bootstrap google # Get instance in GCE
      - juju deploy my-app    # My app is deployed to GCE
    

You can actually try this with already publicly available apps. Example, You
can deploy Wikimedia[2] by just doing:

    
    
      juju deploy wiki-simple
    

This will install Wikimedia, MySQL and creates the relationship needed between
the Wikimedia and the database.

In our case, we have a production and development environments. Both are
actually running in clouds in different regions

    
    
      - juju bootstrap google/us-east1-a production
      - juju deploy my-app
    
      - juju bootstrap google/europe-west1-c development
      - juju deploy my-app
    

In addition to running in different regions, development looks at any changes
to development branch in our GitHub repo.

We don't use any containers. Juju allows us to deploy our services in any
clouds (aws, gce, azure, maas ...) including local using lxd.

[1] [https://www.ubuntu.com/cloud/juju](https://www.ubuntu.com/cloud/juju)

[2] [https://jujucharms.com/wiki-simple](https://jujucharms.com/wiki-simple)

------
throwawaytoday1
We use Dokku
([https://github.com/dokku/dokku](https://github.com/dokku/dokku)) in
production using it's tag:deploy feature to manage all of our containers (apps
in Dokku). We've fully automated it that we no longer interact directly with
the individual instances. Pushes to master kick of builds that create docker
hub images, then a deployment is triggered on the production machines.

------
nawitus
We use Kontena at the moment.

[https://kontena.io/](https://kontena.io/)

~~~
jazoom
How is it?

~~~
nawitus
We've had quite many issues due to using RHEL with the devicemapper storage
driver. But overall I like the concept behind it.

------
errordeveloper
At Weaveworks, we have a built a tool called Flux [1]. It is able to relate
manifests in a git repo to images in container registry. It has a CLI client
(for use in CI scripts or from developer's workstation), it also has an API
server and an in-cluster component, as well as GUI (part of Weave Cloud [2]).

Flux is OSS [3], and we use it to deploy our commercial product, Weave Cloud,
itself which runs on Kubernetes.

1: [https://www.weave.works/continuous-delivery-weave-
flux](https://www.weave.works/continuous-delivery-weave-flux)

2: [https://cloud.weave.works](https://cloud.weave.works)

3: [https://github.com/weaveworks/flux](https://github.com/weaveworks/flux)

------
rdli
We have a relatively simple cloud app: a couple (micro)services, but we also
use Postgres and ElasticSearch. We started using Docker + Spinnaker + k8s, but
then we ran into the problem of setting up the app for local dev (where we
wanted to use local PG) and prod (where we wanted to use RDS).

<plug>we've been working a bit on an open source tool, pib, that supports
setting up multiple environments because we ran into this problem (behind the
scenes it uses terraform, k8s, and minikube). would love to hear if anyone
here has seen anything similar or has thoughts!
[https://github.com/datawire/pib</plug>](https://github.com/datawire/pib</plug>)

------
morgante
I've spent a fair amount of time evaluating different solutions through my
startup[0] and have found Kubernetes, by far, to come with the least pain.
It's not hard to get started with, but also works well as you grow and mature.
It makes most of the decisions right from the start and kubectl gives you most
of the functionality you need to manage deployments easily.

Also, while I have a vested interest in saying this, you don't always want to
solve this yourself. Look at hosted solutions like GCP and CircleCI to make
things even more painless.

[0] [http://getgandalf.com/](http://getgandalf.com/)

~~~
Svenstaro
Offtopic: How do you implement the product of your startup precisely? It seems
way too good of a promise to be true and it seems that people might become
very disappointed.

~~~
morgante
At its core, Gandalf is just a large collection of scripts and playbooks which
are written to work very generically. They're slotted into an overall
framework which I wrote that can "learn" the architecture of a particular
company/app and customize things intelligently. NLP works on top of this to
map user input to playbooks.

There's still a decent amount of human intelligence involved though, since
obviously we want to give customers a good experience. This mainly comes in
upfront (where we tune the implementation for each customer) and for any tasks
which Gandalf hasn't learned to do yet. I've also invested in making it easier
to train Gandalf to do things—for example, I can say "watch me" and then do a
bunch of things with the AWS console/API and they get turned into a
parametrizable playbook.

Any good DevOps engineer invests heavily in automation. Gandalf is just one
level up of automating the process of automation.

If you have any other questions, feel free to email morgante@getgandalf.com.

------
jkemp
We selected Kubernetes on AWS but there are a lot of details to go from source
code all the way through to automated k8s deployments. We are currently using
our own framework
([https://github.com/closeio/devops/tree/master/scripts/k8s-ci...](https://github.com/closeio/devops/tree/master/scripts/k8s-cicd))
but I’m keeping an eye on helm/chart to see if it makes sense to incorporate
that at some point. Pykube
([https://github.com/kelproject/pykube](https://github.com/kelproject/pykube))
has made it easy to automate the k8s depoyment details. We needed a process
that would take Python code from our GitHub repos, build and test on CircleCI
and then deploy to our k8s clusters.

A single commit to our master branch on GitHub can result in multiple service
accounts, config maps, services, deployments, etc. to be created/updated.
Making all of that work is complicated enough but then we also need to deal
with things like canary deployments and letting us build and deploy to k8s
from our local workstations. And then there are details like automatically
deleting old images from ECR so your CICD process doesn’t fill that up without
you knowing. Incorporating CICD processes with Kubernetes is kind of new so
there is a lot of different projects and services starting to address this
area.

------
luckystartup
I've worked with a lot of tools. I've decided that I like things that are
simple and don't cost much money to get started. For new projects I always
start with Heroku, or Parse (on a free back4app plan now).

I love Ansible. Chef is alright. I've been using AWS OpsWorks recently, and
it's not bad. Elastic beanstalk is ok, too.

I've spun up some Kubernetes clusters, and it's nice, although I have no need
for it yet. I remember the database situation was difficult when I was trying
it last year. Something about persistent storage being difficult, so you had
to run Postgres on a separate server.

I still like Capistrano. You can automate it with any CI pipeline. For one
client, I used the "elbas" [1] gem for autoscaling on AWS. It automatically
created new AMIs after deployment. Not super elegant, but it worked fine.

I don't see much of a middle ground between Heroku and Kubernetes. Just start
with the one free dyno. Maybe ramp it up to 3 or 4 with hirefire.io. Once
you're spending a few hundred per month on Heroku, that's probably the time to
spin up a small kubernetes cluster and deploy stuff in containers.

[1] [https://github.com/lserman/capistrano-
elbas](https://github.com/lserman/capistrano-elbas)

------
marcc
This is a great question and something we've been trying to figure out
ourselves. Historically, we were using Ansible to deploy Docker containers to
EC2 instances, but have moved some services over to Kubernetes, Swarm and
Lambda/Serverless. All of these are create the same deployment challenges --
the current products out there don't fit perfectly. The more we want to deploy
to a higher level than "just Docker", the less Ansible provides today. But we
wanted to stick to the core concepts of automation, continuous delivery (at
least to staging), and chatops style management of production.

Our current approach is using an Operable
([https://operable.io](https://operable.io)) Cog we wrote which takes the
kubernetes yaml and applies it to a running cluster. It's not perfect, but I'm
pretty happy with the direction it's going. We built this cog in a public repo
([https://github.com/retracedhq/k8s-cog](https://github.com/retracedhq/k8s-cog))
so you are welcome to use any of it, if it's useful. Then we have our CI
service send a message (using SQS) after a build is done to deploy to staging.

~~~
AlexB138
Would you mind expanding on why you're using both K8s and Swarm?

~~~
marcc
We don't use both in the same product. Currently we have a product deployed on
k8s and a different one on swarm (well, 1.12 swarm mode, not the original
swarm). We won't keep it this way forever, but we've definitely learned a lot
about managing each in a production environment while running this way.

------
kt9
For Kubernetes, checkout [https://www.distelli.com](https://www.distelli.com)

Its a SaaS (and enterprise) platform for automated pipelines and deployments
to Kubernetes clusters anywhere.

Previous discussion:
[https://news.ycombinator.com/item?id=13160218](https://news.ycombinator.com/item?id=13160218)

disclaimer: I'm the founder at distelli.

------
peu4000
I work at an established company and most of our apps are still deployed with
RPM and puppet.

For our dockerized services we use Nomad internally and for a different
product we've built in AWS we're using Elastic Beanstalk with all of the
resources defined in terraform.

We use jenkins to manage the CI/CD for each method.

------
bert2002
[https://mesosphere.com/](https://mesosphere.com/)

------
hiphipjorge
We currently use docker for all our services in AWS and we deploy them with
ansible scripts. Services with a single container are fairly straightforward,
but for services with multiple containers running, we use the DR CoN patter
which works fairly well. Our ansible scripts handle everything from deploying
the container, to deploying registrator, to updating the nginx templates, so
it's fairly automated.

For CI, we use our own product (Runnable [0]), which allows us to test our
branches with their own full-stack environments, which is great for solid
integration tests. We often use it for e2e too. We're planning on adding more
CD features in the near future though.

[0] [http://runnable.com](http://runnable.com)

------
olalonde
We use Deis ([https://deis.com/workflow/](https://deis.com/workflow/)), which
is a sort of Heroku on top of Kubernetes. For CI, we use CircleCI and
automatically deploy when tests pass on the master branch.

------
sandGorgon
Have been working with the kubernetes teams on slack. Kubernetes is definitely
building a lot of the right things ground up, but its like Hbase vs Cassandra
- the former needs a full time dedicated team to get stuff working.

Docker Swarm (especially 1.13
[https://www.infoq.com/news/2017/01/docker-1.13](https://www.infoq.com/news/2017/01/docker-1.13))
is like Cassandra for me. Yes it has a few shortcomings, but it allows you to
have a fairly reasonable cluster using a stupid compose.yml file and very
quickly.

------
backmail
I use cloud66 for my sideproject [https://backmail.io](https://backmail.io) .
All the components are dockerized, and deployed/managed through cloud66 stack.
For a smaller projects/teams, cloud66 provide an easier way to get everything
working with single click ssl , easy scaling , and provide both vertical and
horizontal scaling either using cloud vm's or your custom dedi machines. It
also supports CI pipeline, to build docker images, though i use my own jenkins
setup to build docker images.

------
EngineerBetter
We work with folks (very large banks, automotives, governments, manufacturers,
retailers) who use Cloud Foundry, often combined with Concourse to deploy both
apps and the platform itself.

It's surprising the number of people who want to build a homebrew Kubes PaaS.
When I first started working in development, every company was building its
own CMS, until it invariably realised that it was hard and that they were
better off using a commercial or open source solution. Seems that container-
based platforms are history repeating itself.

------
suhith
I've been using Docker, I love it. Hope to weigh the pros and cons of Swarm
and Kubernetes and try those out too, but for most of my applications
networked Docker containers are sufficient.

------
jordz
We're heavily invested in Azure and their ARM system (Azure Resource Manager).
Our entire infrastructure is code as ARM Templates which we deploy to dev /
test / production. There's no discrepancies between environments. Our entire
application is then deployed on top. Everything is done through VSTS (Visual
Studio Team Services). We're very happy with it, very flexible and we have a
very stable platform because of it.

------
old-gregg
We do this for a living: [http://gravitational.com/managed-
kubernetes/](http://gravitational.com/managed-kubernetes/)

This is Kubernetes, plus monitoring of your choice, running on your
infrastructure, remotely managed by our team. The side benefit is that the
same setup works on different infrastructure options, so you deploy and run
the same stack on AWS and also on-premise/bare metal.

------
energybar
Has anyone tried docker swarm or Docker datacenter, we've been looking at it
but are on the fence vs kubernetes...

------
ryanbertrand
I have been using Convox to deploy our Docker containers. It has been great
for the past year and is improving daily.

------
usgroup
I really liked fleetd, so it's sad that it's wrapped up. It felt unixy and was
small enough to understand. Now I'm looking toward serverless and total
abstraction of the infrastructure. I kind of see the space in between filled
by Mesos, Kube and others as a bit ephemeral.

------
falcolas
Custom wrapper around Amazon ECS. We need more fine grained control over the
instances to support encryption, secret injection, log aggregation, and so
forth than other frameworks provide.

> How do you minimize costs with your solution?

Autoscaling groups triggered off of "cluster capacity".

------
deepnotderp
Docker and nvidia-docker, since it allows pcie passthrough for novideo GPUs.

------
hosh
I'm working for a startup right now. We're using Kubernetes via GKE on Google
Cloud.

Back in 2015, I implemented a Kubernetes by hand in AWS. I'm not going to do
something like that again. GKE is fairly painless and it has most of the
sensible defaults that I want. Networking just works -- pods can talk to each
other as well as to any VM instances from any availability zone and region.
Integrating with GCP service accounts just works. Spinning up experimental
clusters is easy, as is horizontally scaling the clusters. One gotcha is that
Google has not made K8S 1.5 generally available in all regions or availability
zones. Otherwise, upgrades are pretty easy.

I have deployed with Docker Compose (not doing that again -- it is easier to
use shell scripts). I have deployed with AWS ECS service (not doing that
again; it does not have the concept of pods which severely constrains how you
deploy). I used to deploy with Chef. I've heard of Chef's Habitat, but have
not played with it.

Back for the 2015 project, I wrote Matsuri as a framework to manage the
different Kubernetes templates. It's useful if you know Ruby. It uses
idiomatic Ruby to generate and manage K8S specifications, and run kubectl
commands. I wanted a single tool that could work with all the different
environments (production, staging, etc.) as well as manage the dev
environment. For example, if I want to diff my version-controlled spec on dev
with what Kubernetes master currently has, I would use `bin/dev diff pod
myapp`. If I want to diff the deployment resource by the same name, I would
use `bin/production diff deployment myapp`. I can write hooks specific to the
app. For example, `bin/production console mongodb` uses hooks to query
Kubernetes to find a pod to attach to, determine the current Mongodb master,
and invoke the command to go directly into the Mongodb shell. But I could have
invoked `bin/staging console mongodb` or `bin/dev console mongodb`. I could do
this because I have been developing software for a long time and I have enough
ops experience to be able to put it all together. YMMV.

We're using Go.cd for the CD. I could have used Jenkins, but decided to give
Go.cd a try. Go.cd has some advantages (such as much better topologies and
tracking value streams) though there are also things it does not do as well as
Jenkins (Go.cd auth mechanisms blow, and I had to write my own custom proxy to
get Github hooks working more securely and reliably). Setting up GCP service
accounts so that go.cd agents can deploy was a lot easier than I thought, once
I read through the GCP docs. (Much easier than AWS).

Docker containers are still difficult to make. You want to vet things before
using them. Handling this stuff is still going to be a full-time job for
someone, both in terms of designing the infrastructure as well as the
development tools. There are a lot of issues that come up because dev might
throw things over the wall that might impact the overall reliability and
performance of the system.

~~~
hiphipjorge
> . I have deployed with AWS ECS service (not doing that again; it does not
> have the concept of pods which severely constrains how you deploy)

What have you found are the biggest advantages of pods over containers? How
does ECS constrain how you deploy? Are you simply referring to
rollout/rollback, scale up/down?

~~~
hosh
The last time I used ECS for a production deploy, you could group containers
together (just as you can on compose). However, there were no easy way to do
service discovery. This made wiring containers together difficult. If I wanted
one container to talk to another, I had to group them and deploy them as one
unit.

That meant I could not horizontally scale one container more than the other. I
can scale the whole group, but there is a lot of wasted resources at that
point.

Kubernetes pods group containers together under a single IP address.
Containers from one pod (one IP address) can talk to any other pod. Docker did
not even have this functionality until 1.12, and that is too little, too late.
(And I am not sure this is something ECS supports right now). Combined with
label selectors, long-running services (which binds a DNS name to the set
selected by the label selectors), I can horizontally-scale pods and still
maintain service discoverability. Using DNS makes service discovery stupid-
easy. This means I can scale Kubernetes pods independently from each other.

Another consequence of using Service objects to select a set based on label
selectors is routing can now be dynamic. Pods that need to talk to another pod
goes through the service. I can then scale the dependency up and down, and it
doesn't really affect the pod that requires that service. I can do rolling
upgrades to the dependency, and it works because Service abstracts that
through label selectors.

There are still some warts related to this setup. Stateful sets still needs a
lot of work. I've also found that many applications caches IP addresses (redis
sentinel being a notorious example). To work well with Kubernetes, it's better
to always query DNS when making a connection. Ruby drivers for Mongodb and
Redis, for example, will cache DNS lookup, making failover fragile (if you are
running Mongodb and Redis inside Kubernetes; if you're not, you won't have
this problem).

I was choosing between Kubernetes and Mesos after ECS, but had not looked into
either deeply. It was random chance that took me to Kubernetes instead of
Mesos. Kubernetes solved many of the pain points of Docker Compose and ECS.

------
phillmv
A related question: how often are the people here scaling their applications
up and down?

Do you have large workload spikes, or traffic spikes?

------
jacques_chester
At Pivotal we use BOSH[0] almost exclusively for deploying distributed
systems. The motivating usecase was Cloud Foundry[1], but it can be used for
pretty much anything. Our founding role in both of these is why BOSH is our
first choice for such occasions.

It has a plugin model (CPIs) for hosting substrates, so right now it can
deploy and upgrade systems on AWS, GCP, Azure, vSphere, OpenStack and there
are others I forget right now.

It's proved itself in large production systems for years. Every week or two we
entirely upgrade our public Cloud Foundry, PWS, and nobody ever notices.

OK, that's a lie. You get an email from CloudOps: "We're going to deploy
v251". Then a few hours later: "v251 is deployed". Or occasionally: "Canaries
failed, v251 was rolled back".

There's nice integration with Concourse[2,3]. You simply "put" your deployment
and it just gets deployed for you. Our CloudOps team do this now, which makes
their lives that much easier.

Versioning is trivial, especially if you're working in a commit-deploy model
via Concourse.

The downside is that BOSH is BOSH.

We're doing lots of work to make it friendlier and more approachable, but
right now it's powerful and very opinionated. It does not have a smooth
onramp, because the basis of its power and reliability is that it insists on
certain minimum conditions first.

It's really meant for operators, not developers, but at Pivotal the main
consumers by volume are developers. Usually to deploy Cloud Foundry and
Concourse; though my current assignment is actually going to be shipped purely
as a BOSH release.

Disclosure: I work for Pivotal on Cloud Foundry.

[0] [http://bosh.io/](http://bosh.io/)

[1]
[https://docs.cloudfoundry.org/deploying/common/deploy.html](https://docs.cloudfoundry.org/deploying/common/deploy.html)

[2] [http://concourse.ci/](http://concourse.ci/)

[3] [https://github.com/concourse/bosh-deployment-
resource](https://github.com/concourse/bosh-deployment-resource)

------
AznHisoka
I just do a "cap production deploy", and it does everything for me (I use
bluepill + god for running background processes too)

I don't need Docker, and think it's too complex. I deploy to over 50+ servers,
so don't tell me it's because I run a simple setup :P

------
zaargy
If you're on AWS, then you should be using ECS first of all.

------
mohanmcgeek
Openshift is a wrapper on top of k8s.

You should just use helm.

------
sslalready
In a team where Node and Golang were the language of choice, we used GitHub
private repos for code, TeamCity as the driver for CI/CD and Salt to deploy
the Docker images to our different environments running on AWS EC2 instances.
I must say I really liked TeamCity and its different integrations with GitHub,
build processes (Node/NPM, frontend tooling, ..) and how variables could be
shared down with project and releases.

To deploy code with Salt, we had an SSH account on the Salt server configured
with a bunch of deploy keys. Each of those had a forced command that would
read _$SSH_ORIGINAL_COMMAND_ and forward this information to an agent (running
as root) that would execute Salt with the correct arguments, based on
information in _$SSH_ORIGINAL_COMMAND_. This let us use a build step in
TeamCity that basically did _ssh deploy@mgmt-gateway [env] [project]
[version]_. Deployments were logged to New Relic and Slack.

In a different team that are fond of PHP we use a private GitLab CE for code
management, GitLab CI Multi-runner as the build agent for CI/CD, Ansible for
configuration management and code deploys to different environments running on
AWS EC2. Like in the previous team, we have configured our .gitlab-ci.yml to
pass some arguments in _$SSH_ORIGINAL_COMMAND_ over SSH to a management node
that in turns talks to Ansible.

Something I like with having a private GitLab CE instance is that development
doesn't stop because your public Git host is DDoSed or have other problems
(like the recently discussed one here on HN).

Test and staging servers are shutdown/destroyed off-hours and
restarted/recreated by cron jobs that execute Ansible plays which identify
eligible EC2 instances via EC2 tags. Production environments with multiple
servers are similarly scaled down during off-hours. By simply
modifying/removing the "shutdown" tag from the AWS resources, teams are able
to exclude their test/staging environments from the scheduled shutdowns,
something which is useful for upcoming releases. ;)

In the Node/Golang shop I loved how simple Docker images were and how good it
felt to deploy it to isolated containers. Unfortunately, I don't see how
that's possible (in a clean way, preferrably without using two images) when
both an Nginx process (static file serving, e.g. frontend resources) and a
PHP-FPM process needs access to the same code release.

(If you have experience with Nginx/PHP-FPM apps and Docker, feel free to
enlighten me!)

Things I'm not entirely fond of about GitLab CI is that:

\- each branch in each repo must have a _.gitlab-ci.yml_ that is up-to-date
(administrative challenge!)

\- it's entirely driven from a _git push_ (though the web gui provides buttons
for _existing_ builds to retry/manually execute steps to e.g. deploy code)

GitLab has no support for a centrally managed _.gitlab-ci.yml_ file on a
project group and/or project level. There's no way to define variables on a
project group and/or project level. There's no way to schedule jobs so that
you can execute daily/weekly tests, or to manage jobs (in a user-friendly way
via the web gui) that perform cron-like tasks, so you can avoid putting these
tasks on the server themselves in /etc/cron.d (which becomes a problem when
you restore backups / bake AMIs / do auto-scaling).

I'd love to look more into K8 and Google's cloud offerings, especially since I
believe this might be the future and because I believe Google are lightyears
ahead of the competition when it comes to security and protecting the privacy
of its customers. Unfortunately I'm afraid it's not viable given my team's
current investment in Nginx/PHP-FPM apps and various AWS services.

~~~
josephjacks
As long as your apps run well in containers (Docker or Rkt or others even),
they can run and work well on K8s, which also runs well on AWS. You should
consider K8s to replace imperative CM-based (Salt/Ansible) deployment
mechanisms. The native pod abstraction in K8s can also nicely address multi-
container composition issue you mentioned.

------
BuuQu9hu
Matador Cloud ([https://matador.cloud/](https://matador.cloud/)) uses nixops
to manage NixOS machines.

