
I Want to Run Stateful Containers, Too - kevindeasis
http://techcrunch.com/2015/11/21/i-want-to-run-stateful-containers-too/
======
kennu
I think the author is implicitly describing the paradigm shift from software
and hardware abstractions to service abstractions. We used to make computers
by soldering electronics together, then by assembling cases and components,
then buying premade servers, then renting cloud capacity, and now we're
starting to rent everything as services.

The cloud is about going from capex to opex and the "capex" now is the initial
work needed to define your own architectures and stacks for every project
(before you get to work on the actual project, i.e. the differentiating part).
Amazon is eliminating most of this by offering building block services that
fit together with little hassle.

So the challenge for open source is how to move on to this era of service
abstraction. It's no longer enough to just provide an NPM package or a
configure script.

I think Docker is in a good position to bring us there, but it's currently
stuck at the stateless container level. Something needs to evolve so that
launching a scalable and auto-maintainable database cluster along with a
connected web application cluster is as easy with Docker as it is by renting a
few Amazon service.

~~~
theseatoms
How does OpenStack stack up in this regard? A step in the right direction?

[https://en.wikipedia.org/wiki/OpenStack](https://en.wikipedia.org/wiki/OpenStack)

~~~
takee
Personally I think it might make sense to run your stateful apps in openstack
VMs and the stateless portion in containers.

~~~
alrs
Openstack is an equally terrible place to put stateful apps.

------
bonobo3000
What exactly does containerizing everything give you anyways? I really don't
get it.

Before - use chef/puppet to manage dependencies, distribute config files. run
processes. maybe use something like upstart to restart on failure.

After - use Dockerfiles to manage dependencies (same thing, bunch of install
commands). now you have a container for the web app, one for another service,
etc. so everything is isolated.. great. what do you gain over running 2
separate processes? thats pretty damn isolated too.. except for the same disk,
they each have own virtual memory, state, config files etc..

I'm not a (modern) ops expert at all, but i know my way around the command
line. What do you gain from Docker or say launching a mongo instance on the
cloud instead of just renting a server and launching the process? I really
want to know. Atleast on a small scale, say if you're managing say 10-20
servers, i don't see the point.

~~~
Scarblac
I see an instance as sort of a compiled binary of your app and the environment
it runs in. You can install your app, all its dependencies, get all the config
files correct, run tests on the created instance, and then you have a Docker
instance and can start up any number of them in production.

If any of those steps fails in the middle of some script, you haven't put any
server in some half way there state.

If a rollback is needed, you can switch to the previous Docker instance, and
changed requirements won't trip you up.

We used to do it with zipped chroot environments and some startup/shutdown
scripts, Docker is more or less that.

Of course you need to store data outside them, as otherwise you lose once you
switch to a newer version of your instance, but that's easy enough.

All that said, still not really a fan of it.

~~~
TheIronYuppie
Disclaimer: I work at Google on Kubernetes

I'd love to hear more - anything in particular that doesn't feel like a fit?

~~~
Scarblac
Nothing specific, they do what they do quite well.

I have a generic feeling that the rabbit hole is getting too deep though.

In our case we run large servers with virtualisation software (I think they
run on Windows, I never touch this layer). Then we have virtual Linux servers
running on them. They have package managers that are also a way to solve this
sort of problem. We run Docker instances on those servers. They are Linux
again, so have their own package manager. Then we run Python (with virtualenv
or Buildout) or Node (with npm), that again have their own package managers
that try to provide isolated environments. And of course they run bytecode in
the Python or Node...

And that ridiculous stack is used to run some relatively mundane web app that
can't even mutate its data directly, and is used to send some set of
Javascript and HTML and JSON to the user's browser. Which is where the app
actually _runs_...

I wish we had some Web framework as nice to use as Django but compiling to
static binaries that are immediately sort of an equivalent to a 12 factors
app, and a kernel that was made for running them directly. Or so.

But this is the internet, right? It literally evolves, just like nature -- we
can only build on top of existing layers, not remove a few and start over.

------
jacques_chester
I agree with the author that writing your own snowflake PaaS is a mistake. I
work at Pivotal on the fringes of CloudFoundry; OpenShift Origin is a
competing system. Either way, you should be using a full PaaS instead of
rolling your own.

But this is the bit that surprises me:

    
    
        They get you hooked for free and the next level 
        is $1,496 per month… wtf! MongoLabs is little 
        better.
    

$1500 per month, versus weeks of engineering time spent tinkering with and
upgrading and bug-fixing and trouble-shooting and security-patching a hand-
made solution is a _fantastic bargain_.

Supposing the author was running on Pivotal Web Services. It would've taken
less time to add a MongoLab service (about 2 minutes, 3 if you include a re-
stage) than to perform the CloudFormation calculation (say, half an hour,
resulting in no running software).

PaaSes like Heroku, Cloud Foundry and OpenShift are feature-complete for the
cases that application engineers and operators care about. If you roll your
own you're directing effort to something that doesn't provide user value.

If I walked in on someone at a regular dev company rolling their own operating
system for a web service, I'd be surprised. Writing their own programming
language? I'd be skeptical. Oh you built a new HTTP server? _Why?_ Outside of
a research environment, why are you doing that?

And so it is with PaaSes. The marker was passed years ago. We don't need to go
on these spiritual quests any more.

Disclaimer: I work for Pivotal, which donates the majority of engineering
effort on Cloud Foundry. I'm actually in Pivotal Labs, the agile consulting
division, which is where I morphed into a lay preacher for just-using-a-PaaS-
dammit.

~~~
FooBarWidget
> $1500 per month, versus weeks of engineering time spent tinkering with and
> upgrading and bug-fixing and trouble-shooting and security-patching a hand-
> made solution is a fantastic bargain.

That's only true if you look at it from the point of view of an enterprise, a
startup with VC funding, or generally a "rich" company. If you're a cash-
strapped startup/small business with only a few thousand dollars profit per
month, $1500/mo is a huge deal.

~~~
curun1r
This is especially true when you consider what that same $1500/mo will buy you
using something like DynamoDB or Aurora. Both those solutions will give you
more storage, are managed for you and will mostly scale up with you, meaning
you don't have to start anywhere near $1500/mo.

I know the article said tying yourself to Amazon feels wrong. But focusing
your time and energy on managing infrastructure that could be managed by
Amazon instead of focusing your time and energy on your product feels more
wrong to me. There are exactly zero startups that have succeeded because they
had a more reliable and performant MongoDB installation.

~~~
nzoschke
A big +1 for DynamoDB + S3 for a cash strapped startup.

If you are writing an app from scratch, and the cost of data services is a big
concern, then don't pick a data store with complex and expensive replication
properties.

DynamoDB is admittedly harder to understand and use than Postgres or Mongo,
but when you figure it out its a HTTP data API with no setup or maintenance
costs.

$8/mo of DynamoDB can easily cover your users and other CRUD.

------
ownagefool
The author is correct. We should be running stateful containers, but until the
the software is ready the default advice quite rightly should be not to unless
you understand the implications.

For example we run ceph and use it in kubernetes. We obviously only run
replica sets (the kubernetes feature, for those uninitiated) of 1 for these
services. There's also some rough edges with locking for dead nodes, but it's
definitely good enough for running the likes of Jira or Gitlab have single
points of failure pretty much whatever way you scale them (unless you build a
HA nfs server first).

Before that, we were running test clusters with just fleet, using custom bash
and etcd for service discovery. Works fine, but you're probably better off
with frameworks.

Now where things can go horribly wrong is you decide to use a clustered
database, run it in kubernetes, thinking it's fine since you can't mount
external storage for replica's but you have 3 nodes over 3 zones and you have
backups. Kubernetes will quite happily schedule all of your replica's on the
same host, meaning a single reboot and you're dead. There are hacks around
this such as defining a hostPort, but of course if someone updates the RC you
could happily be restoring to backup. You could run two multiple replica
controllers to get around this and that'll allow you to mount storage again
but that takes a way a lot of the elegance of the framework.

Point is, you can do it, but you'll need a bunch of uglyness and you'll need
to be careful. If that doesn't sound great, way until the software properly
supports it. :)

~~~
rco8786
Does kubernetes not have a mechanism for defining host/rack diversity
constraints?

~~~
cpitman
I know that OpenShift 3, built on kubernetes, supports both affinity and anti-
affinity policies ([https://access.redhat.com/documentation/en/openshift-
enterpr...](https://access.redhat.com/documentation/en/openshift-
enterprise/version-3.0/openshift-enterprise-30-administrator-
guide/chapter-9-scheduler)). I don't know if these are part of kubernetes or
part of the extensions custom to OpenShift.

~~~
davidopp_
All of those scheduling policies are also available in native Kubernetes. The
Kubernetes version of that documentation is here:
[https://github.com/kubernetes/kubernetes/blob/master/docs/de...](https://github.com/kubernetes/kubernetes/blob/master/docs/devel/scheduler_algorithm.md)

------
mschuster91
Why not run good old classic bare-metal servers (or in VMs, if you like to
decouple hardware from software) with the services being configured either by
hand (using good documentation, something that is a lost art these days) or
with a system like puppet?

The article, unfortunately, stinks of overengineering and overcomplexity for
setting up a SIMPLE FUCKING SERVER ENVIRONMENT.

Or just stick to plain old Apache/Lighttpd with PHP and a standard MySQL/pgSQL
database?

Don't reinvent the wheel just because it's "cool".

edit: also, I don't get why anyone would spend literally weeks reading docs,
learning DSLs etc. just to get a deployment done. My personal maximum time for
getting a "hello world" is two hours, if it's radically new a day. If the docs
of the project are insufficient (or the quality), DO NOT RELEASE IT. Don't let
your users do your work for you.

------
HorizonXP
So I fought with this over the last two years, and went from running Postgres
in a Docker container, to Amazon RDS after getting frustrated with maintaining
Docker volumes, and then now, back to using Docker volumes via Kubernetes'
volume attachment.

I think Kubernetes has done a great job at tackling this, at least as a first
pass. Right now, I can attach an EBS volume as my Postgres data store, and not
worry about where the container is running, since Kubernetes handles mounting
the volume. Presumably, I can run an NFS server and have it use that instead
of an EBS volume.

Now, I can run backups and slaves however I like. It's not as easy as RDS, but
I have more control now, and it's marginally cheaper.

Anyway, I see where the author is coming from, and we're not totally there
yet, but the problem is being solved.

------
kordless
> Back to my point (I think I have one).

Customers rarely understand the feature set and deliverables they need for a
given use case when there is a major shift in the way the system delivers that
use case. I've been working with a very large company's operation team on
containerized PoC of a specific developer's team use case which builds and
tests their SaaS software stack. The ops team can't wrap their heads around
how the software needs to be changed to enable their move to a containerized
solution. They just thought they could "containerize" it for the developers
and move on. Guess we should have been talking to the devs instead.

One point here about all this is that operations teams and developer teams
must work closely together on objectives but their end goals are in direct
conflict when it comes to providing infrastructure for software. Ops doesn't
want the infrastructure to change much because it becomes difficult to scale
and prevents reliable, repeatable root cause analysis when things break.
Nobody wants to wake up in the night and troubleshoot stuff one doesn't
understand when there are bears at the door.

Developers want the infrastructure to be flexible and support doing crazy shit
with their software so they can satisfy customer's demands with use cases and
requirements, and do it faster than the competitor does. The desire for sales
and growth drives the need for it to remain extraordinarily reliable and
scalable.

Immutability of the infrastructure provides a means by which both devs and ops
folks can come together and achieve common goals, while keeping their
objectives meet with moving fast (devs) and keeping things reliable (ops).

Wanting stateful containers as a feature is simply a misunderstanding of what
is needed from the infrastructure based on the developer's standpoint of
needing "reliability" from both ops and devs. Reliability of the underlying
infrastructure is brought about by making the container's _deployments_
immutable. Reliability of the software is brought about by keeping state for a
given _configuration_ once it has been proven to work through many iterations
of a given use-case.

------
mrerrormessage
How much of this could be avoided if the application didn't use mongo? Needing
to run a three-nice cluster out of the gate seems like a big part of the
problem. Sure, you want backups and redundancy for any database, but there are
situations where a MySQL or pg slave that can be switched on makes more sense
financially, especially if load doesn't require a three-node cluster.

~~~
giaour
That was my first thought, too. If the author didn't insist on using mongo, he
could have opted for an inexpensive RDS setup.

------
enisoc
=== Shameful Plug ===

If you're interested in running MySQL inside Kubernetes, check out Vitess:

[http://vitess.io/resources/presentations.html](http://vitess.io/resources/presentations.html)

We still have work to do, but we're not stopping at the easy part. We show you
how to do replication, sharding, and even live re-sharding of MySQL inside
Kubernetes. We're working on integrating with Outbrain's Orchestrator for
automated failover, and our VTGate query routing service means those failovers
will be transparent to your app.

We are admittedly running into the same limits in Kubernetes around using
replication controllers for datastores. But Kubernetes is improving very
quickly, and there's a reason for that. They have a cheatsheet of one way all
of this can come together successfully in the form of Borg:

[http://research.google.com/pubs/pub43438.html](http://research.google.com/pubs/pub43438.html)

At YouTube, we run our main MySQL databases (the ones with tables for users,
videos, views, etc) inside containers with local storage. Of course, Borg has
a much more mature scheduler, which gives stronger safety guarantees for
replicas. My point is that we've proven this approach can work at scale for
datastores in a container cluster, and through Vitess we're trying to bring
the same capabilities to Kubernetes.

------
skMed
I believe part of what the author is looking for is currently being pushed by
the Deis guys with Helm[1]. Think of it as a package manager for creating
Kubernetes configurations. I think we'll see some stable and reusable
templates for both stateful and stateless services there.

[1] [https://helm.sh/](https://helm.sh/)

~~~
TheIronYuppie
Disclaimer: I work at Google on Kubernetes

We _LOVE_ what Helm is doing, and are actively helping. Please dive in! :)

------
arbitrarytech
Not sure where those estimated costs for running MongoDB on AWS came from. It
jumps from a single t2.micro instance (which you can get for free) straight to
3x m3.2xlarge instances at $1500/month. That's a pretty big jump. There are at
least 6 instance types between the two. Like 3x m3.medium instances with 500GB
gp2 EBS volumes would cost $300/month. That's the on-demand pricing, you could
probably save some more with reservations given the stateful nature of
MongoDB.

~~~
Akkifokkusu
The chart comes from this AWS-provided document, actually:
[https://s3.amazonaws.com/quickstart-
reference/mongodb/latest...](https://s3.amazonaws.com/quickstart-
reference/mongodb/latest/doc/MongoDB_on_the_AWS_Cloud.pdf)

------
Lennie
Not sure why the author complaining about VM & cloud ?

You can get real machines through APIs not from Amazon, but you can from other
providers like: Rackspace and IBM/Softlayer (and others). He even links to
Bryan Cantrill, so pretty sure Joyent can deliver containers (even Docker ?)
on baremetal if you want them.

------
asaikali
Right now VM's are the better choice for Statefull use cases like running a a
conventional databases that expects a real filesystem. In a few years I think
container and container schedulers will get good at doing persistent volumes.

In the meantime I think a less known but great solution for reliabily creating
VM' is OpenSoure BOSH [http://bosh.io/](http://bosh.io/) is an excellent tool
that will allow you to deploy VM's on most major IaaS including AWS, Azure,
OpenStack, vSphere and others.

BOSH is a hard to learn and it has a different philosophy than typical
configuration management tools such as chef/puppet/ansible ... etc but it is
totally worth it once you have it you have an amazing power tool at your
disposal.

There are a lot of bosh releases for popular tools on github.com for example
here is one for mongoDB [https://github.com/Altoros/mongo-
bosh](https://github.com/Altoros/mongo-bosh)

------
stephenitis
One of huge challenge in solving stateful containeres is being as agnostic to
the orchestration tool Docker Swarm, Kubernetes, Mesosphere, etc, all of which
have different opinions of how clusters of containers should be orchestrated
while also accounting for variety in what hosting a cluster of stateful
container(s) means to the user.

I work at ClusterHQ, our team believes the tools we are building like Flocker
are going to get the community there.

it's pluggable to both orchestration tools and has a model for creating
backend plugins.

Storage backend provider plugins that work with Flocker. [http://doc-
dev.clusterhq.com/config/configuring-nodes-storag...](http://doc-
dev.clusterhq.com/config/configuring-nodes-storage.html#list-of-supported-
backends)

------
siliconc0w
If you're using AWS you can use a pre-task (i.e fleet unit file) to run
something like [https://github.com/leg100/docker-ebs-
attach](https://github.com/leg100/docker-ebs-attach) to attach an volume
before running your container. You can also do this with
Flocker([https://docs.clusterhq.com/en/1.7.2/config/aws-
configuration...](https://docs.clusterhq.com/en/1.7.2/config/aws-
configuration.html)) if you want something fancier.

Still rather have AWS manage the data. Unless you're a really biggie sized
company RDS/Elasticache are really good ideas. Managing data and databases are
headaches I'll gladly outsource.

~~~
stephenitis
I'm curious Would you be ok with being locked in with X provider's storage
specific solution? (say a ECS only way of doing things) Is the headache in the
setup, config, or risk of being at the helm of orchestrating your own data?

~~~
nullspace
My experience is that you are going to be "locked in" in some way no matter
what. Current infrastructure systems are a mess of vendor specific solutions
and configurations. Migrating from one open source system to another is going
to be just as hard as migrating from a proprietary thing like vanilla ECS to
say kubernetes on bare metal.

There are other concerns like vendor pricing and stuff, but I have not had bad
luck with that.

~~~
TheIronYuppie
Disclaimer: I work at Google on Kubernetes.

I should mention Kubernetes is 100% open source, runs on AWS, Google Cloud,
Azure, Digital Ocean, Vagrant, bare metal, VMWare, Rackspace and lots more I'm
probably forgetting. Then you can pick the cloud you like, and lock-in be
gone.

------
markbnj
The author seems to be mixing some valid observations regarding the difficulty
of stateful containers with some paranoia about the growth of cloud
deployments and ownership of data. Or maybe it's not paranoia. I don't know.
But it's different.

First off, containers don't absolutely have to be stateless. The first and
foremost benefit of containers is dependency isolation and configuration
management. Once you use them for any length of time this becomes clear. You
make them, put them on a machine with a compatible kernel and network access
to the right stuff and they just run.

It's a pretty short leap from containers that just run to the idea of
container orchestration systems like kubernetes. We just deployed a new
staging environment built on Google Container Engine, an implementation of
kubernetes, and it's pretty damn amazing what you can do at the services
layer, and yes, even at the gateway and persistence layers. But you have to
treat the needs of these layers differently.

Statelessness is important at the services layer because ultimately you want
to scale up and down seamlessly and automatically, and kubernetes allows you
to do just that.

In the persistence layer it's the opposite: state is all that's important and
scaling is a more complicated affair. That doesn't mean containers aren't
useful in that layer. They still provide the above-mentioned benefits. The
aforementioned staging environment uses elasticsearch running in a dedicated
kubernetes cluster, where each pod is bolted to a persistent disk at cluster
creation. It also uses a mongo replicaset that is just deployed on instances
in the old manner, but we have a prototype containerized install and will be
moving toward that. Lastly it uses mysql via Google's cloudsql managed
offering.

So you have a lot of differences in the persistence layer, and a lot of
choices for how to manage those differences. Things are a lot simpler and
cleaner at the services layer, but that doesn't mean the benefits of
containers in one layer are somehow less of a win than in the other. After
three years of using and deploying them my feeling is it's pretty much all
win.

~~~
TheIronYuppie
Disclaimer: I work at Google on Kubernetes.

This is a really good point - I've said it before and I'll say it again -
containers neither add to (nor subtract) from whatever you're doing today. If
you have a single VM and no shared storage, you're exactly as vulnerable as if
you were doing things in a container. And, in the majority of cases, the exact
same techniques you'd use in a VM work in a container or Kubernetes too.

------
bbr
[http://engineeringblog.yelp.com/2015/11/introducing-
paasta-a...](http://engineeringblog.yelp.com/2015/11/introducing-paasta-an-
open-platform-as-a-service.html)

------
NathanKP
While I agree with the author of this article from the perspective of the
engineer who likes to tinker with things and roll my own stuff I must disagree
from the business perspective.

From the business perspective a PaaS solution costs one dollar amount and a
custom, hand rolled, stateful solution built and maintained by engineers costs
another dollar amount.

With offerings from AWS the former will almost always beat the later. It's not
until you reach Facebook or Google level scale that the monthly savings from
the latter can outweigh the benefits of just using an AWS solution like
DynamoDB or ECS, etc.

~~~
Lennie
I agree about this right now.

But I think enough open source code will be made to fill this space.

It's just a matter of time.

------
phantom_oracle
The complexity of this setup screams 1 major thing to me: security issues

Seeing as we are on this topic, I would like to pose 1 simple and theoretical
question for all those who only need 1 decently-sized 4GB server to get their
small projects running:

=====

What type of setup can be used to get a simple Rails/Sinatra/Flask/Django
webapp + Postgres-DB on a single 4GB server that has to be maintained by a
single individual where time and complexity are highly-valued commodities?

The least-complex setup will be preferred, as the hundreds of 1-man side-
projects will not be able to maintain their 43 container-clusters using
x-software on top of y-software that is managed by z-software.

=====

A good answer here will probably help hundreds of individuals here avoid the
situation of "I should probably containerize my app because everyone else does
it" scenarios.

~~~
TheIronYuppie
Disclaimer: I work at Google on Kubernetes.

I put that disclaimer at the top of all my posts, but this one truly is HIGHLY
biased.

The absolute easiest way to do what you describe is to use GKE (Google's
hosted Kubernetes). For $0.15/hr, we'll manage everything for you, and you can
build out teeny tiny clusters that do everything you need. It's even free for
clusters <5 nodes. Start it up, use a sample app from this directory
([https://github.com/kubernetes/kubernetes/tree/master/example...](https://github.com/kubernetes/kubernetes/tree/master/examples))
and you're done.

~~~
IanCal
Is that really the easiest? I couldn't see a sample app there that matched the
description.

One other issue is that's also over 100 dollars per month. I can rent 32GB SSD
backed xeon E3 servers for half that. Or a stack of 10 4G VPS.

------
petar
Have you looked at gocircuit.org and the accompanying language for connecting
templates Escher.io? They are still not production ready, but aiming to solve
your problem in a general way. The circuit simply says you should be able to
write the logics that build out your software as programs against a simple
live Cluster API, provided by the circuit. Escher helps mix and math such
functional logics. But the bottom line is this. Every framework is a language.
Adding frameworks adds complexity. This is why circuit reuses the go language
for its concurrency and abstracts your cluster into a programmable dynamic
data structure

------
perlgeek
Much of the distributed system management tooling seems to be in its infancy.

For exapmle, if I have one host, I can manage dependencies by adding them to
my debian/control files, and apt-get/dpkg will figure them out for me. If the
services are installed on separate hosts, I'm on my own. I haven't found a
proper solution for managing service dependencies distributed on several
hosts. (Compare
[https://news.ycombinator.com/item?id=10487126](https://news.ycombinator.com/item?id=10487126)).

So when even the most basic managment tasks are solved for distributed
systems, why does it surprise anybody that more advanced state management is
still in the "you're on your own" stage?

I'm sorry, but I can't help my cynicism here.

------
jodok
[https://crate.io](https://crate.io) Co-Founder here. We're participating in
this game by working hard to build a fully distributed, shared-nothing SQL
database.

In our vision, a app in a container is able to access the persistence layer
like in SQLite - just import it. another cluster of containers - preferably
having one instance node-local - takes care of the database needs.

The database is distributed and makes sure enough replicas exist on different
nodes. it's easy to scale up and down, local ressources are being utilized
whenever possible (no NAS/SAN like storage).

------
opennode
OpenNode founder here. This article nicely brings out the reasons why we
started to develop NodeFabric prototype design - mixing docker app containers
with highly-available stateful backends. Homogeneous stateful prebuilt micro-
clusters versus large-scale docker orchestration with configuration
management. Agressively co-located and highly-available by design. Simplicity.
[http://nodefabric.readthedocs.org/en/latest/](http://nodefabric.readthedocs.org/en/latest/)

------
erikpukinskis
At the end of the day you want to be robust to failures right? Be able to
quickly restore from a database in a hardware failure?

If so, why not just boot that way every time? It'll keep your backup system
well exercised and it means fewer code paths since you don't have a separate
hot boot.

------
kimi
We run a large stateful service with hundreds of images based on whaleware and
a thin layer of orchestration on top, and it works. We just dont expect
everything to happen magically through an orchestration tool.

------
itomato
Looks like MongoDB is the problem here.

------
itomato
MongoDB is the hub of this problem.

------
sigmonsays
Why not just run lxc and lxd?

~~~
Lennie
That is exactly what we do at work with stateful services. For now.

------
tinco
We run a sharded+replicated mongo cluster and a redundant postgresql array in
docker containers. We had to write our own orchestrator on top of Etcd to get
the functionality we wanted. Like the article author I feel there's some
cultural divide happening. How is it possible that the 100.000 line Java
project of Kubernetes lacks even basic resource management that our
orchestration tool that's written in a few hundred lines of bash and Ruby does
have?

~~~
paukiatwee
* Kubernetes is not written in Java

* There is no one size fit all, just like never ending Ruby vs JavaScript VS Java.

* Kubernetes just hit v1.0 not long ago.

* Seem like you already wrote the functionality you want, now what is the problem?

~~~
tinco
* Alright, sorry Kubernetes is in Go I must've gotten it confused with some other project.

* Well, alright, but they market it as a general purpose project. I'm just saying that to us it's strange that none of these frameworks out there deal with persistent storage, as the author of the article also observed.

* Ok.

* Well the problem is that it'd be much nicer if we could just use Kubernetes. Obviously having a homegrown orchestrator is not very nice. There's bounds to be lots of edgecases that our ops will run into that'll continuously cause us to perform maintenance on it, and maintaining an orchestration framework is not our core business.

Anyway, this is not necessarily a critique of Kubernetes, it's great software.
It's just an affirmation of the point the article is making, that it's curious
that there's no interest in stateful containers from the maintainers of these
frameworks.

~~~
TheIronYuppie
Disclaimer: I work at Google on Kubernetes.

We do care enormously about stateful solutions; I'll say what I've said
elsewhere - how do you handle stateful services in your VMs?

~~~
tinco
We don't run anything in VMs at the moment. I guess if we would we'd do it
like we're doing with the containers, mount the drives into them. Do you guys
run the search engine in VMs?

