
Introducing Empire: A Self-Hosted PaaS Built on Docker and Amazon ECS - streeter
http://engineering.remind.com/introducing-empire/
======
justinsb
This looks great: simple yet powerful. I'm working a lot with Kubernetes, and
you don't actually need to run an overlay network on AWS (or GCE). On AWS,
there's some VPC magic that surprised me when I first saw it! But I believe
that's beside the point; it's not about ECS vs Kubernetes, it is about what we
can build on top.

In particular, I think the idea of embedding a Procfile in a Docker image is
really clever; it neatly solves the problem of how to distribute the metadata
about how to run an image.

~~~
ejholmes
Exactly! One of our goals was to also make the scheduling backend pluggable,
so we're hoping that the community will implement a Kubernetes backend in the
future. There's a lot of similar concepts between both, but we ultimately
chose ECS for it's ease of operation and the integration with existing AWS
services like ELB.

------
whalesalad
I'm interested in hearing more about how you use this in terms of development
lifecycle. Does a container image get created for every release of your app?
I've always wondered about the more correct approach to this.

This is how I currently use Docker:

1) Custom base image with all the things my company needs like supervisord,
libpq, etc..

2) Custom per-service base images like ones with Java for our Clojure services
or Python for our research services which are built off of the base.

3) A release consists of pulling the latest version of the base image,
example, acme-python, and then injecting the latest project code into it.

My concern here essentially boils down to the image repo. Github needs to add
container storage because while I admire Docker Hub's efforts, I don't trust
it.

~~~
ejholmes
We have a setup that has been working out well for us:

1\. We build docker images on every commit, in CI, and tag it with the git
commit sha and branch (we don't actually use the branch tag anywhere, but we
still tag it). This is essentially our "build" phase in the 12factor
build/release/run. Every git commit has an associated docker image.

2\. Our tooling for deploying is heavily based around the GitHub Deployments
API. We have a project called Tugboat
([https://github.com/remind101/tugboat](https://github.com/remind101/tugboat))
that receives deployment requests and fulfills them using the "/deploys" API
of Empire. Tugboat simply deploys a docker image matching the GitHub repo,
tagged with the git commit sha that is being requested for deployment (e.g.
"remind101/acme-inc:<git sha>").

We originally started maintaining our own base images based on alpine, but it
ended up not being worth the effort. Now, we just use the official base images
for each language we use (Mostly Go, Ruby and Node here). We only run a single
process inside each container. We treat our docker images much like portable
Go binaries.

------
bgentry
Really cool stuff. Seems like you found a good way to hand off most of the
hard stuff to AWS and only do a few key things yourselves to make the
experience better. As such I think Empire has the potential to be a viable
option for many companies, which is something I rarely say about a PaaS
project :)

~~~
ejholmes
Thanks Blake! I think somebody mentioned that we were standing on the
shoulders of giants. I think most of your contributions around this domain
qualify for that :)

------
fosk
This is neat. You might want to check out KONG
([https://github.com/Mashape/kong](https://github.com/Mashape/kong)) instead
of putting a plain nginx in front of the containers/microservices. It is built
on top of nginx too, but it provides all the extra functionality like rate-
limiting and authentication via plugins.

~~~
moatra
KONG definitely looks interesting, and I'd love to know more about it.
However, there's definitely not a lot written about it yet.

For example: I've gone searching through the blog posts, github readme, and
KONG documentation, but I still have no idea _why_ it needs Cassandra. What
does it store in there?

~~~
jkarneges
Kong uses Cassandra for storing config. This makes it easy to run a Kong
cluster. Just add more instances that share the same Cassandra cluster.

~~~
moatra
Is rate limiting state stored in Cassandra?

One of the main graphics on the KONG docs shows a Caching plugin
([http://getkong.org/assets/images/homepage/diagram-
right.png](http://getkong.org/assets/images/homepage/diagram-right.png)), but
the list of available plugins doesn't include such an entry. Is that because
caching is built in? Is the cache state stored in Cassandra? Or is the plugin
yet to be built?

~~~
fosk
All the data that Kong stores (including rate-limiting data, consumers, etc)
is being saved into Cassandra.

nginx has a simple in-memory cache, but it can only be shared across workers
on the same instance, so in order to scale Kong horizontally by adding more
servers there must be a third-party datastore (in this case Cassandra) that
stores and serves the data to the cluster.

Kong supports a simple caching mechanism that's basically the one that nginx
supports. We are planning to add a more complex Caching plugin that will store
data into Cassandra as well, and will make the cached items available across
the cluster.

------
nickpsecurity
This work has plenty about it that was interesting. The best part to me was
their answer to "why not feature X?" They said they prefer to build upon the
most mature and stable technologies along with naming a few. Too many teams
end up losing competitiveness by wasting precious hours debugging the latest
and greatest thing that isn't quite reliable yet. Their choice is wiser and
might get attention of more risk-conscious users.

------
rymohr
Thank you, this looks awesome! As someone who still hasn't embraced docker due
to all the orchestration / discovery madness I really appreciate such an
elegant solution. I love and run everything on AWS so building on top of ECS
is just another selling point.

------
stephenr
Does this really classify as "self hosted" if it's heavily dependent on AWS?

~~~
nickpsecurity
I don't think so. It's their hardware, infrastructure, and engineers hosting
it. They control those things. You get the rest. Sounds like an AWS-hosted
solution with some advertised advantages over other solutions. Definitely not
self-hosted.

Note: I think the only 3rd party thing I'd call self-hosted is colocation
where I delivered the server, they plugged it in, and the most they do is
reboot it for me.

~~~
stephenr
From the point of view of software, I generally consider something self-
host{ed,able} if I can run it on a machine I choose, without enforced
network/environment requirements.

~~~
nickpsecurity
It's a fair viewpoint. I guess my critical point is control: control over the
hardware, its software, legal rights to it, and so on. If they're in control,
it's theirs. How can it be myself if outsiders control or own it?

I guess a combo of philosophical and legal.

~~~
stephenr
I understand your point about control completely!

My highest priority for businesses making these choices is usually _slightly_
more pragmatic, and focuses on avoiding provider lockin - something that you
can install and run on your own local hardware, you can also (in most cases)
install and run on co-located hardware, on rented hardware, on traditional
rented virtual hardware (i.e. VPS) or on "flexible" rented virtual hardware
(i.e. AWS, Azure, etc).

~~~
nickpsecurity
That's a very pragmatic philosophy. I'm especially impressed at your unusual
focus on vendor neutrality as a lack of it costs many companies millions in
long run (see IBM & COBOL). Since I mostly avoid clouds, I'm not up to date on
that end. I'd like to attempt your style of things as an experiment in the
future, though.

Do you have a resource or resources for what components, strategies, or
platforms are best for the deployment you describe? Something useful for
production apps, reliable, and easy to move from dedicated hardware all the
way to AWS (or back if necessary). I'm sure there's other readers on my end of
things that might be interested as well.

~~~
stephenr
Thanks. It doesn't always work out - clients/managers often seem to have "all
the cool kids are using it" and/or "but it's the cloud, everyone uses the
cloud now" mentality, but I try.

I should also emphasise that I'm mostly talking about infrastructure level
"lock in" here - e.g. the artificial lock-in AWS creates for their Load
Balancer and/or Elastic IP service by giving VMs new IPs on reboot, etc.

I definitely prefer Open Source solutions, but I'm one step more pragmatic in
that space too - if a piece of locally-installable but proprietary software
does the job and works with open standards (e.g. if you want to use self-
hosted atmail) I'm _less_ worried/vocal about that than if you say you want to
use Gmail or whatever, but I'd still try to suggest a more open option.

In terms of resources, no sorry I don't have any single resource to go on,
besides a basic rule/test:

Can I demonstrate the full stack being implemented, using one or more laptops
(e.g. using VMs) on a plane or cruise ship? You could equally say "can i test
the full stack while the WAN is disconnected" but that doesn't sound as fun!

I'm actually building my new business around this basic idea - giving smaller
companies a better option to keep more control of their tech without the need
for a full-time sysadmin (which is often financially impossible even if they
_wanted_ to). I genuinely believe the vast majority of things most businesses
want/need to achieve can be done with existing Open Source software, its just
_usually_ not particularly easy to setup the various pieces, and make them
work together.

~~~
nickpsecurity
That makes sense. It has been done before to a degree. For inspiration, look
at Net Integrators Nitix appliance [1]. A UNIX system that was easier to
configure, self-managing, largely auto-configed, partly self-healing,
automatic backups, HA support, had most applications and a UI to integrate
their configuration. It was selling well despite being priced above most SOHO
servers. As often with good tech, a big firm (IBM) gobbled it up and rolled
the tech into their own stack (Lotus).

A stack like you describe with good traits of Nitix-like solutions could be
great for businesses not wanting much IT overhead. Might spread like wild fire
so long as you don't sell out or balk over patent suits.

[1]
[http://www.pcmag.com/article2/0,2817,1734766,00.asp](http://www.pcmag.com/article2/0,2817,1734766,00.asp)

~~~
stephenr
Thanks for the vote of confidence and the reference!

------
jordanthoms
How do you handle running one-off tasks (consoles, migrations etc) on this
setup? This is something most of these systems seem to ignore...

~~~
ejholmes
We actually have a relay
([https://github.com/remind101/empire/tree/master/relay](https://github.com/remind101/empire/tree/master/relay))
service that can be run alongside Empire that acts as a proxy to interactive
Docker sessions. It's a bit of an experiment right now and something we'd like
to solve better in the future, but it allows you to run containers with `emp
run <command> -a <app>`.

~~~
jordanthoms
Interesting! We are in a similar situation to where you were, with a bigish
app on Heroku which we are keen to move over to EC2 to join the rest of our
infrastructure, definitely keen to see how empire develops

------
mixmastamyk
Congrats, not everyone can create a simple elegant platform and write about it
in such an accessible manner. I suppose you're standing on the shoulders of
giants, but still.

This is the level of engineering/communication I always shoot for, and which
(somewhat disappointingly) is rare where I've worked.

~~~
ejholmes
Thanks for the kind words. As a shameless plug, we are hiring:
[https://www.remind.com/careers](https://www.remind.com/careers) :)

------
sagivo
personally i use dokku
([https://github.com/progrium/dokku](https://github.com/progrium/dokku)). i
would be happy to see one standard "heroku-like" paass since i feel too many
people trying to tackle the same problem.

~~~
ejholmes
Dokku is definitely an awesome project (pretty much anything from Jeff Lindsay
is pretty good)! The primary problem is that dokku is meant for just 1 service
and we have quite a few.

We'd love to see one standard too. Personally, I think it's good to have a lot
of competing solutions right now (ECS vs Kubernetes, Docker vs Rocket, etc)
and we'll see things settle in the next couple of years as containerization
becomes more common.

~~~
Zaheer
Not exactly sure what you mean by its meant for just 1 service? Do you mean
just one box? I run multiple services/apps on my dokku instance

~~~
mateuszf
He means multiple instances - for scaling horizontally when traffic increases.

~~~
showkhill
[http://progrium.viewdocs.io/dokku/process-
management](http://progrium.viewdocs.io/dokku/process-management)

~~~
dalyons
That's still all on one machine; Dokku can't automatically schedule &
distribute containers/processes across a cluster of hosts like Empire & others
can. If you're doing anything non trivial you're going to outgrow one machine
pretty soon :)

------
scanr
Instead of nginx, we've had a pretty good experience using vulcand
([https://github.com/mailgun/vulcand](https://github.com/mailgun/vulcand)) as
the front-end router for our micro-services.

~~~
LunaSea
Any reason(s) for switching to Vulcand rather than NginX ?

~~~
scanr
First off, nginx is awesome and you can do all the things we did with vulcand
with nginx, so it was just a question of friction.

The reason we went with vulcand is that it natively supports what we wanted to
do i.e. route to micro-services based on dynamic etcd driven configuration. To
do the same thing in nginx (at the time), we would have either had to use
confd or custom lua.

~~~
loki77
We actually looked at VulcanD in an older version of Empire. When we decided
to use a routing layer with this version of Empire, rather than just letting
Empire/ELB expose each service (mostly because it is a lot easier for us to
later shut off public access to each service) we threw together nginx because
it was so simple.

I think at this point everytime we move a service we add like 5 lines to an
nginx config, re-deploy the router in Empire, and the service is exposed.

The internal 'service discovery' makes this a lot easier, since we just have
to tell nginx to route to [http://<app_name>](http://<app_name>) \- no domain,
no port, nothing more than the app_name thanks to DNS/resolv.conf search path
& ELB stuff.

------
smanuel
> We tried Deis briefly but ultimately decided that it was more complicated
> than we felt it needed to be.

That kind of reminds me of [https://xkcd.com/927/](https://xkcd.com/927/)

Sorry if that's not the case. I've also played briefly with Flynn and Deis and
I haven't found anything _that_ complicated that would need a whole rewrite
and changing the entire approach. Moreover with Deis I can easily change
providers (DO, AWS, Azure, etc.) and with Emprire I'm bound to ECS. At least
that was my first impression, I have to read more.

~~~
athrun
IMHO, it's best to see Empire/Deis/Dokku/etc. as a mean to an end, not the end
itself.

While _Empire_ itself may be tied to AWS, your app is still a portable,
12-factor, Heroku-compatible app. You can run it elsewhere.

------
UserRights
How to autoscale with this?

~~~
loki77
So at this point Empire itself doesn't deal with things like Autoscaling. That
said, the Demo Cloudformation template (and the bootstrap script that kicks it
off easily for you) make use of Autoscaling groups for the instances that
containers are being run on.

So, in theory you could autoscale just like you always would. Monitor stats
for a host, if a bunch of them start to run low on resources, kick off an
autoscaling event.

That said, there's been quite a bit of talk about integrating Empire with
Autoscaling, so that when, say, ECS couldn't find any instances with resources
free for a task, Empire could kick off the autoscaling events for you. Could
be pretty awesome :)

------
phantom_oracle
[http://www.openshift.org/](http://www.openshift.org/)

Just putting this out there in case anyone is looking for an alternate open-
source PaaS.

I've never personally used it before (self-hosted), but it may be something
that someone out there is looking for.

------
floridaguy01
aws is silly expensive. Why didnt you build this on top of digitalocean?
Digitalocean is so awesome right now. They dont even charge for bandwidth
overages.

~~~
serferfish
Digitalocean is silly expensive. Why don't you look at Atlantic.net they are
so awesome and charge way less than overpriced digitalocean. Why not run it on
your laptop which you've already paid for, that would be even cheaper than
overpriced atlantic.net!

~~~
dubcanada
Unless I am missing something Atlantic.net is 1-10 cents cheaper then
digitalocean?

[https://www.vultr.com/pricing/](https://www.vultr.com/pricing/) is 20%
cheaper right now at least.

~~~
serferfish
My old laptop offers much better price/performance, especially when I use free
wifi in coffee shops for bandwidth. The laptop is already paid for so I only
pay for the electricity (on days when I don't plug it into the wall at a
friend's place to save _even_ more). Those prices are just too expensive.
There's always a place willing to do a job cheaper...

