Hacker News new | past | comments | ask | show | jobs | submit login
PAAS Comparison – Dokku vs. Flynn vs. Deis vs. Kubernetes vs. Docker Swarm (jancarloviray.com)
128 points by wilsonfiifi on June 11, 2017 | hide | past | favorite | 72 comments

Why is upvoted so much. This just touches the surface. Although, I am not very related to PaaS, I know much more for the three of the methods listed. This even oversimplifies so many things, to the point of getting it wrong.

> WHEN TO CHOOSE DOKKU: for hobby and side projects that do not require high availability

Surely, not correct. Don't make your decision on this line. It's not made for fault tolerance, but anyone who has doubts regarding which to use is very likely to use a single region of single cloud provider. I think most companies(even large one) do that.

> Big caveat (for Kubernetes): it is very hard to setup

I really don't think he did that. It's really easy for AWS at least. See https://kubernetes.io/docs/getting-started-guides/aws/

> WHEN TO CHOOSE DOCKER SWARM: in my opinion, this has a very strong potential to be a contender against current solutions mentioned above but as of right now, it is still evolving.

AFAIK, it has been extensively tested and is being used live. I am not sure, but the amount of presence it has, it's hard to believe that it is just evolving and not to be used.

Swarm is used but I have heard many companies migrating from swarm to k8s and openshift. Swarm is just unstable, network connectivity is flaky, not sure how can you run it in prod (maybe you can't?).

As for k8s - would recommend it for anything from hobby projects to large prod deployment. Easy to set up anywhere.

This article is indeed very shallow and clearly author hasn't bothered to do some research.

> This article is indeed very shallow and clearly author hasn't bothered to do some research.

Well, yeah, but the thing concerning here is HN upvoted it to the top. I have nothing against the author.

> As for k8s - would recommend it for anything from hobby projects to large prod deployment

That's what I am saying. IMO, the reputation that it has to be used only if somewhat wanted to manage very large number of servers is a little misleading. I think Kubernetes can feel very intimidating at the start but it's not that hard.

Just today I spun up a single node Kubernetes production cluster. Obviously not its sweet spot, but the ecosystem, flexibility and tooling made it an appropriate choice, given that I already know Kubernetes and that I'll be deploying the same software elsewhere on "real" clusters.

So much is wrong here it's hard to know where to begin. First, PaaS is not container orchestration. Second, this is extremely superficial. Third, some things are flat out wrong. For example, Docker Swarm has been depreciated. Swarm's capability is baked into Docker as of 1.12. For a better overview see: https://insights.hpe.com/articles/the-basics-explaining-kube...

So, the way I deploy code is to SSH into my server, clone my repo from GitHub, setup the db and start the relevant systemd services. If I need to install updates it's a simple git pull and restart the services. I can use something like ansible for configuration management to reduce the manual SSH part.

For someone like me, what benefit do these PaaS offerings provide?

Automated repeatability, automated rollbacks, self-documenting infrastructure and the ability to deploy with a single command or automatically on a successful test build.

That means you can merge to master and know the build will just get released when the tests pass without having to do any of that work. Further, as the needs of your infrastructure or DB develop (you are hoping for growth, right?), your deployment stays automated, you can add more complexity to your setup with little additional workload, and you can hire other developers who can safely ship to production without having to be shown any of your method.

The question always seems to be how much extra overhead you incur setting these things up in the first place for a relatively small business or side project. I've been curious about these kinds of automation tools for as long as they've been around, but they always look like more trouble than they're worth for any of the projects and deployment strategies I'm involved with at that smaller end of the scale.

Maybe they work better if you're doing a relatively standard DB + API + front-end kind of setup, which as it happens nothing I work on actually is. Or maybe they just don't really pay for themselves until you're working at a scale where a simple copy/clone step and everyday scripting tools aren't sufficient to run everything routine anyway?

I swear by Flynn because it can mount Web Apps, Docker Containers and just anything that runs inside of Docker under Linux.

I often find new sexy stuff on the Hacker News frontpage and just mount it right on my Flynn cluster to fuck with it.

I suppose my question would be why a lot of small systems would even need something like Docker. The theoretical benefits certainly would apply to various projects I work on, but the reality is that businesses at this scale often set up a standard system image and deploy it on a static set of real servers. Once that's done, the foundation might not change for months or years at a time.

There are usually two significant exceptions. One is security updates, which are typically monitored and tested/deployed via a separate process anyway. The rest is the day-to-day development work and deploying new assets, which typically needs no more than some sort of copy/clone job from the VCS to get the data to the servers and then running a single script to deploy/activate everything.

If you're working with hundreds of servers or dynamically reconfiguring things all the time in a Cloud environment, obviously the situation may be different. However, for the simpler cases -- and you can get a long way with them if you're just running a normal business and not trying to be the next unicorn -- the whole process already seems reasonably reliable and efficient anyway.

I'm a software guy rather than an operations specialist, as are the people I work with in almost all cases, so I always have some nagging doubt that we just have a total collective blind spot here. But so far, speaking only about the smaller scale and more static deployments I tend to work with, these kinds of tools seem like solutions to problems we don't often have.

I'm a software guy too (https://github.com/WriteCodeEveryday), but I have spent an entire 3 years of my life building prototypes and software for small businesses, that just aren't technically savvy and don't have large IT departments.

Once I deployed / configured my fifth Postgres database with Rails, I decided enough was enough, I needed to move faster in my Operations setup since I'm the "ops guy". If you're in a company that has dedicated capable ops guys, you don't need a platform, but if you don't have those guys, you will need one desperately.

Flynn's decent for my use cases because those very things you say you need done -> "deploying new assets", "copy/clone job from the VCS", and "running a single script to deploy/activate everything" are wicked simple and supported right out of the box securely, so instead of handing over the SSH keys to a developer, I can give them the Flynn key and they don't get access to the underlying infrastructure, just a basic API for doing development tasks.

Additionally, the backup functionality allows me to backup a single app or a full cluster and upgrade our infrastructure as it's running.

If you've never looked into Docker, or Flynn, take this post, set aside a $60 budget for DigitalOcean and try it out, I can't guarantee it will be perfect for you, but it's perfect for me. All hail /u/_lmars and /u/titanous.

I've looked into Docker, and various other devops-style tools, in recent years. I guess I just don't see what they usefully achieve in the context I'm talking about.

I mean, for all of the projects I'm thinking of here, we already have full version control of all of our assets. We can deploy an update in about upload time plus a few seconds with a single command, and roll it back with a single command as well. Security updates are typically deployed directly from their own upstream repos, again normally with a single command after any due diligence we feel is necessary. Installing the OS and other foundation packages was more substantial in some cases, but mostly because of the various non-standard things any given project might be using in addition to the routine OS + web + DB stack, and we'd surely have to set that up regardless of whether we were making some sort of master filesystem image to clone directly or some sort of image for use in a container system. And we typically run test infrastructure identical to production in both hardware and software, with deployment to that rather than a live system requiring just a couple of simple changes to configuration.

So given the above, how would introducing an extra layer of infrastructure and a bunch more tools really help us? I'd be happy to look into these things in more detail if someone can tell me what I should be looking for and what problem it solves that we don't even realise we have, but none of the (quite a few) resources I've read/watched about them over the past several years has suggested a compelling advantage for smaller, statically hosted infrastructure. Again, the advantages seem pretty clear for people managing, say, hundreds of systems of various types that are spun up on demand and discarded when no longer useful on some sort of Cloud platform, but eliq's original question and my follow-up relate to a different sort of environment.

Flynn's key selling point is easy setup and load-balancing.

You should try it out yourself, and see if it eases some of your pain points.

It also includes rollback of code (right from the UI, and every ENV variable change / worker change is a new release) and continuous deployment options (ensuring your code isn't broken when deploying a new version and deploying while ensuring your service never goes offline)

You should try it out yourself, and see if it eases some of your pain points.

Could you suggest any good places to start, please? I haven't looked at Flynn specifically before, but I did take a look at its website on your recommendation. Unfortunately, the videos and explanatory text I've found so far don't seem to be any better than others I've seen before in this area: plenty of buzzwords, but little real substance or context to help someone understand what the tool offers or how it works. Are there some better sources I could look into?

Go through the basics, and then install the flynn-cli and create a cluster.

You may additionally want to purchase a cheap domain ($0.99 special is enough, doesn't need to be fancy).

Docs: https://flynn.io/docs/basics https://flynn.io/docs/installation

Thanks, but those are exactly the pages I was looking at before and didn't find very helpful.

Some examples of what got in the way for me, as someone totally new to Flynn:

I don't know what a "Flynn cluster" is, but it's obviously a central concept and mentioned almost immediately on both of those pages.

There are lots of impressive-looking commands doing impressive-looking things quickly in the videos, but there's not a single mention of where you're actually running them, where any other systems they are affecting came from, or what they're actually doing for that matter.

In fact, nothing on those pages tells me what Flynn does. I spent probably an hour or so exploring their site after we swapped posts the other day, but it was an exercise in frustration and I'm still none the wiser about what its scope or feature set actually is. (For comparison, that's longer than the total time it took me to write all the scripts we use to do the automation for one of those projects I mentioned before!)

Anyway, I don't want to take any more of your time on this, so I'll stop there. Thanks for trying to help, even if it didn't work out in the end.

Flynn's actually great for this kind of stuff. You can deploy a git repo directly and then provision your DBs (Redis, Postgres, MySQL) right from the dashboard.

Doesn't Flynn require a minimum of 3 servers? What if I'm only running one?

Recommended is 3 servers, 2 CPU cores, 2 GB of RAM.

However, single server configurations are perfectly acceptable for personal clusters.

I would strongly recommend taking a cluster backup when you get it up and running though, since single cluster configurations do not deal well with unexpected power failures.

Is there a reason for this? When I see something like it makes me question what it'll be like for larger clusters.

Because in larger clusters, the cluster can fix any issues if one hosts goes offline (load balancing) or a database gets corrupted.

In a single cluster configuration, anything going wrong can destroy your cluster's capability. Think of it as RAID, the more drives, the more you can do for high availability.

Running a minimum of 3 servers is just a recommendation, you can run Flynn on a single server (many people do), but then you obviously don't benefit from the HA properties.

Found you, Lewis Marshall. You'll never guess who I am!

Ignore the reply below, what you did has no equivalent. Unless public cloud provider automate that.

eidt: I might only see the one comment saying the benefit. That one does not bring a net gain for OP's use case. Managing any of the mentioned platform is hugely more complicated then ssh + git clone.

If you have something that works and you don't need to manage hundreds or thousands of applications in a uniform way, I'd stay with what you trust.

Edit: Though as PaulRobinson points out, before thinking about a PaaS you ought to think about CI/CD.

I mean I think Cloud Foundry is the bee's dancing knees, but for a single app it's probably massive overkill.

Seriously why is this anywhere near the top of the page? It's basically two to three sentences about each and barely scratches the surface of each technology.

We used Flynn over the last 12 months (tech agency, lots of fairly small projects) but gave it up in favour of Dokku. Every single one of our nodes eventually died, and we were unable to recover them. They chewed up disk space continuously and eventually ran out of disk and then were not recoverable. Other times they just died and would not restart all the required services. This was not even as part of a cluster; just single nodes. We raised issues and messaged on IRC but nothing happened. We really wanted to like it, but that was our experience. There's a lot of magic going on behind the scenes, which makes it really hard to even start to try and resuscitate a dead node. Sorry Flynn!

We moved to Dokku and are really happy with it. It's not H/A but it's mega simple and it works really well. It's basically just Docker, buildpacks and the networking to wire everything together. The plugins also mean it's more useful than Flynn - if you need something like ElasticSearch you just install the plugin. Ditto LetsEncrypt etc etc.

I am personally using Rancher with Docker to deploy my apps. I like the web-ui and 1.6.1 is stable to use in production.

Yeah, Rancher is great with Kubernetes too - It sets it up for you on your favourite infrastructure provider and launches instances for you.

Is it? 1.1.x looked nice when I had evaluated it (especially the Web UI), but was a real bugs nest in the long run. Had issues with almost every component. Usually found matching bugs that were already filed and had the workarounds described, but it left the bad aftertaste.

Finally, DNS component had nearly malfunctioned and I've switched to Compose with some WeaveNet and semi-manual (re)scheduling. Lost some useful functionality (and the UI) but at least the switch fixed things and it works. Wanted to use Swarm for fully automated scheduling without a DIY external watchdog system, but its networking is extremely limited at the moment (and I haven't groked how to use it with Weave).

Kubernetes looks nice, but the amount of YAML I'll have to write scares me away. And I'm not sure I trust all that "we do magic" stuff anymore. The more complicated system is, the harder it's to figure out what to do when it starts to fail.

There's now http://ksonnet.heptio.com to make it much easier to get started with k8s and eliminate the need to write YAML

Somewhat-related question:

I have a cluster of ~6 machines that I'm using to play with containers and container orchestration tools and so on. I specifically want to be able to

1. Have a non-cli interface for management, and especially not have to write a bunch of YML files myself, and

2. Be​ able to specify exactly which machine in the cluster a specific container ends up on.

Something like Openshift Origin seems to be the way to go, but is there another option worth thinking about?

Flynn is the best option for #1. It's a full cli written for humans with a web dashboard. It can mount anything that run under Linux, docker files or simple web apps (through Heroku build packs).

In regards to #2, this is a bit tricky, the way it works is that you point a domain to it and the apps are given a subdomain and load balanced.

OpenShift uses Kubernetes, which allows that degree of precision. I'm not familiar with its GUI though.

I believe DCOS has a really nice interface.

The interface of DCOS really is excellent, the default project for kubernetes doesn't even come close, especially the 1.9 release of DCOS polished everything up. It does have its quirks though. Which does force you to use the json editor sometimes.

But the openshift UI is also very nice, I've been using it through the CDK with minishift and it's an excellent interface for kubernetes/openshift.

Even the OpenShift UI forces you to drop into a yaml editor every now and then. Kubernetes is too powerful to really represent in any UI without a ton of work - which is really the central decision when adopting Kube. Do you want the power available if you need it? Or do you want something simpler that might constrain a future choice?

Of note, there's a structured web editor for any API resource in Kube, OpenShift, or extension being prototyped now that I'm hoping helps bridge that gap.

Also, check out the latest nightlies or 3.6.0-alpha.2 for a bunch of massive improvements to the overview - I think it's easily the biggest enhancement over the last few years.

There are some nice features of the UI in OpenShift that I'm missing in the kube dashboard, such as the terminal.

Also the ability to split up access to workspaces in OpenShift could be useful, though I don't know if that's accomplished with actual API objects or whether it would map onto kube.

I love this JSON Editor https://jsonformatter.org/json-editor

#2 would probably be a combination of "stateful sets" and "label selectors" in kubernetes.

Generally, though, you're fighting the high level idea of these tools by trying to tie to specific hardware.

I guess it should've occurred to me to ask why notamy wants that. HA? Hardware-specific differences maybe? Those are the uses I know of.

It's an issue of hardware differences + some of the software not being designed to be clustered Kubernetes/etc.-style (and it's not mine so I can't just rewrite it).

OK, that makes sense. But why use a platform that abstracts physical layout when you already have a physical layout in mind?

This isn't really something that I have experience with, hence asking; my issue is mainly not really knowing the tools available, therefore I'm looking in the wrong place.

Gotcha! I hope I was helpful.

Normally in these circumstances I'd recommend BOSH but, off the top of my head, I don't know what kind of affinity can be defined. So it might be another rabbit hole.

> Generally, though, you're fighting the high level idea of these tools by trying to tie to specific hardware.

I figured as much, but I'm trying to avoid inventing my own solution if one already exists.

However, you can run Flynn in single machine mode, which guarantees your hardware will be set, and you get the best of both worlds (CLI + GUI + Specific Hardware)

Docker Swarm is the simplest setup for a fairly sophisticated PAAS. It works of a docker compose file that will carry port binding, network configuration,etc. You just run "docker stack deploy". Adding multiple nodes to a cluster is a one line command.

Do you mind sharing more from the standpoint of security? What can a user expect to maintain, do, or setup to be secure (as possible)?

docker swarm has TLS overlay network support built in. So it is secure by construction. Docker Secrets is built-in (https://docs.docker.com/engine/swarm/secrets/) and is pretty cool.

Not sure what else are you looking at

Thanks. Thats a solid start. :)

i think the article is correct in its presentation though.

while the docker swarm service looks very promising and will probably proof itself to be the best for small to medium and maybe even big corps, its most definitely not battle tested yet, as its just too new.

you actually need to add unstable repos on most distris to use the 'stack deploy' option. quite a few are still <12

> you actually need to add unstable repos on most distris to use the 'stack deploy' option. quite a few are still <12

Same with k8s - package revisions are at the mercy of distro maintainers. However, I have found that docker is far more updated that k8s.

BTW Docker Swarm is fairly well tested - but I think it is called Docker Enterprise Edition in its well tested form.

There might be edge cases where k8s is better than docker swarm.. however I personally believe in 90% of startuppy deployment usecases are worth not fighting ingresses for.

One of the engineers from my company, Nanobox (a micro-PaaS), compares a few of these tools and others in context of how well they cover the development to production workflow: https://content.nanobox.io/development-to-production-workflo...

Tools included on this chart include: - Heroku - Kubernetes - Flynn - Nanobox - Openshift - Dokku - Rancher - Convox - Mesosphere - Hyper.sh

Surprised not to see an ECS shoutout yet. I'm curious what the rap is on that that makes people walk right on past the AWS default option.

I've been really happy with ECS. Tons works out of the box. Last piece for me was figuring out a good way to do cron, but I just figured that out. http://blog.ratelim.it/blog/cron-jobs-on-amazon-ecs-with-dat...

Glad you figured it out, but they just announced native scheduled tasks: http://docs.aws.amazon.com/AmazonECS/latest/developerguide/s...

Hah! Le sigh. That's hilarious. Thanks for the pointer, can't believe I missed that.

All the options in the article can be self-hosted or run anywhere you like, ECS is proprietary and Amazon only, which is likely why it was excluded. Heroku wasn't in the comparison either.

Couple additional PaaS that could be on the list as well: cloudfoundry, deus, tsuru, openshift (although that's mostly covered by kubernetes).

Flynn looks very interesting nowadays (it's been a while since I last checked it), I specially like the fact that they are trying to handle stateful (databases for instance) environments too. But after reading through the website and the Go code, I'm still a bit lost regarding what kind of guarantees/architecture this implies. Has anyone had any experience deploying database appliances with this?

A first note, this is not like-for-like. Dokku, Flynn and Deis fit the PaaS spot more meaningfully. Kubernetes and Swarm are sometimes called "Container as a Service" -- you still need a bunch of work to select, install and configure features that PaaS users take for granted.

Missing from the list: Cloud Foundry. And OpenShift! Give Red Hat some respect here for making Kubernetes into a platform. And DCOS, which I think got its thunder stolen by Kubernetes.

But I mostly know Cloud Foundry, because it's my day job to keep making it better.

It's the most mature out of all of these, has been soak tested to 250,000 running applications[0], can be deployed to any major IaaS or bare metal, comes with routing, logging, service injection, healing, no-downtime upgrading and I forget what other headlines I usually pick out of the several hundred features it now includes.

The Cloud Foundry Foundation includes Pivotal (my employer), IBM, SAP, Google, DellEMC, VMWare, Cisco, Suse and those are just the fancy tech names.

The reason you don't hear about it is that we and our partners have focused on competing for enterprise sales. It's not how you get publicity on HN, but it does mean that PivotalCF -- our commercial distribution -- has the fastest-growing sales of an opensource-based product in history. And sales are still zooming up and to the right.

The nice thing is that we and other partners get very broad, specific feedback from customers who are already at massive scale and who expect utter, non-negotiable reliability. We run a public service ourselves (Pivotal Web Services), SAP have HANA Cloud, IBM runs BlueMix.

We and other Cloud Foundry contributors have had the benefit of that dogfooding and feedback for longer than any other container-based platform in existence. And it turns out, there are so many things that can go wrong. So many. It's crazy.

Lastly, Cloud Foundry teams have massive investments in project automation. This gives us two capabilities: one is that we can roll out complete rebuilds of the whole platform within hours of an upstream CVE patch. Users can then apply these fixes to their platform live, in-place, without any running apps noticing that it has occurred. BOSH[1] will roll out the deployment with canaries, and Diego[2] will relaunch containers as they are reaped during upgrade.

The second capability is that we are confident in making very deep, very aggressive changes if it proves necessary, because we have tests upon tests upon mountains of more tests. And again: nobody notices, except that their platform gets faster or becomes reliable under even more extreme circumstances.

If I sound like an utterly biased fan, it's because I am an utterly biased fan. I've watched this thing evolve up-close for years. It is amazing.

We build installable superpowers.

Disclosure: If it's not obvious, I work for Pivotal on Cloud Foundry.

[0] https://content.pivotal.io/blog/250k-containers-in-productio...

[1] http://bosh.io/

[2] https://github.com/cloudfoundry/diego-design-notes

Minor edits for style.

Is it still the case that CF requires a ton of overhead to get running? Last I checked there was no easy way to set up CF without:

- Running on a cloud

- Spinning up ~8 different servers, each with ~2GB of memory iirc

Kinda a lot of overhead if you're just running a few dev environments like I was.

I get this question a lot.

PCFDev[0] if you just want a dev environment to tinker with. You can also tinker with cflocal[1] (unsupported) or BOSH-lite[2] (somewhat-supported) if you're looking to practice ops as well.

A lot of Cloud Foundry teams do integration testing with BOSH-lite on large single VMs, because it essentially presents BOSH with a "VM" interface that spins up nested containers instead.

You can, in theory, deploy however few or however many real VMs, or bare metal servers, as you wish, by editing a BOSH manifest.

By default the cf-deployment[3] manifest uses something like 23 VMs of varying sizes. That's because the default manifest is intended for production use with multi-AZ HA as the minimum level of reliability. If you don't need that, one of the alternatives I noted will get you what you want.

[0] https://pivotal.io/pcf-dev

[1] https://github.com/sclevine/cflocal

[2] https://github.com/cloudfoundry/bosh-lite

[3] https://github.com/cloudfoundry/cf-deployment

But to be fair, Pivotal Cloud Foundry is so expensive it's not even in the same league as the others and I can totally understand why the article just ignored it altogether.

PivotalCF is expensive, but that's due to the clientele we target and the various added extras we can provide.

OSS Cloud Foundry is free, and so is BOSH. A lot of companies use those and are perfectly happy with them.

Red Hat OpenShift undercuts us on price, I think largely because they figure they'll get to do the razors/razorblade thing with RHEL riding in behind it.

Mesos should probably be on the list as well.

Absolutely. I made a side reference through DCOS, which is the PaaS-packaged Mesos.

Some of those aren't PaaS, but container orchestration frameworks.

If you're reading this list of comparisons out of genuine interest, I would suggest also looking into Nanobox: https://nanobox.io

help me imagine using flynn in production - these seeem like deal breakers

upgrade flynn requires back up and restore

backups are full backups only no incremental

recommended filesystem is s3 azure or google since

postgres is not reliable

>postgres is not reliable

That will require a pretty strong argument.

You want to invest into what will win including ones own skills. K8 is the winner, IMO. Writing is already on the wall.


I think Flynn should rename, just saying.


Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact