> WHEN TO CHOOSE DOKKU: for hobby and side projects that do not require high availability
Surely, not correct. Don't make your decision on this line. It's not made for fault tolerance, but anyone who has doubts regarding which to use is very likely to use a single region of single cloud provider. I think most companies(even large one) do that.
> Big caveat (for Kubernetes): it is very hard to setup
I really don't think he did that. It's really easy for AWS at least. See https://kubernetes.io/docs/getting-started-guides/aws/
> WHEN TO CHOOSE DOCKER SWARM: in my opinion, this has a very strong potential to be a contender against current solutions mentioned above but as of right now, it is still evolving.
AFAIK, it has been extensively tested and is being used live. I am not sure, but the amount of presence it has, it's hard to believe that it is just evolving and not to be used.
As for k8s - would recommend it for anything from hobby projects to large prod deployment. Easy to set up anywhere.
This article is indeed very shallow and clearly author hasn't bothered to do some research.
Well, yeah, but the thing concerning here is HN upvoted it to the top. I have nothing against the author.
> As for k8s - would recommend it for anything from hobby projects to large prod deployment
That's what I am saying. IMO, the reputation that it has to be used only if somewhat wanted to manage very large number of servers is a little misleading. I think Kubernetes can feel very intimidating at the start but it's not that hard.
For someone like me, what benefit do these PaaS offerings provide?
That means you can merge to master and know the build will just get released when the tests pass without having to do any of that work. Further, as the needs of your infrastructure or DB develop (you are hoping for growth, right?), your deployment stays automated, you can add more complexity to your setup with little additional workload, and you can hire other developers who can safely ship to production without having to be shown any of your method.
Maybe they work better if you're doing a relatively standard DB + API + front-end kind of setup, which as it happens nothing I work on actually is. Or maybe they just don't really pay for themselves until you're working at a scale where a simple copy/clone step and everyday scripting tools aren't sufficient to run everything routine anyway?
I often find new sexy stuff on the Hacker News frontpage and just mount it right on my Flynn cluster to fuck with it.
There are usually two significant exceptions. One is security updates, which are typically monitored and tested/deployed via a separate process anyway. The rest is the day-to-day development work and deploying new assets, which typically needs no more than some sort of copy/clone job from the VCS to get the data to the servers and then running a single script to deploy/activate everything.
If you're working with hundreds of servers or dynamically reconfiguring things all the time in a Cloud environment, obviously the situation may be different. However, for the simpler cases -- and you can get a long way with them if you're just running a normal business and not trying to be the next unicorn -- the whole process already seems reasonably reliable and efficient anyway.
I'm a software guy rather than an operations specialist, as are the people I work with in almost all cases, so I always have some nagging doubt that we just have a total collective blind spot here. But so far, speaking only about the smaller scale and more static deployments I tend to work with, these kinds of tools seem like solutions to problems we don't often have.
Once I deployed / configured my fifth Postgres database with Rails, I decided enough was enough, I needed to move faster in my Operations setup since I'm the "ops guy". If you're in a company that has dedicated capable ops guys, you don't need a platform, but if you don't have those guys, you will need one desperately.
Flynn's decent for my use cases because those very things you say you need done -> "deploying new assets", "copy/clone job from the VCS", and "running a single script to deploy/activate everything" are wicked simple and supported right out of the box securely, so instead of handing over the SSH keys to a developer, I can give them the Flynn key and they don't get access to the underlying infrastructure, just a basic API for doing development tasks.
Additionally, the backup functionality allows me to backup a single app or a full cluster and upgrade our infrastructure as it's running.
If you've never looked into Docker, or Flynn, take this post, set aside a $60 budget for DigitalOcean and try it out, I can't guarantee it will be perfect for you, but it's perfect for me. All hail /u/_lmars and /u/titanous.
I mean, for all of the projects I'm thinking of here, we already have full version control of all of our assets. We can deploy an update in about upload time plus a few seconds with a single command, and roll it back with a single command as well. Security updates are typically deployed directly from their own upstream repos, again normally with a single command after any due diligence we feel is necessary. Installing the OS and other foundation packages was more substantial in some cases, but mostly because of the various non-standard things any given project might be using in addition to the routine OS + web + DB stack, and we'd surely have to set that up regardless of whether we were making some sort of master filesystem image to clone directly or some sort of image for use in a container system. And we typically run test infrastructure identical to production in both hardware and software, with deployment to that rather than a live system requiring just a couple of simple changes to configuration.
So given the above, how would introducing an extra layer of infrastructure and a bunch more tools really help us? I'd be happy to look into these things in more detail if someone can tell me what I should be looking for and what problem it solves that we don't even realise we have, but none of the (quite a few) resources I've read/watched about them over the past several years has suggested a compelling advantage for smaller, statically hosted infrastructure. Again, the advantages seem pretty clear for people managing, say, hundreds of systems of various types that are spun up on demand and discarded when no longer useful on some sort of Cloud platform, but eliq's original question and my follow-up relate to a different sort of environment.
You should try it out yourself, and see if it eases some of your pain points.
It also includes rollback of code (right from the UI, and every ENV variable change / worker change is a new release) and continuous deployment options (ensuring your code isn't broken when deploying a new version and deploying while ensuring your service never goes offline)
Could you suggest any good places to start, please? I haven't looked at Flynn specifically before, but I did take a look at its website on your recommendation. Unfortunately, the videos and explanatory text I've found so far don't seem to be any better than others I've seen before in this area: plenty of buzzwords, but little real substance or context to help someone understand what the tool offers or how it works. Are there some better sources I could look into?
You may additionally want to purchase a cheap domain ($0.99 special is enough, doesn't need to be fancy).
Some examples of what got in the way for me, as someone totally new to Flynn:
I don't know what a "Flynn cluster" is, but it's obviously a central concept and mentioned almost immediately on both of those pages.
There are lots of impressive-looking commands doing impressive-looking things quickly in the videos, but there's not a single mention of where you're actually running them, where any other systems they are affecting came from, or what they're actually doing for that matter.
In fact, nothing on those pages tells me what Flynn does. I spent probably an hour or so exploring their site after we swapped posts the other day, but it was an exercise in frustration and I'm still none the wiser about what its scope or feature set actually is. (For comparison, that's longer than the total time it took me to write all the scripts we use to do the automation for one of those projects I mentioned before!)
Anyway, I don't want to take any more of your time on this, so I'll stop there. Thanks for trying to help, even if it didn't work out in the end.
However, single server configurations are perfectly acceptable for personal clusters.
I would strongly recommend taking a cluster backup when you get it up and running though, since single cluster configurations do not deal well with unexpected power failures.
In a single cluster configuration, anything going wrong can destroy your cluster's capability. Think of it as RAID, the more drives, the more you can do for high availability.
eidt: I might only see the one comment saying the benefit. That one does not bring a net gain for OP's use case. Managing any of the mentioned platform is hugely more complicated then ssh + git clone.
Edit: Though as PaulRobinson points out, before thinking about a PaaS you ought to think about CI/CD.
I mean I think Cloud Foundry is the bee's dancing knees, but for a single app it's probably massive overkill.
We moved to Dokku and are really happy with it. It's not H/A but it's mega simple and it works really well. It's basically just Docker, buildpacks and the networking to wire everything together. The plugins also mean it's more useful than Flynn - if you need something like ElasticSearch you just install the plugin. Ditto LetsEncrypt etc etc.
I have a cluster of ~6 machines that I'm using to play with containers and container orchestration tools and so on. I specifically want to be able to
1. Have a non-cli interface for management, and especially not have to write a bunch of YML files myself, and
2. Be able to specify exactly which machine in the cluster a specific container ends up on.
Something like Openshift Origin seems to be the way to go, but is there another option worth thinking about?
In regards to #2, this is a bit tricky, the way it works is that you point a domain to it and the apps are given a subdomain and load balanced.
I believe DCOS has a really nice interface.
But the openshift UI is also very nice, I've been using it through the CDK with minishift and it's an excellent interface for kubernetes/openshift.
Of note, there's a structured web editor for any API resource in Kube, OpenShift, or extension being prototyped now that I'm hoping helps bridge that gap.
Also, check out the latest nightlies or 3.6.0-alpha.2 for a bunch of massive improvements to the overview - I think it's easily the biggest enhancement over the last few years.
Also the ability to split up access to workspaces in OpenShift could be useful, though I don't know if that's accomplished with actual API objects or whether it would map onto kube.
Generally, though, you're fighting the high level idea of these tools by trying to tie to specific hardware.
Normally in these circumstances I'd recommend BOSH but, off the top of my head, I don't know what kind of affinity can be defined. So it might be another rabbit hole.
I figured as much, but I'm trying to avoid inventing my own solution if one already exists.
Not sure what else are you looking at
while the docker swarm service looks very promising and will probably proof itself to be the best for small to medium and maybe even big corps, its most definitely not battle tested yet, as its just too new.
you actually need to add unstable repos on most distris to use the 'stack deploy' option. quite a few are still <12
Same with k8s - package revisions are at the mercy of distro maintainers. However, I have found that docker is far more updated that k8s.
BTW Docker Swarm is fairly well tested - but I think it is called Docker Enterprise Edition in its well tested form.
There might be edge cases where k8s is better than docker swarm.. however I personally believe in 90% of startuppy deployment usecases are worth not fighting ingresses for.
Tools included on this chart include:
I've been really happy with ECS. Tons works out of the box. Last piece for me was figuring out a good way to do cron, but I just figured that out. http://blog.ratelim.it/blog/cron-jobs-on-amazon-ecs-with-dat...
Missing from the list: Cloud Foundry. And OpenShift! Give Red Hat some respect here for making Kubernetes into a platform. And DCOS, which I think got its thunder stolen by Kubernetes.
But I mostly know Cloud Foundry, because it's my day job to keep making it better.
It's the most mature out of all of these, has been soak tested to 250,000 running applications, can be deployed to any major IaaS or bare metal, comes with routing, logging, service injection, healing, no-downtime upgrading and I forget what other headlines I usually pick out of the several hundred features it now includes.
The Cloud Foundry Foundation includes Pivotal (my employer), IBM, SAP, Google, DellEMC, VMWare, Cisco, Suse and those are just the fancy tech names.
The reason you don't hear about it is that we and our partners have focused on competing for enterprise sales. It's not how you get publicity on HN, but it does mean that PivotalCF -- our commercial distribution -- has the fastest-growing sales of an opensource-based product in history. And sales are still zooming up and to the right.
The nice thing is that we and other partners get very broad, specific feedback from customers who are already at massive scale and who expect utter, non-negotiable reliability. We run a public service ourselves (Pivotal Web Services), SAP have HANA Cloud, IBM runs BlueMix.
We and other Cloud Foundry contributors have had the benefit of that dogfooding and feedback for longer than any other container-based platform in existence. And it turns out, there are so many things that can go wrong. So many. It's crazy.
Lastly, Cloud Foundry teams have massive investments in project automation. This gives us two capabilities: one is that we can roll out complete rebuilds of the whole platform within hours of an upstream CVE patch. Users can then apply these fixes to their platform live, in-place, without any running apps noticing that it has occurred. BOSH will roll out the deployment with canaries, and Diego will relaunch containers as they are reaped during upgrade.
The second capability is that we are confident in making very deep, very aggressive changes if it proves necessary, because we have tests upon tests upon mountains of more tests. And again: nobody notices, except that their platform gets faster or becomes reliable under even more extreme circumstances.
If I sound like an utterly biased fan, it's because I am an utterly biased fan. I've watched this thing evolve up-close for years. It is amazing.
We build installable superpowers.
Disclosure: If it's not obvious, I work for Pivotal on Cloud Foundry.
Minor edits for style.
- Running on a cloud
- Spinning up ~8 different servers, each with ~2GB of memory iirc
Kinda a lot of overhead if you're just running a few dev environments like I was.
PCFDev if you just want a dev environment to tinker with. You can also tinker with cflocal (unsupported) or BOSH-lite (somewhat-supported) if you're looking to practice ops as well.
A lot of Cloud Foundry teams do integration testing with BOSH-lite on large single VMs, because it essentially presents BOSH with a "VM" interface that spins up nested containers instead.
You can, in theory, deploy however few or however many real VMs, or bare metal servers, as you wish, by editing a BOSH manifest.
By default the cf-deployment manifest uses something like 23 VMs of varying sizes. That's because the default manifest is intended for production use with multi-AZ HA as the minimum level of reliability. If you don't need that, one of the alternatives I noted will get you what you want.
OSS Cloud Foundry is free, and so is BOSH. A lot of companies use those and are perfectly happy with them.
Red Hat OpenShift undercuts us on price, I think largely because they figure they'll get to do the razors/razorblade thing with RHEL riding in behind it.
If you're reading this list of comparisons out of genuine interest, I would suggest also looking into Nanobox: https://nanobox.io
upgrade flynn requires back up and restore
backups are full backups only no incremental
recommended filesystem is s3 azure or google since
postgres is not reliable
That will require a pretty strong argument.
Finally, DNS component had nearly malfunctioned and I've switched to Compose with some WeaveNet and semi-manual (re)scheduling. Lost some useful functionality (and the UI) but at least the switch fixed things and it works. Wanted to use Swarm for fully automated scheduling without a DIY external watchdog system, but its networking is extremely limited at the moment (and I haven't groked how to use it with Weave).
Kubernetes looks nice, but the amount of YAML I'll have to write scares me away. And I'm not sure I trust all that "we do magic" stuff anymore. The more complicated system is, the harder it's to figure out what to do when it starts to fail.