I recently spoke with a company that had no testing whatsoever for a large production app. When I asked about it, they proudly said "Oh, we do CI. We have Jenkins!" Any tests? "We're going to add them after we move to microservices. Moving away from our monolith is top priority because monoliths are difficult to debug."
I see a ton of companies shitting all over best practices and then chanting buzzwords to pretend that they're all about it. That, or gross misunderstanding of any concept behind buzzwords.
X company uses Docker. We should use Docker. "Um. This code runs on an FPGA." "Does it run Docker?"
Containers are just a better tool for writing OS configuration scripts. (If your team is full of Chef experts then it's not "better" for your team, but for a lot of teams it is).
What you're saying applies a lot more to microservices, which are a fundamental architecture choice. Containers aren't, they're just better than a tangle of bash scripts which create stateful VMs. And the problems you're describing apply no matter which tools a team uses.
Remember that you can use containers without complicated orchestration or microservices. I think a better argument would be to untangle these three things and describe how each one can solve certain problems or make the problem worse, and under which conditions.
Agreed, containers are just one way to package software, they aren't the be-all-end-all when it comes to making software modular.
One example of a modular design abstraction that does not rely on containers is the data access layer ( https://en.wikipedia.org/wiki/Data_access_layer ). The idea being you can design a service that sits on top of a data store (whether that's a RDBMS or otherwise), that encapsulates the core business logic that you want the applications in your business to adhere to. This data access layer can potentially be shared by differing applications. The implementation of this does not rely on containers. Also, just in case "data access layer" (DAL for short) seems like a corporate IT term, I'd say the best tool I've ever seen that's used to build DALs is GraphQL.
No, not really. You could argue that dockerfiles are part image provisioning script and part process environment specification, but I think you'd still be missing the main advantage. Dependency isolation is the thing that usually gets touted, but that's only part of the picture. After all you can isolate dependencies now by baking images. That works great, it's well proven and reliable. But the vm that runs a single boot image can potentially run dozens of different containers, and using an orchestration platform you can easily and quickly shift those loads around, scale up and down, reconfigure and redeploy, all with far less overhead then deploying an image to a vm requires. Containers didn't become a popular tool because they don't add value. The use case for them has been clear for over four years now.
Well, yes, they did become popular despite not providing anything
substantially new. The main value of containers is that programmer who works
with network now doesn't need (initially) to understand how to configure the
network, which is a dumb idea by itself. All the other things added by
containers boil down to distributing a tarball with whole operating system, so
you can run that in a chroot.
From where I stand it seems that programmers didn't want to learn how to
build, distribute, and configure software with OS packages, so they invented
their own binary packages system.
Edit: the extreme portability and "free" concurrency were just a bonus.
Chef experts likely deploy your underlying hosts, setup other required services (eg load balancers, state)
while you can provide a generic "container" that will run whatever you want and provide consistent ingress/egress points so the "chef experts" can run it without caring what it is they are running
I'm a C# guy so I'm going to speak in c# terms. I don't see any reason you shouldn't always have in a monolithic solution, c# projects where all of the internal classes are "internal" with a "public API" that is either a single class or interface.
That gives you the optionality of creating either in process Nuget packages by extracting the project or creating a microservice later when it makes sense. It also makes merge conflicts less likely and it lends itself to easier testability. In the last year and a half, I've been combining and separating projects from one application to another between microservices, Nuget packages and even Git subtrees as it made sense.
As far as microservices, one benefit is that it makes blurring the lines between domains much harder. A mediocre developer can easily go into a well designed code base and make a mess of it quickly. In a microservice setup, their mess is mostly contained to one domain at a time.
I am a fan of "refactor from zero".
Yes, you're right. People do often mean rewrite. I suspect that using "refactor" instead comes from working in environments where rewriting is seen as akin to proposing sacrificing babies to Satan, but refactoring is a daily event.
But, I made sure we had an easy to use CI/Cd solution, orchestration and service discovery.
The reason is that a deep understanding of the problem is hard and expensive. A proposed solution, particularly one that's being widely adopted in other places, has the surface appearance of a potential solution, but it is difficult to tell in advance whether or not this is true.
Hence: software, programming, technical, and management fads.
This also appears in clothing, music, diet, arts, and language (most especially dialects and/or slang).
The foundations are in information theory.
Once we've solved a problem once, we 'understand' it. But we don't want to solve the same problem again. And even if we do show up for another one of these we have to somehow explain it all to people who won't believe it til they see it anyway. It's boring, thankless work.
For instance I've done things that look a lot like CMSes many, many times. I can predict pretty accurately what the bosses will be pissed about in 6 months. I'm only surprised by a production issue if it's actually dumber than I thought we could possibly be. Yeah, of course that broke. I've been warning you for months.
But if I had a nickel for every time I said "You really don't want to do it that way, do it this way", and people actually listened, I wouldn't be able to afford a cup of coffee. Only the Jr devs listen. The rest think they're special and will avoid the problems that everybody runs into.
At this point, I should probably have my head examined for showing up again. I have resolved that next time I will work on something where I can make all new (to me) mistakes and have a chance to learn. But here's the rub: that's probably exactly what 90% of my coworkers were thinking when they joined this project.
From my personal experiences and those of my peers, I don't think you can trick people into discipline by rewriting the application and then letting them in after you've "fixed everything".
First, you are most assuredly deluding yourself about your own mastery of the problem space. The problems you don't see can kill you just as badly as all the ones you do. Two, the bad patterns will sneak back in the first time you are distracted. Which will be almost immediately, because you just made assurances about when major pieces of functionality will be ready to use.
If the team has enough discipline already, you can start refactoring the code to look more like what you wanted. By the time your rewrite would objectively ship you'll be a long way toward it already (and maybe discover some even cooler features along the way.) Refactoring is the Ship of Theseus scenario. You get a new ship but you still call it by the old name.
Culture > Strategy > Process
CI/Containers/Microservices come under process. Without a culture that fosters a solid engineering strategy that supports and enriches them those processes will die on the vine.
My favourite example is when our AWS TAMs offer a solution, knowing we have ZERO pipeline/infrastructure setup for supporting containers. They always push containers. We don't use containers, stop forcing them down our throat. We've tried, we've been burned, VMs work for us. Stop!
When did containers become perceived as the end-all solution? I see their value and uses but they don't meet ours so why have we started ignoring the right solution for the job? I see this everywhere I go.
But you don't need k8s or containers for orchestration.
I chose Hashicorp's Nomad (I'm the dev lead for our company) precisely because I didn't want to commit to Docker from day one but I did want to leave that option open. Nomad works with everything - Docker containers, jar files, shell scripts, raw executables, etc and is dead simple to set up - one < 20Mb self contained executable that works as a client, server and as a member of a cluster. Configuration is dead simple if you use Consul.
"Docker, docker, docker!"
Edit: I'm not knocking either of these products - I actively use Mongodb in production. I like docker/containers/Kubernetes and have used them for various projects. I just take offence with how people have started ignoring common sense, like: we don't have the tooling in place to support this product, or: it doesn't meet our business needs.
(honestly couldn't tell if you were being sarcastic, so assumed you weren't)
This is mostly a critique of microservice architectures, not containers. If that were the main point I'd have little disagreement.
> Someone in security is weeping for the unpatched CVEs...
> ...the heavyweight app containers shipping a full operating system aren't being maintained at all...
This is just wrong, it's the opposite of that. Never have I had more up-to-date operating systems, programming languages, and frameworks than when I started using containers. It's just so damn easy, especially if you use `FROM python:3` instead of `FROM python:3.6.2`. It auto-updates every time you deploy.
> There is no substitute for experimentation in your real production environment; containers are orthogonal to that...
They're not orthogonal to it, they're a really useful way to get very, very close to production. The maxim isn't untrue, but again, I sleep better than I ever have in my life because I know that these problems are now rare for me. The difference between my local, staging, and production is tiny. I haven't encountered such an issue in over a year.
All of the problems in the article are true no matter what tools you use to build and deploy. The author focuses a lot on developers' desire to go off in a corner and build their own little world. That's still a risk if you're using Ansible or Chef.
Bottom line: writing a Dockerfile is the most powerful way I've ever found to define your OS's configuration in code. Stop discouraging people from trying it just so you can make grand arguments about the types of problems every engineering team faces.
I still think containers are the best answer to a ton of operational needs but it’s absolutely true that better tooling is needed for a bunch of problems, and this is repeating the classic hype cycle where it’s being billed as a cure-all when in fact it’s still going to require time, staffing, and a commitment to do the job well.
As an example, start with the easiest problem: say you want to prove that you’ve installed the latest OpenSSL patch. On traditional servers, this is a well solved problem. If you’re using Docker, your options are to buy a commercial offering with a support contract or, if your purchasing process is dysfunctional, build something around Clair, which has a bunch of usable but not great tools. If you’re the ops person looking at that, you’re probably thinking this just made your life worse even if there’s the promise that in some indefinite future it could get better. I’m hoping that the OSS community starts rounding out rough edges like that because it’s definitely an enterprise adoption barrier.
I like the idea of being able to precisely control both my code and all of my dependencies, so that I know for certain that I'm deploying exactly the same overall system that I tested. Containers are much better for that than the old way, because you could never be certain that your OS and system software were exactly the same in production as they were local and staging. But to achieve that precision, you need to use precise version numbers, and you need to install your dependencies from a local repository to be really sure.
Local may test with something that's a point release further, but in that case you'd find the issue when testing and pin the python to a specific 3.6.x until you resolve the problem and start rolling again.
There's also nothing preventing your from building images with precise control over versions of your dependencies. You can do it in the image in the same way you'd do it anywhere else. Specify your own repos and use lock files, or whatever your language allows.
What you're missing is that the beginning of the "deploy" process is building the image on your local (or on CI). That's when the update happens. Then you test it on staging, and if all is well you deploy to production.
If there's a problem it's easy to change your Dockerfile from "python:3" to "python:3.6.2" in order to go back to what you had. Or stick with "python:3.6" if you only want security patches. Or, if you want to miss out on those security patches in order to guarantee more stability, go with "python:3.6.2" and decide when to test and deploy an upgrade.
I'm saying that has little to do with containers, and if anything containers make it a lot easier to get security updates.
It builds once, and then that very container with those dependencies gets pushed through testing, staging, and production. No more changes happen after the build.
Way back in the day, I helped push my company towards a process where build artifacts were tied to specific commit ids (SVN, back then) so that everything that reached production could be traced back through QA and Development. So, basically the process you described. No containers back then, of course, and no VMs either. We had real servers in our server farm.
I was lucky enough to get to slog through this stuff hard and have been building containerized development and operations systems as a huge amount of my work ever since. I get the author's position and I think it's likely the case for most people if they are asked to "Just use containers." It takes planning, knowledge, and a pretty multiclassed skill set to put together great container operations but even from day one when I realized I could isolate node (way before nvm) it was a godsend.
I barely run anything on my base system anymore. Everything I put together is now a cascading series of helm charts that easily deconstruct into bare metal deploy. The developers on my team are able to move fast with it because I stay ahead of it and make sure the tools are usable and documented before they are even thinking about them. I can take really obtuse customer integrations and quickly come up with solutions that don't create friction because of how fast I can break them and "infra as code" their way into our stack so no one has to deal with the fact that the API is garbage. I deploy things with health and liveness checks, I get reporting across the board of usage. Anything I want to flight to the world or internally is authenticated through our LDAP/Directory/GHA and I don't need a server troll to administer it.
I fully understand people not wanting to use them and just stick to what they know, but containers are amazing and I use them at micro to macro scale. Like you, my code has never been so up to date.
It's fun to write a glib article about how you don't like things that are happening. Great if you don't wanna learn them soup to nuts, but to dismiss their value so absolutely really misses a ton of opertunity. Even if it's many many pain in the ass weekends to get fluid with it.
There's an issue with that. You're trusting whoever builds the python:3 image to actually update it and be secure.
There are a couple high CVEs in python:3 image, including a 10:
Then there are a bunch of other medium and low CVE, mostly from imagemagick, which is kind of a shame to include if you really don't need it. Same goes for that 10 for mercurial if that's useless to your project too.
You are best off receiving a base image from a trusted source, eg, if your organization maintains a set, or there is some distribution you trust who provides just the OS. Grab the most minimal set, then add your application on top of that. Make sure you go through a check to ensure you're not adding any insecurities yourself.
Of course later on the consultants came along later and scoffed at the simplicity of our process - our deployment process is basically one step in our continuous delivery pipeline - copy the bin/release directory to the destination folder.
I tried to get them to articulate a business case for us to use containers. They couldn't come up with one.
Then my manager, someone I really respect for his technical acumen finally gave me one. If we go to containers, we don't have to provision servers on AWS. We can use AWS Fargate. Lambda isn't an option we have long running processes that make more sense as apps.
I wanted to do Docker anyway eventually just to add a bullet point on my resume and I could have as the dev lead but it felt unethical to make a choice that wasn't best for the business. Now I think it's the right way to go.
Yeah, Fargate may be an inflection point: a lot of discussion of Docker ignores the fact that you need orchestration to make it work in prod, and orchestration is overhead.
If they really cared about containers they would have helped you continue using nomad/consul as thats a fine combination
How do you debug/trace the flow of execution between them easily?
You can't even get a good stack trace inter-executable, can you?
And in that regard Containers are successful as hell. That's why we have a religion around them now. You can hate it but you can't really ignore it if you need to pay rent and work in the industry.
That’s a shame because the rest of your comment is right: that space is on the red hot point of the hype curve and there’s a ton of money being spent so someone can brag that they’re doing the same thing as the cool kids at Google or Netflix when in reality that’s like saying you’re an Olympic marathoner because you bought the same shoes. It’s profitable now but I wonder whether we’re due for some backlash.
> common baseline
"Here is the standard image with standard packages; if you need something else, add it to Chef/Ansible/whatever-we-use"
> responsibility boundary
"Ah, this binary is in /opt/$COMPANY - go ask the devs why it's broken"
> which is identical
So use configuration management (Chef, Ansible, whatever) for the system and tarballs/packages for your stuff, rather than shipping whole system images per-project?
Here are some complications for the traditional approach you mentioned:
1. Containers provide a standard interface for managing things in any language: that means you don't have to know that this Java team used Tomcat, someone else used Jetty, the Python apps use mod_wsgi or gunicorn depending on when they were written, etc. Yes, Chef/Ansible/etc. can coordinate that too but you have to maintain conventions for editing shared configuration, permissions, storage, firewall ports, dealing with religious arguments about systemd, etc. That's especially true if you're using vendor or open-source apps where it's really nice not to have to spend time repackaging something where someone at Oracle, Atlassian, etc. use the LSB docs as rolling paper rather than reading material.
Again, having done it for years I'm not saying this cannot be done but it's refreshing not to ever have a talk about user IDs, using a sane config .d layout on RHEL, etc. again. That brings me to:
2. Coordinating changes or updates: again, yes, it's always possible any way you choose to do it but it's really nice not to have to maintain changes in multiple branches for teams at various stages of upgrading, deal with special cases, etc. Shoveling everything into a container means that the only team which deals with those questions is the one best equipped to answer them. That's especially nice when the problem is something like upgrading a common distribution package and it'd take a non-trivial amount of time versus no time to answer the question of whether your backport will break something else used somewhere in your company.
3. This is similar: “Ah, this binary is in /opt/$COMPANY - go ask the devs why it's broken”. In simple cases, yes, that's easy but once it gets more complicated — “this gets slow every Tuesday night”, “we're getting sporadic disk full errors”, etc. — it's really nice to have an inside/outside division which is consistent across every system and every project. Following the LSB rules gets you a lot but that's not universally followed so you're going to have to spend time on exceptions, arguing with vendors, or getting upstream open source projects patched. Again, that's all valid work but some times you don't have the time to spare for.
until you have to maintain/debug/etc N^N combinations/branches of common code used within the different self produced containers to accommodate all of this 'freedom'...
(This is similar to the argument for containers with developers who aren't great sysadmins: they're usually making changes anyway and this way they don't have root on the host, etc.)
Pretty similar to the article's point of view, but not that it hides cultural problems under a tech layer, but technical problems.
And yes, for the pure developer it might be better, because he can now focus more on his software only. But someone needs to maintain the infrastructure the software runs on, and this job just got harder.
(this is a copy of a comment, but it was meant to be in this path of the comment tree not the other one)
I put containers and "orchestration" like kubernetes right up there with "Big Data", Kafka and a bunch of other technology that is the current fad. All of these have a legitimate use case. Odds are the use case of anyone reading this comment isn't one. But because of the terribly broken interview process and bandwagon effects, engineers feel compelled to force them into the development process in order to bypass filters (human and automated) on their resumes and keep a sort of social cache among their peers.
The basic idea of Kubernetes is really great. But it's not really finished development yet, and already bloated as hell. If you add that most of the stuff also runs on Openstack, which has the exact same problem, and developers who still don't understand that this is not a silo-build sysstem, you end up in an environment where 100+ people only work on getting your stuff installed with the same control you would have with a signle developer, ssh and init scripts.
PS: my comment isn't even in the minus anymore. It often happens that the first 1-2 views give negative comments and then it gets upvoted quite far. Not sure why. From my perspective I write as objective as possible. But also don't really want to think about it too much. For some people my thoughts seem valuable enough, so it's fine.
I'd put "cloud computing" in that list. The only explanations I can think of as to why it's taken off so quickly are:
1. Not all companies are large enough to hire their own sysadmin(s).
2. Mid to large sized companies can have byzantine budget bureaucracies, making it easier to invest in short term fixes rather than considering long term savings.
However, there are also large tech-savvy companies using cloud computing, and paying through the nose for the privilege, that's the part I really can't understand.
My argument is, that they don't solve a new problem. Where there are already solutions it's fine (e.g. multiprocessing), but where there aren't solutions yet (e.g. low bandwidth cluster storage, or simply networking) it doesn't add any value. And instead of solving any of the hard problems it adds a layer of abstraction on top of the problems that makes it harder for the every day engineer to solve the problems.
This was magnified when I attended Dockercon in Austin last year. As a very small team that has used Docker (and ECS) to solve some problems, I was excited to learn more. It became quickly apparent that I was at the wrong conference for that: it felt like an enterprise vendor party where everyone was passing around cups of kool-aid.
In the past we might have to decide to forego employing one or both. Now we kind of have the option of giving each dev their own "playground" via containers, and not actually expect either to improve their teamwork skills. Again, as long as the cost of supporting the container infrastructure is lower that the business value each dev can deliver, it's a net win.
In practice I'm not sure if containers can really deliver on this promise, but it's a very seductive idea.
think the issues being pointed out here are more about the whole 'linux container ecosystem' which has a notion of statically built containers, automated system orchestration, etc, and primarily in an operations context..
Stop trying to reduce costs. Work out your cost of delay, and focus on getting things done more quickly and more effectively, not more cheaply and more efficiently.
Old-school software engineering is very much about reports, forms, documents. Each tries to gather to itself every instance of some kind of information. Here is the Customer Requirements Document. Here is the Software Requirements Document. Here is the Software Design Document. Here are the Software Test Plans.
These days most places work out of an issue management system. JIRA tickets, Github issues, stories in Pivotal Tracker, whatever. The work is broken into small chunks with their own lifetime. We don't wait while all the requirements pile up before opening the dam and letting them flow downstream in a batch. Each goes when it's ready to go.
This is not possible without the right tools.
That's a lie. It's totally possible. With 1980s word processing and spreadsheets you could absolutely do everything Tracker or JIRA do. You could use stacks of 3x5 cards to track thousands of items across dozens of teams.
But you probably don't.
The tooling lowers the threshold of the possible, in a social and economic sense.
No, containers aren't miraculous. Of themselves, they do nothing to fix other problems. But they make it possible to achieve improvements that are more expensive and difficult in other ways. They lower the barrier of possibility. The landscape of alternatives shifts, mountains become hills.
I've been on both sides of that divide now. As a consulting engineer I saw projects rapidly iterating but not being able to deploy ("ops are too busy right now"), leading to dozens of handsomely-billed hours being squandered in meetings and workarounds and emails and chats and phonecalls trying to get the code into any kind of production. I've also seen projects where deployment took an hour or two to set up and that was that. People got on with the job. And a major difference was the platform.
One more thing.
> Development teams love the idea of shipping their dependencies bundled with their apps, imagining limitless portability. Someone in security is weeping for the unpatched CVEs, but feature velocity is so desirable that security's pleas go unheard. Platform operators are happy (well, less surly) knowing they can upgrade the underlying infrastructure without affecting the dependencies for any applications, until they realize the heavyweight app containers shipping a full operating system aren't being maintained at all.
This is a problem buildpacks have solved for well over a decade on multiple independent PaaSes.
Disclosure: I work for Pivotal, we do some stuff with containers, but we sorta focus on the parts on top of and before them.
scientific computing, for instance -- you may have a weird set of dependencies for some code that you want to deploy a copy of on 200 slightly heterogeneous nodes, once, and hopefully never again; but, it's vitally important that people in the future have the possibility of replicating your computation.
containers are the perfect solution for this. in fact, in my experience they essentially do fix a broken culture around reproducibility of computational experiments.
Not exactly. For example, running your core infrastructure via Docker containers makes "ordinary" maintenance/disaster recovery really easy.
We run e.g. JIRA and Confluence in Docker... and it's a breeze to make a consistent backup and restore: stop the application and DB container, rsync delta to Netapp, restart the containers and the Netapp automatically does a snapshot. Daily, fully consistent backups with only 5 minutes of downtime - and because a restoration is as easy as "rsync the data out of the Netapp/LTO and do a (documented) docker run command with appropriate mounts" it's easy to verify that the backup actually works.
Prior to using Docker, testing the backup involved setting up a server, copying data around and manually importing it. Needless to say it was not done very often. But now we can apply this policy to all our services, plus we can test version upgrades with minimal effort compared to setting up the software from scratch.
Oh, and upgrading the software versions itself is also easy: rsync delta to backup, stop container, quick rsync again to ensure consistency (e.g. lock files, not flushed DB writes), stop & remove old docker container, run new docker container w/ new version, check if upgrade went OK - which it always did so far, but in case it did not, it's a matter of minutes to do a full rollback.
The downside, of course, is that we depend on the container author(s) to provide regular new builds which incorporate not only new app versions but also updates for the packages of the "OS" inside the container. For example if the thread vector is something like "PHP/Java does a RCE when passed a certain HTTP header", on an "old school" system I'd do an apt-get update && apt-get upgrade and that's it - with Docker I either depend on the container vendor or have to roll my own docker images with e.g. "FROM <vendorimage>; RUN apt-get update && apt-get dist-upgrade && apt-cache clean"...