>Docker solves a simple problem that everyone has.
Docker provides an (IMHO pretty buggy) isolation layer that lies between "keeping things that need to be kept separate in separate folders" and "keeping things that need to be kept separate in separate virtual machines".
I actually don't have the need for the level of isolation below VM and above folder very often. IMHO this level only really makes sense when containing and deploying somewhat badly written applications that have weirdly specific, non-standard system level dependencies (e.g. oracle) that you don't want polluting other applications' dependencies.
I've compiled and installed postgres in separate folders lots of times (super easy) and I've lost count of the number of times people have said "why don't you just dockerize that?" as if that was simpler and/or necessary in some way. That's the effect of "docker hype" talking.
The difference is standardization. Postgres can be installed in subdirectories nicely, as you say -- if you know how. Same is true for JBoss, CI build agent, whatever. Now if you have dozens of these apps then e.g. onboarding people in a remote team suddenly becomes non-trivial (a nightmare, to be precise). With Docker, they can get a complex system running in an hour.
The two primary use cases for Docker is, as far as I can see, is simplifying deployment on varying environments. Variations can happen because of many reasons. Sometimes you have clusters of various sizes in production. Sometimes the environment is a developer laptop. And so on.
I've very rarely seen setup scripts that don't have implicit dependencies on the details of the host environment. Whether it's assumptions about the OS or file system or what related software may be installed or whatever. Often because the original developer can't predict every permutation of the possible interactions because of a combinatorial explosion of possible system setups, sometimes because the script is poorly written, and very often because the language or tooling itself is poorly isolated (eg, pip and npm). Docker is a lightweight way of guaranteeing deterministic setup script execution. Obviously you can still screw it up by pulling from unversioned base images etc, but the numbers of failure modes are limited, typically easy to locate (ie within a single dockerfile rather than across an entire OS), and relatively easy to prevent with some best practices. And best of all, that's true across languages. I can achieve the same effects with pipenv, yarn, or other tooling specific techniques like folder level postgres installs. But then as a developer I have to know the thousands of idiosyncratic pitfalls that occur across the plethora of tools I have to deal with every day. And realistically, I have to deal with a ton of poorly written applications and scripts that I want to execute with some degree of isolation without a ton of overhead. More than I could possibly ever fix.
I've very rarely seen scripts like this make implicit dependencies beyond assuming what kind of package manager is installed. Moreover, everywhere I've worked the package manager was either under our control (in which case no problem) or was mandated from above (we're a red hat shop: use yum - again, not really a problem).
I've spent more of my life and torn out more hair dealing with obscure docker bugs than I have converting scripts from one flavor of linux to another.
It's not whether you need it, it's about becoming a standard. Ubiquity is a strength. It's much easier to learn a few docker commands that will run a container the same way everywhere instead of worrying about distros, folders, config files, volumes, etc. The container registry also makes software distribution much nicer than installing and configuring repos.
It's great that you compile postgres but I just want to run it in a clean and portable way, along with several other programs, and without learning new workflows for each one. Docker containers give people more options to package and run software in a simple standardized process while offloading the tedious system details that don't matter. That's progress.
But this level of work requires an operations guy to know how do do all this right. Most (not all) developers can get to virtualenv or similar tools, but have issues keeping the rest of the system working and stable, or with firewalls, or with system patching.
As an ops person myself, docket saved me lots of time.. defeated can run their containers locally then hand then over to me to stand up. As we move to hosted services, I don't even need to maintain a server. My role of shifting from spending lots of time on ansible and monitoring servers to helping look at code and spending more time investigating weird bugs outside the developers capacity.
I was an early Hadoop adopter as well... And I agree with people's sentiment here -- it was a tool looking for a problem (outside it's specific use case). I used it for it's intended purpose, and I have with it too make it a web crawler to. It actually kinda worked in that regard, but it's not the right usage. It might be able to expand into new use cases though.
Docker solves (again) a real problem in the industry that had existed for decades... And The problems solution keeps going back and forth. Nowadays we train developers, not systems engineers (I've been trying to hire a systems engineer for almost a year and have nearly no bites... Or developers positions get 3 good candidates worth interviewing in 2 weeks or less). This means we have lots of available developers and not enough ops people. Containers help shift the burden to work in this dynamic to -- it simplifies the process to get the devs application to work in isolation. This means 1 ops guy could support a dozen developers and 30 apps on one server relatively easily compared to before. It shifts the burden of the developers runtime environment to the developer... We can still step in too help, but when file that environment is codified in git.
I've been an ops guy for a decade and unlike my positional colleagues I love Docker, it's let me focus on more important things.
I sort of agree with this, but the advantage I see of Docker is in providing a "standard" (ymmv) interface that's less heavyweight than running an entire simulated virtual machine, but less dependent on the idiosyncrasies of a specific environment.
I can give my coworker a docker image and it mostly "just work" without failing because she happens to be running a slightly different version of Ubuntu with different system libraries present.
> I can give my coworker a docker image and it mostly "just work" without failing because she happens to be running a slightly different version of Ubuntu with different system libraries present.
"I can give my coworker a VM image and it mostly "just work" without failing because she happens to be running a slightly different version of Ubuntu with different system libraries present."
and also:
"I can give my coworker a full system container image and it mostly "just work" without failing because she happens to be running a slightly different version of Ubuntu with different system libraries present."
That's not all that docker or any containerization gives you. For most people I've seen it's about infrastructure as code and quickly deploying Dev/staging environments. Yes you can do the same with ansible or terraform but what if you only want to test locally? Do you really want to wait for a VM to be carved out? Or just run a command and have things come up?
Also, not needing to support log extraction or deal with unit scripts or SSH or bin packing applications onto VMs or mucking with ansible/packer/etc. Personally I don’t want to spend my time on needlessly tedious, uninteresting problems.
Just registered a bug issue. On "Steps to reproduce" simply put docker start command, and curl to reproduce an error. Without docker how can i be sure that maintainers will have same env as i have? This is very useful as far as i can tell.
90% of the time Docker is used to solve the problem of "how do I upload this bucket of Python crud to a production server?" (Replace 'Python' with any other language to taste.)
A slightly smarter .tar.gz would have solved the problem just as well.
No it wouldn't have. What's unzipping and running that code? What's monitoring it and restarting it? How do you mount volumes and env variables? How do you open ports and maintain isolation?
A container is vastly more powerful for running an application than a tar file.
No offense, but you're aware you make it sound as if we used to use punch cards until the arrival of docker?
You can often run daemons as different users and set appropriate file permissions. You can add ENV variables to your start up scripts or configuration files. Volumes are mounted by the system (and you set appropriate access rights again). Monitoring and restarting services is managed by your init system (and probably some external monitoring, because sometimes physical hosts go nuts). Depending on your environment you can just produce debs, rpms, or some custom format for packaging/distribution.
Yes, sometimes you still want docker or even a real VM, and there are good reasons for that - I totally agree. But often it is not necessary. I'm often under the impression that some people forget that the currently hyped and cool tech is not always and under every circumstance the right solution to a given type of problem. But that's not an issue with docker alone...
>You can often run daemons as different users and set appropriate file permissions. You can add ENV variables to your start up scripts or configuration files. Volumes are mounted by the system (and you set appropriate access rights again).
That sounds exactly like creating a Dockerfile. The difference is that your script has to work any number of times on an endless number of system configurations. The Dockerfile has to work once on one system which is a much easier target to hit. The "any number of times on an endless number of system configurations" is a problem taken care of by the Docker team.
You seems to be not aware of the problems docker solving “out-of-the-box” and that about 10-15 years ago, those problems was solved in-house developed toolset.
The difference is that with VMs, you have to configure the things that you get for free with container runtimes. Specifically, Amazon can take care of a ton of the most mundane security and compliance burden that our org would otherwise have to own. Those differences means that developers can cost effectively be trained to do much of their own ops and I can solve more interesting problems.
Well, not necessarily by hand. I never gave up deploying our Java application with Ansible. We could have used Docker but the team decided to use fat jars and Ansible instead. Nowadays with Java 11 you can make those fat jars even slimmer. There was no value proposition for us to change.
I did not work much with deployments but one thing I liked with Docker over Ansible is that testing the configurations locally is really easy and independent on the host platform.
My point was that Docker purports to solve the sandboxing and security problems.
In reality, this is something that 90% of people who use Docker don't give a shit about. For the vast majority Docker is just a nice and easy-to-use packaging format.
The sad part is that
a) Docker failed at security.
b) In trying to solve the security problem Docker ended up with a pretty crufty (from a technical point of view) packaging format.
Maybe we need to start from scratch, listen to the devs this time and build something they actually want.
docker solves 2 problems. first is you have no control over your devs and allow them to install any software from anywhere. and second is you want to sell cpu time from the cloud in an efficient way (for the seller).
1) is not a containerisation problem. It’s a team problem. I can jam in a load of npm and pip installs in to a shell install script. Maybe even delete /usr/ for the hell of it. Because the script isn’t isolated from the OS I can cause more damage.
This problem is actually solved by doing code reviews properly and team discussions.
2) errr no. Containers != infrastructure. If you want to deploy on bare metal, you can.
The cost includes making development impossible without internet access, given that devs are not going to be carrying a cluster of servers around with them.
Docker provides an (IMHO pretty buggy) isolation layer that lies between "keeping things that need to be kept separate in separate folders" and "keeping things that need to be kept separate in separate virtual machines".
I actually don't have the need for the level of isolation below VM and above folder very often. IMHO this level only really makes sense when containing and deploying somewhat badly written applications that have weirdly specific, non-standard system level dependencies (e.g. oracle) that you don't want polluting other applications' dependencies.
I've compiled and installed postgres in separate folders lots of times (super easy) and I've lost count of the number of times people have said "why don't you just dockerize that?" as if that was simpler and/or necessary in some way. That's the effect of "docker hype" talking.