In all seriousness he has half a point, but for the wrong reasons. Docker doesn't let you take your "old" languages and put them "in the cloud" over more superior approaches - it's a shipping container. That's it. It's a known quantity that lets you slot your 10,000,000 LOC Java monolith alongside your 25 line Python microservice and orchestrate them with the same tools.
That's the advantage. The side effect is, sure, you can take your old, shitty code and run it "in the cloud", and it's nothing "we should get rid of".
To put it another way: "Shipping containers protects a transportation method that we should get rid of". Sure, drones could fly every item individually to you from the factory, but while you're fiddling around with that Maersk is shipping 20% of the worlds GDP every year by dealing with what it knows best: how to move containers, not caring about what's in them.
"Shipping container standardisation" is not a metaphor for what docker is, it's marketing. Docker, of course wants to be a "standard" product. Every product wants to be that.
In my experience, it's been more like a buggy kludge to deal with applications that have isolation issues with their dependencies and unnecessary overhead for applications that don't. All with a sprinkling of marketing hype.
Refusing to support old and new side by side is going to leave you with the old way of doing things indefinitely.
Nothing of what he complains about is Docker's fault or even Python the language's fault.
Funnily enough, I have spend the better part of Friday trying to wrap my head around Go modules and getting a project to a state that I could run any go command in. Have I blamed Go, the language? Nope. The tooling? A little bit but mostly my own ignorance of how it works.
It would do the author good to do some soul searching and perhaps understand that not all problems are nails, where the best tool to deal with is a hammer.
Python is being fixed at its foundations and tools like poetry will approach parity with Maven.
He goes on to suggest that somehow these languages put the programmer closer to the OS than C, which completely ludicrous since most reference implementations of them are written in C - making the language effectively one layer above it.
I get the argument about concurrency but that's pretty much the only thing he has to offer there. Everything else is just someone who doesn't understand the tools blaming the tools themselves and wishing he was using something else.
The fact that he thinks "Docker is supposed to protect us" from dependency management issues is ludicrous.
EDIT: nope. author has 10 years of java. this is the worst kind of sr dev. they think their java expertise applies to some other ecosystem (python) because well it’s all computers amirite? that python has a dependency problem doesn’t mean docker is broken! he couldn’t diagnose a problem therefore the tooling is bad. get over yourself.
Also, not all experience is equal. Someone that's spent 10 years working on 4 or 5 different systems in totally different problem domains, written in totally different languages, and operating in totally different ecosystems is going to have a very different view of development from someone that's spent 10 years doing essentially the same thing over and over again.
This guy seems to have a very focused view of the correct approach to problems. He's familiar with the tools that linux offers (which I agree are great), but he doesn't seem to respect the scale of specialization it takes to use and maintain those tools effectively on a large scale. Also, there is no mention of the cost to rebuild existing systems in terms of developer time, the mental cost to re-train all of the developers, as well as the time to migrate and train the users.
Ironically, I remember getting into debates like this back in the mid-2000s when I was first starting to think I had it all figured out. The points I made back then were more or less the same things I see now in the article above. It's quite nostalgic, though it definitely makes me feel older than I like.
I argue docker-based deploys should not be looked at from a "application programming" perspective, but from a "dev ops" perspective. It is docker vs puppet/ansible. Not docker vs akka.
To me docker is a big step fwd from provisioning OSes with puppet/ansible and deploying apps on top. I have a deployable unit that is accepted in many envs (k8s, aws ecs, roll-your-own), I keep more of the app-specfic code in the app's source code repo (instead of having some of it in puppet/ansible scripts), and we can use the same system locally (on our laptops) for development. And finally the configuration-by-env-vars convention creates a lot of clarity.
I'm not familiar with akka, but I do see BEAM (the Erlang VM utilized by Elixir as well) as an alternative to docker swarm/kubernetes.
It requires stack homogeny of BEAM languages, but you can run distributed, concurrent, parallelized code with live debugging tools and "hot code reloading". I feel like the author's point is that docker and friends enable tools not meant for distributed/concurrent/parallel to be deployed in such a way. I could be mistaken and would be curious what you think of that argument.
I'm inclined to agree with the point in as much as docker can permit forcing a square peg into a round hole. On the other hand, being able to develop on *nix and deploy to weird editions of windows 10 had made me deeply appreciate docker.
That's exactly the point I think the author fails on. Docker is not a tool to make your app distr/conc/parallel: and thus I point out that docker is merely a way to ship code. Not as a binary. Not with load of ansible/puppet scripts. But as a container (container spec + config as env vars).
It just happens to be employed by other frameworks that make it easier to make that mistake.
Presumably the one that doesn't have to rewrite everything that they just wrote from scratch to avoid Docker.
I think that's too simplistic. Larry is right that a company who manages to survive 5 non-productive years rewriting everything in a "better" ecosystem will likely be in a better situation than the company trundling along with ever increasing technical debt.
The problem is that the conservative company without the major rewrite is more likely to make it through (at the least the first few of) those 5 years. So Larry's not wrong --- he just might be making some unwarranted assumptions.
So you believe that code with no-to-minimal history running successfully will be better than dated-but-proven code that has been running for over a decade?
You mentioned hidden bugs, but what about hidden "features" that may be a critical part of existing business processes for core parts of the company? Developers really like to believe they are at the center of the wheel due to the complex work they do, but a lot of the time they are not the ones that actually create the cash-flow.
I've been part of rewrites that have succeeded tremendously, but I've also been privy to utter failures that have cost millions, and led to entire teams getting sacked.
After using languages with lots of dangling parts such as Java and Ruby, moving to Go was amazing on the deploy end.
With the dangling parts languages, you have a VM of some sort, and packages. All that stuff needs to be managed when you're building the app, and when you deploy the app, and when you maintain the app.
With a fat binary you only have to manage that once, and deployment becomes super simple. We actually put off using containers for quite a while because what's the point of putting a single binary into a container? There are reasons, but its much less compelling than if you have all of those dangling parts to manage.
I'm working with Elixir now and I only have to apt-get Erlang and Elixir. Everything else is included in the build.
To be fair if we developed some C extension we could need to install some more apt package. That would go into the ansible server setup scripts.
Deployment is not as simple as scp of a file to the server but deployment with distillery is still a single command operation once the deployment configuration file is OK. We're not using docker, which in our case would provide only sandboxing.
And this is way, way too much.
I will need a reverse proxy (nginx/traefik) anyway for non-Java services. Just give me an application that simply listen on a TCP socket (or better yet on an Unix socket) instead of that redundant intermediary that is a configuration/deployment nightmare.
With a go binary, you just have the binary. To deploy as a unit, just copy it to a machine. To run it, just run it - it's an executable. You can just give the executable to someone and they can run it. Great for sharing tools and utilities. Great for microservices because you can replace each microservice without impacting the others.
With java, you have the JRE, at least one JAR (which likely will contain lots of other jars for any non-trivial application), and likely some sort of script to orchestrate startup. You need a build system to put all that together. If you want to install it, you need something that will package it all up, which will require some thought and tooling. In the end, though, you still will have these separate parts installed on the system.
This greatly complicates maintenance, because now that you have a separate JRE, someone might want to reuse that for other things as well. They might argue it doesn't make sense to install the JRE separately for each service, or maybe their install system is incompetent and doesn't allow for that. If this happens, and you want to go to a newer version of your service that requires a newer JRE, you might get stuck because the people who control the deployment will tend to control the JRE that is installed and used. You end up getting stuck with a old JRE, and upgrading it becomes this big coordination issue.
Any non-trivial java app will require testing on whatever platform you're targeting. In addition, it will be handicapped by the least common denominator approach of the jvm.
With a go binary, on the other hand, you may have to build a separate one for Linux and Mac, but that also tends to free up people's thinking a bit and allow for tailoring the app for each platform. The testing issue is the same as for java apps.
However I think this viewpoint misses some of the fundamental features of Docker that are nothing to do with platform independence and more about isolation, configuration management and orchestration. I would use docker and docker compose even if it could only run Java which is already cross platform. Because I love how it lets me map volumes, ports, orchestrate startup and shutdown with compose etc etc.
On Py2 I think it's C compiled and that's always awkward - usually I forget to add python dev headers. Unless I use proper systems with "ports". start from the beginning or don't use py2 unless you really have a good reason.
On Py3 there is no mysqldb - they have not made the leap. so it sounds like the author got trapped in the 2to3 chasm. it happens to us all.
Problem is not docker- problem is not python. Problem is devops is hard and your personal eco system is only comfortable because you know where all the sharp bits are.
I think such absolute thinking is misguided and is low on fundamentals.
Same thing with people arguing over what is or is not a 'systems' programming language.
In the case of this article however, I think the author just dislikes python and docker.
Great generalization with no mention of asyncio?
> Myself and a friend just spent an hour trying to get a short Python script running on an EC2 instance. We got stuck dealing with this error: ModuleNotFoundError
Outdated dependencies, dealing with your distro's package manager, and specially understanding your package manager is vital, isn't it? But I guess it's easy to dismiss an ecosystem without fully understanding the problem.
> the dependency management in the Python community is so badly broken
You can choose to use pip as the package manager for system libraries.
python -m ensure pip # you won't use your distro's pip version
pip install --user <your-package>
For apps, a virtualenv is sufficient?
I see no mention about C extensions, which is one pain point for packages. If the package doesn't have a pre compiled version, you might see a big red blob of text directly from the compiler when trying to link dependencies . This kinda sucks for a newbie but I guess a quick google search will tell you to do apt install python3-dev or something if you're using some debian based system like the author seems to be using.
 This pycon talk is a nice summary of the state of wheels and C extensions.
The gist is that new binary wheels are being added, and it's a community effort.
There's no reference to isolation with namespaces and cgroups.
> There are older, mature options, such as Java and C# and Erlang, and there are many newer options, such as Go or Elixir or Clojure.
The JVM, BEAM, or Go process still needs to run on a machine and interact with an OS. They still need to be scheduled across thousands of machines that are constantly breaking. They still ought to be isolated from each-other when running on the same machine. There is nothing magical about these platforms that solves these problems.
Sure, and you can solve those problems without docker or dockerfile hell.
There are more stripped-down ways to achieve that if you're willing to solve a less general problem. Something like gvisor  gets pretty far. In the limit, I'd guess the difference between an executable and a containerized app goes away?
The idea that Kubernetes is somehow a symptom of bad dependency management seems to miss the point of Kubernetes - to allow multiple systems to be modeled as a pool of computing resources. One could image a Kubernetes that runs ‘fat binaries’ instead of containers and still be useful. In fact, that may be possible to implement within Kubernetes with a simple extension given how flexible it is.
> Developers who write Scala are in love with Akka, which appears to be very good.
Nitpick: Akka is very far from universally loved in the Scala community. Most Scala devs I know would rather use Kafka for cross-system message/event processing, and something lightweight like Monix for parallelism and concurrency within a process (you can also simulate Actor-like behavior with Monix if you want)
EDIT: there’s some kind of addendum at the end about base images where it looks like they are A) pulling straight from Dockerhub (a huge no-no from a security viewpoint) and B) not using tags right. Irrespective of how much you like or dislike a certain tool, not taking the time to learn the best way to use it is going to cost you more time in the end.
Docker three things at once. It's a container image format, a way to build images that follow this format and a way to run those images as containers. Therefore docker images are just giant fat binaries.
What the author seems to miss is the fact that a lot of languages need not only an archive that contains all dependencies but also an environment to run those archives like a JVM or a python interpreter, maybe even both at once. And this is the part were docker comes in. You split your Dockerfile into two stages. First you run the build stage which includes all development tools needed to build your software and generate your language specific archive, then copy it over to the next stage which only installs the bare minimum to run the software.
There is one fatal flaw here. If fat binaries don't install their JVMs or python interpreters then docker containers don't install their container runtime. So you will again need to up one layer and use an automated tool to install docker or kubernetes.
Oh I forgot to mention the solution to the JVM+python problem in the same Docker container. Well it's pretty easy. In the build stage you just tell pip to install the dependencies into the libs folder of your application instead of the global folder then you just create a bash script that sets the PYTHONPATH to your libs folder and passes cli args to the python script. That way you don't need pip in the final stage of the dockerfile.
I do think this article has some interesting points.
But, he misses the main point of Docker (or LXS for that matter) which is to scale/downscale & maintain a massive amount of VMs.
I worked at the time where we had 16 or 64 servers (not VMs), running several instance of Weblogic/Jboss. Each deployment took hours and we didn't have any elasticity on the number of servers/instances (each server could run X instances of Jboss) to scale up/down. If we hit the max, management bought another server and that's it. Oh, and every deployment meant downtime, because we couldn't do the crazy stuff Docker allows.
To me, Docker is a tradeoff: you're trading disk space and a little bit of efficiency for removing a TON of variables. Given that I have zero interest in debugging servers, that's a great win for me. YMMV. But if you have to potentially deal with thousands of servers, it seems like an obvious choice (excluding otherwise using something like a VM).
The JVM may have promised "write once run anywhere", but that's never really been particularly true. You've only ever really gotten that with an actual VM, and those are much heavier than Docker is
You're going to have to debug something, even with Docker. I still have nightmares about the Docker 12 rollout.
The JVM may have promised "write once run anywhere", but that's never really been particularly true. You've only ever really gotten that with an actual VM, and those are much heavier than Docker is
Which is to say that you don't get "write once run anywhere" with Docker either. In x86 land you'll still have differences (some more serious than others) between Docker on Linux, Mac (that still runs under a Linux VM, with all the networking hell that entails, right?), and Windows.
I suppose the immutability of the layers amortizes the size cost as one starts to run ludicrous numbers of service copies.
Frankly Python is still catching up to many improvements that have happened in the 2000s. That said, if you want to avoid using Docker as a packaging kludge for Python, it's possible to do so in many cases.
First, yes, Docker is a hack to make a fat binary. But that's not the only reason people use it. People use it because it's easier than packaging and running their app "the right way". Running 10 fat binaries all listening on the same TCP port and IP won't work. Even if your 10 binaries all co-exist on the filesystem just fine, you still need to run them in a way that won't conflict. You can do that without Docker, but Docker makes your life easier, so people use it.
Second, concurrency frameworks don't make your life easier, they make it harder. It's a language-specific abstraction that you have to design your app for. You should be able to just call fork() and join(), or listen() and accept(), and have some OS-level, generic component handle the concurrency for your app. This is not a failing of the language, but a failing of the OS.
And if you're going to rag on the industry, target the lack of standards. "Integration" is the opposite of standardization. The fact that I have to re-write 10 layers of glue to make some shitty web app run on AWS versus Azure is frankly one of the stupidest things I've seen in my career, but it makes sense when you realize all this tech was written by small departments in companies who don't give a shit about standards. Why spend time on interoperability for a product that only needs to work with one thing, right now?
You should be able to just call one function call in any app, and your runtime should communicate with a language-agnostic ABI for anything needed to route communications between a service broker and the code being executed, including changing the location the computation is happening on. And executing any application should allow you to include some isolation guarantees, because any application can have interoperability problems with other components running on the system. Containerisation is, I believe, the only thing besides Jails that has attempted to do this.
And if we're serious about microservices, why do specific clusters need to live behind a dozen levels of security groups, load balancers, VPCs, etc that all have to be custom crafted before we can even run an app in a specific region? That's a shit load of state surrounding a supposedly stateless service.
I think k8s is in some ways a response to all this, but it's still a poorly designed hack that enslaves itself to traditional OS models. What we really need are distributed operating systems. But just like we will never retrofit our highways with electric tracks for self-driving electric cars, we will never begin to upgrade the OS to be more standard, automated, and distributed. Instead we'll just push more shit into the app layer. Browsers now implement their own TCP/IP stack, and nearly all modern apps support encryption without ever allowing the OS to handle it. We're just too damn lazy to tackle anything but short term goals.
A lot of what you are asking for was and is in VMS.
That's the difference between IaaS and PaaS. If you don't want to deal with custom infra, go with heroku or platform.sh rather than AWS.
What do people actually disagree with here?
If you don't want to create something from scratch you can even use ready solutions like Flynn.
My point is more like, a lot of the time we don't need a PaaS, we just need to do less "integrating", if cloud providers all implemented the equivalent of Terraform providers. But going further, that we shouldn't need to use K8s if the operating system came with similar functionality and exposed it as an ABI.
All these things would let you "build from scratch" distributed parallelized network services, with a lot less layers of technology, in a more standard way. Rather than building a spaceship just to run errands, we could be cruising around on city rental bikes, or golf carts, or something. This comparison might have gone off the rails.
I am mostly a python dev though. I may have just ignored the docker option when presented with a python project, but used it for those other languages.
For the reasons why, see the article (you can safely skip the first two paragraphs). The biggest gripes are concurrency and package management.
i dont see why the two are necessarily related. docker lets me package a reusable and repeatable deployment of OS+application ... if you want to use that as a crutch to solve environment problems, why not?
But most of all, I found the definition of “conservative” so apropos to politics (accidentally, surely) that I both laughed and cried.