As the reason for using Docker, the author writes "The benefits of matching your development environment to your production one cannot be overstated". Again in my personal experience, this simply isn't an issue. Just use the same Python version and a roughly similar PostgreSQL version in development and production, and you're good to go.
In short, I feel OP would be better off without Docker. Once you know the rest of the stack well enough, you can set it up in dev and prod in such a way that you can trust that what works in dev will also work on prod. I've been doing this without problems for years.
As I said before, you must have your reasons for your view, but my own experience is the polar opposite.
For platforms which support more self-contained binaries, I'm going to keep trying to avoid docker as much as possible.
One of the most important thing docker provides is an abstraction between the application and the OS. If you don't need the abstraction, then it's only going to look like cost.
On the other hand, being able to write an some code in python 3.9 and deploy it anywhere across a heterogeneous production environment without fighting with the OS about how many versions of python it has installed is a useful thing.
And if you're using Java/Node/C#/etc? Where all dependencies are included in the compiled output?
But it ends up being a matter of preference, I could still do it without it, but I'm used to it now, for some reason I feel that it makes my systems more "reproducible" in a simpler way.
Docker is a huge security hole.
Oh, Crypto miners ....
Docker use shared resources like Kernel. Linux Kernel is big ugly C mess (Compared to includeOS) and probably one can find a good enough exploit for the kernel then escape the Docker.
That's and VM provides much better security. Well, VM escape exploits exists but they are at least much harder than say a Docker level escape.
I don't find docker that useful for developing on a day to dat basis however, it just gets in the way and generally you're not going to need a webserver, SSL, celery, rabbitmq in development anyway. It is useful to debug things locally though.
The other reason to use docker is state: while using ansible we got bitten by old dependencies hanging around because ansible doesn't exactly mirror your config, it can really only add stuff. Whereas docker actually wipes the slate clean every single deploy so your server is an exact mirror of your docker-compose file. This is really important.
Still important to keep in mind those Docker images must be maintained too.
There's a lot of emphasis on your own perspective: "*I* don't get", "*I've* been doing that", "Not once have *I* missed Docker", "making *me* less productive", "*my* personal experience*"
And an assumption that it applies to others: "Just use the same [...]", "I feel OP would be better off [...]".
The comment could provoke a more productive, thoughtful discussion by focusing on understanding rather than on pushing particular views and practices.
- "I don't get why people use Docker" -> "Could anybody share their reasoning for using Docker?"
- "You can run everything without Docker to remove an extra level of indirection" -> "What benefits do you see in exchange for this extra level of indirection?"
- "Not once have I missed Docker." -> "I haven't found much use for Docker and would like to understand those who have"
- "I have taken over [...] making me less productive" -> "I've only encountered downsides in my own experience with Docker, could anybody share what the upsides may have been for others involved?"
- "Again in my personal experience, this simply isn't an issue" -> "I'd be interested to hear more about the benefits as I haven't encountered issues before"
- "In short, I feel OP would be better off without Docker" -> "I wonder if the OP really needs Docker, perhaps they could be more productive without it"
If we all wrote like this (and I'm certainly not perfect myself), HN could be a much friendlier, more welcoming place.
It strikes me as exceedingly dishonest to pretend that you're genuinely interested in someone's opinion if you already know you strongly disagree with it, and it's by no means a requirement to lack an opinion to have a respectful debate. Telling someone that you believe they're wrong is not an insult.
People have subjective opinions and experiences, this does not necessarily make their statements true. This should be obvious and I don't see why it's the sender's responsibility to preface everything they say with that fact to avoid stepping on the toes of whoever reads it. Opinions don't kill debate, it furthers it.
That's really not what I'm trying to say at all. If I had to condense it down to one sentence it would be: "if there's a difference in opinion between two reasonable people, it's probably because you've had different experiences; you'll have a more productive discussion by finding out what they are than simply stating your opinions".
> It strikes me as exceedingly dishonest to pretend that you're genuinely interested in someone's opinion if you already know you strongly disagree with it, and it's by no means a requirement to lack an opinion to have a respectful debate. Telling someone that you believe they're wrong is not an insult.
If you're not interested in somebody's opinion, what value could there possibly be in having a discussion with them?
Not every comment on the Internet is an invitation to discussion, and the discussion is not exclusively between two people. A lot of comments are just responses about why someone disagrees or agrees with what was just said, which is allowed.
OP’s post was clear and to the point. It was entirely focused on the content of their frustrating experience with Docker. And it wasn’t at all “adversarial”. They had a bad experience with software and are sharing it.
I’m glad they didn’t try to wrap it in softening language just in case someone might find their communication style “violent.” I hate this trend. I see so many comments like this in tech and honestly it’s mostly from people who have nothing to say or who become indignant if you say anything they disagree with.
And being direct isn't a bad thing either, my point here wasn't "don't be direct or concise, you'll offend people" but "assuming everyone here is reasonable, disagreements are likely due to differences in experience and you'll have a more productive discussion by figuring out the differences than simply stating opinions".
Ironically I find the first comment non-adversarial and this comment overly-adversarial, but maybe that’s just me.
People are allowed to disagree and share their own experiences, and OP didn’t say anything nasty about anyone. I would prefer a space where people share their honest experiences with reasoning over one where everything has to be coated in sugar to make it digestible. And I’m sure the author of the post can handle hearing someone else’s differing opinion, it’s a discussion forum after all - not a blind agreement forum.
In its current form it is very clear that it is their opinion.
This is a well-intentioned thought but I'm not sure it actually makes sense. The things that make or detract from HN being a friendly and welcoming place have a lot to do with its size and heterogeneity -- ideology included. You're trying to neuter the disagreement and rephrase it in terms of a question. That's not always a bad thing, but in some cases you really do just disagree. If that's going to be the case, why not be upfront about it, so long as you can be civil? (GP's point does seem to be phrased in a pretty civil manner IMO)
In my ideal world, everybody would just use what comes with vanilla Python:
python3 -m venv venv
(Or `call` on Win)
pip install -Ur requirements.txt
Docker does it for you.
apt install python3-venv
# Build the image if it does not exist
if [[ $(podman images --filter "reference=$IMAGE_NAME" -q) == "" ]]; then
podman build -t "$IMAGE_NAME" -<<EOF >&2
RUN python3 -m pip --no-cache-dir install youtube_dl
podman run --net host -i --rm -v "$PWD:/app" -w /app "$IMAGE_NAME" "$@"
Dealing with BC breaks for versions you aren't intending to run in production seems like unnecessary overhead.
My team insists on using Docker and against my better personal judgement I let it, but I set up the code to run locally without it. If I call the shots I'll not use docker absolutely.
1. Open a port on the container for debugging
2. Tell the debugger in the container what's the port it should use (if you're not using the default one)
3. Tell the IDE what port it can use to connect to the debugger (if you're not using the default).
4. Debug it!
For things like Typescript you might need some extra trickery because it's all transpiled, so you'll need to make sure your sourcemaps are set up correctly, but that's not overly difficult.
It did take me several hours to work everything out and write a README so that every time we get a new hire / someone sets up a new PC they can just follow the instructions. I'd say it was time well spent.
Inside containers and without Docker alike.
The team would no longer have to make these deployment decisions and argue about which tool is a better fit. They'd make the decision once, hopefully follow best Docker practices (unprivileged user, multi-stage builds, etc.), and have documentation available for how to integrate the setup with IDEs, work with volumes, etc.
Once this initial adoption hurdle is overcome, IME the productivity gains are greater than the issues of dealing with Docker. It becomes trivial to setup CI/CD, onboard new developers and integrate the app into other workflows.
Docker and containers in general have become mature enough to prove their use case and benefits, so the cargo cult argument doesn't hold weight for me.
But true, you weren't supposed to need it. But managing projects with interlocked dependencies get old fast.
... and add questions like: are all paths correct for this machine, are binary lib dependencies the same as in production, are the OS versions compatible between dev and prod, is my local runtime version compiled with same options?
The great example where the indirection adds value is: every developer has the same environment and the CI is the same, regardless of personal preferences in systems.
A lot of the problems docker solves is as soon as you need to fit out a team with a mix of environments and split ops from dev
I still use it for individual projects because I never plan on maintaining everything forever - makes it easier to grow to a 1+1 team or hand the project to someone else or solicit contributions
Also builds your experience for when you do work in teams
Some languages bundle all of their dependencies so you can be relatively sure they will run the same on prod. For others (Python, Ruby) that use many system libraries, containers may add value
Yes, which is where you can reach for firecracker for example. But docker gets you 95% there. If kernel makes a difference you can make the decision about the other 5%.
> Some languages bundle all of their dependencies so you can be relatively sure they will run the same on prod.
As long as they're static, vendored dependencies. Otherwise you're still likely to run into pulling something very common like openssl, zlib or libuv from the system.
A Dockerfile is almost exactly like your requirements.txt file, only it works for everything. Need Imagemagick installed on your server? Just add it to the Dockerfile. Need to run a pre-start tool that was written in Ruby? A line or two in the Dockerfile can add that. And if it turns out that you don't want it after all, you just remove the lines and feel confident that no trace is left behind.
1) Having to "docker exec" to get a shell in a running container vs just running the command (in dev) or sshing into the server (in prod).
2) Tests taking 3:30 min to run with docker vs 30 secs without. And then of course, you don't just run tests once but many times, leading to tens of minutes on a given day where I'm just twisting my thumbs.
3) Having to even spend time learning how to debug Python code inside a locally running docker container.
4) Having to deal with disk space issues caused by old docker images on my 500gb hdd.
All while not deriving any value from it in my projects.
This is not a problem with docker, but your environment. Tests should run at the same speed and initial time with or without docker (with an exception for changes that update dependency list - that will take the time for the initial build).
Things to check: Are you installing dependencies before adding the app? For development are you mounting the app instead of building a new image each time?
1. If you're doing this, then you're likely using docker incorrectly. Your container should be run from images that are automatically deployed. If there's an issue, fix local and deploy image.
2. I agree, that sounds dreadful. We use pyenv and poetry for local development and docker for deployments. That would address your issue. We of course do not use pyenv or poetry inside the docker image. Hopefully that helps clarify potential real world use.
3. Your should not be debugging a live deployment. Debugging local container is straight forward in VSCode.
4. This shouldn't be an issue. Prune weekly, and aim for smallest images where possible, e.g., 60MB for python microservice.
The value is in the simplicity of deployment. Need to host a well-known software? Can be up or down no time.
1. First, the infra to automatically deploy again introduces complexity. And waiting for the Docker push to complete and then until the new container is started takes away from my time.
2. Yes, it is a pain.
3. I do what's necessary to fix problems. Sometimes that means looking at production. I use PyCharm not vs code and because I don't find value in Docker for my projects have no incentive to look into how to set up local debugging.
4. "Prune weekly" you say. But it's just another complication I have to deal with when using Docker. What for?
> The value is the simplicity of deployment
I would argue my deployments are simpler than yours. Give me git and ssh and I'm good to go. No need for Docker, pushing to some registry, looking at a dashboard or waiting for the image to be deployed. And my setup is much easier to debug.
But it sounds to me like you are coming from a more enterprise-y environment. There, the things you say probably make sense. In my case, I'm a sole developer. Any unnecessary process or tool slows me down and incurs a risk of bugs due to added complexity.
> Need to host a well-known software?
I don't have this need.
FYI: the wait time is marginal at best to deploy images and containers. My workflow is the same as yours most of the time for solo projects: SSH into server, git pull (from a "stack" branch that has docker-compose files containing DB and microservices), docker-compose up -d, and because of cached images it takes minimal time to deploy.
I agree that as a sole developer it can add initial complexity. As a sole developer myself on prior projects and now on personal projects, I use Docker primarily to streamline my deployment practices.
Another excellent use case of Docker not mentioned elsewhere in this thread is the simplicity of running databases locally, e.g., mongodb, postgres, etc.
If this bother you, take a look at some 3rd party docker management ui such as portainer (vanilla docker) or k9s (kubernetes). These tools will let you navigate and launch shell on your container quickly. Very useful if you have tons of apps running in your node.
> Tests taking 3:30 min to run with docker vs 30 secs without. And then of course, you don't just run tests once but many times, leading to tens of minutes on a given day where I'm just twisting my thumbs.
In my case, I don't use docker in development phase. I test in local environment and only build the images when it's ready for deployment.
> Having to even spend time learning how to debug Python code inside a locally running docker container.
I never had to do this anymore. I just hook sentry or newrelic and they'll log exception stack traces that I can use to figure out the issue without live-debugging the app.
> Having to deal with disk space issues caused by old docker images on my 500gb hdd.
Yes, disk usage is one of the drawback of using docker. It's especially suck pruning images on busy servers with spinning rust. On ssd, pruning is not as slow though.
VSCode lets you debug inside a Docker container.
> or in production, do things such as inspecting log files
Logs should be sent out of the container, either directly (mount a /logs folder, push them to Sentry / ELK / etc.) or by simply logging to stdout and having Docker send the logs where you want.
> monitoring system resources
Docker processes are still processes. They show up in `top` and friends just fine.
> running necessary commands if there is something urgent...?
`Docker exec my_container <insert command here>`.
Although I've never even considered doing that. For the past few years the resolution to a production bug has always been "rollback to the previous image, then if any data got borked fix it in the database".
But it also sounds like you are operating within a larger organization. So your requirements may be different from mine.
This insistent push that the old way was good and why did we expend all this effort to make a new thing that I don't want to bother learning the five new invocations to just doesn't line up with the needs of today.
Docker pull, docker exec, docker ps, docker logs and you've pretty much got what you need for ninety percent of your job.
This stuff is not hard. You make it hard for yourself by digging in.
"New is good" is also no general justification. It always depends on the context. Several replies in this thread mention very good use cases where Docker makes sense. From what you wrote, it sounds to me like it also makes sense in your environment.
Yes I mean I don't want to have to learn things that don't bring me value. But it's not about learning the commands. It's about having to repeat them over and over again in my daily work. About the associated mental burden "am I in the container now? Is it running?" And about everything taking longer, be it running tests in Docker or pushing to a registry and waiting for the new container to be spawned. As a single dev, and I feel this is where your and my requirements differ, it simply is not worth it.
I would not want to go back to the old ways of doing things.
Can you connect to a remote host that runs the container?
Docker exec won't work if the container spins into start/fail loop.
Git pull, python3 setup_prod.py, pip install requirements.txt
Is docker worth not writing the 50 (admittedly unfun) lines of setup?
Docker is useful for reasons besides scale. It allows you to (kind of) declaratively define your environment and spin it up in any number of different scenarios.
It allows you to ship all your dependencies with your app and not worry about getting the host machine properly configured beyond setting up Docker and/or Kubernetes.
Have other apps you want to host on the same set of machines? No problem.
When you you pair that with something like alpine Linux, you’re getting a whole lot almost for free.
For my Go/Haskell binaries, I usually do need to make changes in order to get them working. The smaller image size is essential in my use case though so I pay that penalty.
I achieve the same (defining the environment) by pinning Python dependencies via requirements.txt and virtual environments. And I have an installation script, like a Dockerfile but just an executable bash script, that installs PostgreSQL etc, pulls the code from GitHub and starts up the server. I can upload this to a clean Debian installation to set up the server, with a well-defined stack, in minutes.
> Have other apps you want to host on the same set of machines
I don't. I just use one $5 per month Linode for each SaaS. (Or bigger Linodes as the projects get more users. My biggest single box currently serves ~100,000 users.)
Well then you're basically using VMs as your containerization mechanism, with install scripts replacing Dockerfiles.
So from your perspective, don't think of Docker as a really fat binary. Think of it as a stripped-down VM that doesn't cost a minimum of $5 per instance, and comes in a standardized format with a standardized ecosystem around it (package registries, monitoring dashboards, etc.)
On the contrary, it's easier to debug with Docker - it eliminates dangling system level libraries / old dependencies / cache, and everything is self-contained. If you have the problem in dev, you'll have it in prod as well.
Docker is not a VM, it's basically a big wrapper around chroot and cgroups, the performance hit is minimal. The advantage is that, again, it's self-contained, so there's little risk some OS / dangling library muddies the waters ( especially in Python that's a great risk -OS level python library installations are a thing, and many a Python library depend on C libraries on the system level, which you can't manage through Python tooling). It's also idempotent ( thus making rollbacks easier) and declarative.
Easy fine grain control of individual application memory, namespace isolation (container to container communication on a need to know basis), A-B testing, ease of on-boarding, multi-os development, CD pipelines, ease of container restarting etc.
I much prefer building a Docker image and pushing it somewhere compared to tarballs of the repo or repo access. I would rather build rpms or debs than just push the repo around. Container tooling makes that kind of stuff nice in my opinion.
I also don't see why you'd use a CI pipeline for simple projects where you are typically the only developer. Run tests locally, if everything is fine push to prod.
I use git with a small script to roll out to production. When I execute the script on the server, it stops all services, pulls the latest code from Git, applies Django migrations, and starts everything up again. I use desk  to make this as simple as a single `release` command in my local shell.
You may say that it can be mitigated with some wrapper scripts with limited commands, but then you have to maintain them and we can all agree that homebrew security is very hard to do correctly.
And it won't break the host machine because someone ran sudo pip install -r requirements.txt by mistake.
As I always say, know what you are doing + know to debug everything. If somebody is more comfortable with Docker, I think no problem there either :). Sometimes it solves a real pain, sometimes don't (having an Elixir release or Java jar makes me not to really use containers).
If you are on Docker, I recommend looking into Podman (maybe even together with runc -> today trending on HN). Don't run your containers under root and run your app inside the container under a specific user as well.
Shameless plug: I am now writing a book on application deployment and will show both approaches, Docker and Docker-less.
Sandboxing/isolation is a very good mitigation against software supply chain worries IMO
I work products based on the official Python Docker image, but the base image containing the version of Python they want was release in April and haven`t received updates since. Developers forget that releasing software as Docker images mean that they are now responsible for patch management.
I can't image a world where running python manage.py runserver is less productive than spinning up two containers.
But once you need stuff like Celery workers, message brokers, and multiple databases for sharding, Docker is a life-saver.
Sure, you can do similar with ansible or puppet, but use something.
SSHing into boxes to fix things wasn't good enough in the 90's, one reason I looked into building .debs back before the turn of the century. cough
A Dockerfile may be complex for any piece of the stack. But now this complexity is not part of deploying the application. Each component has its own build process which is decoupled from the others. And even a really complex installation process results in a deployment process that works the same way.
If you can't see these benefits then maybe you haven't had to deal with complex software installation. Or maybe you haven't had to run the same stack on your laptop that you run in production. But if you ever get to the point that you want to deploy something in more than one place, or more than once to a single place, it really makes things much easier.
It depends on how you're using it.
Docker itself isn't going to increase your hosting costs. If you rent a server for $20 a month, it's still $20 a month with or without Docker being installed.
I've been the lead developer on teams where I introduced Docker to solve consistency/reproducibility issues in AWS and Azure.
I've also done smaller applications in DotNet Core, Go, Node, Python, and Ruby. In those cases I've use other alternatives, including:
- Known Linux version, with git push-to-deploy (my favourite)
- Packer (the server), with embedded codebase (still quite simple)
- Docker (for most non-trivial deployments)
- Known Linux version, with chef or ansible (as an alternative to Docker)
- Terraform the machine, upload the codebase, run scripts on the server (ugh)
Every method had it's place, time, and reason. If possible, for simplicity, I'd go with the first option every time and then the others in that order.
The thing is, though, I may have an order of preference but that is totally overridden by the requirements of the project and whether or not the codebase is ever to be shared.
For solo projects and small sites, I've not benefited from Docker as I have never had any server/OS issues (and I've been doing dev stuff for decades).
However the moment there was a need for collaborators or for pulling in extra dependencies (and here is the crunch point for me) such as headless browsers or other such 'larger' packages, then I would move on to either Packer/Terraform for fairly slow-changing deployment targets or Docker for fast-changing targets, as otherwise I inevitably started to find subtle issues creeping in over time.
In other words keep it simple while (a) you can and (b) you don't need to share code, but complexity inevitably changes things.
1) By installing packages on the system directly, you can't be sure that a future update won't break it. For example, when using Tomcat or a similar application server, new updates sometimes deprecate or disable functionality in the older versions, or add new configuration parameters, that the configuration from the older versions won't have, thus leading to weird behaviour. This will be especially noticeable, if you set up development, test or production environments some time apart, or in different locations, where the available packages may not be 100% consistent (i've had situations where package mirrors are out of date for a while).
2) Furthermore, if you maintain the software long term, it's likely that your environments will have extremely long lists of configuration changes, managing which will be pretty difficult. This results in the risk of either losing some of this configuration in new environments, or even losing some of the knowledge over time, if your approach to change management isn't entirely automated (e.g. Ansible with the config in a Git repo and read-only access to the servers), or you don't explain why each and every change is done.
3) Also, it's likely that if you install system packages (e.g. Tomcat from standard repositories instead of unzipping it after downloading it manually), it'll be pretty difficult for you to tell where the system software ends and where the stuff needed by your app begins. This will probably make migrating to newer releases of the OS harder, as well as will complicate making backups or even moving environments over to other servers.
4) If your application needs to scale horizontally, that means that you'll need multiple parallel instances of it, which will once again necessitate all of the configuration that your application needs to be present and equal for all of them. You can of course do this with Ansible, but if you don't invest the time necessary for it, then it's likely that inconsistencies will crop up. In the case of Knight Capital, this caused them to lose more than 400 million dollars in less than an hour: https://dealbook.nytimes.com/2012/08/02/knight-capital-says-...
Edit: It was stated in this case that "scale is not an issue" and therefore this point could be ignored. But usually scale isn't an issue, until it suddenly is.
5) Also, you'll probably find that it'll be somewhat difficult to have multiple similar applications deployed on the server at the same time, should they have any inconsistencies amidst them, such as needing a specific port, or needing a specific runtime on the system. For example, if you've tested application FOO against Python 3.9.1, then you'll probably need to run it against it in production, whereas sif you have application BAR that's only tested with Python 3.1.5 and hasn't been updated due to a variety of complex socioeconomical factors, then you'll probably need to run it against said older runtime.
6) Then there's the question of installing dependencies which are not in the OSes package repositories, but rather are available only in the npm (for Node.js), pip (for Python) or elsewhere, where you're also dealing with different mechanisms for ensuring consistency and making sure that you're running with exactly the version that you have tested the application on can be a bit of a pain. For an example of this going really wrong, see the left-pad incident: https://www.theregister.com/2016/03/23/npm_left_pad_chaos/
Essentially, it's definitely possible to live without containers and still have your environments be mostly consistent (for example, by using Ansible), but in my experience it's just generally harder than to tell an orchestrator (preferably Docker Swarm/Hashicorp Nomad, because Kubernetes is a can of worms for simple deployments) that you'd like to run application FOO on servers A, B and C with a specific piece of configuration, resource limits, storage options and some exposed ports.
I feel like this is a largely issue of tooling and our development approaches, NixOS attempts to solve this, but i can't comment on how successful it is: https://nixos.org/
I kind of agree with you - build with whatever tools you have. But, we also want to be curious and open minded of other tools that exist.
Sorry for my harsh reply.
I found htmx easier to understand and handle.
Here are the two implementations (backend is in Quarkus/Java):
In addition to the founder, there are at least 3 active contributors with significant contributions to HTMX. It may have started as a one man show, however, the knowledge is now spread over quite a few people who are exceptionally helpful.
My go to stack for small projects where scale is not an issue is Laravel, Laravel Forge for deployment, Vue or jQuery for interactivity, SQLite for database, Redis for cache/queue and...that’s it. No Docker (because Laravel has a super simple dev environment setup with Valet), not a single YAML configuration file to be found anywhere, and this kind of setup on a single $20 DigitalOcean server can literally serve 100k users without a hitch. How many apps have more than 100k users?
I'd guess sqlite (for example) is fine for very simple apps with one user at a time, but that bar is pretty low. Yaml itself is not an issue, and intercooler/htmx is simpler than your JS picks. shrug
1. If you're not going for scalability why Postgres and not SQLite?
2. What does your monitoring look like? I've read something about Grafana and Hetzner dashboards in the comments but what exactly do you use and where do you run that? Also, do you have anything for intrusion detection specifically? (I'd be extremely paranoid about that.)
1. I chose Postgres because I have the need to remotely manage as well as pull metrics into grafana and metabase. That is simply not possible with SQLite.
I realize this post should be a series of posts where I go deep on all parts of the SimpleCTO stack, so to speak.
2. My monitoring is quite limited because. Generally I use netdata to get a quick overview on health, and I export that into grafana. I rely on Hetzner for actual healthcheck and uptime monitoring. Sentry tells me if my code starts throwing too many exceptions or errors.
As for backups, SQLite also supports backups while another process is using the database.
Of course, this only really works when you're not worried about multiple workers for the same "queue", etc.
Still, simplicity is key -- don't build for scale just because you think you might be there in X years. I like your style!
And implemented in Django:
I like the simplicity!
I used to combine RabbitMQ and Celery for async tasks. Mostly because it's what I learned to do since it was already in use at my first job. But Celery is such a pain -- or, at least, it was at that job. So many configuration options, different places where they were stored in different versions. Weird errors. Poor documentation (at least at the time?)... I just started going for something simpler: rq and rq-scheduler with redis. For most of my use cases it's more than enough.
I've got to say that your approach has gotten me thinking of maybe simplifying everything even more. We'll see where I end up. In about 4 weeks I'll have to introduce asynchronous tasks in our current project, and though I was thinking of going the rq-way, your article has given me food for thought.
Other than that, your backend stack is mostly like what we use for our projects. We also use plain old docker + docker-compose, with the small difference that we have a somewhat hacked-together system I built with bash several years ago to extend docker-compose's functionality a bit and make every component somewhat more reusable between projects and easier to fine-tune on a "per-environment" (development, staging, production, etc) basis. We also use nginx, but your article has convinced me to look into alternatives.
Once again, thank you for your articles, they're a joy to read and think about!
 To be fair, that job had several aging Django codebases and I know most of them are still stuck with Python 2 and outdated Celery and django-channels versions. I constantly kept pushing for us to get rid of technical debt, but we never got to it...it's part of the reason why me and a mate left it for our own endeavors together.
 - https://github.com/simplecto/makefile-taskrunner-template
https://pypi.org/project/django-viewflow/ Is from the same developer which is a workflow/rules engine built on top of FSM.
It’s fairly new, all feedback greatly appreciated.
The thing I like from spark for B2B products is the Team type roles & team billing
(looks like spark is getting an overhaul)
I like to try new JS technologies and those alone sometimes need different node versions - i would hate to spend time on these issues.
My deployment is 2 lines:
Did not have to touch it for years
However it aligns well with my own sentiments - for anything below large scale complexity or scalability needs, docker-compose hits a beautiful simplicity vs power tradeoff.
"curl -s https://get.docker.com | sudo bash"
The daemon itself is just plain docker.
It's also significantly easier to dockerize a generated binary from golang or rust, etc. Some of my worst docker headaches have been from trying to debug something like mayan-edbms, which uses python and celery workers.
I just don't get it unless you specifically need some libraries that no other language has.
Many have imagined and built valuable stuff in interpreted, dynamically typed languages before you, by being more focused on overall structure and making sure it's strict and resilient. One really doesn't have to search far for successful applications that are certainly way beyond 1000 LoC and still iterate pretty quickly for their size.
> significantly easier to dockerize a generated binary from golang or rust
Not sure the verboseness of either golang or rust would be worth it if you compare it with python. If you're just launching, your focus should not be on performance or how easy it is to dockerize but to figure out who your user really is. Scaling and deployment issues will happen much further down the road, and iteration speed is more important in the beginning as you have many changes. Your architecture needs to reflect this too, and dynamically typed languages (arguably) makes it easier to change things, as long as you know what you're doing.
But together with that, you are eventually gonna have to start leveraging more languages, as you have programmers working on different levels of the overall architecture. Some fit for some tasks, and when it comes to quickly launching and iterating on SaaS businesses, dynamically typed languages are a pretty good fit.
In the end, I don't really think dynamic/static typing is the most important consideration, that's such a small part of what you have to think about overall.
Compilers catch a lot of bugs before your software is in production.
In the meantime Django provides productivity like you can’t get many other places. It’s just a pleasure to work with and aside from a couple of setting-related places it’s code is a pleasure to read.
If not, this is the first time I've heard someone say it's a pleasure to read/write.
I've done lots of Python over the years (Zope3, Django, flask, aiohttp, custom stuff...), and if there is one thing I wouldn't like to come back to, it's Django.
Sure, it's a do-everything toolset, but the syntax is an abomination (imho, at least, though I hope that doesn't need to be stressed every time).
To contrast ORM syntax, look at old SQLObject, Canonical's Storm or both SQLAlchemy's declarative and core implementations.
About the only place where I dislike Django’s ORM is stuff around aggregation and annotation. The fact that the order in which you declare your annotations matters is annoying af and while I understand the need for it, I do wish there was a better way. Having said that, I have simply started structuring my models in a way that doesn’t require complex aggregation and that made my life a lot easier. If I have an instance of a Book model I know what fields to expect it to have and don’t worry about whether this particular time it has some specific annotation attached to it or not. That has made the rest of the code a lot cleaner.
Basically, I think to each their own, but I don’t see the Django ORM as a negative. Give it a try and see how it works for you especially if you can avoid going against the grain with it.
I once tried to get into a large Python project, and even the IDE (PyCharm) had trouble "guessing" the function parameter types. It's absolutely not scalable. I don't want to be reading a function and guessing "what the hell is the type of this?".
def myfunc(foo: Dict[str, MyObject], bar: int, baz: str) -> List[MyObject]:
Apache airflow? Doesn't use it. Jupyterhub? Doesn't use it.
I was happy to see projects like Zulip using this, but if it's optional then I can't rely on people actually using it. It's the same thing with ruby/sorbet.
It's an objective fact that Python 3 was not adopted quickly (given various EOL deadline extensions) and that plenty of libraries are stuck on 2, so your point that types have been around for 10 years is not really an argument that I can actually expect to be able to use types when working within the ecosystem.
I don't know why you think dismissing legacy code is valid! More code is old than new in the world, and the idea that you're only going to run into codebases with the most up to date versions of software seems pretty naive to me.
Serverless runtimes have historically been behind on Python versions, Airflow had a hell of a time with dependency issues and Python 2, etc etc.
I also don't really care whose fault it is - the question isn't "is Python morally wrong", the question is "can I rely on the technology that you offer as the solution to a problem as being actually used by the community and therefore actually a viable solution to the problem." Python type annotations are not popular.
Because everyone everywhere has legacy code. In that sense, Python isn't split between versions any more than any other language, making that statement moot.
If you want to use type annotations, you can, for years now. If you want to use macros to generate macros in Rust, you can, it's not Rust's fault if you're stuck on 1.20 and don't want to upgrade.
> If you want to use type annotations, you can, for years now.
Yes! And it's a cool feature. Can I expect most libraries to provide types so that I know I'm calling them correctly? Is it going to be an uphill battle for my team to use them, because the community at large doesn't find them necessary? The default for a majority of Python codebases is to not use types, and that's totally fine, but that also means that rather than type annotations solving the super-grand-OP's concern, they're going to potentially add friction.
It's likely, and that's FINE, that you're going to be in dynamic-land when you're working in Python, due to the preferences of the community, and it's cool that you can use annotations if you yourself want to get some nice signature checking.
1. In my own code, they are super helpful
2. I still get really frustrated when I'm calling other libraries (I had this experience when the various typed JSes were fighting it out as well) - in fact for a lot of web-style work that I do MOST of my code is calling a library, which is where I'd get most of the value.
And yet, somehow YouTube muddled along through and made it work.
It's some weird combination of solving an actual problem decently enough (iow, it does need to work most of the time :)), good timing and some arbitrary pick up by the masses.
Like Google succeeded because they provided a no-frills search that had no ads, and geeks like us started pushing it down the throats of our non-geeky friends as The One True Search.
I ain't saying that Python leads to bad code (because you can have bad code in any language, just like you can have good untyped Python), but that there's this talk of scalability like it's some msgic thing that's impossible if all the stars are not aligned.
I agree here but I should note that my main focus is spending less time debugging errors at runtime and in production, and avoiding errors in the first place. This is primarily why I use rust.
"Not sure the verboseness of either golang or rust"
Verbosity has nothing at all to do with implementing good, robust code. You could do it in python, rust, or another language. With certain languages you don't have to spend time writing guard code because it's handled natively by the compiler, type system, or both.
" dynamically typed languages (arguably) makes it easier to change things,"
They make it easier to change things, not necessarily correctly. With modern IDEs refactoring is not an issue.
"Many have imagined and built valuable stuff in interpreted, dynamically typed languages before you, by being more focused on overall structure and making sure it's strict and resilient. One really doesn't have to search far for successful applications that are certainly way beyond 1000 LoC and still iterate pretty quickly for their size."
I am aware of this, but I can't help but wonder if they would be able to iterate much more quickly knowing that the compiler and type system eliminate entire classes of bugs. Sure, you might be able to get to running code a little bit faster, but if you have to spend time guarding it and debugging runtime errors... is it really faster?
There seems to be this mindset that being fledgling startup means no time for silly things like types or compilation, etc, only iteration. If one actually does their market research and figures out who their users are and what they want, one should be able to iterate just as quickly in i.e. golang or another compiled language with a healthy ecosystem.
I have not worked at a FAANG company or a company even remotely the size of those companies, but I have created and maintained somewhat large ruby on rails codebases. Debugging runtime errors became my bane, especially when I could not guarantee that the persons writing the ruby code were following modern practices (or any practices at all). Not that I think compilers are perfect, but they -do- catch so many errors, some of which may have made it into runtime.
That's the key thing. You don't know until you put people in front of what you're building, to understand if you're actually solving the problem. Or even if the problem you're aiming for is the right problem. Hence you want a non-verbose and simple program that is easy to adopt to changes you want to make. Verbose languages that enforces you to be very strict, doesn't allow itself to change as much as a non-strict one. You're right that it's more error-prone, as you don't have the same guarantees. But in the beginning of launching something new to the world, you are gonna need to focus more on what needs to change, rather than how correct something is.
Everything looks great at the beginning. They have features, customers and a growing codebase full of technical debt. Debt that I think would be greatly reduced if the code was compiled.
Maybe it is the price you need to pay to be successful but I doubt it.
I dont see dockerizing as a problem. Once copied out from my template I never think about it. I do admit that familarity with the underlying Image OS (Debian in this case) makes debugging / fussing about a non-issue.
This is one of the benefits of golang. You compile it into a binary and copy just the binary into your scratch container. Maybe it is 20MB in size.
apt-get update && apt-get upgrade
Secondly, all this cargo-culting around small containers is fine if you are cramming resources. I am not.
The reality is that my projects and many others are really, really over-provisioned. Looking at my graphs right now, I am on the front page of HN and my server load is still only 20%.
You can do this exactly the same in alpine, with "apk update" and "apk add".
"Secondly, all this cargo-culting around small containers is fine if you are cramming resources. I am not."
The default docker python image is 885MB. python:buster-slim is 114MB which is much more reasonable. Even if you're not "cramming resources", pushing 885MB vs 115MB vs 20MB across the wire does add up over time.
In my case, we do quickly iterate and pull docker images so it does make a difference for my colleagues if the image is 30MB vs 500MB. Even if we're all on gigabit internet.
"Cargo culting" implies doing it without knowing why we're doing it. The benefits of reducing the size of docker images is apparent to everyone, although there are diminishing returns. With dive it is very easy to figure out where the waste is.
"The reality is that my projects and many others are really, really over-provisioned. Looking at my graphs right now, I am on the front page of HN and my server load is still only 20%. "
Optimizing docker images has nothing to do with being over-provisioned.
I will concede that it's very different as a solo developer vs even a small team.
Push new versions of one of those to 50 machines, versus the other.
You don't have to watch security lists for vulnerabilities.
You don't have to scan your docker containers because the dockerfile is 4 lines long.
You don't have to worry about a coworker adding a bad tool to your production container.
That frees up cognitive load to write your code.
Note: I am an infrastructure engineer for a small SaaS that builds and runs production.
> You don't have to watch security lists for vulnerabilities.
These two statements are incompatible. You have fewer things to watch but you’re definitely still going to track your dependencies. Static linking still means you have to do that, and nobody else can do it for you.
I am more concerned about pulling in Linux binaries that are full of vulnerabilities.
Alpine is subtly different for python builds:
* It uses musl-c. It's not glibc, it's different. Not necessarily better or worse.
* You can't use manylinux1 wheels with alpine, so if you've got python/c extensions, you're going to be building them instead of installing upstream binaries. So cue the need for a dev toolchain, and a much longer install time.
And once you have all that, you're running a different version in dev/prod, which is one of those things that docker is supposed to be good at fixing.
The other experience point which a lot of people forget the first few times is maintenance: if using Go means that you have notably more code to write and integration to support, you’re shipping fewer features and have less time to optimize the architecture. That Go code I mentioned earlier was much larger and required time to identify, integrate, and debug third-party libraries for things which are in the Python standard library.
This isn’t to say that Go is a bad choice but again a cue to make sure you’re solving the problems you actually have rather than someone else’s situation. If you’re still exploring the business case you should hesitate before copying a decision made by someone with both a well understood problem, large scale, and more people working on it.
In this thread’s context, I would highly recommend focusing on keeping the architecture easy to support and replace components when the business expands to the point where you really need to hammer specific optimizations. This is especially true in the 2020s where a large fraction of problems are either never exceeding single server scale or can be handled by trivially autoscaling containers on-demand for less than it’d cost in developer time to optimize them beyond the level you can hit in a language like Python or Node.
Again, that doesn’t mean there’s anything wrong with using Go – it’s a perfectly fine choice and has some great libraries — but it usually won’t be transformative the way some advocates claim. When you hear about huge benefits from rewrites look for how often they mention rearchitecting based on what they’d learned from the first system.
Once you internalize that your priorities change from all the tech hype (k8s, react, whatever) to actually working on the business problems. Not solving technical ones.
That would include Ruby, Python, PHP, and JS. GitHub, Shopify, Wordpress, Instagram, Pinterest, Reddit, significant parts of Netflix (Node), Facebook, just off the top of my head.
And you can’t imagine writing > 1000 LOC of a startup SaaS product in any of them? Really?
If you had asked me in 2010, 2015, or even 2018 I would've had different answers most likely.
"And you can’t imagine writing > 1000 LOC of a startup SaaS product in any of them? Really? "
After working on large rails codebases I get nauseous even thinking about having to safely maintain that level of code in dynamically typed languages.
I can’t speak to Rails but this is easy in Python - especially because the addition of typing years back now means that a few annotations will cover most of the common mistakes before leaving your editor. In the Django community there’s a healthy distrust for the heavy levels of magic which are harder to test, which helps a lot.
Serializers and hygienic procedures seem to take care of whatever might have been or maybe I'm just really used to the setup.
Fully-featured frameworks such as Django, Rails, etc give you lots of functionality out of the box which is very valuable especially at the early stages of the project where performance isn't a concern yet.
The standard library makes it very easy to build web services and there’s a couple or popular packages for making routes and dealing with HTTP requests/responses simple.
There’s projects like Micro that try to do a little more batteries included but I wouldn’t consider them to be pervasive in the way Django or Rails are.
Beyond that, it’s just app logic and the typical Golang boiler plate.
Source: I’ve built a few big Golang services for a couple companies.
Which is, IMHO, the biggest thing holding Go back. The standard library does a lot but a lot is also missing. Authentication, db migration, admin panels, ORM, Asset pipeline and probably more. It's a lot of custom work to do or pulling in various libraries.
For example, once an application becomes sufficiently complex, I've found ORM's to be more of a hindrance. The abstraction gets leaky and I end up having to tune custom queries to get around some pathological edge case.
That said - most of my experience is with Rails and a lot of the implicit nature of it is problematic. On the other hand, I've not had this experience with Laravel.
Which one is right? I dunno. Lately though, writing sizable web applications in Golang is working out well so I guess I'll keep doing that.
I’ll disagree and agree on the ORM front.
A good ORM has a clean escape handle for tuning hot spots and otherwise saves you time with boilerplate queries.
Add a new field? Update the class / structs, generate a migration and move on. Without the ORM you write the migration yourself, you have to find and update all the queries and if you’re centralizing all your query building then you’re just using your own ORM-light without the battle hardened trials of other users using something open source and popular.
I insisted we use sql alchemy and alembic in our latest Go project — and I’m eyeballing Buffalo’s Soda and Fizz as a replacement.
Its generally just nerdcore navel gazing to insist on raw sql across a project in my opinion. Especially web apps / SaaS where you churn tables a lot. Simple migrations from simple models is what I love about Django and Flask/Fast API with SQL Alchemy.
Perhaps, but most applications don't start off complex. Most of them start off pedal to the metal develop as fast as you can.
Its also a gentler introduction to Go. I've wanted to use it on past side projects but the upfront cost of learning the language + figuring out how to get everything else was too steep of a curve.
An alternative explanation for why there is no Django for Go is that Go is not expressive enough to support such levels of abstraction.
There's room for a tool that will scaffold a Go backend from an Htmx-ified web page IMHO.
Gin-gonic and a few others try to do that job ( and provide the flexibility to plug the missing parts), but a lot of people say the stdlib is sufficient. Depending on the he case, either can work great.
I find the "ease" is overstated compared to time spent debugging issues in production, but that is a separate discussion.
See my comment in the sister thread about the django/rails equivalent.
The biggest problem for Django and Python in general to me is more in the line of the scalability (performance) question. You are simply forced to put one or even two layers in front of it with a multi-process architecture just to get it to a baseline deployable state (in this case, traefik + gunicorn). But the good news is there are so many people doing that it really isn't a problem. And it turns out you probably want that architecture anyway (probably best if your application server isn't also doing your TLS etc).
I'm saying this in complete self contradiction: my current stack is Next.js with Go/gRPC Web/sqlboiler
It's also very unlikely you are going to write >1000 LoC (or honestly even >100 LoC) for any view function/API endpoint.
No thanks, I'll stick with python, for me it is the perfect balance between expressiveness and being easy to read.
Also Django itself and the python ecosystem is hard to beat. It gets out of my way so that I can focus on the business logic, instead of reinventing wheels.
Airbnb, Stripe, WhatsApp and Discord were able to and win in gigantic markets despite fierce competition.
Does Go (genuinely) have something to match?
Suggesting Go, given where it is right now, is betraying a lack of experience with at least one critical part of the software development lifecycle.
To the extent of django or rail's maturity and features, likely not, although I don't know what features you use django for. I guess this answers my question to an extent.
Last time I worked with golang I used gin and xorm, but there were some things I had to manually do. That being said, I find rails to be an enormous beast that provides a lot of things that one may or may not need.
In my experience, working with golang and rust, which are compiled and statically/strongly typed have saved me headaches and time even with having to hand-roll some features.