Hacker News new | past | comments | ask | show | jobs | submit login
Benchmarking Debian vs. Alpine as a base Docker image (2018) (nickjanetakis.com)
70 points by aloukissas on Feb 6, 2020 | hide | past | favorite | 21 comments



If you're actually sensitive to performance you surely would never use upstream debian binary packages. At least for the x86_64 distro they are built to run on Opteron and just rebuilding them for a less-old CPU like Intel Sandy Bridge, or for whatever CPU you intend to use, can make a huge difference. I get a full 100% improvement in sysbench oltp just from rebuilding the libc and database server with march=skylake.


Benchmarks please. I also do not think that they are actually tuned specifically for opteron.


Why would you care what my benchmarks look like? If you are performance-sensitive, run yours. If you don’t have any, you’re not sensitive to performance.


I think the only one being sensitive here is you - the other commenter is just looking for some way to quantify the huge performance gain that you claim, as it seems unlikely (although not impossible) that just recompiling the packages would provide such a major performance increase.


Using the oltp workload that comes packaged with sysbench, the median latency falls from 50us to 25us on a skylake xeon with nvme storage after recompiling MySQL and libc with march=skylake. Give it a whirl.


Maybe I'm performance sensitive and haven't noticed anything like the claimed improvement and wondering what I did wrong and how I can fix things?


Seriously? You are claiming a 100% improvement from a recompile when the difference between gcc -O0 and -O2 isn't that, and wondering why people are asking to see the benchmarks?


Could one use -mtune to optimize to newer hardware, but make sure it runs for older hardware?


Maybe, but unless you actually own the old hardware, why would you? By the way mtune is much less effective than march. It might change the way a few instructions are scheduled but march will use the actual instructions and registers on your machine.


If you read the comments, it looks like the meaningful difference btw the two distros is (was?) the libc implementation. Alpine uses the newer, smaller musl, while debian used the battle hardened glibc.


If you're interested in Docker best practices for Python specifically, I highly recommend the following site: https://pythonspeed.com/docker/

I listened to the author on The Python Podcast and learned lots from the discussion, in particular the differences between Alpine and Debian for Python images.


We used to use alpine because it was smaller (ferrying images in and out of AWS adds up) and the package management UX often felt better (the community and testing repos tend to make many things easily available, which led to simpler and easier to follow Dockerfiles).

Eventually we switched to Debian-based images because of intermittent DNS-related woes that plagued our alpine containers. We struggled to isolate the cause but it seemed to be a combination of Node + Consul DNS. Suddenly those saved gigabytes and smaller Dockerfiles didn’t seem worth the hassle; after jumping to Debian we never saw that issue, nor anything similar.


The only things i regularly use docker for require wacky cmake configuration and heavy reliance on specific compiler toolchains and GPU stuff, not to mention scientific-stack python wheels that I don't want to rebuild for musl.

I don't see a switch to alpine for my normal use case happening any time soon, but I'm glad it's available as a lightweight option for the times when my normal workflow isn't a requirement.


Interesting comments about the same topic recently:

https://news.ycombinator.com/item?id=22182226


Python image build times was actually the main driver for us switching out of alpine. For example, pip install for numpy on alpine requires full c-code build (takes super long) vs install from binary on slim. We saw a 10x speedup in our CI/CD pipeline.


That was mostly about image build time (and it's pretty Python specific), this article is more about runtime performance.


Actually, I have a few notes on image performance in there:

https://news.ycombinator.com/item?id=22185760

(I maintain both Ubuntu and Alpine images and never noticed significant performance differences)


Why pyhton 2.7? He's dead, Jim.


Article is from 2018.


The article even states it's only 1 select statement.


Though honestly a web endpoint performing 1 select statement per request is a lot closer to reality than 10,000.

Does make it a little inconclusive though.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: