Hacker News new | past | comments | ask | show | jobs | submit login

It’s not common to need that many pulls nor is it hard to build your own images.

If you’re deploying to a cluster with 200 machines, you could easily hit this if you use the public registry though. However, if you’re managing that size cluster you can probably afford the fee, but more importantly, you should probably pull once to a local registry and use that to deploy to your cluster anyway.




Do you run a local registry? Any high-quality articles/youtube talks to share? I'm about to set one up for our own little cluster (~5 machines, ~75 containers). I know tons about docker engine, and a fair bit about the registry, but it's always nice to watch a "lessons learned from actually doing this in production" talk to know what mistakes to avoid


I run tens of thousands of docker images in production, or rather, tens of thousands of copies of a few hundred images.

If you do something like this, you absolutely MUST have a local registry.

Harbor [1], JFrog [2], and Quay [3] would be the first ones that I look at.

Harbor is open source, free, and a member of the CNCF. You will need to do a little bit of work to set it up to scale properly. JFrog offers a SaaS registry, but you will pay big $$ based on pull traffic. Their commercial site license is about $3k/year. Quay is older than either of them, stable, and high quality. I'd start with Harbor these days.

[1] https://goharbor.io/ [2] https://www.jfrog.com/confluence/display/JFROG/JFrog+Artifac... [3] https://quay.io/


Just to add all the major cloud service providers provide registries ACR /ECR/GCR etc . If you run k8s service with one of the them in my experience it is best to use the corresponding registry.

I have pulled and run 20k times a 1GB image in less than 10-15 minutes without breaking a sweat.

Finally GitHub packages offers a registry out of the box . It is great for CI and devs to access . I generally have the tags mirrored from tags GitHub for production to ACR .


Github Docker Registry is a mess and should be avoided at all costs.

1) It is broken and unusable on Kubernetes and Docker Swarm.

2) It is flaky often returning 500 type errors.

3) It is expensive as the amount of pull bandwidth is very limited.


Github packages works with Github CI out of the box, it makes development lot easier, like I mentioned for best networking in prod you should always use the registry from your k8s Provider, mirroring the Github registry to ECR/GCR/ACR is fairly straightforward. Bandwidth costs are eliminated, network is lot more reliable intra DC.


> It is broken and unusable on Kubernetes and Docker Swarm.

Hmm, I use them on several kubernetes clusters in the past few months and don't see any issue yet.


FYI, Using ECR with Docker Swarm is something we did try. It was hellish. We never nailed down the exact problems, but we spent about a month with 2-3 experienced engineers trying to fix the edge case issues.

The main issue was ECR has a slightly different authentication model than docker swarm. The whole '--with-registry-auth' only partially works when you are using ECR. Unfortunately, it works just enough that you think it's working, until all your tokens time out and a worker can suddenly no longer pull an image.

Our common failure case was an image becoming unhealthy or a node being drained. When that image would try to be restarted on a different worker, if that worker did not have the image it would try to get it from the registry. If the tokens were expired it would fail.

The only "fix" we ever found was to setup a cron job that forcibly deployed a new version of a "replicated globally" image every X minutes (where X was based on ECR token expiration). It kind of worked, but we still had occasional failures we could not identify.

I wish it worked better, because it was nice to use ECR. Frankly token expiration sounds much more secure too, but without direct support for token refresh inside the docker engine it's just hard to get everything to work


Looking at doing GitHub Packages for direct-to-dev and mirroring into ECR over here. Seems sound. But also considering other options as ECR is a pain to work with.

That said, word of warning for anyone looking at GitHub Packages for docker registry: it's broken with containerd and some other similar tools. They (GitHub) are currently working on a fix: https://github.com/containerd/containerd/issues/3291


I got set up with ECR without any difficulties whatsoever. You do have to authenticate before pulls and pushes, but that can be scripted very easily.


I was a happy user of JFrog's registries via site license at my last 2 places. Seemed to just work as expected. Didn't have visibility into the cost though (other teams set it up) so I had no idea it was $3k/year.


We have not had good luck with Quay. They are not stable, especially as of late. There was a period last month where for two weeks pulling images was a crapshoot.


Thank you very much. This is exactly the type of info I needed.


If you want to run a local registry to stay below the 100 pulls per 6 hours limit please consider GitLab. The Dependency Proxy https://docs.gitlab.com/ee/user/packages/dependency_proxy/ will cache docker images. This way you stay within the limits Docker set and subsequent pulls should be faster as well.


Personally, I wish generic caching proxies were still a thing, and easier to set up. I've tried setting up squid several times in the past, and failed miserably every single time--all I want to do is use it as a gateway (ie, make the proxy invisible to the application) for e.g. apt packages, so I just ended up using apt-cache or whatever other appropriate software, but I'd far rather use something generic that just works on 90% of the software I use at home, whether it's reading webcomics or repeatedly installing the same software in a dozen VMs with slightly different configurations, or even just browsing remote filesystems via webdav.


I use nginx to proxy cache the Arch Linux package repository transparently. It's fairly easy to set up, and enables nice features like contacting a secondary mirror if the first one is down, or when multiple requests hit the same resource, all are blocked waiting for a single merged package download, so the proxy will not make the download multiple times if I run pacman -Syu on my 18 machines in parallel. And it's all just 20-30 lines of nginx config.

It's not transparent though.



I just use ECR[1] which in many cases costs less and is fully locked down behind my AWS VPC

With ECR you pay for image storage: $0.09 per GB after the first 1 GB which is free

[1] https://aws.amazon.com/ecr/


are you gonna rebuild all the images that you use and push to ECR?


nope. you don't have to rebuild images to push to different registries

pull from docker hub once, push to ECR. then pull from ECR as much as wish


Just drop a Sonatype Nexus instance on a Docker container somewhere on your network. Alternatively, use Squid if you don't push to the public Docker registry, although you might need to mess around with internal CA for SSL...


Docker supports proxies (they call them “pullthrough repos”) so you don’t have to be so generic as an http proxy.


I would stay away from Nexus. It has problems with latest tags.


Nexus in a container... because storage in containers is such a good idea? Any vps with a disk is probably a better idea


You can still bind mount a directory into a container...


Storage in containers has been a long solved issue. The defaults are unfortunate because but make sense for ease of use. Your container root should be read only, ephemeral storage lives in a tmpfs or dynamic volumes depending on performance and size needs, and persistent storage lives in volumes.


https://docs.docker.com/registry/

you can set it up in less 10 min and the only thing required is to add '--insecure-registry' in your client. It is not a issue if all your machine are in private network.


Isn't there no authentication on that registry? I guess that's fine if you don't believe in zero-trust architecture.


you are right. That is what you can get in minutes.


If you cannot get a TLS cert for internal infrastructure in a few minutes, I'd recommend you start looking into why.


no good document on it and it is not very important for me ( I run it on homelab).

still wonder how to do it in minutes.


I use this (in a docker image) to generate certificates automatically: https://github.com/adferrand/dnsrobocert

Expect to spend 1-2 hours first time you try it until you can setup the correct DNS records, API keys and configuration.

Afterwards it's pretty hands off, every three months you'll receive an email from letsencrypt and you'll have to rerun this script to regenerate your certificates. Takes 2-3 minutes max (but of course you still need to distribute your certificates to all relevant services...)


If you run traefik it's even easier: https://docs.traefik.io/https/acme/


If you run on Kuberenetes, the image is/can be cached at network layer.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: