It’s not common to need that many pulls nor is it hard to build your own images....

hamiltont · on Aug 24, 2020

Do you run a local registry? Any high-quality articles/youtube talks to share? I'm about to set one up for our own little cluster (~5 machines, ~75 containers). I know tons about docker engine, and a fair bit about the registry, but it's always nice to watch a "lessons learned from actually doing this in production" talk to know what mistakes to avoid

oppositelock · on Aug 24, 2020

I run tens of thousands of docker images in production, or rather, tens of thousands of copies of a few hundred images.

If you do something like this, you absolutely MUST have a local registry.

Harbor [1], JFrog [2], and Quay [3] would be the first ones that I look at.

Harbor is open source, free, and a member of the CNCF. You will need to do a little bit of work to set it up to scale properly. JFrog offers a SaaS registry, but you will pay big $$ based on pull traffic. Their commercial site license is about $3k/year. Quay is older than either of them, stable, and high quality. I'd start with Harbor these days.

[1] https://goharbor.io/ [2] https://www.jfrog.com/confluence/display/JFROG/JFrog+Artifac... [3] https://quay.io/

manquer · on Aug 25, 2020

Just to add all the major cloud service providers provide registries ACR /ECR/GCR etc . If you run k8s service with one of the them in my experience it is best to use the corresponding registry.

I have pulled and run 20k times a 1GB image in less than 10-15 minutes without breaking a sweat.

Finally GitHub packages offers a registry out of the box . It is great for CI and devs to access . I generally have the tags mirrored from tags GitHub for production to ACR .

threeseed · on Aug 25, 2020

Github Docker Registry is a mess and should be avoided at all costs.

1) It is broken and unusable on Kubernetes and Docker Swarm.

2) It is flaky often returning 500 type errors.

3) It is expensive as the amount of pull bandwidth is very limited.

manquer · on Aug 27, 2020

Github packages works with Github CI out of the box, it makes development lot easier, like I mentioned for best networking in prod you should always use the registry from your k8s Provider, mirroring the Github registry to ECR/GCR/ACR is fairly straightforward. Bandwidth costs are eliminated, network is lot more reliable intra DC.

neurostimulant · on Aug 25, 2020

> It is broken and unusable on Kubernetes and Docker Swarm.

Hmm, I use them on several kubernetes clusters in the past few months and don't see any issue yet.

hamiltont · on Aug 26, 2020

FYI, Using ECR with Docker Swarm is something we did try. It was hellish. We never nailed down the exact problems, but we spent about a month with 2-3 experienced engineers trying to fix the edge case issues.

The main issue was ECR has a slightly different authentication model than docker swarm. The whole '--with-registry-auth' only partially works when you are using ECR. Unfortunately, it works just enough that you think it's working, until all your tokens time out and a worker can suddenly no longer pull an image.

Our common failure case was an image becoming unhealthy or a node being drained. When that image would try to be restarted on a different worker, if that worker did not have the image it would try to get it from the registry. If the tokens were expired it would fail.

The only "fix" we ever found was to setup a cron job that forcibly deployed a new version of a "replicated globally" image every X minutes (where X was based on ECR token expiration). It kind of worked, but we still had occasional failures we could not identify.

I wish it worked better, because it was nice to use ECR. Frankly token expiration sounds much more secure too, but without direct support for token refresh inside the docker engine it's just hard to get everything to work

ztjio · on Aug 25, 2020

Looking at doing GitHub Packages for direct-to-dev and mirroring into ECR over here. Seems sound. But also considering other options as ECR is a pain to work with.

That said, word of warning for anyone looking at GitHub Packages for docker registry: it's broken with containerd and some other similar tools. They (GitHub) are currently working on a fix: https://github.com/containerd/containerd/issues/3291

lolinder · on Aug 25, 2020

I got set up with ECR without any difficulties whatsoever. You do have to authenticate before pulls and pushes, but that can be scripted very easily.

drzaiusx11 · on Aug 24, 2020

I was a happy user of JFrog's registries via site license at my last 2 places. Seemed to just work as expected. Didn't have visibility into the cost though (other teams set it up) so I had no idea it was $3k/year.

apple4ever · on Aug 26, 2020

We have not had good luck with Quay. They are not stable, especially as of late. There was a period last month where for two weeks pulling images was a crapshoot.

hamiltont · on Aug 26, 2020

Thank you very much. This is exactly the type of info I needed.

sytse · on Aug 24, 2020

If you want to run a local registry to stay below the 100 pulls per 6 hours limit please consider GitLab. The Dependency Proxy https://docs.gitlab.com/ee/user/packages/dependency_proxy/ will cache docker images. This way you stay within the limits Docker set and subsequent pulls should be faster as well.

efreak · on Aug 25, 2020

Personally, I wish generic caching proxies were still a thing, and easier to set up. I've tried setting up squid several times in the past, and failed miserably every single time--all I want to do is use it as a gateway (ie, make the proxy invisible to the application) for e.g. apt packages, so I just ended up using apt-cache or whatever other appropriate software, but I'd far rather use something generic that just works on 90% of the software I use at home, whether it's reading webcomics or repeatedly installing the same software in a dozen VMs with slightly different configurations, or even just browsing remote filesystems via webdav.

megous · on Aug 25, 2020

I use nginx to proxy cache the Arch Linux package repository transparently. It's fairly easy to set up, and enables nice features like contacting a secondary mirror if the first one is down, or when multiple requests hit the same resource, all are blocked waiting for a single merged package download, so the proxy will not make the download multiple times if I run pacman -Syu on my 18 machines in parallel. And it's all just 20-30 lines of nginx config.

It's not transparent though.

EnigmaCurry · on Aug 25, 2020

this is my way: https://github.com/EnigmaCurry/lazy-distro-mirrors

drzaiusx11 · on Aug 24, 2020

I just use ECR[1] which in many cases costs less and is fully locked down behind my AWS VPC

With ECR you pay for image storage: $0.09 per GB after the first 1 GB which is free

[1] https://aws.amazon.com/ecr/

tuananh · on Aug 26, 2020

are you gonna rebuild all the images that you use and push to ECR?

drzaiusx11 · on Aug 28, 2020

nope. you don't have to rebuild images to push to different registries

pull from docker hub once, push to ECR. then pull from ECR as much as wish

mschuster91 · on Aug 24, 2020

Just drop a Sonatype Nexus instance on a Docker container somewhere on your network. Alternatively, use Squid if you don't push to the public Docker registry, although you might need to mess around with internal CA for SSL...

Spivak · on Aug 25, 2020

Docker supports proxies (they call them “pullthrough repos”) so you don’t have to be so generic as an http proxy.

efrecon · on Aug 25, 2020

I would stay away from Nexus. It has problems with latest tags.

mvanbaak · on Aug 24, 2020

Nexus in a container... because storage in containers is such a good idea? Any vps with a disk is probably a better idea

1337shadow · on Aug 24, 2020

You can still bind mount a directory into a container...

Spivak · on Aug 25, 2020

Storage in containers has been a long solved issue. The defaults are unfortunate because but make sense for ease of use. Your container root should be read only, ephemeral storage lives in a tmpfs or dynamic volumes depending on performance and size needs, and persistent storage lives in volumes.

swuecho · on Aug 25, 2020

https://docs.docker.com/registry/

you can set it up in less 10 min and the only thing required is to add '--insecure-registry' in your client. It is not a issue if all your machine are in private network.

johnmaguire · on Aug 25, 2020

Isn't there no authentication on that registry? I guess that's fine if you don't believe in zero-trust architecture.

swuecho · on Aug 25, 2020

you are right. That is what you can get in minutes.

majewsky · on Aug 25, 2020

If you cannot get a TLS cert for internal infrastructure in a few minutes, I'd recommend you start looking into why.

swuecho · on Aug 25, 2020

no good document on it and it is not very important for me ( I run it on homelab).

still wonder how to do it in minutes.

Aeolos · on Aug 25, 2020

I use this (in a docker image) to generate certificates automatically: https://github.com/adferrand/dnsrobocert

Expect to spend 1-2 hours first time you try it until you can setup the correct DNS records, API keys and configuration.

Afterwards it's pretty hands off, every three months you'll receive an email from letsencrypt and you'll have to rerun this script to regenerate your certificates. Takes 2-3 minutes max (but of course you still need to distribute your certificates to all relevant services...)

roleone · on Aug 27, 2020

If you run traefik it's even easier: https://docs.traefik.io/https/acme/

Sherl · on Aug 25, 2020

If you run on Kuberenetes, the image is/can be cached at network layer.