Hacker News new | past | comments | ask | show | jobs | submit login
Show HN: Accelerated Docker builds on your local machine with Depot (YC W23)
63 points by jacobwg 4 days ago | hide | past | favorite | 29 comments
Hello HN! We just launched a new feature we built at Depot that accelerates Docker image builds on your local machine in a team environment, and we wanted to share some of the details with you all.

The launch blog post: https://depot.dev/blog/local-builds

Depot is a hosted container build service - we run fully managed Intel and Arm remote build machines in AWS, with large instance sizes and SSD cache disks. The machines run BuildKit, the build engine that powers Docker, so generally anything you can `docker build`, you can also `depot build`.

Most people use Depot in CI, and you could also run `depot build` from your local machine as well. That would perform the build using the remote builder, with associated fast hardware and extra fast datacenter network speeds.

But then to download the container back to your local machine, BuildKit would transfer the entire container back for every build, including base image layers, since BuildKit wasn’t aware of what layers already existed on your device.

The new release fixes this! To make it work, we replaced the BuildKit `--load` by making the Depot CLI itself serve the Docker registry API on a local port, then asking Docker to pull the image from that localhost registry. The CLI in turn intercepts the requests for layers and fetches them directly using BuildKit’s content API.

This means Docker only asks for the layers it needs! This actually speeds up both local builds, where you only need to download changed layers, as well as CI where it can skip building an expensive tarball of the whole image every time!

We ran into one major obstacle when first testing: the machine running the Docker daemon might not be the same machine running the `depot build` command. Notably, CircleCI has a remote Docker daemon, where asking it to pull from localhost does not reach the CLI’s temporary registry.

For this, we built a "helper" container that the CLI launches to run the HTTP server portion of the temporary registry - since it’s launched as a container, it does run on the same machine as the Docker daemon, and localhost is reachable. The Depot CLI then communicates with the helper container over stdio, receiving requests for layers and sending their contents back using a custom simple transport protocol.

This makes everything very efficient! One cool part about the remote build machines: you can share cache with anyone on your team who has access to the same project. This means that if your teammate already built all or part of the container, your build just reuses the result. This means that, in addition to using the fast remote builders instead of your local device, you can actually have cache hits on code you haven’t personally built yet.

We’d love for you to check it out, and are happy to answer any questions you have about technical details!

https://depot.dev/docs/guides/local-development






We have been happily using Depot for months now to build https://plane.dev. Prior to finding Depot, we basically gave up on building an M1 image from a GitHub action.

(btw, I always get suspicious when a Show HN post has a lot of praise in the comments, but I swear the Depot folks did not ask me to post anything and I only saw the post because I was checking HN)


Ah, I read through the stuff you guys were working on a while back. We use docker but haven’t made the leap to k8s and friends yet. One reason is that we too have dedicated backends per user and it doesn’t seem like an out of the box fit (never used k8s, so might be wrong). Your solution looks to fit our problem better (need a persistent dedicated process per user on the backend). Will take another look.

Glad to hear it! Feel free to reach out if you have any questions. We went the kubernetes route ourselves before coming to the conclusion that it was not the right fit for the problem.

Just to echo the other comments- really impressed with both Depot and the team. I decided to kick the tyres on it last week and suddenly found myself replacing all our production docker builds with it by the end of the day. Felt like my Tailscale experience in terms of onboarding.

Totally seamless integration and it solves a very real issue that I’ve had with docker caching across our environments. We tried with the docker s3 cache originally but it didn’t really work in practice. Depot is the answer.

When I ran into an issue last week, the guys had responded and scheduled a call within minutes.

Depot are a team I’m happy to back with a product I’m very happy to pay for.


Hey Depot peeps, I like the idea of faster builds, but that's not what I really need. I need easier builds.

Making a simple container with a simple app is easy. The devil's in the details. What if you want to pull from a private registry using temporary credentials tied to a specific user, then use different temporary credentials during the build to pull packages from a different private package repository, persist the package cache across different container builds, then push the images to a different remote registry with more temporary credentials, with multi-stage builds, without capturing secrets in the container or persisting environment variables or files specific to the credentials?

Now what if you wanted to do all that in a random K8s pod?

Yes, of course there are ways to do this, I've pretty much done it all. But I've spent a huge amount of time to figure out each and every step. I've seen dozens of people take the same amount of time to figure it all out, often taking years to truly gather up all the knowledge. You know what would be great? If I didn't have to do that. If somebody just said "Here, Product X does everything you will ever want. The docs explain everything. Now you have 600 hours of your life back.", I would say Take. My. Money. I don't even necessarily need a product, if someone could just write down how to do it, so I don't have to bang my shins for days to get one thing to work.

Fast builds are nice because I can run more builds, but easier builds are nicer because more people can work on containers faster.


Sounds like you have some very complex, niche requirements. Perhaps paying an experienced human is the best solution (sleepless nights included), instead of laying yet more technical complexity on top.

A happy paying customer. Depot is great and their docker drop-in replacement GitHub Actions are working perfectly, highly recommended. Thanks folks!

Super cool - we've been using Depot in our CI pipelines since Feb and it's allowed us to focus on shipping / keeping our CI infrastructure simpler. Kyle and Jacob have been super-responsive whenever we've encountered issues.

I've been using Depot for a while and I'm a fan. It has made our (GitHub Actions) builds a lot faster because the cache is always warm. It also helps a lot when building amd64 images from my arm64 Macbook - that's excruciatingly slow if I just run `docker build`.

My only complaint is more about GHA - I wish there was an easier way to build multiple unrelated images at the same time in a single GHA job. Running `depot build &` to background things is a bit fiddly when it comes to interleaved console output, exit codes, etc.


Nice!

Our docker builds are getting slow despite using kaniko, does depot has a better caching than kaniko?

How so?


It should yeah, our builders are based on BuildKit rather than Kaniko, which optimizes for building container images in parallel and caching as much as possible. BuildKit also supports some more advanced types of caches, such as cache mounts: https://github.com/moby/buildkit/blob/master/frontend/docker...

Both Kaniko and BuildKit can be run in rootless mode - we are not doing this, instead we give every builder access to an isolated VM, so builds are a bit quicker as well by avoiding some of the security tricks that rootless needs to work.


Where does this isolated VM run?

In AWS - we launch either Intel or Arm EC2 instances depending on the requested build platform (or both for multi-platform builds). When a project's builds are running, they have sole control of that instance, which is terminated when the builds are done.

To make this performant we keep a certain number of spare "warm" machines ready for build requests so that you don't have to pay the instance launch time penalty yourself.


Just to clarify, when you run depot build, does build run locally or it runs remotely in an ec2 instances? Also, it sounds like the instances is on your side, not on customers infrastructure. Compounding build time is a problem, but I think we solved it with buildkit cache. But the setup you are describing, if I understand correctly might be a no-go for enterprise customers. May be you are going after the mid-market companies, in that case it might work. Just an opinion from my side.

I think Kyle answered this below, about the option for enterprises to run the data plane of Depot in their own cloud account. In that model, the Depot CLI connects directly to that data plane without passing through any infrastructure on our side.

> I think we solved it with buildkit cache

One big thing we're doing here, if you're familiar with BuildKit cache, is providing builds a stable cache SSD that's reused between builds. This means we support all of BuildKit's caching features, including things like cache mounts that aren't directly supported in ephemeral CI environments. Plus Depot doesn't need to save or load the cache to a remote store like S3 or GitHub Actions cache, instead the previous cache is immediately available on build start.

This may not be any better or different than what you're doing, I just wanted to mention the detail for anyone familiar with trying to make BuiltKit more performant.


It’s a client/server model with remote build kit, I believe.

> the setup you are describing, if I understand correctly might be a no-go for enterprise customers

Fwiw, I know at least buildbuddy operates in a similar space (hosted bazel cache/builds, client/server style).


I haven’t tried but I believe you can do either. You can use their api and tooling and it will connect to agents in your infra.

Hi there! Kyle here, the other half of Depot. This is correct, we have a self-hosted data plane model that larger enterprises can use if they want full control over the builders + build cache.

In that deployment model, the Depot control plane passes changes to make in the customers environment via a small agent they run in their account. Here is some docs we put together for anyone that wants to go into a bit more of the details: https://depot.dev/docs/self-hosted/architecture


So helpful to have a persistent Docker cache across builds. We sped up our Docker builds by 40-50% on average which directly contributes to speeding up our iteration speed. Excited to try the new local option - probably will be awesome for local testing of Docker builds!

Depot is freaking awesome. Sped up two of our Docker image builds from 11 minutes to 1-1.5 minutes and the drop-in Docker build replacement in GitHub Actions was super easy. Can't imagine our CI/CD system without it.

We're using Depot at Hathora, and it's enabled us to focus on our platform development instead of worrying about CI/build pipelines. We're very happy with the speed improvements we're noticing.

I expect a blog post to deeply explain what does "docker layer" mean and what's the best practice to optimize the Dockerfile.

Have "Show HN" posts ever been just a blog?

Why not just using nix build system for that, it can create docker images for years now with a great speed.

People will often go to any lengths possible to avoid using nix.

Happy customer here too at windmill.dev :)

We’ve also been happily using Depot.

Awesome work!

Cool post guys.



Applications are open for YC Summer 2023

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: