What you're looking at here is informally the "v3" effort, which extends and consolidates the "v2a" and "v2b" designs evolved by Heroku and Cloud Foundry respectively from the original Heroku design.
In v2 both Heroku and Cloud Foundry provide supported PHP buildpacks, as well as Java, Ruby, Python, .NET Core and I forget the rest right now. There are hundreds of community buildpacks.
It's not tied to the packaging format. Detect is the step that decides which buildpack or buildpacks will be responsible for constructing the image from the sourcecode.
Typically this means that buildpacks look for files that correspond to the relevant ecosystem. Maven buildpacks look for pom.xml. PHP buildpacks look for composer.json. Etc.
Nothing in this creates a hard binding. Detect steps may use whatever logic they need to decide on whether to signal they can work on a codebase.
Edit: in the v3 design the detect script can also provide dependency information that later steps can pick up. So for example, a JDK buildpack can say "yes, I can interpret this codebase, and I can contribute a JDK". A later buildpack can then look for this contribution as a condition, eg. the Maven buildpack can say "I will proceed if I see a pom.xml and if there is a JDK available".
> As one of the maintainers of Herokuish - a tool that manages the application of buildpacks to repositories and is used in quite a few Buildpack implementations - I am super happy that CNCF reached out to us and included us in the process. Oh wait, that didn't happen...
All jokes aside, this looks great. Super-early of course - seems like there are quite a few issues in the `pack` repository to be implemented - but I'm excited to see where this lands. Buildpack detection and application is not a straightforward problem.
Dockerfiles require you to rebuild lower layers when any upper layers change, even though the OCI image format doesn't care about this. Cloud Native Buildpacks can intelligently choose which layers to rebuild and replace. Additionally, certain layers can be updated en mass for many images on a registry (using cross-repo blob mounting, with no real data transfer!), which is useful for patching CVEs quickly at scale.
The samples take advantage of this (as well as a separate transparent cache) in order to demonstrate the different aspects of the formal spec. A simple buildpack is not necessarily much more complicated than a simple Dockerfile.
Yes, if the underlying layer’s hash changes then it has to be rebuilt. But if you just change index.html it caches the other layers and builds are very quick.
My issue with Buildpacks is that it looks like a glorified bash script (which is a skill I am not bashing — pun not intended) whereas a dockerfile is much more human readable and the idea of layers, for a guy coming from a systems background, is much more intuitive for me. The analogy of a very lightweight VM makes perfect sense to me which means I’m much more productive with it.
What you need to know is: it just works. You don't even need to write a Dockerfile any more. The buildpack turns your code into a well-structured, efficient runnable image with no additional effort.
2. Operators.
What you need to know is: oh thank god no more mystery meat in production. Patching the OS is a ho-hum affair instead of a stone grinding nightmare that turns your developers into a white hot bucket of rage because you have to nag them or block them from deploying or both.
3. Platform vendors, buildpack authors and curious passers-by
What you need to know is: All the other stuff about detect, analyse, build or export.
Unless you are in group 3, the basic thing is that Buildpacks require less effort than Dockerfiles with more safety and faster builds.
I don't think you're missing anything. It's vendors that don't do Docker trying to stay relevant in a Docker/Kubernetes world. By making Docker more complex, said vendors can continue to charge $500/pod for OSS k8s.
It's opensource and you can run it yourself, which has incidentally been true since before either Docker or Kubernetes existed. We've got an open PR to add Cloud Native Buildpack support to Knative Build. You can take these containers and run them on whatever Kubernetes you like.
I just tried this on my rails project and it is detecting a nodejs project. Is there a way to have it use ruby instead while still taking care of my yarn dependencies?
> This can be a problem with buildpacks, since you'd need one buildpack that has both the ruby and nodejs runtimes.
This isn't strictly true of v2 designs (both Heroku and Cloud Foundry have schemes for multi-buildpack support) and in the v3 design explicit consideration is given to making mix-and-match a triviality. Buildpacks can cooperate quite easily and in multiple groups.
Where this shines is in updates. If I have an OS base layer, a JDK buildpack and a Maven buildpack, the layers they generate can be independently updated without needing a complete rebuild. So far as I am aware, this is not currently possible with a Dockerfile, multibuild or not. If you invalidate the OS layer, everything else gets rebuilt whether it needed to be or not.
That's cool, I'll need to read up on the v3 design.
I think you are right about the rebuilds, but maybe not in all cases.. If you had.
FROM nodejs:whatever as js
RUN npm build
FROM golang:whatever as go
RUN go build
FROM base
COPY --from js app.js
COPY --from go app
If only 'base' was updated rebuilding the image would just need to re-run the COPY commands. The only way everything would get rebuilt is if you also updated the go and js images.
I agree, the Nix approach makes lots of problems go away, but you have to be bought in first. I wrote something distinct a few months back which overlaps on the "we can be smarter about layers" thing[0].
Google folks worked on "FTL", which is a technique for determining layering by reasoning about packaging system information. Jib[1] is one such system. There is a view that it will be possible to use FTL implementations or derivatives as buildpack components in the future.
Isn’t that the problem that Google’s image-rebase tool [0] is supposed to solve? Given a well constructed image with an OS, JDK, and app layers, it could rewrite any (or all) layers.
That's one the technologies being used in buildpacks v3. It's the key to very fast updates. But it's not the whole picture. Having a standard way to assemble the images means that the rebasing operation can be done safely.
We've also met to talk about buildpacks and have pre-existing working relationships with all the relevant folks in Google Cloud Builder and Google Container Tools teams through our work on Knative.
I currently use Bazel in combination with rules_docker to construct container images of applications, often written in Go. As I don't depend on cgo, my container images may be bare (i.e., only containing a Go executable and some static artifacts).
Though I understand that there are likely only dozens of us, my question is: what would the use of Buildpacks buy us for this specific use case?
one case is that you could have a bazel buildpack that contains bazel and java and all that stuff requires in order to run the build.
Unfortunately the resulting image would be larger than just a single binary, but building it would be a lot easier and repeatable for other people working on the project. The base layers would be cached though so in practice it might not be that much larger on disk.
I looked into using bazel for doing similar things, and the biggest stumbling block is that bazel itself is a PITA to install on all platforms. May end up trying https://please.build/ at some point.
Unfortunately that's a non starter for things like debian.. but that's good to know, I'll give it another look.
One thing I really like about bazel is the pkg rules.. I currently use fpm/goreleaser(nfpm) to build rpms for things, and it's nice having a single build tool that can build the app and spit out an rpm.
Mostly the advantage would be that 1) you wouldn't have to maintain the buildpack yourself, 2) operators can upgrade your software without needing to get you to do it, 3) upgrades might actually be faster due to layer rebasing and 4) if you add other stacks to your architecture, you still have the same development, deployment and update experience.
Hi, in my role as professional gadfly, I have been involved with this tangentially for a few months and directly for a few weeks (on behalf of Pivotal).
I used the Onsi Haiku over on the equivalent HN thread on Heroku's blogpost:
Here is my source code.
Run it on the cloud for me.
I do not care how.
The gist is that Docker containers are awesome for the Day 1 experience. I write a Dockerfile and I'm off to the races.
But then Day 2 rolls around and I have a production system with 12,000 containers[0].
1. What the hell is in those containers, anyhow?
2. A new CVE landed and I want to upgrade all of them in a few minutes without anyone being interrupted (or even having to know). How?
3. I have a distributed system with many moving parts. I build a giant fragile hierarchy of Dockerfiles to efficiently contain the right dependencies, making development slower. Then I snap and turn it into a giant kitchen-sink Dockerfile with the union of all the dependencies in it. Now production is slow as hell.
4. Operations become upset about points 1-3. Now I can only use curated Dockerfiles, can only come through our elaborate Jenkins farm, rules rules rules. Wasn't the purpose of Dockerfiles to make this all just ... go away?
Buildpacks solve all of these. I know what's in the container because buildpacks control the build. I can update CVE flaws in potentially seconds. Each container can have what it needs - no more, no less.
And most important: the buildpack runs locally, or in the cluster, exactly the same. It's all the developer benefits of Dockerfiles/docker build, minus most of the suck.
It sounds like a buildpack identifies a common case of what someone might be doing in a container / Dockerfile, and standardizes it to remove the arbitrary variation that inevitably occurs when lots of people are independently solving exactly the same problem. For example, the patch level of the OS is directly or indirectly specified in the Dockerfile, but generally doesn’t actually matter to the application.
If your use case fits into a scenario which is handled by an existing buildpack, then the claim is that you’ll be better off using the buildpack because the infrastructure can make optimizations that can’t be made with arbitrary containers.
If your use case isn’t covered by a buildpack, then you can either (1) make a buildpack or (2) revert to raw containers.
Clound Native Buildpacks are based on a very well-proved model. Heroku do this at massive scale. So does Pivotal, so do many of our various customers and customers of other buildpack-using systems like Deis.
Step 1: replace your Dockerfile with one that consists of the single line "FROM cloud-gov-registry.app.cloud.gov/python-buildpack". The buildpack contains magic that knows how to turn a standard python program into a usable docker image. (AKA, it inspects requirements.txt, et cetera).
Step 2: Now that your Dockerfiles no longer contain any real information, retool your orchestration system to use source tarballs rather than docker images.
That sounds un-ideal to folks who are used to and comfortable with the Docker image/Dockerfile based workflow. What advantages does this provide over plain Dockerfiles?
In a situation where there are many running containers a security patch could be applied across the board and still have confidence that running apps are not affected. This would be a case by case situation if they were all Dockerfiles.
The cool part is that when the patch arrives, devs don't have to do anything to be patched. Cloud Foundry already does this with buildpacks and so does Heroku.
A big part of what's new is that layer rebasing could make this really really fast.
Could you elaborate, though? It helps us to understand what seems magical so that we can either explain it better or make the process more transparent.
For companies with compliance requirements, you're required to change the systems explicitly; this is one of the reason why Docker images work so well; you can target specific tags for deployment and nothing changes with the same tag.
Aside from that; I think its a bad idea to have things update automatically. What if the upstream fix breaks things? It reduces trust in the build system.
> For companies with compliance requirements, you're required to change the systems explicitly; this is one of the reason why Docker images work so well; you can target specific tags for deployment and nothing changes with the same tag.
This isn't really true, though. Tags are floating targets, only the digest is stable. Taking Kubernetes as an example, suppose I push an updated Pod definition where I've changed an image tag from v1 to v2. If the tag is not properly locked, then I can be running multiple versions of the software without even realising it.
Speaking of regulation, we find a lot of people like buildpacks for that exact reason. Operators know exactly what OS is running in every container, exactly what JDK is running in every container, everything up to the runtime (and as FTL matures, up to the package dependencies as well). The platform doesn't have to accept any old container, they can all enter through a trusted pathway.
You can do this with docker builds, of course. You build CI/CD, you have centrally-controlled images, prevent non-conforming images from reaching production and so forth. But then you've pretty much recreated all the stuff buildpacks gave you, except you're the one having to maintain it.
> Aside from that; I think its a bad idea to have things update automatically. What if the upstream fix breaks things? It reduces trust in the build system.
Rebasing layers is close to instant. You can rollback the change as soon as it looks bad. More to the point, if the OS vendor or runtime have broken ABI compatibility, rebuilding a docker image won't necessarily help you to notice that before runtime.
Gotcha. Not my intention to jump down your throat with hiking boots, but the tag-vs-digest thing has been a big part of what I've worked on over the past few months. I agree heartily that keeping the books using digests is the only sane option and buildpacks retain that property. They're just producing and updating OCI images, at the end of the day.
LinuxKit (as I understand it) is for building the OS itself. Buildpacks are a higher level abstract for building a complete application image (with the emphasis on application). They take app source code as input, and output a docker image that's ready for prod.
Presentation to CNCF TOC: https://docs.google.com/presentation/d/1RkygwZw7ILVgGhBpKnFN...
Formal specification: https://github.com/buildpack/spec
Sample buildpacks: https://github.com/buildpack/samples