
Truly immutable builds - kiyanwang
https://blog.frankel.ch/truly-immutable-builds/#gsc.tab=0
======
jermo
I believe "reproducible build" is a more appropriate term.

Unfortunately its quite difficult to get a fully reproducible build with Maven
as the post points out. Even if you manage to replicate the build environment
and settings, Maven produced binaries will be slightly different on each run.
File timestamps are saved in jar files so they change on every build.

------
paulddraper
FYI, the typical phrase here is "hermetic" (sealed, air-tight).

Several build systems stress this aspect, include Google's Bazel, in beta.

------
jonhohle
Missing: all of the other software on the host (including maven) and the OS
itself.

~~~
pmontra
Yes, maybe it doesn't apply to Java but I've been unable to setup projects on
a new machine because library versions changed, external programs became hard
to build themselves, some sw is impossible to find anymore (I remember a
customer with a prehistoric MySQL, we had to find the oldest available bug
compatible version and hope - they had no tests), etc.

A Vagrantfile or a Dockerfile are not enough. You need the actual images.

~~~
rtpg
Or in your docker file you specify versions of things to install, check in
repos to use, etc etc.

It's versions all the way down

~~~
ris
You're assuming that all those repos are still around 5 years later.

~~~
manyxcxi
They are if you mirror the Docker registry. We use Sonatype Nexus and get the
added benefit of having our own private Docker, NPM, Maven, etc. repositories
on top of automatically caching dependencies we use.

So now we don’t have to worry about them disappearing AND we don’t have to
leave the network to get them... assuming we don’t screw up our server.

~~~
pmontra
We own our registry. That's why we have the images for one project. We install
from there.

What we don't do yet is storing a pristine a VirtualBox image for another
project. We should start doing that because sometimes building those images
requires a surprising amount of work. Bitrot, etc.

------
gumby
"Truly immutable" story: at Cygnus we had a phone switch company (DSC) as a
customer. They paid us a LOT of money to support a frozen version of the tools
(gcc etc). When they found a compiler bug and asked for a fix they would then
diff the binaries and use the debugging data to make sure that every delta in
the binary was solely a result of the lines of code in the bug fix.

They were so hard core about immutability: they had an old code base (Z800? I
can no longer remember) that they didn't want to migrate away from even though
the hardware was no longer available, so they build an emulator and ran the
old code on that on their newer, PPC-based hardware. Newer code ran natively
on the PPC).

Their concern: they had SLAs with their customers (not SAAS customers,
customers who bought the hardware) of no more than two minutes of downtime per
decade.

Never heard of DSC? Yes, famously their hardware brought part of AT&T's
network down for 10 hours or so in the late 90s or early 00s. That was the end
of DSC.

------
xchaotic
I'm not too convinced by this trend of hiding complexity - in this case we
should expose the complexity and KISS instead? What good is an
immutable/hermetic build going to do if it's broken and you don't know here or
how?

------
saosebastiao
Just out of curiosity as I've never personally experienced any bugs of the
sort, but what is an example of a really nasty bug that someone has
experienced that this sort of thing prevents?

~~~
joshribakoff
Inherited project running in production, sources in git did not build. Project
depended on packages which were present when it was built, but never added to
package.json. Code in production was minified. Had to track down a ton of
missing npm dependencies.

Another example is someone depends on a library, then that library's author
deletes it.

This is why it may be worth committing your dependency's sources, despite
"best practices" dictating otherwise. When someone needs to add a feature
10yrs from now and they only have an incomplete package.json & a copy of
minified code, they're in for a world of hell.

I even had to recover a project which built non deterministically. Sometimes
it would build, sometimes it would error. Running the build several times, it
would eventually succeed. The project's author had written their build system
against an unstable version of Webpack & Typescript, before they were popular
(both of which are now perfectly stable).

~~~
viraptor
> This is why it may be worth committing your dependency's sources, despite
> "best practices" dictating otherwise

I think you may have misunderstood the best practice here. That best practice
is you shouldn't just copy the dep code into your repo. The
maintenance/release best practice however is that you should always have a
copy of all your dependencies available locally and permanently store them.
They're two completely separate rules.

~~~
joshribakoff
So are you saying you store them, but not in your version control? What do you
mean, like backups of production? What about when production only contains
minified / compiled artifacts from the build? Where should I be storing my
dependencies? Got any articles to share that elaborate? Or links to open
source projects on GitHub that employ the "best practice"? Everyone seems to
be just committing bower.json package.json composer.json etc.. & calling it a
day.

~~~
viraptor
Basically don't rely on the external sources. Keep your own repository with
the packages you need. Whether that's controlled with git, backups, prepared
S3 bucket or other ways is less relevant.

The reason stuffing the dependencies with the normal source is bad is you have
a huge repo after a while, where most of the size comes from old dep versions
you're unlikely to ever use. (But have to clone anyway)

~~~
joshribakoff
I guess it's a tradeoff. A bloated history sucks for sure. So does maintaining
brittle legacy code that has to change often. It's a tradeoff between a
bloated repo and being able to easily bisect issues from arbitrary points in
the projects history. I was hoping you knew of some tool that maintains the
dependencies separately while correlating them with specific commits in the
main repo. It's nice being able to just checkout arbitrary commits and have it
just work. In one project there was dozens of package.json files. I committed
the node_modules as a short term hack to get a one step build. It makes git
very slow but it was worth the tradeoff for this particular project. Another
issue is node gyp since its platform specific, you don't want to be committing
mac specific stuff when some team members use Linux. Managing dependencies can
really suck sometimes. Another consideration is deployment. In my experience
deploying using Capistrano and git is often faster than rsyncing a bunch of
files or copying from s3 or wherever. My conclusion for this project is yes it
sucks that our repo is now bloated. But maybe that should suck. Maybe it makes
us more cognizant of our dependency issues. Maybe having tons of deps should
cause you pain, as motivation to more carefully consider your dependencies

