
Speeding Up Our Build Pipelines - pasxizeis
https://engineering.skroutz.gr/blog/speeding-up-build-pipelines-with-mistry/
======
plicense
To summarize:

1\. Your build pipeline has a lot of hermetic actions.

2\. To speed it up, you execute these actions remotely on isolated
environments, cache the results and reuse when possible.

Pretty neat.

You might want to look into [https://goo.gl/TB49ED](https://goo.gl/TB49ED) and
[https://console.cloud.google.com/marketplace/details/google/...](https://console.cloud.google.com/marketplace/details/google/remotebuildexecution.googleapis.com)
if you need a managed service to do just that.

------
peterwwillis
I suggest a documentation cleanup. The initial README should have blurbs about
who should use it, what it's for, how it does it, and links to example use
cases. A quick start guide steps a user through accomplishing a simple task,
and links to extended documentation. Extended documentation is the reference
guide to the latest code, and should be generated from the code. I would not
suggest splitting documentation up into multiple places (a readme here, a
lengthy blogpost there, plus discombobulated Wiki); all documentation should
be accessible from a single portal, with filtering capabilities (search is
incredibly difficult to make accurate, whereas filtering is easy and
effective).

This whole solution seems like a very custom way to use docker. You can
already create custom Docker images with specific content, use multi-stage
builds to cache layers, split pipelines up into sections that generate static
assets and pull the latest ones based on a checksum of its inputs, etc. I
think the cost of maintaining this solution is going to far outweigh that of
just using existing tooling differently.

~~~
pasxizeis
Thanks for the suggestions, we'll improve the documentation soon.

------
rossmohax
in Docker that would be something like

    
    
      COPY Gemfile Gemfile.lock /src
      RUN bundle install
    

even if they don't use docker to run application in prod, it can be [ab]used
to perform efficient build layer (build step) caching and distribution.

~~~
quickthrower2
Problem with docker is there if something early on changes then everything
after needs to be rebuild.

However with a build you might have A and B feed into C. If A changes and B
hasn't you want to just build A and get B from cache.

~~~
caleblloyd
One solution in Docker is multistage builds where A and B can be separated
into intermediate images with results copied into C.

A pattern that I like to use is two multistage build Dockerfiles- `dev` and
`prod`.

The `dev` image has 2 stages. The first stage copies only files required by
the package manager, such as `package.json` into another directory using a
combination of `find` and `cp --parents`, then restores dependencies. The
second stage copies the dependencies from the first stage and overlays source
code. The `dev` image is then instantiated to run all tests.

The `prod` image also has 2 stages. The first stage starts with the `dev`
image and publishes a production bundle to a directory. The second stage
starts with a clean image and copies the production bundle from the first
stage.

------
nstart
Curious what the HN community feels is a "slow deploy". I scanned the article
first to find time reductions and still couldn't see how much time was
actually taken at the end of it.

11 minutes is a great time reduction. (11*30 builds a day = 5.5 hours saved in
total).

But I still am not sure what constitutes as a slow builds. I assume at some
point there's an asymptotic curve of diminishing returns where in order to
shave off a minute, the complexity of the pipeline increases dramatically
(caching being a tricky example). So do y'all have any opinions on what makes
a build slow for you?

~~~
pasxizeis
For us, when assets had to be compiled, a production deploy would take 15-17
minutes. After mistry (when asset compilation time was shaved off), deployment
takes ~5 mins.

------
arenaninja
The first point isn't so much a change in build pipelines as much as it is
avoiding the build pipeline altogether and deploying prebuilt artifacts; I
can't think of a reason to re-run your build for prod if you have run it for
another environment already. In other words recognizing that deployment and
build stages are different.

~~~
pasxizeis
Correct. It might sound obvious to some, but if you're an organization from
the Rails 0.x era (~2011) you have a lot of legacy code and infrastructure
that albeit not fancy new tech, it works.

------
siscia
It's actually touch a point very close to my work.

We definitely need much more speed in running our pipeline.

The software is mostly C/C++ with a lot of internal dependencies.

Do you guys have any experience in that?

What is worth the complexity and what is not?

~~~
Joky
The main difficulty with C/C++ is making the build steps "hermetics". The
build system is frequently unaware of what files the preprocessor will
touch/lookup before hands.

Build systems like Bazel are providing some way to ensure some level of
isolation that provides correctness guarantees. It also provide an easy way to
statically build a superset of the files needed for each step of the build,
allowing to integrate with a distributed service. Some related
documentation/discussions can be found starting from
[https://docs.bazel.build/versions/master/remote-
execution.ht...](https://docs.bazel.build/versions/master/remote-
execution.html) Another good source of information is online videos from
BazelCon: [https://conf.bazel.build](https://conf.bazel.build)

------
danielparks
This is basically parallel-make as a service.

This has been an increasingly difficult problem as more and more pipelines
move to containers for testing and building. What other solutions have folks
come up with?

~~~
jillesvangurp
I'm usually really obsessed by build speeds because I know how long build
times can suck the life out of a team. Slow builds cause a lot of negative
behavior and frustration. People sit on their hands waiting for builds to
finish; many times per day. It breaks their flow and leads to procrastination.
If your build takes half an hour, it's a blocker for doing CI or CD because
it's not really continuous if you need to take 30 minutes breaks every time
you commit something.

Here are a few tricks I use.

\- Use a fast build server. This sounds obvious but people try to cut cost for
the wrong reasons. CPU matters when you are running a build. This is the
reason I never liked travis CI because you could not pay them to give you
faster servers; only to give you more servers and they used quite slow
instances. When your laptop outperforms your CI server, something is deeply
wrong.

\- Run your CI/CD tooling in the same data center that your production and
staging environments live in and avoid long network delays to move e.g. docker
containers or other dependencies around the planet. Amazon is great for this
as it has local mirrors for a lot of things that you probably need (e.g.
ubuntu and red hat mirrors).

\- Use build tools that do things concurrently. If you have multiple CPU cores
and all but one of them are idling, that's lost time.

\- Run tests in parallel. If you do this right, you can max out most of your
CPU while your tests are running

\- Learn to test asynchronously and avoid using sleep or other stop gap
solutions where your tests is basically waiting for something else to catch up
while blocking a thread for many seconds where it does absolutely nothing
useful whatsoever. People set timeouts conservatively so most of that time is
wasted. Consider polling instead.

\- Avoid expensive cleanups in your integration test. I've seen completely
trivial database applications take twenty minutes to run a few integration
tests because somebody decided it was a good idea to rebuild the database
schema in between tests. If your tests are dropping and recreating tables
tables, you are going to increase your build time by many seconds for every
test you add.

\- Randomize test data to avoid tests interacting with each other. So, never
re-use the same database ids or other identifiers and avoid having magical
names. This helps you skip deleting data in between tests and can save a lot
of time. Also, your real world system is likely to have more than 1 user and
the point of integration tests is also finding issues related to broken
assumptions related to people doing things at the same time.

\- Dockerize your builds and use docker layers to your advantage. E.g.
dependency resolving is only needed if the file that lists the dependencies
actually changed. If you are merging pull requests, you can avoid double work
because right after merge the branches are identical and the docker will be
able to make use of that.

For reference, I have a kotlin project that builds and compiles in about 3
minutes on my laptop. This includes running a over 500 API integration tests
running against Elasticsearch (running as an ephemeral docker container). None
of the tests delete data (unless that is what we are testing). Our schema
initializes just once.

A cold Docker build for this project on our CI server can take 15 minutes
because it just takes that long to download remote docker layers, bootstrap
all the stuff we need, download dependencies etc. However, most of our builds
don't run cold and typically from commit to finished deploy takes around 6
minutes and it jumps straight into compiling and running tests. Our master
branch deploys to a staging environment. When we merge master to our
production branch to update production, the docker images start deploying
almost immediately because it already built most of the layers it needs for
the master branch and the branches are at this point identical. So a typical
warm production push would jump straight to pushing out artifacts and be done
in 2 minutes.

~~~
deboflo
15 minutes to pull an image is crazy. Run a “docker pull {image}” followed by
a “docker build —cache-from {image} ...” to speed up your pulls by 10X.

~~~
jillesvangurp
Not pull an image, build an image from scratch.

~~~
deboflo
Ah, that makes more sense then.

------
deboflo
JavaScript bundles are often a bottleneck in web builds. I wish there were
better ways to speed this up.

~~~
swsieber
Yarn plug-n-play (pnp) is a mechanism developed to speed up yarn install (up
to 70% faster).

IIUC, angular is considering (working on?) using Hazel under the hood to
parallelize angular builds.

~~~
deboflo
Thanks, I'll look into those.

