
Using Monorepo? Do not rebuild unchanged components in CI - taleodor
https://medium.com/@taleodor/using-monorepo-do-not-rebuild-unchanged-components-in-ci-c386e7c03426
======
verdverm
Or use git diff to calc the changeset and match filename globs to builds.

No need for another SaaS expense to solve this problem

~~~
taleodor
Thanks for your comment!

I would discourage use of git diffs as in many CI cases builds happen per git
push rather than per commit.

This means that you need to maintain a separate mapping of component vs git
commit where it last changed. So you need extra commit from CI to maintain
that and more scripting to parse the map.

If you don't, and say you push or merge 2 commits at once, you would lose any
component that is changed only in the 1st commit.

Finally, the bash command in the article is universal and not dependent on
Git.

Regarding what you mention about SaaS expense - this is a secondary use-case
for Reliza Hub and I also suggested plain Git no-SaaS way in the article with
extra commit from CI.

However, I know some orgs that do not allow using extra commits from CI -
therefore Reliza Hub would be a nice workaround for those constraints.

~~~
verdverm
You don't need extra commits to enable what I'm talking about, just some extra
files. The historical commit hash you want to calculate from, and a file with
globs to match against.

We've implamented this in Jenkins. You can push multiple commits at once,
trigger several / multiple sub tasks / jobs, conditionally running tasks
depending on what has changed in the monorepo.

There is no mapping of git commit to components, the mapping is from file
globs (dependencies) to jobs (what should run if a file changes). Nor is there
any need for the storing of hashes of subdirectories on some file.

Pushing two commits does not cause the loss of code

TL;DR, your first two TL;DR points are exactly what git is for, so it seems
you are reinventing functionality that exists in the tool already

~~~
taleodor
So case I'm talking about: I'm pushing 2 commits at once - A, B (B being the
latest). Last commit before that was X.

Commit A changes component 1 relative to X. Commit B doesn't change this
component. In your case, how can you know that you need to compare Commit B
with Commit X and not Commit A - and actually build component 1? (you can
extrapolate to 3 or more commits being pushed at once). This is the problem
I've seen in the git diff implementations that I encountered. They either only
compare current commit with previous commit - which can be wrong - or use
extra tools to keep track of commit mapping (as I suggest).

One way around that I can understand is you force a build per every commit -
so now per my push there would be separate build per A and separate build per
B. But to me it's a waste as I push A and B together for a reason (so why do I
suddenly need 2 builds now?).

If you solved this problem and could share your solution in details, would be
useful for me - I would certainly include reference in my article.

\+ As mentioned I tried to build a solution that is agnostic of various tools
(so - approach that would work for any VCS or CI software).

~~~
verdverm
Read the git diff docs, it can span multiple commits. Builds are tied to
pushes, not commits. You will see the changes for both A and B. When you
create a new branch, you add the latest hash in git history to a file. When
you build, you compare the latest commit against X from the file.

Note, this is around branches, not commits. Generally people tie features to
branches and build branches. You want to build all changes from the new work
that is off of the main branch. Don't think about it at the commit level,
that's just how the bookkeeping is done.

The hard / manual part is maintaining commit X. You end up with merge
conflicts when two branches are in PR and one gets merged. Order doesn't
matter.

I'll try to post something this weekend, I'll link it here and probably create
a small repo

~~~
taleodor
Thanks, looking forward. Yeah, totally get about git diff - I'm talking
exactly about how you know that it is commit X that you need to compare with
and not commit A.

Also my preferred workflow is different from yours. I try to use trunk based
development with commits to master directly, branches generally allowed either
on dev machines only or long-term release branches which are never merged back
to master (on some high-compliance projects).

I try to reduce the number of branches as much as I can - there is some pain
there in committing to master and ensuring green build, but then you don't
have that pain when merging (following advice of Continuous Delivery book by
David Farley, Jez Humble).

~~~
verdverm
Trunk doesn't really work for teams because you need multiple features to be
worked in parallel

Commit X is stored in a well known file and committed to git

~~~
taleodor
Google does trunk-based development for thousands of people -
[https://trunkbaseddevelopment.com/](https://trunkbaseddevelopment.com/)

(Although if you mean short-lived 1-day feature branches always merged back to
trunk - we're on the same page here).

Worked fine for my teams as well, although I usually use multi-repo - part of
what Reliza Hub does is it gives holistic view of multi-repo structure.

~~~
verdverm
Just because Google does something a particular does not mean it's right for
everyone else. Most of us will never experience the problems of their scale.

Do you check out subtrees from the monorepo like Google does?

