Hacker News new | past | comments | ask | show | jobs | submit login

I wish they explained how to merge existing repos into one new (mono)repo while keeping git history. Still haven’t cracked that problem



Here's a way you can do this with git. This trick relies on `git merge --allow-unrelated-histories`.

Assuming you have repos `foo` and `bar` and want to move them to the new repo `mono`.

    $ ls
    foo
    bar
    
    # Prepare for import: we want to move all files into a new subdir `foo` so
    # we don't get conflicts later. This uses Zsh's extended globs. See
    # https://stackoverflow.com/questions/670460/move-all-files-except-one for
    # bash syntax.
    $ cd foo
    $ setopt extended_glob
    $ mkdir foo
    $ mv ^foo foo
    $ git add .
    $ git commit -m "Prepare foo for import"
    
    # Follow those "move to subdir" steps for `bar` as well.
    
    # Now make the final monorepo
    $ cd ..
    $ mkdir mono
    $ cd mono
    $ git init
    $ touch README.md
    $ git add README.md
    $ git commit -m "Initial commit in mono"
    
    $ git remote add foo ../foo
    $ git fetch foo
    $ git remote add bar ../bar
    $ git fetch bar
    
    # Substitute `main` for `master` or whatever branch you want to import.
    $ git merge --allow-unrelated-histories foo/main
    $ git merge --allow-unrelated-histories bar/main

    # Inspect the final history:
    $ git log --oneline --graph
    *   8aa67e5 (HEAD -> main) Import bar
    |\
    | * eec0abd (bar/main) Prepare bar for import
    | * 9741d6d More stuff in bar
    | * 634ba3d Initial commit bar
    *   43be6e9 Import foo
    |\
    | * d4805a0 (foo/main) Prepare foo for import
    | * 4d2ca10 More stuff in foo
    | * 72072a1 Initial commit foo
    * bfcb339 Initial commit in mono


For the "move to subdir" step, I recommend using git-filter-repo, which should be preferred over git-filter-branch (older code snippets often use it).

Use git-filter-repo's --to-subdirectory-filter and --tag-rename:

https://github.com/newren/git-filter-repo#solving-this-with-...


Do you think this will speed up things? I tried the above suggestion and it's already for four hours to merge two repo's into one (3 years worth of git history)


I am not sure; I don't know much about performance of git operations. Which step is slow? Have you figured out why? I am curious.


Thanks, I will give this a shot


There are several ways to do this. Having extensively experimented with all of them I can say that the best are josh[0] (if you need external history continuity) and git subtree[1] (if you just need the commits to remain valid within your repository).

[0]: https://github.com/josh-project/josh

[1]: https://manpages.debian.org/testing/git-man/git-subtree.1.en...


Thank you, Josh looks interesting. I will need to look into this. At first read it looks like the end result is not a brand new Git repo that combines/merges a bunch of repos. I am not sure if a proxy is going to work well with Gitlab CI


If you can define exactly what you mean by "keeping history" (i.e. which operations do you want to support, and in what context?) I might be able to tell you how to do it :)



I'm curious about that as well. Maybe it'd be possible to start a repo with a single empty commit, rebase everything on that in a separate branche for each of the git repos and then merge them all into the master branche? Although some file renaming may be in order, otherwise everything ends up in the same folder.


Ive used this approach before (https://mattsch.com/2015/06/19/move-directory-from-one-repos...) with good results.


It is possible and I did it.

In target repo you create a folder and in that folder you rebase your dependency repository.

Maybe I can find better documentation that I remember writing it down somewhere.


You can use git pull with the --allow-unrelated-histories option.


git subtree


I will have to look into this. I always understood that this won't generate a new repo but somehow combine the other repos. The idea to merge the existing repos into a monorepo and then archive the old repos. I don't think that's possible when using subtree's


Subtree merges a whole repo into the subdirectory of another repo. You can git blame yourself back to the original repo. Unlike submodules, there's nothing in the file tree which signifies there is something special about this directory (it searches commit messages to get that metadata). From the monorepo POV, archiving is just never doing another pull. Using submodules is a nightmare.




Consider applying for YC's W25 batch! Applications are open till Nov 12.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: