Hacker News new | past | comments | ask | show | jobs | submit login
Highlights from Git 2.31 (github.blog)
54 points by chmaynard on March 15, 2021 | hide | past | favorite | 30 comments

Whoa, `git maintenance` looks amazing.

It's a background server that you launch with `git maintenance start`.

> This cross-platform feature allows Git to keep your repository healthy while not blocking any of your interactions. In particular, this will improve your git fetch times by pre-fetching the latest objects from your remotes once an hour.

For anyone who was worried like me that this would screw up your refs:

    Instead, the remote refs are stored in
               refs/prefetch/<remote>/. Also, tags are not updated.

               This is done to avoid disrupting the remote-tracking branches. The end users expect these refs to stay unmoved unless they initiate a fetch. With prefetch
               task, however, the objects necessary to complete a later real fetch would already be obtained, so the real fetch would go faster. In the ideal case, it
               will just become an update to a bunch of remote-tracking branches without any object transfer.
So looks like they though of it & looks like there's no reason not to do this?

Sounds a lot like Microsoft's Scalar. [1]

The commits referenced in the blog post look to be from the Scalar team lead as well. Looks like some of their work is getting upstreamed which is exciting. [2]

Although, I wonder what this means for the Scalar project itself.

[1] https://devblogs.microsoft.com/devops/introducing-scalar/

[2] e.g., https://github.com/git/git/compare/26bb5437f6defed72996b6a2b...

You're right that we used what we learned from Scalar's background maintenance an applied that to Git itself.

Putting background maintenance into Git was actually part of our effort to get Scalar on Linux.

You might be interested in our "Philosophy of Scalar" [1] document, which includes this paragraph:

> Scalar intends to do very little more than the standard Git client. We actively implement new features into Git instead of Scalar, then update Scalar only to configure those new settings. In particular, we are porting features like background maintenance to Git to make Scalar simpler and make Git more powerful.

[1] https://github.com/microsoft/scalar/blob/main/docs/philosoph...

How large does a repository need to get to run into this issue?

> cross-platform

Implemented via crontab on linux, launchctl on MacOS, and schtasks on Windows.


At risk of sounding curmudgeonly, I can't help but feel like `git maintenance` is a bit of unneeded feature creep. Does it sound useful? Sure. But should it part of git itself, when its functionality could be easily achieved instead by crontabbing a relatively simple shell script that runs existing git commands? I feel like the answer is "No.". Of course, not everyone would go out of their way to write the aforementioned shell script; they even address this directly in the release:

> You could manage these data-structures yourself, but you might not want to invest the time figuring out when and how to do that.

On the other hand, not everyone will use `git maintenance`, so by the same token you could argue it's unnecessary bloat.

As noted in the documentation[1], doing this in the background on a repository that might be simultaneously in use is a bit non-trivial. You can do it using commands that already exist in Git, but doing so safely requires knowing a fair amount of details about Git's internals. By the same argument, you could say that it's overkill to have "git commit" when we could instead just construct commit objects using the lower-level plumbing commands.

This seems like a relatively small feature (about 2K lines of code by my count, a great deal of which is tests and docs) that results in a big quality-of-life improvement for anyone who works on large repositories.

[1]: https://git-scm.com/docs/git-maintenance

I'd argue that if we eliminate all features that not everyone will use, we'd have a bare bone tool that's barely useful, except to those who just like it so much that they'll implement all the features they need via scripts.

Since it's not something that turns itself on, eating your RAM behind your back, or slowing the computer silently, I'd say let the feature creep in as long as the maintainers are happy to maintain it.

`git maintenance` does set a crontab. Per the doc, it sets up the following:

  0 1-23 * * * "/<path>/git" --exec-path="/<path>" for-each-repo --config=maintenance.repo maintenance run --schedule=hourly
  0 0 * * 1-6 "/<path>/git" --exec-path="/<path>" for-each-repo --config=maintenance.repo maintenance run --schedule=daily
  0 0 * * 0 "/<path>/git" --exec-path="/<path>" for-each-repo --config=maintenance.repo maintenance run --schedule=weekly
`git for-each-repo` was a useful addition in 2.30, by the way.

for-each-repo looks great, I'm considering moving my karn[0] configuration to this instead.

[0]: https://github.com/prydonius/karn

The `git maintenance` documentation links seem to go to the wrong place - nothing on that page mentions "maintenance". Maybe this[1] is what they meant?

Time for a plug for Folder Git[2], which runs a Git command in a bunch of folders, as in

  fgit maintenance run -- ~/dev/*
[1] https://git-scm.com/docs/git-maintenance

[2] https://gitlab.com/victor-engmark/fgit/

As another comment already mentioned, we now have git-for-each-repo built-in.

That requires configuring a static list of repos. I might see if that can be changed, but I'm not exactly a C expert.

It’s both cool and confusing that Github is posting git release notes.

It is completely to GitHub’s benefit to blur the lines between Git and GitHub. Many newer devs don’t even realize there’s a difference.

This feels overly cynical. GitHub is a major Git contributor, as is their parent company Microsoft, both supporting use cases that either directly or indirectly benefit their users. And in GitHub’s case, their product is tightly coupled to Git. Why shouldn’t they want to highlight changes that may be useful to the readers of their engineering blog?

That was me years ago.

I feel like Git let its brand get used by others and now that's that. Kind of illustrates why companies defend their trademarks so vigorously.

The full story of Git* trademarks is here: [0]

TL;DR: GitHub applied for a US trademark 5 years before the Git project did, so the initial application from the Git project was denied as confusingly similar to GitHub. Rather than litigate who got to use what, the parties came to an agreement to continue using the marks in parallel as they had been doing, which allowed the Git project to be granted a US trademark. The Git project also grandfathered in others who had been operating in good faith like GitLab.

[0] https://public-inbox.org/git/20170202022655.2jwvudhvo4hmueaw...

They posted a blog about it. First sentence

"The open source Git project just released Git 2.31 with features and bug fixes from 85 contributors, 23 of them new"

With a link to the lore.kernel.org mailing list archive

Are there any plans to add support for large files to Git? (Not as an extension)

This was the subject of a presentation at the 2019 Git Merge conference:


Git has always supported large files! What it does not do is treat them specially, with different semantics, which is what git-lfs, git-annex, etc, do.

Of course, but large files don't work well in Git without using an external plugin you mentioned. And external plugins have other problems, e.g. I can't use them without setting up another server (can't use them in git-shell).

Right now I just use "--depth 1" often to prevent the download of the entire history. But it is not ideal.

Maybe not exactly what you're looking for, but I'm building a tool that's meant to be a companion to Git for large files[1]. The core concept is to track large files/directories in a separate content-addressed store and have Git track references to said files. To your "I can't use them without setting up another server" comment, I'm making use of rclone[2] to replicate the file store to any reputable storage service/platform. If you already use S3, for instance, just set up a new bucket.

I'm happy to answer questions and take any feedback.

[1]: https://github.com/kevin-hanselman/dud

[2]: https://rclone.org/

Only github(and maybe other hosters) have a size limit for files.

git-osx-installer is still at 2.27 :( Seems unusual such a popular project doesn't have up to date binaries for a popular OS.

brew install git installs 2.31.0

The Tim Harper version on source forge is indeed 2.27.0 and was last built for Mavericks. A general poke about activity suggests that he's much less active in various online communities now than he has been in the past - he might have gotten bored.

I don't see an update for Xcode command line tools yet - it was only released today. That said, I'm not overly optimistic as the version bundled in Xcode appears to be 2.24.3 (which is from April of 2020 https://github.com/git/git/releases/tag/v2.24.3 ).

Macports git is 2.30.2 (as of this typing... which is only 7 days old).

If you're on a Mac, use one of the package managers... or see if the git organization needs a new build maintainer for OSX.

Thanks, but our JAMF configuration does not allow homebrew, macports.

I think osx user either use brew, or use the version come with xcode

Applications are open for YC Winter 2023

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact