Hacker News new | past | comments | ask | show | jobs | submit login
A security vulnerability in Git that can lead to arbitrary code execution (microsoft.com)
402 points by martinwoodward on May 29, 2018 | hide | past | favorite | 83 comments

A few important points that aren't mentioned in the post:

- you have to tell git to use submodules for this to trigger (so `clone --recurse-submodules` or a manual `git submodule update --init`)

- credit for discovery goes to Etienne Stalmans, who reported it to GitHub's bug bounty program

- most major hosters should prevent malicious repositories from being pushed up. This is actually where most of the work went. The fix itself was pretty trivial, but detection during push required a lot of refactoring. And involved many projects: I wrote the patches for Git itself, but others worked on libgit2, JGit, and VSTS.

Another thing not mentioned in the post, although admittedly more obscure, is that a 2.17.1 client will still happily ferry the evil objects along in its default configuration. I.e. in this sort of setup:

    unpatched hosting site ->
    in house (patched) v2.17.1 --bare mirror ->
    unpatched client
The transfer.fsckObjects setting needs to be explicitly turned on for the in-house mirror so that it doesn't collude in passing the bad objects along from the unpatched hosting site.

The protection in v2.17.1 only gets enabled by default if you're checking out a repository yourself, not if you're merely fetching and re-serving git objects[1].

Turning on receive.fsckObjects as the official v2.17.1 release notes suggest is not sufficient to protect against this attack. It needs to be transfer.fsckObjects, which also turns on fetch.fsckObjects, which is what's needed here.

1. https://public-inbox.org/git/20180529211950.26896-1-avarab@g...

Thanks, peff, for the feedback. I pushed some changes to try to clarify that this does indeed require `clone --recursive`, and I added a note to credit Etienne Stalsman explicitly. That was an oversight in my haste.

No problem! Thanks for all your work on this.

I should have clarified above, too: there were folks from GitHub, Microsoft, and Google working on the various fixes.

The Git community is great because even though many of the interested parties compete with each other in some form or another, we always put that aside. And that's especially true for security issues.

The monthly security release for GitLab was today, and this release was coordinated with the Git security release. https://about.gitlab.com/2018/05/29/security-release-gitlab-...

In addition to our recently implemented monthly non-critical security release process (we already had a critical release process before), we are making a number of changes in how we secure GitLab.com, which includes expanding our HackerOne program this year to be a public bounty program. As always, we appreciate the contributions of security researchers.

The initial Git for Windows 2.17.1 releases published on GitHub earlier today [0] apparently didn't include the patch. So just a heads up if you updated right after this was published: You probably want to make sure you have the fixed version (2.17.1.windows.2) [1]

Not sure why the erroneous releases haven't been removed? Seems a bit confusing.

[0] https://github.com/git-for-windows/git/releases/tag/v2.17.1....

[1] https://github.com/git-for-windows/git/releases/tag/v2.17.1....

The link on https://git-scm.com/ still points to the old 2.17.0.

Chocolatey has the original 2.17.1, the 2.17.1-2 update is not yet approved.

Cygwin still only has 2.17.0-1. I wonder whether they were even part of the embargo.

For some reason git-scm.com is almost always slow at updating links to new Git for Windows releases.

I usually just track the repository's atom feed [0] and download urgent updates directly from there (git-scm.com eventually links to the releases published on GitHub anyway).

[0] https://github.com/git-for-windows/git/releases.atom

Edit: The Git Chocolatey package has now been approved https://chocolatey.org/packages/git/

Perhaps it would be best if sensitive options such as the post-checkout hook could only be stored outside of the repository altogether. Given this vulnerability and the semi-recent .GiT/config vulnerability[0], I would not be surprised if other attack vectors are lurking under the surface.

Storing config data outside the repo would not be a foolproof solution, but it would probably make things a little safer. (Having the <repo_root>/.git folder has always felt a little bit "in-band" to me, and I don't like it.)

[0]: https://news.ycombinator.com/item?id=8769667

The correct CVE numbers are CVE 2018-11233 and CVE 2018-11235. The microsoft blog mentions 11234, but thats not the git vulnerability.

Thanks for catching that, I've updated the blog.

I know this isn't Reddit, but I can help but point out that 11234 = average(11233, 11235)

That's most likely the result of how CVE numbers are allocated: in sequence.

Another reason to train yourself to always think before you execute code you found on a site.

Too many of us are so used to git clone'ing a repo and building the software with make or its descendants that we overlook the security considerations.

> Too many of us are so used to git clone'ing a repo and building the software with make or its descendants that we overlook the security considerations.

The issue here is that the "git clone ..." allows for arbitrary code execution so the flow of 1) clone 2) analyze 3) make breaks at step 1.

Looks like it's back to tarballs for me.

From what I understood, it's only "git clone --recursive"/"git clone --recurse-submodules" or "git submodule init ..." that's vulnerable, not bare "git clone". So the flow of (1) clone (2) look at .gitmodules (3) git submodule init (4) analyze would still work.

Even with tarballs all it takes is one fuck-up in a script and your homedirectory is gone. In that sense nothing is safe. You could of course audit all the code that the makefile or build script executes but I really don't know anybody that would do that for anything beyond the trivial.

We've been totally conditioned to just wget some archive, unpack it and build it, and even if git clone takes it one step further in day-to-day practice there is no difference between the two.

I think the parent's concern is that there should at least be SOME way to audit everything. That doesn't necessarily mean that everyone ought to be doing it every time they clone a repo.

Most web interfaces to git will allow you to review the repo, if you blindly clone some repo that someone asks you to clone recursively I would see that mostly as a social engineering hack with the technical hook than a purely technical hack.

And depending on the sophistication you might want to do that in a chrooted environment or a VM.

There's no way to know if what you see on the web will be what you clone. The inspection must occur locally.

> The issue here is that the "git clone ..." allows for arbitrary code execution

It does not, actually. It's the submodule steps that create the vulnerability, and those have to be done manually. There are no standard or automated ways of pulling submodules, every project that uses them has its own scheme and provides its own build instructions. Frankly they're pretty obscure and in most communities replaced by tricks like npm's dependency management instead. It's a goof, and fixed, but even for the most naive users the exposure is fairly low.

It's true that it happens early in the process, but it's not true that a simple git clone command is a vector.

> Looks like it's back to tarballs for me.


This is going to be interesting for the go-guys, since a lot of go package management is built around git clones. It's probably fine for things hosted on gitlab, github or bitbucket, and then you missed that one sneaky little dependency stabbing you in the back.

Why is it more interesting for the go packaging situation? As far as I'm aware, most packaging systems run scripts from inside the packages anyway. If you're running the packages' scripts, being vulnerable to this is a side-note. You have to trust the packagers anyway.

Have you ever looked at a makefile before calling make? A typical makefile might take days to audit.

I’m pretty sure that it would be trivial to hide malicious code in a tarball that you wouldn’t find (especially if you’re not expecting it).

Wouldn't that preclude the usage of npm almost entirely?

npm from a security standpoint feels a little like a house of cards.

Using npm has never been secure, here is a fun little article about how someone used npm to inject malware into thousands of sites. https://hackernoon.com/im-harvesting-credit-card-numbers-and...

That’s a hypothetical piece (read it to the end), not the account of something that actually happened.

Regardless, it still highlights the fact that everything is built on a rather fragile layer of trust.

I mean, there hasn't been anything as malicious as that, but...

The left-pad issue [0] highlights how one developer can unpublish their code and result in breaking every package depended on that package. "Unpublish" can be substituted for anything from "unpublish" to "publish broken code" or "publish actively malicious code."

Further, issues like the is-odd/is-even package popularity spike how developers can develop minimally beneficial encapsulation packages and then insert them into other packages as dependencies to pump up their numbers. [1] Well, what if someone's motivation wasn't to make their packages look important but to instead give them a way to inject code to all the locations (or maybe just one!) that run `npm update` or similar automatically on a daily basis.

No, neither of these events that have actually happened are particularly malicious. On a scale of "excessive use of service" to "full network worm ransomware" they're somewhere around "suspiciously sketchy." But, the same problem can be exploited to cause real damage. Yeah, it very much reveals how fragile the web of trust in NPM is.

I really do hate to keep posting these links, but people keep bringing NPM up where they're relevant!

[0]: https://www.theregister.co.uk/2016/03/23/npm_left_pad_chaos/

[1]: https://www.reddit.com/r/programming/comments/886zji/why_has...

> The left-pad issue [0] highlights how one developer can unpublish their code and result in breaking every package depended on that package.

How they could unpublish their code. left-pad's specific issue is no longer allowed by npm and thus could not arise today. That's not to say other issues could not crop up, of course.

NPM is still not secure. So dangerous, many companies run their own NPM repos with analyzed packages.

In a similar vein, here is a fun issue I filed: https://github.com/npm/npm/issues/20072

npm might be one of the riskiest, but most other package managers aren't materially better.

Over the past few years there's been a few vulnerabilities in Git that result from an attacker injecting hooks into a repo. I wonder whether it'd be possible / worthwhile to disable hooks by default, and only enable them on a per-repo basis.

Of course, then the goal just becomes attacking that whitelist, and all the complexity that comes with that. Security is hard.

A good starting point would be requiring a specific flag when executing any git command that should be attached to a hook. Something like git clone —-allow-hooks to enable post-checkout hooks ; or git commit —-allow-hooks to allow pre/post-commit hooks.

After reading the responses to my previous question, I'd like to know if there's a global way to turn off post-checkout hooks.

None of the use-cases I read are convincing enough to allow `git clone` to do anything but what its short man description says.

I'm not even thinking about security, just basic separation of concerns. If `git clone` leaves a script-hooked repo in an unusable state for building, I want to know up front so I can complain to the maintainer and get that problem fixed.

Hooks aren't part of the repo. The only way your copy of a repo will have hooks is if you put them there.

Not if you `git clone --recursive` an evil repo with anything < 2.17.1 (which is the point of the article and the reason of this discussion): then your copy will have hooks that you have not put in.

What's a (common) example use case for a post-checkout hook?

That's a really interesting question. I tried to think about that when I was writing this blog post and couldn't really think about anything. Perhaps setting some permissions on a file or running `chattr` / `setfacl` is the best idea I had, but I omitted any suggestions because I've never actually used a post-checkout hook in anger.

The post-checkout hook gets used for things like concatenating JS and CSS assets in a repository that's meant to house templates, i.e. a poor man's invocation of "make" whenever a user does a "git pull" without the user having to do anything extra.

It can also be used for other cases where you'd like to amend what git does by default when updating the tree. See this recent thread[1] where some users want to have mtime behavior on files that's different from git's defaults, and one way to do that is via a post-checkout hook.

1. https://public-inbox.org/git/20180413170129.15310-1-mgorny@g...

Checking out submodules.


    git submodule sync && git submodule update --init --recursive
into `.git/hooks/post-checkout` and never again wonder why your code doesn't work after switching branches (because you forgot to update submodules).

Also put it into `post-rewrite` so that the same works for `git rebase`.

In one repo I use the post-checkout script is "npm install" so I always get the package versions for the branch I just checked out. One less thing to worry about.

I'm stretching here, but maybe compiling assets with webpack and then HUPing your web server? So when you switch between branches, your local server is always ready to serve up the version you've got on disk?

Notifications. Not for people, but for machines: "checkout is complete on X, and you're subscribed to this event, now do something based on that." If you publish on any event, it's the client's responsibility to choose which it wishes to subscribe to.

I use a set of pre/post hooks to save uid & permission data for a repo that contains www & system data.

Pulling a dependency project when a dependent project is checked out maybe

I'm sure there are security experts out there who managed to create tools that scans source code to find eventual security vulnerabilities.

Although I'm not sure those tools could 'find' and build a vuln, but there could be ways to analyze an algorithm, and detect that it can do dangerous things it's not not supposed to do. A little like static analysis works.

I'm sure those tools are already built by the NSA at least, so they just have to peek into github repos, point out what code is vulnerable, give it to some developer to make an exploit. Done.

That way the NSA would clearly wins the cyber arms race, versus those pairs of eyes Torvalds was being quoted for, would surely be obsoleted.

And the announcement from the Git project is at https://marc.info/?l=git&m=152761328506724&w=2. See also CVE-2018-11233 and 11235.

So to an extent, this is bad, you can be compromised by a malicious git repo, however given that most people already trust code cloned from git (or acquired from similarly untrusted sources like npm, rubygems, maven central etc) it may not change the equation that much.

If you run code without trusting the author, you're likely going to have a bad time.

It's not that you can be compromised by running the code contained in a git repository, it's that there's a bug that allows somebody to create a git repository that will cause you - merely by (recursively) cloning the repository - to run arbitrary code that they've provided.

This is, of course, unexpected. And while you should perhaps raise an eyebrow if somebody you don't know asks you to recursively clone a repository that you're not interested in - this is indeed a problem and you should upgrade your Git client.

Oh yeah I'm not saying don't upgrade, what I'm saying is that you're already cloning a repo. presumably to run the code therein.

Given that running the code therein (if the owner of the repo. is malicious) will hurt you, this doesn't do too much apart from have it happen earlier on :)

I guess one scenario where it could be a problem is if you were planning to clone untrusted code and read it all carefully before running it.

Suppose that you are, say, a security researcher, running purely static analysis tools on the code from every git repo that you can crawl, specifically looking for malware or common bugs without running the code (simple example: you are grep-ing for AWS keys and other secrets that should never be committed to a repo). This would be unexpected and very dangerous behavior in that setting.

Sure, in theory, "unexpected and dangerous behavior" is par the course for security research and you isolate even data that you don't intend to execute if you suspect it is malicious. But, in practice, this is an easy mistake to make.

As another example, consider an automatic git mirror, or whatever the internal GitHub/GitLab/Bitbucket infra might do to move repos around, without intending to execute the code.

Thank you for posting this. I found myself mostly agreeing with the user you replied to, even though I myself am crawling and cloning a large number of git repos to do some static analysis work on them. That was a niche attack vector that I'm clearly exposed to, and somehow I still wasn't seeing it.

I’d guess that only 50% of times that people git pull a repo, they eventually execute it. I know that’s the figure for me personally - I pull a lot of projects down locally so i can open them in my IDE and study how they work - never to be executed.

You're not wrong for the individual laptop case. But there's a lot of systems out there that download code, inspect a text, YAML, Dockerfile, or whatever kind of file, then sandbox it to run it. That's more or less how most CI systems work, and this vulnerability happens before the sandbox.

> But there's a lot of systems out there that download code, inspect a text, YAML, Dockerfile, or whatever kind of file, then sandbox it to run it.

I've never seen this. It smacks of terrible practice that would not last long in a daily CI system. Once you have a good version, locking on to that version is one thing. Just randomly downloading new code versions for a daily build is impractical.

Sounds like it could be devastating for shared git hosting like github, bitbucket, gitlab, and for anyone just mirroring repos.

Part of this bugfix includes changes to "git fsck" to reject these problematic repositories. Git hosting sites like GitHub and Visual Studio Team Services are actively blocking these repositories today to prevent being used as vectors for propagating this vulnerability.

Unless I'm reading it wrong the attacker still has to have a repo. under their control, then get the victim to clone the repo.

I'm not saying it's trivial , but more that people execute code from GH and similar alll the time without reading/evaluating , and this won't do any worse than that.

Let's say you sign up for a free account on a shared git-whatever.example.com and set it up to auto-mirror a malicious repo you host elsewhere. If git-whatever.example.com(1) had implemented its mirror feature on top of regular git, boom and you might get shell access on their server and full access to any and all private repos stored on the same service.

(1) could be any shared service like gitlab.com, github.com, bitbucket.org, etc

oh yeah, cool :) I was thinking about the issue from the perspective of an ordinary user of git, not of services that consume git repo's from other places.

I guess this will make some bug bounties for researchers who can find services that haven't patched quickly...

Seems like this actually happened on github: https://twitter.com/_staaldraad/status/1001542421161930752

> then get the victim to clone the repo


I don't think I've ever done that, even on the few places which have used submodules, but I guess some people do.

This is not what this is about. This is similar to figuring out that copying a file also executes it (in some cases).

While I try not to run code I don't trust, I have much more liberal point of view when it comes to cloning it. I assume others do too.

It's interesting that you say you try not to run code you don't trust.

I'd be interested in hearing how you establish trust in the software you run. Assuming you're using git for cloning software code, do you include libraries that are dependencies of the code you're running in your trust calculations?

Hmm - the "official" git PPA isn't updated yet: https://launchpad.net/~git-core/+archive/ubuntu/ppa. Wonder if this due to dependencies or oversight?

Ps. anyone running debian strech can get this via stretch-backports:

  echo "deb http://ftp.debian.org/debian stretch-backports main" | tee /etc/apt/sources.list.d/stretch-backports.list 
  apt-get update
  apt-get -t stretch-backports install git

Its there - thanks ;)

Am I the only one who didn't even know git submodules and recursive clones were a thing?

What the link doesn't mention, that I can see, is what the first version affected was. Does this bug go back to the inroduction of submodules in git?

Page is down -- anyone have a mirror?

Surely Microsoft should have enough resources to survive the usual HN-hug-of-death?

Maybe HN, Reddit and /. struck at the same time?

It's probably not worth upgrading anyways.

What I really love about this is that they give details of the vulnerability. They never do this in any Microsoft security advisories.

Guess when it's not your direct product this is OK.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact