Hacker News new | past | comments | ask | show | jobs | submit login
Please stop serving .git to the outside world (pythonsweetness.tumblr.com)
247 points by _wmd on June 10, 2013 | hide | past | favorite | 85 comments




Apparently fixed as it's now a redirect loop.


what ! this is a huge security hole !


To be clear (it wasn't mentioned explicitly in the blog post), but it's obviously what the author was referring to, this is about people who are deploying web sites with static content which is being managed via git.


And it only applies when the document root is also the git repo root. If the document root is under a subdirectory, the .git directory won't be served.


I'm curious, what security risk does static content could pose by serving the .git ?


If you don't expect your source code to be made public and don't take care to keep secrets out of it, then you will be surprised when attackers have (for example) your cookie signing key.

I know checking secrets into source code is already a bad practice, but accidental publication takes bad practice and makes it a security hole.


Parent post was asking about static content, not dynamic. No code to keep secret.


Off the top of my head, if you use an email (to commit) that you don't want the outside world to see, now it's exposed.


For simple static sites, I use a workflow very similar to this one[1]. It takes a minute or two to set up, but once it's all configured, you can deploy to your heart's content without ever worrying about exposing your .git directory to the world.

[1] http://toroid.org/ams/git-website-howto


No. Not this. index.html has no business being in the root of your project.

Always, and I mean always put your web content into a directory separate from the root of your .git archive. This is the easiest way to avoid all of these problems.

Rails calls this directory "public", but it could be whatever you want, so long as what's mounted on your web server is not the root.


* Any folder in your project can have index.html and indeed it should have.

* Parent talks about static pages, not about rails-ruby/php/python project.

But i get your point : If somehow somebody screw with servers config, there is a risk to expose your apps files & configs. You can follow parents advice but set your root path in apache/nginx config to /var/www/www.example.org/public instead of /var/www/www.example.org


Even a static site can be structured so that the .git directory is outside of the main public mount.

This gives you a natural place to store notes, documentation, and other non-public content.

It's not about screwing with the server config, it's so that it takes several stupid mistakes before your .git folder is flapping in the breeze, not just one. Being one configuration directive away from embarrassing failure is not a good idea.


> Being one configuration directive away from embarrassing failure is not a good idea.

I can't stop but thinking about PHP webapps, for instance Wordpress serves wp-index.php in the same directory as wp-config.php, indeed only one configuration directive away from blowing it all up in your face.

Every few months or so I encounter a huge site that serves me PHP source code. For instance the BBC: http://www.bbc.co.uk/radio4/hitchhikers/zmachine/hhguide.xml

But hey its PHP I guess anyone who would care about something like that has stopped using it years ago.


WordPress can work with wp-config.php in your document root but it's recommended to move it 1 directory up (which is supported our of the box).


Why in the world would anyone serve a php file as .xml? That seems to be the problem... xml files are meant to be readable as text.


I would guess it's a configuration error around something like smart extensions, maybe? "If the clean URL has a .xml at the end, send the request through PHP."

Dumb, but it's the only thing I can come up with offhand.


I recently discovered that I had been serving .git on my blog for a couple of years. All it took to fix was a simple rule in my Nginx config:

    # Don't expose hidden files to the web
    location ~ /\. {
        return 404;
    }


Though, keep in mind the potential conflict with /.well-known/ - <http://tools.ietf.org/html/rfc5785>.


This isn't the best solution. Instead do as others have suggested and make a subdirectory of your project the webserver root.

This solutions just stops nginx (/whatever web server you're using) from displaying the file. If someone finds a remote-file inclusion vulnerability in your app, in all likelihood they can use said vulnerability to browse your .git directory -- because, hey, it's in your webserver directory, so the permissions are almost certainly set up so whatever user the webserver is running as (www-data probably/hopefully) can view it!

(Obviously if your webserver user can view important files such as these elsewhere on the system, you're still screwed -- but reducing attack surface, etc etc.)


Yep. It's good practice to disallow . files because one time you may forget you put one there.


deny all; works too


OK, so I knew about this one before. But if you didn't, there's a better solution than just remembering it, which seems to be the gist of this post.

Solution: run one or more automated security tools across your sites, before deploying them to public locations. If possible, automate this process, so it happens all the time. The tools won't catch everything, but they will catch something you didn't, some of the time. Testing is good practice with all software! Do it.


Can you recommend a couple of good automated security tools?



Very useful list. None of these have anything running as a service you can try out easily, I guess that would be a Bad Idea anyway as this kind of tools might either accidentally cause stuff to happen to a server, or be used as part of an attack. I guess script youngsters will not be very much deterred by having to install e.g. a python or ruby library - but any lazy bum that drops the idea to scan some elses site is a win.


https://www.tinfoilsecurity.com is a hosted automated security tool (Disclaimer: I work for Tinfoil)


Just tried it, not impressed.

"give us your email, give us your email, give us your email" on the first page isn't very welcoming, it sounds desperate. I can wait a short while for a scan (I came for the scan, not to give you my email address - what are you going to do with it, by the way?)

The scan returns 3 vulnerabilities - I have to create an account to see them. Alright, it's a "security company" (even though I've never heard of it before), how bad can it get? I have good spam filters anyway, whatever.

After creating the account, I have to verify I am the owner (this is a good thing!). I choose the meta tag option - it's broken, and the error is at the top of the page, not at the bottom where I have clicked "Verify". The link in the error doesn't even work.

So, I upload the HTML file, and proceed to my report, where the 3 vulnerabilities have reduced to 1, a vague "Entrance scan". Impossible to see the contents, the only possible action is another scan. which returns 0 vulnerabilities, my website is safe.

Just before closing the page, I notice my email address at the top right, I see "My Account" there... and it's a good thing I've checked, because I've been signed up to 2 newsletters.


(Disclaimer: I'm a cofounder of Tinfoil)

Sorry about the experience you had! We only ask for your email address in the event that you leave the scan running and go away, so that we can email you a link to your report once the scan is finished.

We require verification before viewing any vulnerability data because we wouldn't want to show the vulnerabilities to someone who shouldn't have access. We'll take a look into the Meta tag issues you were having - normally that works great, but it's possible we messed something up.

The 3 vulnerabilities being reduced to 1 is /incredibly/ unusual, and we've never seen that before. Could you email me at borski@tinfoilsecurity.com with the URL or email associated with your account? I definitely want to look into this and fix it if there's an issue.

As for the two newsletters, you're more than welcome to unsubscribe. We think the information contained in those emails is actually very useful to anybody programming web applications, but we certainly understand that it is more useful for some than others. :)

TL;DR: Would love to fix the issues you ran into - please email me and we'll make it right. :)


The communication needs to improve at every step. I've just discovered that my one-off scan is actually a daily scan.

I admit that my intent was curiosity rather than something I actually need, however there are more things happening in the background than expected.


I make it a habit to put all my public files in directory such as www/, where as .git and other non public but site-related files/directories are contained above this.


IMO this is the correct approach - Exposing every single file in the repository (including files that don't need to be public) sounds terrible.


Yes. I mean, even 6 year old PHP articles get this right, don't serve files from the same place your executing code is running from.

public_html has specific connotations for a reason.


This is too much of blanket statement. As long as there's nothing secret in the repository, serving up .git is perfectly fine. Both http://codemirror.net and http://ternjs.net (projects by me) have websites that are simply checkouts of the projects' repositories. Which were already public.


This is pretty much a blanket statement for any project of reasonable size (or even with just a sole clueless developer). But don't take my word for it, check the evidence first hand by poking around the list for yourself :)

The solution is so effortless that it seems indefensible to serve .git, when the risk of doing so is a fleeting moment of forgetfulness leading to your site and databases getting pwned. Kind of like /etc/passwd being "public" so long as nobody puts their password in their GECOS field. Why take the risk.


The repository is already public. It can't get any more public by being exposed over the website.


No, it really isn't a blanket statement. Almost all of my organization's web properties are public github repos (yay, non-profit organizations!). I doubt any of our properties serve .git, but that's just for technical architecture reasons (they're mostly Django apps), and there's literally nothing in the .git directories that isn't already available for free.


Back in the 90s I found I could get to /etc/passwd of a campus machine via anonymous ftp. I got all worried and I reported to the sysadmin, and they said anyone of the thousands of students who could log into the machine could read /etc/passwd anyway, and it was not a big deal because the passwords were in /etc/shadow and the anonymous ftp user could not read that file. This was back before ssh, and I think I expressed a concern about people knowing who had accounts on the machine, but you could tell if someone had an account via finger. Still seems like a bad idea to me. But if .git is from a public repo on github already I don't see the issue.


So long as the initial checkout wasn't made via https://user:password@github.com/your/repo/ URL it might not be an issue. But git doesn't warn in this case, happily writes the password you provide in plaintext to your web root, and you may never notice anything is wrong until much later. If it's not that, then perhaps it's the rude comment made about a coworker in COMMIT_EDITMSG that you abandoned before committing..

Even if neither of these were the case, the general principle of unnecessarily exporting chunks of internal state is asking for trouble somewhere, even if you can't think of a good reason why it would bite today.


I think there's nothing wrong with this if there aren't (and weren't) any secrets directly embedded in the source code and all configuration files that contain sensitive information are (and always were) properly gitignore'd.

Tech-savvy users can even be encouraged to pull the code and send patches. :)


Somebody correct me if I'm wrong here, but doesn't the .git directory essentially contain the entire history of the repository? The history could easily contain sensitive information like passwords. It will contain names email addresses of contributors, too. Try it yourself: cat .git/logs/HEAD


Why would it contain passwords?


Naive users have left passwords in by mistake, only to make another mistake in thinking a commit takes care of it.


I've done this, accidentally published my livejournal passwords on a project. Now I make certain that at the very least I have a folder covered by .gitignore in my local repo that contains any auth tokens or whatnot and I double check the history before pushing it anywhere.


Some people are naïve. All people make mistakes. And many people are naïve about whether they make mistakes.


Well, it could contain OAuth tokens for external services (eg Twitter), as well as secret tokens (used in Rails, Django for cookies).

Worse still, they could be using passwords in an external service (eg, for a database) and have included those as well.


Stop putting shit like this in your repo. Developers should not have access to credentials that make their way onto production.


So how else should they handle them? Assuming the repo is private, keeping keys in the repo is the most frictionless way to ensure everyone has everything set up correctly.

Environment variables get annoying quickly if you ever need different ones for different projects, and if you create a shell script (or Vagrantfile) to do it for you, you're still keeping the keys in the repo.


> Assuming the repo is private

Git is designed to facilitate sharing. Repos are a poor tool for managing secrets, especially intermixed with a general software project.

Use something else for secrets. Ideally you would generate the secret on the same server on which it will be used and not move it over the network (except for a one-off backup).


Put them in deployment-specific configuration files.


Granted, that stuff shouldn't be in the repo, but some of us are both dev and ops, or just working on our personal site. Chill out.


  //TODO replace password and username with user input
  password = "johnsmithAdminpass1234"
  username = "johnsmith01"
  login(username, password)
If your test-DB is the same as your live one and username and password are real admin passwords, then you are in trouble. It's horrible and you shouldn't do this, but it happens.

Edit: formatting


Fortunately, it can be fixed just by changing your passwords, which you should do from time to time anyway.


The history could. If you ever stored passwords in Git, even if you have now removed them, they would still be in the history.


Luckily, you can use git to rewrite history.

http://git-scm.com/book/ch6-4.html


I feel incredible stupid asking this, but...how?

I'm running a (very small) personal site on lighttpd, and updating it via git. I checked my.server.com/.git/config - and was duly served the config file. Eek!

So I edited my /var/www/lighttpd/lighttpd.conf file, and added the following lines: $HTTP["url"] =~ ".git" { url.access-deny = ("") }

and (I have used [asterisk] for the symbol, here, as otherwise it rendered my text italic)

$HTTP["host"] =~ "(.[asterisk])" { url.redirect = ( "^/.git(.*)" => "%1/nope.html" ) }

And restarted the server.

But my.server.com/.git/config still served up the config file. What am I missing?


Another way would be to only deploy releases to your server and not have the .git directory at all. Look at git archive.


Instead of messing around with access and redirect stuff I would suggest, as others have in this thread, that you just put your files in a subdirectory and use this subdirectory as root for lighttpd.

Problem solved with two commands and you can also add additional stuff to your repository like sources before they went through different processors (coffee script, SASS,...) or various drafts. This way you also have a full copy of everything you need for your site in case something happens to your workstation.


OK, now I'm REALLY confused.

I've moved all my documents from /var/www/lighttpd/mainSite to /var/www/lighttpd/mainSite/webdocs (leaving .git in .../mainSite).

I've updated /etc/lighttpd/lighttpd.conf to point at /var/www/lighttpd/mainSite/webdocs, and confirmed that it's pointing at the right place by editing a file, and confirming that that change shows up on the site.

I've confirmed that there is no .git folder in webdocs (you never know!)

But calling my.site.com/.git/config still serves a file!?

Time to go ask on some lighttpd fora...


Hit refresh in your browser. :)


Cleared the cache, refreshed, it's still there. Weirdness abounds.


So, instead of having:

.git some_page.html some_css.css

You would advocate: .git web_docs/some_page.html web_docs/some_css.css

?

Makes sense. It will require some reorganisation, but it seems (from everyone's comments) to be best practice. Thanks!

(Still kinda bugs me that I can't work out how to prevent lighttpd from serving individual files, though!)


You can turn the .git directory in a simple "shortcut"

To explain it briefly:

  mv .git ../git-repo
  echo "gitdir: ../git-repo" > .git


.git should never be in your web root... I can't think of any situation (other than a very simple site) where you'd want to just stick the whole Git repository in the web root. Normally there's a bunch of other things in the repository (documentation, database scripts, etc.) that you wouldn't want exposed publicly.


You can hide it very easily with any of the virtual host syntax, and it's easy to deploy that way. I say, why not.


You could also very easily add one additional command to your deploy script to copy all the web content from the directory being pulled to to the web root, and not have to worry about any accidents.


Or you could set the web root to be that directory.


Hiding it is riskier than just not having it there in the first place. Why blacklist when you can whitelist?


Because of all the reasons parent mentioned.


You should do both, though.


Same problem with subversion. Try http://yoursite/.svn/entries

If you use Apache, you can add the following to prevent the .svn directories from being served: <LocationMatch ".\.svn."> Order allow,deny Deny from all </LocationMatch


This is a brilliant post. I was about to deploy a site before I read this. It's so easy just to forget that directory.


You can keep the same workflow but just have .git somewhere else. Just define GIT_DIR.


Last I checked, GIT_DIR wasn't actually respected by a lot of git subcommands. Is that no longer the case?


As far as I'm aware, it's respected by every git subcommand. I've never had a problem using it. Do you have any specific examples?


Defcon 19: Adam Baldwin - Pillaging DVCS Repos For Fun And Profit: https://www.youtube.com/watch?v=3Tq8tUDKUH0


In a word: Rsync. What I do is have source files in one directory, filter them over to another with, at the moment, jinja2, then rsync that to my server using a shortcut script. The filtering is the tricky part.


I used to restrict that globally in my nginx / apache configurations (since I mainly work with these two web servers).


I just push to a repo out of the web root that has a git hook which sets GIT_WORK_TREE to my web root and does a checkout.


devs aren't always the ones deploying the sites they develop! Also, non-standard file extensions for source code is often exposed and should probably be hidden. (e.g. .module for drupal sites).


Just keep the .git bare repo outside the document root.


This is clearly Ops problem.


yes, but that doesn't make it not a dev problem.


devops


or, to put it another way, please stop writing closed source software ;)




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: