In response to this disclosure, we have set up a continuously-running scanner for credential leakages of various kinds. It's not foolproof, but it's made things a lot better. We'll be writing a proper blog post about this at some point, but we've been really busy!
I've also written about some less security critical things, like shell history (http://corte.si/posts/hacks/github-shhistory) custom aspell dictionaries (http://corte.si/posts/hacks/github-spellingdicts), and seeing if one could come up with ideas for command-line tools by looking at common pipe chains from shell histories (http://corte.si/posts/hacks/github-pipechains).
I've held back on some of the more damaging leaks that are easy to exploit en-masse with a tool like this (some are discussed in the linked post, but there are many more), because there's just no way to counteract this effectively without co-operation from Github. I've reported this to Github with concrete suggestions for improving things, but have never received a response.
This works pretty good too, does not suffer from the github blocking of your script and is probably even easier.
Github might include something like a warning on your repo that it includes possible data that you might not want out there.
You can access all of this functionality with ghrabber.
One of my suggestions to Github is that they disable indexing of dotfiles of all persuasions (including contents of dot-directories), unless the repo owner explicitly opts in. That would make it much harder to find a very large fraction of the more obvious leaks.
If true, doesn't this make crippling the usefulness of GH's search really superfluous?
Full disclosure: I'm never a fan of crippling search to cover the ass of someone who has pushed sensitive information to a publicly accessible location. I'm still sore about Google's decision to do things like prevent one from searching for -for instance- credit card numbers. :(
Just look at some of these chains:
ps | grep
cat | grep
find | grep
find | xargs
grep | wc
ls | grep
echo | grep
grep | grep
A particularly odd one on the list was `type | head`. Does anyone know the purpose of this?
Remember, we are always intermediates at most things.
Oops? Ironically (assuming two distinct values of PATTERN) I think you just answered your own question. (They are different: first is disjunction of patterns, second is conjunction).
Your point has merit for scripts (performance) but for data exploration at the prompt it's almost always irrelevant: the simplicity of pipe composition outweighs anything else.
Koa I'm curious of, I've seen almost every pull-request go in there, anyway nice post.
It's enough for one of the packages down the line to break compatibility and don't change the version correctly (i.e. bump up major version number bit), or have a slightly too loose version requirements and everything breaks down the line. Ok, if something gets broken it's relatively easy to notice given the test coverage is good enough.
However, it's much much harder when it comes to security breaches (like the one described in the linked article), you might not notice it for a long long time.
Anecdotal data but I tried to teach the interns to use yeoman when they were working on a small angularjs project and it just didn't work, because some dependency somewhere was broken. Happened to me as well and the solution was to try to update it a few days later (should have opened an issue, I know).
I'm using npm shrinkwrap to avoid surprises but still.. It just doesn't feel right. I shouldn't be risking to break the project just by updating the dependencies, unless I've decided to update one of the dependencies to a new major version.
Practically that means that you can push a semicolon fix, your CI server will fetch a different (newer) version of a dependency and break something completely unrelated.
For one it generates huge files. Like 750KB on a project with a couple dozen dependencies: https://github.com/metabase/metabase/blob/master/npm-shrinkw...
Secondly, it's not deterministic and will generate huge diffs every time you run it even if nothing changes.
Uber has a tool called npm-shrinkwrap that in theory is supposed to solve the latter, but I've never gotten it working on my current projects: https://github.com/uber/npm-shrinkwrap
The idea is to rely on semver. If you do ~1.3.4 in your dependency then if that dependency follows semver properly, you'll get 1.3.5 if it's out, and your stuff will still work, but you're getting bug fixes and patches without having to keep an eye on the sometimes hundreds of dependencies. Luckily tools like greenkeeper.io are around now.
The drawback is many people don't follow semver, so I opt to appending --save-exact to all npm installs (actually have npm config set save-exact true)
Security flagging that can parse your dependency tree is a good solution to this in my opinion.
Edit: the author notes that these are excluded by npm anyway these days. The documentation does not reflect this.
> npm will no longer include .npmrc when packing tarballs.
If I remember in the morning, I'll send them a PR to update their docs.
Wow, this is the scariest part. You already have your details leaked, get notified about it and still decide that resetting the token/login to the original value would be the best thing to do.
An example of this happening on AWS just like you mentioned: https://www.humankode.com/security/how-a-bug-in-visual-studi...
> That's pretty sweet, but by then the damage's already done.
I'm not sure that's true. I was able to disable the key before anyone used it (although it was locked down so far that they couldn't have charged anything to my account, since I didn't trust the code I was testing with real money).
Of course, 93% of our repositories are private so this feature may not be exceedingly useful to our customers vs other things we could be spending our time on.
Edit: I shouldn't have said not useful, rather, comparatively there may be more value in us pursuing other work first. E.g., provide a mechanism for 3rd party pre-receive hooks via our add-on system.
Is BitBucket separate from Atlassian? Are you hiring? ;)
Yes, we're part of Atlassian and we're hiring in San Francisco.
PHP's package repository, Packagist, doesn't have this problem because it's in the browser. You never enter or store any credentials on the command-line, you click a button on the Packagist site and it tracks your already-published GitHub repository.
Finally, if GitHub can automate some of the simple checks, so can we, for different tools and environments of course.
The article states that many popular Node.js packages have had leaks (in the past). Also, this article was not the source of many of these leaks (example: bower's github oauth token was expired by github itself when it was posted to the website).
You should better review stuff that you publish. That includes commit review, package contents review before publishing them, config files review, logs review before sharing them.
If you have an org — it would better to educate your devs more and make each commit go through an independent review. Also, don't forget about checking package contents.
Naturally anyone can become dependent on anything designed to assist them. I'm not really passionated about either direction really.