Hacker News new | comments | show | ask | jobs | submit login
GitHub hit by DDoS attack (status.github.com)
405 points by theyeti on Mar 27, 2015 | hide | past | web | favorite | 209 comments



This is an article [0] summarizes what happened. It is however in Chinese. So let me put a simple summary here:

Baidu has Baidu Analytics, a service similar to Google Analytics. In short, a website includes a javascript file from Baidu and Baidu will report some basic analytics to the site manager like how many visitors per day, how much time they spent on average per page etc.

Someone in the middle between a client outside China and Baidu, allegedly it should be the Great Fire Wall, changed the javascript file from Baidu and added some code so that any client executing the javascript file will periodically access https://github.com/greatfire/ and https://github.com/cn-nytimes/. This means any user who is accessing a site using Baidu Analytics will be an attacker to github.

Here is a simple solution: Block any javascript from Baidu if you do not use it. For chrome users, add the pattern [*.]baidu.com. See here[1].

Edit 1: Added a solution.

Edit 2: Format.

Edit 3: Oh, it's not only Baidu Analytics. Baidu Ads' javascript is also being hijacked and changed [2]. Imagine that all sites containing Google Ads use their visitors as attackers to attack github. Now it is literally what is happening to Baidu and its customers (and their customers' visitors.) The javascript is only changed for visitors outside China. This is why people believe that is done by Chinese government --- the only entity who has total access to all out-going routers in China. Since many Chinese users use VPN or other types of proxy to access Internet, they are all considered as visitors outside China.

0. http://drops.wooyun.org/papers/5398

1. http://www.howtogeek.com/tips/how-to-block-javascript-and-ad...

2. http://www.solidot.org/story?sid=43489


According to this[1] post GitHub (or someone else in between) started changing the responses to alert("Malicious Script Detected")[2]. That's an awesome counterattack - this stops the script from looping indefinitely and annoys the users.

1. http://insight-labs.org/?p=1682

2. https://github.com/greatfire/


For github, this is a smart move.

But, really, you can hardly negotiate with Chinese government. I'm pretty sure that they will deny this attack and re-emphasize their so-called Internet policy.

If I were github, instead of a warning message, I would redirect the workload to some Chinese government's website and let them suffer what they've created. Let's face it, they are waging a war on the Internet first.

Edit: Disclaimer: I know that my post is quite biased, especially this one. I'm not suggesting that people should wage a war to Chinese government. Please take my words just as a (biased?) sample from an ordinary Chinese citizen who is really tired of the government's censorship.


Let's face it, they are waging a war on the Internet first.

The two most important aspects in war are casus belli and plausible deniability. China has the latter and github lacks the former. Thus github would lose by default in any 'war' against the Chinese government.


How would github redirect the load?



I think you are underestimating the volume of traffic. Simple generating that many 301s would be an issue. And... where would you redirect to?


Generating a 301 is certainly less work than rendering the entire user's profile page.


Generating a 301 is likely more work. Profile pages are simple database hits, and they may be dynamically or even statically cached (for popular pages). You're probably severely underestimating how much traffic China can produce [1].

1. http://furbo.org/2015/01/22/fear-china/


A 301 avoids a database hit, and avoids having to go through the cache framework all together.

It just says given this URL, we return an header that tells the browser to redirect. The only thing faster would be just dropping the connection as soon as its given to you.


Excuse me, Why no 403 or 404?


If I scroll up a bit, I see the reason is:

> redirect the workload to some Chinese government's website and let them suffer what they've created.


A 301 is unequivocally less work in every respect.


If verizon and the like can hold up traffic to/from Netflix... couldn't they just do the same for chinese traffic? Maybe redirect requests for baidu to a chinese version of google?

I'm not saying this is a great idea, just that something could be done.


[flagged]


Could you be more of a sock puppet in this discussion?


That's an insult to sock puppets everywhere.


The Chinese government is for sure a plausible culprit here, but consider that the American government (NSA, or another TLA) is also a plausible culprit. If this ends up successfully blamed on the Chinese, it builds support for "cyberwar" defence funding in the US.


kardos, that is fair. I do not know which government entity paid for this poster, just that as usual someone is trying to sway online opinion with their workforce.


The wu-mao is real.


Instead of that "script detected" they should convert the whole page to something the Chinese government really hates, like Tiananmen Square massacre.


I hope they do. I hope they regret it happened. It should never have happened.


And DDoS the end sites?


If the message had some information about the Tiananmen Square massacre or some other censored information, the attack would probably stop. At least temporarily.


This is genius idea, Using GFW to deflect GFW attacks!


They are using the IP outside China...


Very interesting defense. It seems that it works because the attacking AJAX call is done with content dataType 'script'. I don't think it'll be too hard for the attacker to fix that.


It's either that or call a jsonp endpoint, which could still throw up the alert. CORS protects standard AJAX from requesting anything outside the current domain.


So Baidu is using eval() instead of JSON.parse()? What kind of engineers did they hire?!


They're using neither. It's a cross-domain call so Github could block a regular AJAX GET by just not including ACAO headers. So they are using $.get with dataType 'script'. This is basically like JSONP without the callback - it adds a script tag with the remote URL to the page which means the client has no choice but to run the contents.


Even I use JSON.parse() without knowing its benefit :3


JSON.parse takes a string of "JSON" and turns it into a JS object. It doesn't evaluate the string in a JS context at all, which is what eval() does.

Some people have used eval() to do JSON parsing because JSON is a subset of JS, but if the user has any control into making malformed JSON, they could do so to create JS that can do anything the page can from the context of another user, otherwise known as Cross-Site-Scripting (XSS).


I wonder what percentage of Baidu users read English rather than just Chinese.


Wikipedia suggests the English-reading Chinese population is in the hundreds of millions (much higher than the English-speaking Chinese population), and I would guess that group overlaps quite a bit with Baidu users.


Why the attacker will run the content loaded as an script instead of just dumping what they get?

Edit: I think is the dataType: "script" part. From jquery docs:

> "script": Evaluates the response as JavaScript and returns it as plain text.


They have no choice. If they used an AJAX call it could be blocked by (lack of) ACAO headers. The only way to hit a remote URL that cannot be blocked is by adding a <script href="//github..." /> tag to the URL, which means the client has no choice but to run the contents.


What about an img tag?


Hmm, yes that would probably work... Not sure though.


The JavaScript used is very amateurish with many outmoded features, poorly optimized. I couldn't believe they were loading jquery, for example. This looks to be the work of a script kiddie rather then a superpower's cyber warriors.


I agree - but it might just be deflection. The Chinese could use the same argument to assert they bore no responsibility. Besides - everybody has jquery cached. Why create an ajax from scratch and add to the weight of the crap they are injecting into the script?

For quick and dirty I like it. Its not exactly long term or really destructive - but its kind of a cute and clever attack.

Github's response to pop an alert was priceless. Sure it probably annoyed the hell out of millions of Chinese people - and their government will probably claim Github attacked them --- but the truth will out... maybe.

Totalitarian regimes are shady as hell.


Or a script kiddie working for a superpower.


i feel like this alert should be in chinese for greatest effect


Asked in another thread but it went down because the post linked was taken down it seems[0]:

Would it have been prevented if Baidu served the .js files only over https? Are there any reasons of using http for anything that Baidu serves?

[0] https://news.ycombinator.com/item?id=9275201


Probably, yes. But considering that CNNIC, a root CA from China, is issuing unauthorized certificates [0], I cannot help to connect these two events together. I won't be surprised that Chinese government is using unauthorized certificates to initiate MITM attack specifically targeting TLS traffics. If that is the case, there will be really bad days for the whole Internet.

0. http://googleonlinesecurity.blogspot.com/2015/03/maintaining...


Well, that sucks. That effectively makes HTTPS worthless there doesn't it?

Also on the other link I have seen another relevant article [0] on how BitTorrent could be used for attacks from China.

Scary stuff.

[0] http://furbo.org/2015/01/22/fear-china/


CAs aren't geographically limited. Any CA trusted by your computer is trusted for any domain anywhere (with the exception of certificate pinning, which isn't commonly used). That means that a single rogue CA is enough to make HTTPS worthless everywhere.


Mozilla actually has done this (sort of), once. They restricted French agency ANSSI's root CA to only be valid for TLDs ending in .fr, .gp, .gf, .mq, .re, .yt, .pm, .bl, .mf, .wf, .pf, .nc, .tf.

https://wiki.mozilla.org/CA:IncludedCAs


They could also strip the https and serve everything over http through the firewall. The fact that the firewall exists is accepted in China so I don't see why they couldn't pull that off too.


For this to work properly, it requires https only, to prevent downgrade attacks (you stated that).

Google analytics is also served both over http & https ? Can anyone shine a light on that ?


From https://developers.google.com/analytics/devguides/collection...

ga.src = ('https:' == document.location.protocol ? 'https://ssl' : 'http://www') + '.google-analytics.com/ga.js';


"Gee coldpie, why do you use NoScript? All you're doing is breaking every website you visit!"

Shoe's on the other foot now, hahaha!

:)


Having no sites on the internet actually work finally paid off huh


Personally, the content I'm usually interested in when I visit a page is just the text and images. There are some exceptions, but that's what a whitelist is for. Even if you don't count the numerous security benefits, denying scripts by default has made my websurfing much more enjoyable.


It's a great filter for nonsense. Usually sites that cause problems for NoScript are crap anyways. Not always, but often enough that I see it as a net benefit.


I use NoScript as well (and I wish more people would) but to be fair I doubt the users who were part of the botnet even noticed it at all. It's only github who would've benefited from these users running NoScript.


Yeah, it was obviously a lighthearted comment, but the larger issue is that every web user is running someone else's untrusted code on every website they visit. Frankly I'm surprised these kinds of attacks aren't more common. NoScript helps mitigate this issue, and while it has lots of other incidental bonuses that a nerd like myself cares about, I freely admit it results in a worse end-user experience for almost everyone else.


They could have done a similar attack with an <IMG> tag. Or do you block images too?


Erm, you don't? I suggest you read the Basilisk FAQ before you get into real trouble...

http://ansible.uk/writing/c-b-faq.html


Something like Request Policy could cover this


What I don't get is why they didn't inject a script into all html passing through the firewall. That would have achieved a much greater effect if they really wanted to take out GitHub - the Baidu Analytics tracker is just a single script.


Easy to implement?

This is just my theory: I think that GFW is currently entering its next stage, which probably includes MITM attack to TLS traffic and some attack specific to websites outside China. I suppose that since everything now is in a "research" stage, they are just trying to see if the technique works and how much it could go.

Disclaimer: I was an user inside China and being blocked from the real Internet. So please take my words with a grain of salt.


This is actually the most plausible explanation I have seen so far: they just finished implementing this new injection feature and they needed something to test it on. For lack of a better target, they chose those two github projects.

Everybody's talking about how this is a targeted attack against GH, but I'm starting to think you might just have hit the nail on the head...


What is GFW?


The Great FireWall of China - https://en.wikipedia.org/wiki/Great_Firewall


Thanks.


Using jQuery to send this request is really kind of amateur. They could just append a <script> tag to the page which is effectively what that $.ajax call is doing.


Can we report it to Baidu and ask Baidu to clean up the scripts immediately?


If that’s a MITM attack they can’t do anything I guess.


Wow this must be a major blow to Baidu. This time their scripts was hijacked to DDoS, not that bad.

Since we all know that the Chinese government would never do a thing like this, it must mean that there is a very powerful hacker group behind this. And they are probably DDoSing for profit. Who knows what they may do for profit next? Spy on users? Steal passwords? Credit cards? Impersonating users?

Until Baidu implements a secure crypto solution that can prevent this malevolent hacker gang from sending corrupted scripts, it would be very irresponsible to use baidu analytics!


> Since we all know that the Chinese government would never do a thing like this

Why?


They've mobilised the Troll Department


I'm kind of amazed that these trolls are so obvious. It makes me wonder how much user generated content is really government generated content that we miss because it's not as apparent.


What if it's all government generated content? Maybe there are no real users online at all.


Wow, this one has >100 karma and his other comments are actually reasonable.. Normally they're new accounts. Maybe this is just a regular nut?


I'd assumed it was sarcasm, but clearly Poe's law applies here.


I think that was sarcasm.


The much-vaunted principle of non-interference in other countries' internal affairs, of course.


wooooooooooooosh


Something bothers me about this post. It's obvious astroturfing, but this user has a large amount of legit comments. They are in perfect English, while this is clearly not, and uses a totally different tone and voice than their previous comments.


Baidu's javascript cdn is being Hack by national firewall, inject these JS attack script. If other webseit include some javascript library from Baidu's javascript CDN will automatically run JS script that will DDOS attack Github. The attack JS script is here:

https://gist.github.com/zhufenggood/7bb040b1effb71d14bcc

Here is deobfuscate version using http://jsbeautifier.org/

https://gist.github.com/zhufenggood/6a38c2a2b2185977b3cb

Github notice that, it replace that DDOS http request respond with a alert("WARNING: malicious javascript detected on this domain"). That is why some Chinese guy gets a weird pop-up with English text when visiting Chinese websites.


clever move we should say ? or any better idea ?


Looks like another case of Chinese traffic being tampered with to load resources from another domain - in this case Baidu searches: http://insight-labs.org/?p=1682


So who's doing this? The Chinese government? Baidu themselves?


Sad but true, it is better to use tcpwrappers to block China bot nets, see for example http://g14n.info/2015/03/server-hardening-tips/#restrict-ssh...


I was going to make point that Baidu should serve his analytics-js over https only.

But thinking about it, there are fails on so many levels.


This is far more interesting than the OP. Thanks for sharing.


You mean TFA.


This is a reminder that having a single service, like Google Analytics or Google Fonts, injected into just about every major site on the web might not be a great idea.


This!

Even though I don't think this would have proactively helped this DDoS, but it can't be closer to the truth.

CDNs hosting libraries / fonts / resources for the web are going to be targeted more and more, it is just too attractive to malicious people.


This is why I wish we could have a file hash attribute added to certain tags (such as script). It could improve caching across domains and validate the content you're serving up. I proposal was posted here a while ago.


Subresource Integrity - W3C Editor's Draft http://w3c.github.io/webappsec/specs/subresourceintegrity/


Someone turning a widely-used third-party-hosted JS into "evil" seems like an incredibly difficult layer 7 DDoS to address. Assuming you have great capacity to filter on the edge (CloudFlare, being Google, etc.), but a limited backend, it's still very hard to identify legit vs. non-legit traffic and do filtering.

(Obviously if the attack is against, say, Chinese users, and your site's legitimate users are mainly in Estonia, you can do filtering, or if the attack only hits an obscure URL, but the attack doesn't have to be weak in that way.)

There are a bunch of potential ways to address it, but they all work best if you have a site with a defined community of users. If you're a large public site, without login, it's hard. Some of the better techniques are in-browser challenges (JS, CAPTCHA, etc.), but it's conceivable with enough endpoint with real browsers and real humans on them, these could get defeated.


GitHub seems to have done just this. Both attack URLs return

>alert("WARNING: malicious javascript detected on this domain")

They can probably serve that without hitting their database servers.


A good warning sign for companies that only have their codebase at Github which seem more common to me nowdays. If you run your own server at least you can like physicaly restrict the access to local network only.


Why though? The whole point of git is that it's distributed source control - you _always_ have a local copy of the source and the history. That's one of the biggest wins in using git and not a centralised source control in the first place..


But you don't necessarily have an up to date copy. If you had a local mirror, you could queue collective commits until the github server becomes available again.


> But you don't necessarily have an up to date copy.

You would if you designed your build process that way. I think it's a good idea to eliminate all third party build-time dependencies. In practice this means keeping a Git clone of everything to use for official builds, or anything that can't risk a third-party being unavailable when you need it most.


Exactly! If you have a sane build process everything is always synchronized anyway.


You could set one up in approximately no time at all using your local copies.

I mean, having a proper local master backup is a great idea, but you could survive without it in a case like this. Just set up a new repository somewhere everyone can get to it and have everyone push to it.


You have your own local branches...


If people have their whole workflow in git kernel-dev style, sure. But the reality is that for a lot of people their workflow is github. Leaving them with raw git is equivalent to leaving them with nothing.


Then perhaps we need to make workflows that don't rely on Github a lot easier.

https://github.com/toolmantim/bananajour is pretty nifty.

I've worked at places with enormous enterprisey SVN checkouts that might take an hour or two to slowly download off the internet even though the developer at the next desk may have the exact same bunch of bits available without having to go all the way out to the internet and back. Just solving that problem with better Git tooling could mean a better experience than the current re-centralised decentralised version control...


Well yes, but if you need to setup the codebase on another computer for example or are very used to use GitHub to do some "extra things" like browsing and follow commits, it will disrupt the workflow.


I don't know. If you're a really small shop the odds are frankly higher that poor administration will lead to your internal servers going down than that your (or Github's, really) server is down because of an attack by a state actor or someone with similar resources.


Internal servers might theoretically be under your control, but in practice your sysadmin team probably isn't better than Github's and you're not going to have 100% uptime. If you're spending time and effort doing stuff in-house, be sure you're getting real value and not just the feeling of being in charge.


No, but even if your sysadmin team is crap and you have a 95% uptime, that's a 95% chance that your solution will be a backup on the fairly rare occasions that github is down. Unless you have some reason to have your downtime correlated with github downtime.


If you just want a failover option, github + bitbucket is quite possibly more reliable and/or cheaper than github + internal.


That's possible, but since internet connectivity isn't 100% reliable, it's quite likely that bitbucket and github are effectively down at the same time.


No.

Our internal source control server is more reliable than our local shared ethernet with cable backup combined with the reliability of github.

We've had 43 minutes (scheduled and unscheduled combined) downtime in the last 11 years across three VCS technologies hosting internally.

If it was github and remote, we're talking about 7 days of downtime in the last 2 years alone.

We're definitely in charge and it has consumed virtually no admin effort in that time. The VCS server currently has an uptime of 3 years, 5 days, 20 hours and 38 minutes. It's also shifting 10mbit peaking at 200mbits pretty much all day every day 24/7/365.

(CentOS 6, SVN, HP DL380 for ref)


This is wildy uncommon. Borderline unheard of. In every job I've had, internally managed services failed with unscheduled outages at least monthly, sometimes for multiple days at a time. 3 years of up time? If I was hiring for system administrators and a candidate told me they had achieved that, I would just think they were lying.


We're very good at what we do and plan carefully. No joke. That's it. 2-4s downtime for a 40Gb svn repo migration is what we aim for.

To be fair we haven't patched that box's kernel since it was rebooted but we really don't have to bounce kit very often. Most updates are handled through service restarts. Only updates that are an attack vector internally are actually applied too.

As for the usual problems of power, we have battery and generator backed support. Redundant NICs on two VLANs and redundant power supplies on two feeds and a RAID array obviously.

We also have a large memcache deployment that hasn't been restarted in that timescale as well. Even the memcache processes are that old. Works wonderfully.

Also CentOS is pretty damn stable.

If this was public facing it would be a slightly different story as we'd be patching the kernel too.


Your systems administrators in every job you've had should be fired. There is nothing particularly astounding about three years of uptime from a correctly-administered internal system.


Yeah - frankly, for most VCS's, it's pretty hard to achieve an uptime as poor as github locally(1). There's just not much that can go wrong; even doing virtually no maintenance you'd do better simply because github is hosted else and effected by all the vagaries of the internet.

(1): by uptime I mean time up during office hours under the assumption that dev workstations can run. There's no point in measuring uptime during a power-outage...


I've worked at a place where our public Web site would go down for more than a day at a time like, more than once a month... I think I'll take my chances with Github.


There is opportunity cost for this. How much does it cost to run a system like that?


Over 5 years, probably about the same ($3000)


I'm not sure if I understand what you're saying.

They're using git, don't they? Obviously you miss the web interface to issues, pull requests, etc; but if you don't have git repositories distributed in your own infrastructure you're doing it wrong.


And what when your own server hosting repos is down ?

The problem is not having a third-party host, it's having no redundancy.

We use this, in our capistrano tasks :

    set :repository,  ( ENV[ 'FALLBACK' ] ? fallback_url : github_url )
If github is down, we just push on that secondary repos (which is indeed on one of our servers) and just process business as usual.


If the reason you're using github is its tools and (nice) interface, why not start your deploy by syncing github to your private repo, then have capistrano always pull from there? Less code paths to test and no need to define env[FALLBACK]. Also, it's probably faster.


A simpler way is just define two remotes to push and pull from in a single namespace say origin. If one is up you'll have no real problem. No need for code paths to test or having fallback stuff either. Just make git use both remotes. Linus uses git that way.

http://marc.info/?l=git&m=116231242118202&w=2


Nice, I didn't know one could have multiple urls for the same remote.

This post does not specify what happen on a `git fetch all`. I suppose it takes the first one, and you have to specify other remote entries if you want to pinpoint the remote repos to use ?

Note that the deploy I mention will do a fetch, so this solution does not replace the fallback variable.


No idea on git fetch all, but for a normal fetch it just iterates if the first one say fails it goes to number 2 in the list etc..

Past that I've not used it that often outside of having a cheap cluster solution in a way. Try it out and let me know!


The fallback repos is used as often as its name suggest :)

We do not use github only for issues and PRs, though, but also with repos hooks : pushing to a branch trigger a build on our CI, which then deploy our staging env if successful.

Of course, we could do as well with git hooks and writing code to do the web push. I just can't see any reason to migrate to that.


Today when I deployed a dependancy from NPM was unavailable due to Github being spotty, and somehow the deploy process let the app start without all dependancies, causing it to 502 everywhere.

I guess I'll have to look into what ElasticBeanstalk is doing.


Or run something like https://toranproxy.com/ if you use composer, and also mirror your code remotely.


First of all, physically restricting access to a local network pretty much makes the benefits of using Git redundant. Ignoring that though..

I don't think there are many companies who have the depth of info security knowledge that Github can draw upon. You might believe that running your own server and locking it down to your local network (as best your team can do that) is better, but I'd rather trust Github even if that means my code is available online. While the chance of being victim to a non-specific attack is far higher (Github is a much bigger target; I'm affected if they're attacked), the chance of someone targeting my code and actually getting it are far, far lower because Github is better at making things secure than I am, and they have people who are paid to make sure things stay that way.

A policy of only having your code on Github has it's flaws but making sure it's secure isn't one of them.


> I don't think there are many companies who have the depth of info security knowledge that Github can draw upon

That's true and I believe in that as well. However, GitHub also poses a much larger attack surface than any single company and it's safe to assume that once someone is in, they're going to get _everything_. Apart of that, you're also vulnerable to having GitHub disgruntled employees accessing your data, and the inherent vulnerability in having a remotely accessible repo in the first place.


I think google protected your data from employees. Can't github do the same ?


Just because they have a policy against these thing doesn't mean they don't happen:

http://www.wired.com/2010/09/google-spy/

>Google acknowledged Wednesday that two employees have been terminated after being caught in separate incidents allegedly spying on user e-mails and chats.

>David Barksdale, 27, was fired in July after he reportedly accessed the communications of at least four minors with Google accounts, spying on Google Voice call logs, chat transcripts and contact lists

>In the case of one 15-year-old boy Barksdale met through a technology group in Seattle, Washington, he allegedly tapped into the boy’s Google Voice call logs after the boy refused to tell him the name of his new girlfriend. Barksdale then reportedly taunted the boy with threats to call the girl.

>Barksdale also allegedly accessed contact lists and chat transcripts of account holders and, after one teen blocked him from his Gtalk buddy list, reversed the block. A source told Gawker that Barksdale’s intent didn’t appear to be to prey on minors for sexual purposes, but simply to goad them and impress them with his level of access and power.


It makes sense that Google protects your data from most employees, but there's always a core of employees that have everything accessible. It's probably a small operational core in Google (still probably way bigger than the entire headcount of Github).


It depends I know that old school telcos use security Vetting for people with wide access to systems and this is DV Developed Vetting or TS (in American usage)

And our Internal security team (BT Security) was bad news if you where investigated - they have a ferocious reputation


> First of all, physically restricting access to a local network pretty much makes the benefits of using Git redundant.

No, it doesn't. Why should it?


> "I don't think there are many companies who have the depth of info security knowledge that Github can draw upon."

I recall at least two vulnerabilities that GitHub was exposed to. The mass-assignment one from Rails (which they didn't fix until after they were the poster-child for it), and the cross-site one, which prompted them to use github.io.

I suspect their security teams are better now but you're still taking it on faith. By using 3rd-party services, you are increasing your exposure, not diminishing it. Someone who wants to attack you specifically will do so regardless.


Parent post is not concerned with security, but with availability.


Availability is a security concern. Not as important as confidentiality and integrity, perhaps, but still within the security remit.


GitHub enterprise


If only git was a decentralized rcs, you could have both.


because GitHub's uptime of 99.9912% isn't good enough for you? Even during this attack we were able to push and pull to our projects off and on. https://status.github.com/graphs/past_month


We use bower and npm for our project. Every couple of months github is under attack by a DDoS or not working correctly leaving us with broken deploy scripts. What is the best way to fix this? We don't like the idea of commiting the node_modules or bower_components folder. Is there a tool which will cache the npm and bower sources so they only have to be downloaded if something changes?


You could install something like Angry Caching Proxy (https://www.npmjs.com/package/angry-caching-proxy) to cache commonly downloaded packages in your local network. That should also please whoever's paying for the NPM repository.

Another thing I used to do was have a simple script that tried to do npm install, if that failed, npm install using the NPM Europe mirror - although I see now that that one's been deprecated because NPM's primary repository has become more reliable and hosted at a CDN with multiple locations across the world, etc.


Another option is Sinopia (https://www.npmjs.com/package/sinopia) which is a private npm repository server. After installing it, you'll have to change the npm registry url to your new private server and any npm requests (install, update, etc.) will go through your private server and fallback onto the official server if a package is unavailable.


Sinopia is great. It is a good idea to run your own npm repository anyway; not only do you have some protection from tampered packages (in combination with shrinkwrap), but you can also publish your own minor bugfix updates and private packages, without having to rely on GitHub URLs.

You're also insulated from npm and GitHub downtime - win-win-win.


Committing downloaded packages is not a bad practice. Yes, it can be a bit big, but otherwise I don't see much problem with it. You will be always sure that the installed packages are compatible with each other.


That works great until you have compiled extensions. Then it's misery.


To be sure everything that we know works together we use things like npm-shrinkwrap files. We don't like it because it makes the git changelogs a lot bigger and almost unreadable if you want to compare a pull request.


You could commit them to their own repository so they don't taint your main repo. Then use a submodule to pull that repo in to the main repo...


Yes the way you should do it is with shrinwrap to ensure the consistency of your dependencies. As for the actual files if you depend on it you should have your own npm repo caching them that you deploy from and have that mirror the public one.

but for small projects or quick deploys absolutely just go ahead and commit the modules.


Try https://github.com/uber/npm-shrinkwrap, it produces deterministic shrinkwrap files that actually diff properly.


> We don't like the idea of commiting the node_modules or bower_components folder

The bower documentation actually advises checking bower_components into source control.

http://stackoverflow.com/a/22328067


We currently only use it for private modules, but [private bower](https://www.npmjs.com/package/private-bower) also allows caching bower packages.


It's much more robust to have a dedicated build step that is separate from your deploy step.

The build step is where you install deps, do compilation, etc. Then it saves the output. The deploy step can just copy all the files from the last known good build to the new/updated servers.

This way you can always deploy a last-known-good release quickly and without external dependencies.

It also lets you take advantage of package caching on the build server so that as long as your deps don't change, you can deploy new releases to old or new servers without hitting any external services.


I recently rolled out a Sonatype Nexus server at work to handle both NPM and Maven artifacts. Allows us to publish internally, and a combined internal/proxy repo group allows us to serve both our packages and cache global registry stuff. Took ~3 minutes off some of our CI builds.


Don't use npm for deployment. that's even the recommendation by joyent.


Git submodules are designed for your use case, but I don't know how well they handle it.


Making the not unrealistic assumption that the data is _not_ normally distributed, mean is a useless average here. You should be looking at median, which will be much less distorted by the long tail.


Would be interesting to see the 10/90% too.


This, or if you are evil and know that your competitor has a stupidly-designed build process that depends on GitHub being available - by DDoSing GitHub itself you'll make your competition unable to work.


Some smaller companies depend on github. I think its stupid, but I know of at least one ivy league university subgroup who depend on github for everything. Management wanted to outsource everything they could, they just saw developers and IT as cost sinks.


What we really need is a free and open source distributed version control system.


Could github whitelist ip addresses who did commit to protect normal users from DDoS effects (splitting traffic to two sets of servers during DDoS etc)?


Seems like a reasonable strategy to me, but probably very infeasible for an attack already in progress if this tactic weren't planned and ready to go in advance. It would be something I'd investigate post-attack however to see if it's a viable strategy for mitigating future attacks.


maybe the point is to scare people into preemptively blocking Chinese IPs so the Chinese gov doesn't have to swat flies with the Great Firewall. it's good marketing too: "look at all those foreigners who refuse to let us access the free internet!" "anti-Chinese prejudice!" etc etc


If that is their plan it seems to be working, judging by some of the comments here.


They should redirect the attack to the server hosting the script as a friendly encouragement to use encryption.


Still getting lots of instability in spite of the status page.


Why the hell would anyone launch a DDoS attack against GitHub? Seriously, the only point I see in DDoS-ing GitHub is to prove yourself that you can DDoS it.


Given that the DDoS wasn't targeted at "https://github.com/" but rather "https://github.com/greatfire/" and "https://github.com/cn-nytimes/", two projects that can reasonably be described as not being pro Chinese government, it seems that github was targeted for hosting anti-Chinese 'propaganda'.


Thanks for the clarification!


There is the Github you see and use, then there is the Github you don't see. It's a service that allows files/information to be uploaded, downloaded, shared, by pretty much anyone. That's something some governments sadly don't like.


Note that it seems to be attacking two specific projects, according to the details above, one of which is to get around the restrictions of the Great Firewall of China. It's most likely an threat about the advisability of hosting something unapproved-of.


Interestingly enough they have succeeded in pulling those projects down.


I wonder what's the positive effect of sharing the same Internet with the chinese. I never visit nor I know someone who visits sites from China.

Major ISPs from the west should definitely consider blackholing all traffic coming from there to avoid DoS attacks, spam, etc. – From my experience, this would mitigate spam by a 50% or even more.


So you agree with the Chinese government and think we should just censor the Internet for all Chinese citizens?


Yeah, screw all those Chinese people living overseas who want to keep in contact with their relatives back home. Screw everybody who visits China and doesn't want to be cut off from the world while they're there. Screw the hundreds of billions of dollars in trade to and from China which the internet facilitates.


You my friend really know how to poke.


I'm not sure if that's a good thing or a bad thing, but I'm amused regardless.


I wonder what's the positive effect of sharing the same Internet with the French?

Or the Americans? How about the Scottish? I know I personally want rid of the Irish from the internet, so we should black-hole all traffic from there too while we're at it.

/s


Not that I agree with the parent, but you're being disingenuous. The Americans, Sottish, French, and Irish are not actively trying to wreck the Internet, certain US Congress members notwithstanding. There's a difference between throwing someone out of your store because you don't like the look of them and throwing someone out of your store because they shit on the floor every time they come in.


The Chinese people aren't actively trying to wreck the Internet either. It's more like banning everyone with black skin from your store because a black person once robbed it.


That's a pretty stupid idea. Lot's occidental businesses are (or area trying) to expand their market to china. Give it a look to this map [1], and you'll understand why China is a massive opportunity.

[1] http://en.wikipedia.org/wiki/List_of_countries_and_dependenc...


Read a little more closely. The way this DDoS was achieved only endpoints from outside China are being used to carry out the attacks.


I wonder what's the positive effect of sharing the same internet with you? I never visit nor do I even know you


I assume Github could block it themselves if it were that simple.


If someone wouldn't mind explaining, what could the motive possibly be for the Chinese government to be doing this?


Maybe because the github repo is for this website: https://zh.greatfire.org/ ?


I have written a brief summary of issues, as a tweetstorm: https://twitter.com/bitinn/status/581350026217013248


Anyone in europe having trouble with github this morning? I cant get bower to install and it fails with cannot connect to github error, status page seems to suggest everything is working


Did Baidu or its employee said anything about their script being used to attack Github? Would like to know how they think about it


The internet is so cool. I hope no one ever fixes the ability for shit to go crazy online. It makes me so happy to be alive in the age of data leaks, ddosses, and malware. It's all the more awesome that it isn't just individuals but entire nation states fucking shit up. This is a really neat attack. I hadn't thought of a MITM being used on a such a massive scale, and to leverage uninfected computers as a botnet is pretty great.

Props to China. 很好!


What I think is most interesting of all is not that a foreign country is attacking a US company nor that the company has no support from the wast pool of three letter agencies but the fact that github as a company designed their architecture in such a way that a sub site is allowed to eat all of the resources bringing the whole company down. Kudos to all the engineers and architects with +100k salary over there.


Is it confirmed the Chinese government is behind this? Could it be possible to also be a competing company?


Given Snowden's recent allegations that the Canadian government is engaging in false flag operations (causing havoc and placing the blame on other nations), it could even be another country that just wants to make China look bad. To be honest, I would expect an attack from China to be a little bit more subtle... Of course it could be a double bluff... but then... Basically, it's pretty hard to know what the heck is going on.


Question, way can't `code.jquery.com` help here? or has the attack moved passed this MIM attack?


Apologies - it seems my earlier comment was made during a brief respite.


And we're down again


the page says that everything is operating normally, but the main github.com page doesn't even load for me...

Edit: works again now


def noticed this like 5 minutes ago. DNS completely failed for a second.


oh boy, think we have to go home early today and its tgif.


How is this news ?


[flagged]


You would have made a valid point if git wasn't fully distributed.

But it is. And every dev has full copy of the repository. This is only a temp DDoS situation ? And you can host your code repository elsewhere. You can even use multiple remotes simultaneously. Not that it's convenient (that's why we use Github), but hell, you just create some ssh-accounts for your colleges and host the repository on your laptop ?

Sorry, who was laughing and why ?


The real problem that I can see is not for developing, but all the build and deployment systems that are hard coded to pull in dependencies from github.


Fair point. But now you're not using Github for development.

If your deploy process involves Github, you actually promoted it to be part of you production/operations infrastructure ?

That's where the 'wise and mature tech person' comes into play, I guess ;-)


It's a very good point. I had a slight inconvenience today because I was missing one commit on a branch that I wanted. But I could easily get it from my colleague. If my colleague wasn't available, then I've only missed one commit. It's not a massive issue to just redo the work. When I get access to the commit I'm missing I can even diff my changes against it and choose which one I like better.

Even if Github went away completely, we wouldn't have that much of a problem. It is important to have systems that maintain a copy of old projects, though. It would be easy to think, "Oh it's in Github. I don't have to worry about it." But that's stupid, of course (not that it will stop people from doing it...)


> You would have made a valid point if git wasn't fully distributed. But it is.

Except for Issues and Pull Requests, which are essential to continuing ongoing development.

Mailing lists are better distributed for this than Github.


Wise and mature tech person: "Here's what you can do to mitigate it." (Gives some helpful advice).


Fxxx GFW!


To the guys who gave me -1: If you cannot visit Google, Facebook, Twitter, Pinterest, and many other websites, you will say "Fxxx GFW!" too.


I didn't downvote, but I can understand this getting pushed down as it doesn't bring anything to the discussion.


However, I still prefer http://gitlab.com :-)





Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: