
GitHub censored my research data - doctorshady
https://gwillem.gitlab.io/2016/10/14/github-censored-research-data/
======
gwillem
GL sent me this statement. For the record, I didn't publish vulnerable
systems, I published stores that have malware.

\---

Willem,

GitLab has opted to remove the list of servers that you posted in your
snippet. GitLab views the exposure of the vulnerable systems as egregious and
will not abide it. While GiLab reserves the right take further action, up to
and including termination
([https://about.gitlab.com/terms/](https://about.gitlab.com/terms/)), we have
chosen not to terminate or lock your account.

Please know this decision was not reached lightly and we appreciate your
understanding on the matter.

Regards, GitLab

GitLab Support Team GitLab, Inc.

~~~
zAy0LfpBZLC8mAC
And even if it were (a list of vulnerable systems, that is), why the fuck do
they think that they should censor serious journalism? If you operate a public
venue, then it is an important societal role of journalism to report on it if
that public venue poses a risk to the public, whether that might also have
negative consequences for the people operating it is completely irrelevant.

~~~
llukas
You mistook "free and accessible" with "public".

You may exercise freedom of speech but not on server that belongs to a private
company - it is their right to limit what kind of content they like.

But in an essence you are right - companies should exist to benefit society,
but it is not how it exactly works right now.

------
sh1392
We at GitLab believe the author did not responsibly disclose this security
information in a proper manner, and today we removed the list of hosts in
accordance with our terms of service
([https://about.gitlab.com/terms/](https://about.gitlab.com/terms/)).

The author says that he contacted "about 30 merchants directly", but the
published list includes over 1000 merchants. Most merchants were neither
informed nor given a chance to respond in a timely manner. We did not feel
comfortable hosting information that could be construed as an open invitation
for malicious users to exploit.

~~~
eridius
This is completely unacceptable. You're treating this as though the author was
publishing a list of _vulnerabilities_ about sites. That's not what the author
did. The author published a list of sites that are already infected with
malware and thus are dangerous for users to visit. This is a public service
and there is zero expectation of "responsible disclosure" to the sites. The
only thing that disclosing to the sites without telling the public does is
protect the reputation of the site, but there's no expectation for anyone to
try and protect the reputation of sites that are serving malware.

> _We did not feel comfortable hosting information that could be construed as
> an open invitation for malicious users to exploit._

The sites were already exploited. That's the thing. At best I can see the
argument you're making being "we don't want the sites to get exploited a
second time", but that really shouldn't be a concern. The sites were already
exploited, there's nothing left to protect. And publishing a list saying "this
site has malware on it" doesn't actually tell anybody _how_ the site was
vulnerable anyway, unlike disclosing security vulnerabilities which by their
very nature informs people how to take advantage of them.

~~~
Singletoned
I agree. GitLab's reputation has taken a pretty big hit in my opinion. I kind
of expected GitHub's reaction, but had hoped for something better from GitLab
(if nothing else, just to differentiate themselves from their opposition).

~~~
sqldba
Definitely stuffed my opinion of them. I thought they were for the people and
made good decisions.

Nope - corporate suits - and they give us a response which doesn't even make
logical sense. Do they think we're idiots?

------
ddeck
Archive of the list on Gitlab which is now 404:

[https://web.archive.org/web/20161014133252/https://gitlab.co...](https://web.archive.org/web/20161014133252/https://gitlab.com/gwillem/public-
snippets/snippets/28813)

~~~
eth0up
A pastebin link with spaces converted to newlines, apparently for purposes of
copy/paste: [http://pastebin.com/rYqEeuNm](http://pastebin.com/rYqEeuNm)

------
Animats
Do this kind of thing on your own domain.

I have a list of major sites with currently active phishing pages.[1] This is
basically a join of PhishTank and DMOZ. Nobody seems to be upset by that.

Google is at the top of the list because of their hosting business. It's not
just Google Sites. You can put a web site in a Google Spreadsheet cell, which
Google doesn't seem to check as a possible phishing site.

If you host for others, or offer a URL shortening service, you need automated
checking against all available phishing lists or you will be exploited.

[1]
[http://sitetruth.com/reports/phishes.html](http://sitetruth.com/reports/phishes.html)

~~~
bsder
Okay, so assume he hosted the list himself and is now DDoS'd.

Now what? I'll give you a budget of $100 a year.

~~~
giancarlostoro
Well there is Frantech / BuyVM. We had a VM we used as a proxy for about $15 a
year + the $3 IP DDoS filtering. We had several proxies on top of our
Limestone Networks server (they didn't offer protection at the time - still
this is cheaper).

Game servers become hot targets by kids who just don't care and do anything to
get your server taken down, whether it be competition or a user got banned or
just about any reason they can sum up to try and find any given approach to
take your service down.

Edit:

Forgot to mention the free CloudFlare option as well (unless they stopped
doing this, haven't had to deal with these things in a few years, but I have a
feeling it's still about the same, there's likely even more offers now out
there).

------
jrochkind1
> I understand that Github doesn’t have the resources to investigate each and
> every DMCA notice.

The DMCA as written really encourages no investigation whatsoever on the part
of the service provider, this is pretty much how everyone acts. File a
counter-notice with the service provider if you don't think your content
violates anyone's copyright.

In this case, if Github took it down because of a DMCA notice, i think Github
actually behaved _better_ than Gitlab. Github is simply following DMCA, if you
file a counter-notice, they'll probably restore it -- if they don't, and say
it's not an issue of copyright, it's just that they don't want to host your
material, then at that point they'll be behaving similarly to Gitlab. Gitlab
did not take it down because of a DMCA notice, they took it down because they
decided it was 'egregious' and they just didn't want to host it.

[https://help.github.com/articles/guide-to-submitting-a-
dmca-...](https://help.github.com/articles/guide-to-submitting-a-dmca-counter-
notice/)

I can't find any gitlab docs on filing a DMCA counter notice. Their DMCA
policy at [https://about.gitlab.com/dmca/](https://about.gitlab.com/dmca/) is
short and solely targetted at those claiming infringement, there is no
description of how to file a counter-notice.

In this case, I think github wins. The terrible parts of github's counter-
notice policy (10-14 days until your content comes back) is part of the DMCA
law. Take it up with your congresspeople. [http://io9.gizmodo.com/the-dmca-
how-it-works-and-how-its-abu...](http://io9.gizmodo.com/the-dmca-how-it-works-
and-how-its-abused-1616830093)

However, reading OP again -- it's not clear to me that Github took it down
because of DMCA. They may simply be acting exactly like Gitlab, taking it down
because they don't want to host it, unrelated to DMCA. But I wanted to clear
up some things about the DMCA, since OP mentioned it.

~~~
syshum
>>>it's not clear to me that Github took it down because of DMCA.

Exactly, there seems to be a common misunderstanding that if anything is taken
down then it because of DMCA, there are number of ways content may be removed
from a platform, be it GitHub, Facebook, YouTube, etc. Not all of it is DMCA.

In fact for large platforms that offer take down processes outside of DMCA I
would say the vast majority is not, for example anything taken down via
ContentID on Youtube is NOT a dmca take down.

------
anondon
How exactly does publishing a list of malware-infected stores fall under the
DMCA? I always thought DMCA was meant to be for copyright infringement cases.

I didn't see the list, but did it by any chance contain the logos of the
online stores? If it did, the DMCA notices make sense.

~~~
notatoad
it falls under the DMCA because the DMCA mandates a takedown-first and check
the validity later workflow. To be in compliance, a site _must_ respond to any
DMCA takedown notice as quickly as reasonably possible, regardless of how
fraudulent it might be.

As long as you're okay being found guilty of perjury by a US court, or have no
plans to enter a US jurisdiction in the near future, you can get any content
you want taken down from any site that's required to follow the DMCA.

~~~
kevin_thibedeau
DMCA is a copyright law. It requires a takedown for alleged _copyright_
violation. That doesn't apply here. The stores don't have copyright over their
domain names on a list.

~~~
sveiss
Yes. So to get something taken down, all you have to do is _allege_ copyright
violation. If you're not concerned about perjury in the US courts, the claim
doesn't have to have merit.

Service providers have a choice:

a) follow the DMCA takedown process and be shielded from all liability,
whether or not the claim has actual merit

b) evaluate each takedown notice to decide if the material falls within the
scope of copyright, ignore takedowns where _in the service provider 's
opinion_ the material is outside the scope of copyright, and should it go to
court, be forced to defend that position on its merits instead of having an
automatic liability shield

What sensible service provider is going to choose option b?

------
gwillem
Gitlab CEO just called me and apologized, will restore data shortly.

I am personally very sorry that GL got in a bad light here. They had
misinterpreted my data and have acknowledged that. For comparison, I have
heard nothing from GH over the last two days.

Gitlab, you rock.

~~~
turncharlie
FWIW, if GitLab sees this: Well done. We were planning on buying their hosted
service (GitHost) for my employer next week and after this story initially
broke I was going to table that and use something else. This has restored our
faith in GitLab as a company.

~~~
sytse
Thanks Charlie, glad to hear that. For reference our response blog post is
[https://about.gitlab.com/2016/10/15/gitlab-reinstates-
list-o...](https://about.gitlab.com/2016/10/15/gitlab-reinstates-list-of-
servers-that-have-malware/)

------
sqldba
GitLab and GitHub are both pretty active on HN. I look forward to their
response - where is it already?!

I'm especially disappointed by GL. GH is already too big to care about such
things.

------
inlineint
There is a service
[http://www.cryptograffiti.info/](http://www.cryptograffiti.info/) which can
be used for posting sensitive information that should not be removed or
hidden. It allows to write a message and store it in Bitcoin blockchain as a
transaction. It costs near 0.0015 BTC or $1 per kB. Large files can be posted
as magnet links to torrents with them.

Even if the service's site had been shut down, everyone would always be able
to obtain the transaction from it's hash using any bitcoin client/blockchain
explorer, convert it to ASCII and read the text.

I'd like to note that it is worth to sign with GPG all messages posted that
way in order to have ability to post updates and verify authorship.

------
yincrash
After briefly looking at
[https://github.com/github/dmca/tree/master/2016](https://github.com/github/dmca/tree/master/2016)
there doesn't appear to be any DMCA request for gwillem's content, so maybe it
wasn't removed through the DMCA process?

------
eeeeeeeeeeeee
Definitely feels like a bad interpretation on Gitlab's part, but not done out
of malice.

The person was not exposing sites that nobody previously knew about -- the
sites were already compromised, there is nothing to compromise again except
maybe having more than one attacker in your compromised account. The damage is
already done, though.

These are likely web applications that were not kept up to date so the
responsible security disclosure already happened when it was reported for
WordPress/Drupal/Joomla. It is the site owners responsibility to pay attention
to those security disclosures, which they likely failed to do.

And those compromised sites, in my experience, are usually attacking and
infecting other sites and servers on the Internet. That makes them a public
nuisance and so public disclosure is necessary so they can be appropriately
blocked/isolated.

------
mattip
Discussion of origin of the list here

[https://news.ycombinator.com/item?id=12707860](https://news.ycombinator.com/item?id=12707860)

------
x1798DE
Article doesn't say that gitlab censored the list, though the gitlab link is a
404.

Also, are they actually using DMCA to get these lists taken down? If so, isn't
there some penalty for filing a false DMCA?

~~~
vSanjo
The first comment mentions it was taken down about an hour ago.

~~~
x1798DE
Hmm. I didn't (still don't) see any comments. Maybe it's a mobile thing.

------
cweiske2
List is at [http://p.cweiske.de/366](http://p.cweiske.de/366)

------
ptman
For once, the Gitlab employees on HN don't comment on a Gitlab-related story.

~~~
icebraining
Hum...
[https://news.ycombinator.com/item?id=12712996](https://news.ycombinator.com/item?id=12712996)

~~~
sli
It's a new account that offers no proof that they are legitimately speaking on
behalf of Gitlab.

------
duncan_bayne
Perhaps distributing this list is a use case for IPFS?

[https://ipfs.io/docs/getting-started/](https://ipfs.io/docs/getting-started/)

------
Pyxl101
> I understand that Github doesn’t have the resources to investigate each and
> every DMCA notice. However, it still took me by surprise that Github censors
> data so easily.

Send a counter-notification asserting that the data is not under copyright and
have it put back up. Assuming this is really DMCA.

------
ChuckMcM
So it seems the real bug here is that a site that is hosting malware is doing
so because its actually vulnerable to being hacked, was hacked, and malware
was installed. So posting the site name identifies a vulnerable site (which is
wrong) and stops informing people that those sites have malware on them (which
is an issue as well).

That is quite the catch 22. And of course many of the sites owners are
clueless and don't even know how to patch or fix their systems.

My isn't that that a mess?

~~~
jcoffland
It's not wrong to expose negligence which endangers others.

------
pdq
Isn't pastebin the correct site to post lists like this?

~~~
r3bl
Not if you want to update it.

Also, Pastebin has some shady practices. One example: they offer HTTPS support
as a premium feature.

~~~
oolongCat
I am sorry, but I fail to see how offering https support as a premium feature
could be considered "shady".

~~~
chill1
All web traffic should be encrypted, regardless of purpose. This increases the
work that nation-state level adversaries must do to effectively spy on the
population. And it's cheap and easy to do these days.

~~~
oolongCat
It's not a question if its easy to do or not. Yes I prefer an encrypted
connection over one that is not.

BUT! the point here is the op claims this is a shady practice to provide https
to their paying customers.

A shady practice is if they take your personal information and sell it to
another without telling you. Shady is when companies lie to their customers.

And this situation is not.

~~~
chill1
Fair enough. I don't think this is a "shady" practice either. But I think it's
an old, out-dated practice to charge extra money for a feature that should be
on by default.

------
0xmohit
Would be fun to see if the list is censored on Google docs.

~~~
ythl
You should host the list and post a high visibility link (i.e. news article).
You'll probably discover real quick the pressures that caused GitHub and
GitLab to buckle so quickly.

------
r3bl
Isn't it the whole point of GitLab that it's decentralized? As in, you can
roll your own instance and stop worrying about censorship?

I'm pretty sure someone here has a GitLab instance that is willing to share
for this purpose.

~~~
pmontra
And get DDoSed by the malware guys who don't want their victims know they are
in the list and fix their site. It would be an altruistic offer but I would
think twice about it.

However it's a problem and we need a solution.

A torrent? Yes but Google won't index it.

A file on S3 wouldn't work because they could just download it as many times
as needed to make the bill skyrocket. Better than a DCMA.

Anything that is unaffected by DDoS, costs no money, can be indexed?

Edit. One more requirement: not easy censorship.

~~~
TACC_2016
Blogspot/Blogger?

~~~
pmontra
DMCA, again. I forgot that constraint.

------
ComodoHacker
Back to the original problem of skimming.

I think the fastest way to get sites fixed is to run a script that crawls
sites in the list, parses their Twitter and posts a warning there with link to
original article.

Can someone help with that?

------
mankash666
And just like that, we discover how helpless the average Joe is against
corporate money.

Let's crowdfund an AWS s3+CloudFront hosted site. DDosing that is no easy
feat, and if corps do try it, the logs can prove their complicity, which has
legal implications I presume

~~~
Waterluvian
It's an endless cycle. The Man keeps you down, so you throw together resources
and collaborate to have a crowdfunded solution. It becomes successful, grows,
hires some employees to maintain things, keeps growing, and becomes The Man.

------
epalmer
@gwillem thanks for doing this important investigative work.

------
bruce_one
As soon as I read this I assumed it was as a libel prevention method.

I'd be curious whether Gitlab/hub could be held responsible for proving the
accuracy of the claims? (That was my initial assumption as to the reason they
were taken down.)

------
zimbatm
Why is a third-party required to publish the list, couldn't it be hosted on
the blog post itself? That would have the advantage of being to archive the
whole thing on a single page.

------
cyanbane
To be fair, maybe some automated method that hub/lab owners have not vetted
the data overall. I hope your list stays up/public personally as long as you
are willing to take responsibility for its upkeep. I wish there was some
format to submit this list (and your responsibility for keeping up with it) to
vendors on the lookout for this kinda stuff (up to them to decide inclusion).

------
blahblah12356
Please post it to pastebin.com or something like that! put up a torrent I want
to know which sites so i can steer clear of them.

------
juskrey
What is the problem to publish on your own site?

~~~
jhou2
It would probably get DDOS'ed just as quickly.

~~~
ajmurmann
Just put it behind the free Cloudfare protection.

~~~
eriknstr
>Only limited DDoS protection and mitigation is provided to domains on a free
or Pro plan through "I'm Under Attack" mode. If you are looking for advanced
DDoS protection and mitigation and frequently suffer sizable DDoS attacks,
please consider looking at our Business or Enterprise plans.

[https://support.cloudflare.com/hc/en-
us/articles/200170186-D...](https://support.cloudflare.com/hc/en-
us/articles/200170186-Does-CloudFlare-offer-DDoS-protection-to-Free-and-Pro-
plans-)

------
vSanjo
I don't know enough about these kinds of situations yet to form a reasonable
argument for-or-against. Is what was done considered a kind, favourable thing
for the developers behind those sites or is it something that shouldn't have
been displayed?

~~~
chipperyman573
It doesn't matter what is kind for the devs, it matters that their website has
malware causing their user's data to be compromised.

------
vacri
"Moderated", not "censored". Neither GitHub nor GitLab have stopped the
message going out from outlets other than their own. Would we be comfortable
calling the moderators here on HN "censors"?

------
stepik777
I tried to pay in several of these shops. Most didn't even had the
functionality to pay with card. The only one in my sample where I was able to
get to the payment form had redirected me to the proper payment gateway.

------
akerro
[http://gogsys33repvmfz5.onion/](http://gogsys33repvmfz5.onion/) Free gog git
server in Tor.

Also, it will ask for an email on registration, but it isn't verified and no
email is sent.

------
leni536
I wonder how google acts, if you dump the list here:

[https://www.google.com/safebrowsing/report_badware/](https://www.google.com/safebrowsing/report_badware/)

~~~
Svenskunganka
That's covered in the article:

> I have, prior to publication, submitted all URLs and malware samples to
> Google’s Safe Browsing team. They have since only acted upon a small portion
> of the sites.

------
beedogs
Money talks. Uness it's consumers' money being stolen, it seems.

------
rawfan
Now I see. I thought this was a list of stores hacked through the current
vulnerability. This is a list of stores hacked through 1-2y old
vulnerabilities.

------
problems
Post it on a WordPress.com blog or host it on an OVH box or put it behind
CloudFlare. All of these are quite censorship resistant in my experience.

------
inimino
The article states the lists were taken down but does not say why. Perhaps
there will be an explanation forthcoming from Github or the researcher.

------
webjunkie01
The only way this will get fixed is if a script is written to take advantage
of the vulnerabilities and clean the sites affected.

------
blahblah12356
Post it to Pastebin.com or make a torrent!

------
qwertyuiop924
Well, as an absolute last resort, you can use Freenet or Dat to store your
list.

------
franciscop
Have you thought about contacting Adblockers or even Browsers? They might be
interested in this data to block the sites for the average Joe.

~~~
halter73
From the article:

7\. I have, prior to publication, submitted all URLs and malware samples to
Google’s Safe Browsing team. They have since only acted upon a small portion
of the sites.

~~~
franciscop
When I read that I assumed it was Google Search team, not Google Chrome team

------
lucaspiller
I'm kind of with Gitlab on this one, just publishing a list of broken sites
isn't going to help them get fixed. Most of the owners probably barely know
the Googles from the Facebooks, so even if you email them saying 'you have
this JavaScript thing that's bad' they won't understand and will blow you off.

OP doesn't go into details of how they check the stores, but I'd assume they
have some sort of script as they checked 255k. If that's the case it would be
trivial to send an automated email if malware is detected, and include links
explaining how to fix it.

It won't resolve everything but it's a lot nicer than naming&shaming
businesses who have effectively done nothing wrong. What I mean is they
probably hired a developer or team to build their website, and assumed that
they would build a secure website - they didn't go out purposely and find
someone to build them a site that would be hacked.

~~~
bsder
> I'm kind of with Gitlab on this one, just publishing a list of broken sites
> isn't going to help them get fixed.

So the malware should be allowed to continue stealing credit card numbers just
because the site owners don't know any better?

Is that really a position you wish to defend?

~~~
lucaspiller
No I don't agree that it should be allowed to continue, but how is
naming&shaming people going to fix anything? Nothing is going to come of this,
other than maybe some other hackers will see them as weak targets. Do you
expect this list to be read on prime time CNN or something? People who want to
buy something online aren't going to search through GitLab to check if the
site has been hacked (maybe they should though), they just look for the green
padlock and assume it's safe.

As I said, and GitLab suggested, OP should at least contact them. If you
contact them and they say they won't do anything, now that's a different
story...

~~~
flopto
> So far, between Oct 10 and Oct 14, 631 stores have been fixed. [0]

So it sounds like does a pretty good job of fixing things

[0] [https://gwillem.gitlab.io/2016/10/14/github-censored-
researc...](https://gwillem.gitlab.io/2016/10/14/github-censored-research-
data/)

------
formula_ninguna
The malisious code on those websites isn't your bussiness guys. If their
owners wish not to respond and not fix it -- they have a right to do so. Why
are you all so anxious? It has nothing to do with you all. Just don't buy from
them, that is simple.

Needless to say, I've seen most of those websites for the 1st time.

The world is unfair? No, it's fair and this proves that it is fair.

~~~
orf
What a stupid comment. Here is a list of thousands of sites that people other
than you _do_ visit and buy from, causing their cards to be skimmed, and your
response is 'stop being so uptight'?

Imagine if this is instead a list of real world ATMs. The correct response is
not "well ive never seen these ATMs before just don't use them, duh", it's
"how the fuck can this be allowed to happen and how do we fix it".

