Hacker News new | comments | show | ask | jobs | submit login
RawGit is now in a sunset phase and will soon shut down (rawgit.com)
248 points by marvindanig 37 days ago | hide | past | web | favorite | 84 comments


"Unfortunately, RawGit has also become an attractive distribution mechanism for malware. RawGit was meant to improve people's lives, but jerks are increasingly using it to hurt people.

Since I have almost no time to devote to fighting malware and abuse on RawGit (and since it would be no fun even if I did have the time), I feel the responsible thing to do is to shut it down. I would rather kill RawGit than watch it be used to hurt people."

Where does one draw the line? The internet is used for that, so should we shut down the internet? Is the content of the internet the responsibility of its creators, or operators?

I completely sympathize with this position for a one-person service, but I think the real problem is that we still haven't figured out how to fix incentives so we can distribute the work of keeping the jerks in line. On the internet, the jerks are winning.

> On the internet, the jerks are winning.

Arguably, it's not just on the internet. But I try to be optimistic.

Sure. I guess I'm old enough I still see the internet as a thing that we could hypothetically choose to shut down (without killing half the population) -- unlike, say, agriculture and civilization. Maybe that's not realistic, though.

I am having a similar problem with npm. I guess people decided to upload warez to npm and it's affecting my tool that reads from npm. This is why we can't have nice things :(


I figured this day was coming. Excellent writeup. Sorry to see it go. Thanks for all the raw fish.

Sounds like the cost of hosting this were not insignificant. Curious how was this financed.


To answer my own question, I found the FAQ page, which is also great:


Sounds like donations helped (maybe). Oh and Stackpath. Kudos to them, too.

Hi! I'm the guy behind RawGit.

I've actually never accepted donations. I paid the meager cost of the $10/month DigitalOcean droplet for the origin server, and StackPath (formerly MaxCDN) sponsored the CDN, which handled the majority of RawGit's traffic.

Once, I worked out how much the CDN bill would have been if StackPath hadn't sponsored me and my head nearly exploded. So yeah, I definitely couldn't have done it without them!

How much would the bill have been? Rough estimate: 176TB per month: $3.500 to $4.200.

It was a while ago now so my memory is hazy, but that sounds about right.

It was a pleasure working with you on this Ryan. You did an amazing job.

Thanks Justin! Likewise.

Did you consider trying to sell the thing to GitHub or GitLab? That many users must be worth something I'd imagine.

Maxcdn must be very happy with all the data they gather with your project... I stopped using all those cdn based js deliver service when I saw all the cookies they were adding to my user's browsers.

StackPath doesn't add any cookies to cdn.rawgit.com responses. Here are the complete headers from a sample response:

  HTTP/1.1 200 OK
  Date: Fri, 12 Oct 2018 17:56:30 GMT
  Content-Type: application/json;charset=utf-8
  Transfer-Encoding: chunked
  Connection: close
  X-Content-Type-Options: nosniff
  X-Robots-Tag: none
  Access-Control-Allow-Origin: *
  ETag: "9c153866d0cad7024c0f31eb7b65be11582a6737"
  Cache-Control: max-age=86400
  Vary: Accept-Encoding
  RawGit-Cache-Status: HIT
  Server: NetDNA-cache/2.2
  Strict-Transport-Security: max-age=31536000; preload
  X-Cache: HIT

The ETags are identical to those from Github:

$ curl -I https://cdn.rawgit.com/sindresorhus/awesome/master/readme.md

etag: "1218d241f6c717fa83d0f1afa64809d6ce7451c8"

$ curl -I https://raw.githubusercontent.com/sindresorhus/awesome/maste...

ETag: "1218d241f6c717fa83d0f1afa64809d6ce7451c8"

> It's super nice of you to offer, but I don't need any donations at this time. RawGit's server costs are minimal, and the lovely people at StackPath provide RawGit's CDN service free of charge. Thank you though!

RawGit was an amazing service, it's truly the end of an era for when you want to render the HTML page for source code you're viewing by just changing the domain.

> "Unfortunately, RawGit has also become an attractive distribution mechanism for malware. RawGit was meant to improve people's lives, but jerks are increasingly using it to hurt people."

This has been my experience too, and I'm now convinced that hosting any online services can only be done by corporations whose business model entirely relies on it, and that a small full-time staff is needed to combat malicious users effectively and keep the service usable for everyone else.

This is something the decentralization fans need to take into account too: if successful, ipfs will be full of malware being distributed from compromised routers etc.

This is Service Management 101 and why “free” services don’t exist offline. There is always a catch.

A Corporation who's entire business model relies on it is actively incentivized NOT to combat spam any more than absolutely required to minimize their own monetary losses as a direct result of the abuses. As long as they aren't held responsible for their users actions it's just eating away a revenue stream.


It was a really cool project and I think the first to offer this kind of features. Sad to see it go.

But right now jsDelivr supports both GitHub and npm as source for CDN files. So here is an easy tool for migration https://www.jsdelivr.com/rawgit

Not really a full replacement, unfortunately:

> For security reasons, we serve HTML files with Content-Type: text/plain. We recommend using GitHub Pages if this is a problem.

Wish the URLs were a bit cleaner (short hashes, no subdomains, no GH path), but I get that JSDelivr is not catered just for GitHub.

There are ~1M RawGit URLs embedded in open source projects on GitHub: https://github.com/search?q=rawgit.com&type=Code I'm curious to see how many there are a year from now.

How many of them would have SRI hashes added to them. If this domain eventually expires, and whoever buys it next can make trillions!

I'm not going to let the domain lapse for exactly this reason.

This will be interesting... I just did a grep of my local codebases and found 11 instances where CSS or JS resources were being downloaded from rawgit.com.

I have a feeling when it's turned off, a lot of sites are going to break.

Yeah, I really, really don't want to break things, so I've tried to be very careful about how I'm doing this.

In the shutdown announcement I committed to keeping the site running in sunset mode for at least a year. Hopefully that's plenty of time for everyone who's aware of the shutdown to migrate, but I expect there will be stragglers.

My unofficial, subject-to-change plan for dealing with that is that at the end of the sunset year, if there's still a significant amount of traffic, I'll start throttling requests to make RawGit slower. Hopefully people will notice their websites are slow and will investigate. I'll also try to notify stragglers individually by filing issues against their GitHub repos if possible.

Instead of throttling, you could also consider doing incremental brownouts where you drop requests for the first ten minutes every hour. PyPI did this recently when they phased out TLS 1.0, which worked really well IMO.

I wish more services shuttered this way because you can’t miss it.

I once had a web host generously give me three months to pay a delinquent bill that I missed the emails for. Sadly it just meant I thought things were fine. When they finally shut my service down, my users made me aware within minutes but it was too late.

Thanks for the suggestion! This does seem potentially more effective.

IIRC Google also uses (or used) a similar system for deprecating old APIs.

Start with failing 1% of requests randomly and slowly ramp up from there.

Incremental brownouts only work well if there is mechanisms to ensure that your service's users realize that the brownout is a deprecation warning.

By default, pip doesn't show the contents of HTTP error messages [1], so users affected by the brownout would have to take extra steps (using `-v`, visiting the PyPI status page) in order to figure out what was wrong. I think it could easily appear as a networking issue or some other sort of intermittent problem.

There was also no notification of the impending blackout on python.org. [2]

[1]: https://github.com/pypa/packaging-problems/issues/130

[2]: https://lwn.net/Articles/751800/

That's above and beyond what's needed IMO, good on you for going to that length.

In the end the only thing that will fix the broken sites is to cut it off entirely.

Thanks for your project and I'm glad you're able to bring it to a successful conclusion!

This is one of the biggest reasons I want the web to start moving towards content-addresses instead of location-addresses.

It's unreasonable to expect that a service like rawgit would stay online forever. But even if someone else steps in and builds something equivalent, or if sites turn around and start self-hosting, all of those URLs still need to change.

People focus on the flashy parts of DAT and IPFS like, "oh, someone else could host my website." But there's a much more mundane and arguably much more important side of that which is, "One day NPM might have a different URL." Rehosting content is pretty easy, getting sites and dependencies to link to it is very hard.

And it's not just the problem of updating all of your own projects, there are projects that aren't going to be updated. Rgrove is being super nice about all of this, but there are sites that are going to break in a year, and nobody can really do much of anything about it.

Ryan is not going to allow any sites to break. Still, everyone should switch to an alternative

Until “at least October of 2019”. Who knows what’ll happen after that

I know! :D

…and? :)

And, ultimately, it's not rawgit's job to maintain a website. If you use a free service as part of your deployment it's your job to keep an eye on it in case it shuts down.

Rawgit has provided a fantastic service that's proved very useful and I'm glad it's been available as long as it has.

It is bit more complex than that. it is incredibly hard to keep an eye on whether a library you use has a dependency which uses such services, and even if they fix it in the next version it may not be fixed in the version you use.

If you're deploying your website and you don't know where assets are coming from then something has gone very, very wrong. You can't just ignore security completely and thrn blame a CDN provider if things break.

I've never used rawgit for my own projects, but it did make life a lot easier to check out demos and test run examples in JS libraries hosted on github.

Totally understand that the effort required to fight malware is tedious. But instead of shutting it down you may want to try a $5/mo plan. The money is not for making money but kind of like how Google chrome store required a $5 payment to publish your first chrome extension.

Spammers hate paying it and I think as soon as money changes hand there is verification and a trail which makes them nervous too (many forum owners do this as well).

I got the impression the author would rather not deal with the headache anymore, and charging money is trading one set of headaches for another. Building things is fun. Operating things is stressful. I don't blame them.

The stats he has quoted on the FAQ it wouldn't be surprising if it starts making $10K/mo (just a wild guess) and the malware reduces as a side-effect too.

But I guess you're right. Not everyone is driven by money and it would certainly turn a hobby project into a full time job. Sounds fair to shut it down.

It was great while it lasted. I currently have a small extension that depends on it. Oh well, I knew the risks of using a free service so no complaints. Thanks for the free service!

By now you can configure GitHub to publish your master branch as github pages. So no need to use rawgit either.

The only sad part, is all the links that are breaking.

Is there a service that can replace RawGit for Gists, specifically serving them with the correct MIME type? This was a convenient feature for users of https://reviewable.io to hack up a quick custom stylesheet or keyboard shortcuts override. It's not clear to me whether any of the suggested replacements support Gists.

try replacing "rawgit.com" with "rawgit2.com"

Congratulations, it was a great gift for the web. Nothing lasts forever anyway.

Precisely why I never use free CDNs or stuff like this to serve static content. Just host your own thing. It maybe costs a little bit more but at the end you'll have websites that will run for decades.

Rawgit was useful, I wish it stayed up for a little longer, perhaps someone like Cloudflare could sponsor this?

CDNJS is from CloudFlare, running with what I believe is better DNS and more POPs.

Maybe Red Hat (who use Rawgit a lot) or, for that matter, StackPath want to take over?

Thanks for everything! Are you going to open source the Rawgit codebase? I'd be interested just for educational purposes.

RawGit has been open source from day one: https://github.com/rgrove/rawgit

oh well, a few hundred of the working answers I posted on stackoverflow will stop working. Not sure how I can fix them

Thankfully, it's pretty easy to find the GitHub page for the linked source with a bit of URL surgery.

And SO will probably fix this automatically for all answers.

They might, depending on how widespread it is. It certainly wouldn't be without precedent. Alternately, there might just be a manual effort to fix them all.

"You either die a hero, or live long enough to see yourself become a villain"

Why not sell or give away the project to someone who has time to fight the abuse?

I did consider this, and several people have offered to take over, but ultimately I feel that the thing that was most useful about RawGit — it made serving HTML at scale dead simple for anyone with a GitHub repo — is also the thing that made it most prone to abuse.

It came down to a simple equation: fighting abuse on RawGit will _always_ take more time and effort than spreading abuse via RawGit. One persistent jerk working a few hours a day could do so much damage so quickly that mitigating it would require multiple people working full time.

There's just no good way to scale that and retain the functionality that actually made RawGit useful.

I talked about this a little more here: https://github.com/rgrove/rawgit/pull/191#issuecomment-42831...

Look at how enthusiastic people are about moderating forums. Would it not be possible for someone to add a web application where volunteers could go to manage blacklists or something? Like prescreened volunteers.

It’s not free. You will have to now spend time managing people.

It's a bad idea to shut down a service that hundreds of thousands of people find useful due to theoretical reasons.

People have volunteered to maintain the service. Why not let them?

Pretty sure you could recreate this with an nginx reverse proxy and a little configuration to set the Content-type header dynamically. I don't think any code would be required.

As mentioned, the bandwidth costs and chasing abuse would be the higher effort part.

Exactly. And plenty of people have done exactly this. There's nothing technically interesting or unique about RawGit except that it got popular.

Like Twitter or Bitly clones. There's nothing technically interesting or unique to write, but just try getting a userbase.

For the time being I threw this up.

You should be able to replace rawgit.com, with rawgit2.com.

Please don't use the name RawGit.

This is likely to confuse people, and it's also likely to cause people to reach out to me when they need help with rawgit2.com.

Are you going to self fund the $50,000/year in hosting costs and devote time to fixing the malware problem?

If the time comes, sure.

Also it can be hosted for much less.

I used this for a lot of Codepen stuff - sad to see it go.

You might find this useful: https://github.com/mallendeo/cdp-rawgit-fix

It'll scan all your pens and try to suggest jsDelivr URLs wherever you've used RawGit.

I loved your product and enjoyed it. Thanks always!

This was very useful service, sad to see it go.

> 176 terabytes of bandwidth

How did he pay for this?

It looks like he didn't, for the most part: https://github.com/rgrove/rawgit/blob/master/FAQ.md#can-i-do...

"It's super nice of you to offer, but I don't need any donations at this time. RawGit's server costs are minimal, and the lovely people at StackPath provide RawGit's CDN service free of charge. Thank you though!"

The FAQ says "CDN hosting generously donated by StackPath".

Yet another trendy hip service folds with little to no notice. Meanwhile sourceforge keeps on going.

... what?

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact