Git-scm.com status report (marc.info)
    > We (the Git project) got control of the git-scm.com domain this year. We
    > have never really had an "official" website, but I think a lot of people
    > consider this to be one.
So, uh, git-scm.com wasn't an official website all these years?

It was set up and run by GitHubbers for Git advocacy/evangelism.

Maybe so, but since git-scm.com's 2012-05-05 redesign [1], the professional visual design, the new footer, the internally-hosted download pages, and the verbiage has no longer given the impression of an "unofficial" site. Compare with the old design used until 2012-05-04 [2].

[1] http://web.archive.org/web/20120505190309/http://git-scm.com... [2] http://web.archive.org/web/20120504151545/http://www.git-scm...

git history says it's been (kinda?) official since 2009:

https://github.com/git/git/commit/69fb8283937a18a031aeef12ea...

I am always amazed at how quickly Heroku gets prohibitively expensive when you start scaling.

When I ran https://jscompress.com/ on Heroku, I was up to $100 per month for 2 2x Dynos. Completely absurd for a simple one-page Node.js app. I put in a little work moving it to DigitalOcean, and had it running great (and faster) on a $10 VPS.

I get the appeal of Heroku (I have used it several times), but man sometimes it feels like gouging when you can least afford it.

If it were my site, I think I wouldn't even bother with the search: just stick it all on S3, and have one of those 'Google custom search' or similar boxes, so it's static as far as your site's concerned, and just redirects to Google with the `site:foo` filter.

I don't really have a handle on what S3 costs 'at scale', but I think I'm willing to bet it would knock at least the 0 off the end.

Search could easily be done if content is indexed in json files.

CDN hosting of a static site is nearly $0 so def the best option in this case. Plenty of providers give free PRO service to OSS projects as well (i.e. netlify)

Why even bother paying for S3? Just use a static site generator and put it on github pages, github is already paying for the heroku vms so I bet they'd be happy to pay less to use their own infrastructure even if the git-scm.com/org site uses more than their bandwidth requirements.

Probably the best/cheapest solution is:

A) Get a Linode VM, put elastic search on it and have it load the text. Probably $20/month with that little text, tops.

B) Use something like KeyCDN to cache everything for long periods of time.

I doubt it'll cost $50/month.

A linode vm's cost is not measured in its price tag but in how much effort it takes to maintain. Updates, security advisories, hypervisor crashes, crashes under sudden unexpected load, crashes because you're on vacation and you are having too much uninterrupted fun, ...

As a famous writer once said: "ain't nobody got time for dat."

The generated Rust API docs, which are uploaded to https://doc.rust-lang.org/std/ (but also the same ones you can download), do something where they generate an index of the entire site as a JavaScript object, so searches can happen client-side. So it's a static website, but search functionality works.

See https://doc.rust-lang.org/search-index.js for the messy back-end.

Why bother with S3? I'd buy a Raspberry Pi, plug it in at home and call it a day.

Pedantic analysis:

At 5 Watts and $0.30 per kilowatt-hour, a Raspberry Pi would cost $1.08/month to run.

With 1 free GB and $0.09/GB-month, S3 would be able to deliver 13 GB/month at a cost of $1.08/month.

So, RasPi at home gives you "unlimited" egress and a fixed cost, but you have increased latency, a rather small outgoing bandwidth (most likely), and all the downsides of running your own server.

S3 gives you unlimited bandwidth, low latency, and no server maintenance, but it's only competitive on price if you don't exceed 12GB of egress.

Overall I prefer S3, even if I think their egress prices are ridiculous. RasPi at home has some geeky cool factor, though...

NOTE: $0.30/kwh is basically what I pay (California) for any additional usage. These equations will favor home hosting if your electricity is cheaper.

A raspberry pi is also not a CDN. Why even manage servers when all you need is a static site? Deploy to a CDN and if properly configured, everything just works.

Especially with the site being considered in 'maintenance mode', I doubt they want to manage a server aside from the other things they need to do.

Not sure if it's a sarcasm, but upvoted just in case if it is.

That's not sarcasm. It's not like anyone here has numbers to say I'm wrong, and I'm not hearing any good arguments as to why it's a bad idea. I've had my blog (which does 3 mysql queries for each pageload) hit the HN front page with 105KB/s uplink and either a Pentium 3 or Intel Atom as CPU (I forgot when I switched servers). In any case, something much like a Pi can handle HN front-page traffic with a database. And people are downvoting hosting a website that's even less heavy on a Pi? Unless it's 500 hits per second, I don't get it.

As much as I enjoy the Raspberry Pi, I doubt if it could handle the traffic or if it would last long with the limited write cycles of an SDcard.

reply


I know someone who optimized searching the leaked Adobe database (of a few hundred gigabytes if I remember correctly) on a Pi to a sub-second search. That was super impressive and the method used wasn't even obscure (binary search). The same doesn't apply here, but I'm trying to say a Pi isn't entirely worthless.

For example, what pages get accessed the most anyway? I'm guessing the latest source code and maybe the latest release, though most people probably just apt-get git instead so it's probably mostly the source code. Then there are man pages and some other info pages, if I remember correctly. Sounds like the latest release + 90% of those text pages can easily fit in RAM. So memcached? Nah, the Linux kernel happily caches the files that you read from disk.

I don't know the actual numbers but it doesn't sound infeasible to me. A $230 hosting bill is very heavy though, I guess you'd need some serious fiber as well to provide the uplink. But again, without numbers it's all "maybe" and "probably".

You could put the site itself on an external drive via USB - and ideally most of it the access be read-only for a static site, so the life of the sdcard shouldn't be as big of a concern.

That said I'd agree, Raspberry Pi's a great but not quite fast enough for serving a high-traffic website.

Just a note:

> The deployed site is hosted on Heroku. It's part of GitHub's meta-account, and they pay the bills.

So why aren't they just using a GitHub page for this?

GitHub Pages still doesn't support HTTPS on custom domains.

I wonder how hard it would be to convince folks to drop their expensive setups in favor of nearly $0 static sites, as well as how much up front cost they'd be willing to shovel out for the transition. S3 + CDN (+ Lambdas optionally) feels really ready to me for almost any straightforward "website." For most things GitHub/Lab pages is an easy path to that.

A lot of folks who have static websites aren't technical and invested thousands of dollars for a WP website. If their site already runs, you wouldn't be providing anything.

> It uses three 1GB Heroku dynos for scaling, which is $150/mo. It also uses some Heroku addons which add up to another $80/mo.

Wow, why ? You can get a VPS with 2Go Ram + 10 Go SSD for 3€ those days (https://www.ovh.com/fr/vps/).

That seems very expensive.

Probably the tooling around Heroku and the scaling. If the site suddenly gets more popular, you tweak a slider and suddenly you have more compute.

For a static website you can easily live with 10000 users on this machine. But lets say you need more, for 40 dollars you get 30Go of RAM and a 250 Mbps pipe (https://www.ovh.com/fr/vps/vps-cloud-ram.xml). To serve HTML pages plus a bunch of css files that should be ok.

What's the best way to optimize cost here? Complete site cached and served from memory (no disc access -> faster response times -> scales better)?

Quick win would be to put CloudFlare in front of it

Why would that do anything? They aren't paying $230/mo for bandwidth, but for shitty VMs.

It's mostly cacheable, so they should theoretically be able to scale down the cluster with a caching reverse proxy or CDN.

This was my thinking and experience when I set out to build Cachoid[0]. There's so much to gain from caching stuff in RAM it should be ubiquitous. The thing is CDNs don't always have the scale to stick all tenants in RAM. Hence the caching to disk.

[0] - https://www.cachoid.com/

From the page:

> Do we really need three expensive dynos, or a $50/mo database plan?

Sounds like there's the chance to optimize for what is, as they say, a static website. Why for a database that you're not using? (And what kind of a database plan do you have when it costs $50/month when it's apparently a (nearly?) empty database?!)

Dropping Heroku

IMHO, since it's a static website, they can use a static website generator and simply usage something like GitLab pages to deploy it (for free).

There is a bit of work to be done, but it shouldn't be too terrible if the templates and stuff are okay.

Or just throw a CDN with a decent cache lifetime in front of the Rails app and scale the Heroku side way down if you don't want to go through the hastle of changing anything. It's pretty much static after all.

I was thinking the same thing after reviewing the repo and output HTML but I wonder if that would really lower the monthly hosting costs for the Heroku instance and the various addons. It would be simple to modify the RoR app to output the proper caching headers that would allow any CDN to cache the HTML output and obey the various cache limits, but on demand rendering the output from time to time is still required once the cache expires.

I think moving the site to a normal static site generator (like Jekyll) would deliver the most bang for the buck but would be quite the transition. The site would only need to be built upon a new commit and with the proper site generator it will only update the underlying HTML files that require a change. Then syncing the update HTML to whatever CDN is chosen.

As far as I understand it nobody but the git team is paying for hosting. Why neither Github or Heroku are paying for this? They are built on top of git. Millions of tech dollars go to political causes right now yet nobody is willing to give $230/mo of free hosting to git website, the most used VCS today? Talk about priorities. And it's not the first time, plenty of open source projects used by billion dollar companies receive 0 of funding.

Edit: GIthub seems to be paying for that but Heroku shouldn't even bill them.

> The deployed site is hosted on Heroku. It's part of GitHub's meta-account, and they pay the bills.

Sounds like GitHub foots the bill.

Did you read the entire thing? GitHub is paying for it today.

From the Hacker News Guidelines[0]

> Please don't insinuate that someone hasn't read an article. "Did you even read the article? It mentions that" can be shortened to "The article mentions that."

[0] https://news.ycombinator.com/newsguidelines.html

The article says in the first few paragraphs that GitHub is paying for the hosting right now.

It was a website built by a GitHub co-founder on his own initiative, who happened to also be a git contributor. It wasn't particularly a thing that git the open-source project requested.

The previous git website was http://git.or.cz/ , also run by a git contributor, and releases were (and still are) at https://www.kernel.org/pub/software/scm/git/ .

