Hacker News new | comments | show | ask | jobs | submit login

Exactly this. I don't think that if you take into account __all__ parameters bare metal is cheaper. My other problem is that they are moving from cloud to bare metal because of performance while using a bunch of software that are notoriously slow and wasteful. I would optimise the hell out of my stack before commit to a change like this. Building your own racks does not deliver business value and it is extremely error prone process (been there, done that). There are a lot of vendors where the RMA process is a nightmare. We will see how it turns out for Gitlab.



The RMA process is pretty much a moot point at this scale. It simply doesn't make sense to buy servers with large warranties. Save the money, stock spares, and when a component dies replace it. In the long run it will come out to cost a lot less. I'd much rather pay much less knowing a component may fail here and there and you buy a new one.

As far as moving from cloud to bare metal, another thing to take into consideration (that this was replied to), if you don't architect your AWS (or other cloud solution) to take advantage of multiple geographic regions, the cloud won't benefit you.

I 100% agree that there should be more than one region deployed for this service. As others have said, all it takes is 1 event and the site will be down for days to weeks to months (It may not happen often, but when it does, you go out of business). The size and complexity of this infrastructure will make it nearly impossible to reproduce on short order in a new facility. If I were the lead on this I would have either active / active sites, or an active / replica site.

I would also have both local (fast restores), and off-site backups of all data. A replica site protects against site failure not data loss and point-in-time recovery.


"As far as moving from cloud to bare metal, another thing to take into consideration (that this was replied to), if you don't architect your AWS (or other cloud solution) to take advantage of multiple geographic regions, the cloud won't benefit you."

Yep, this is why scaling starts with scalable distributed design. We were moving a fairly large logging stack from NFS to S3 once, for the same reason Gitlab is trying to move to bare metal now. Moving off cloud was not an option, moving to a TCO efficient service was. NFS did not scale and there was the latency problem. I think moving to bare metal cannot help with scale as much as a good architecture can. We will see how deep the datacenter hole goes. :)


Agreed. Application Architecture is far more important than Cloud vs. Bare Metal. It is just easier and more cost effective to through more bare metal hardware at the problem than it is cloud instances. For some this does make bare metal the better option.

To add to my previous comment though, AWS (and cloud in general) tends to make much more sense if you are utilizing their features and services (Such as Amazon RDS, SQS, etc.), and if you aren't using these services I can absolutely guarantee I can deliver a much lower TCO on bare metal than AWS. (Which is why I offered to consult for them) I see this all the time. Company moves from bare metal to AWS as bare metal is getting expensive, then they quickly find out AWS can't deliver the performance they need without massive scale (because they aren't using a proper salable distributed design and can't afford to re-architect their platform)




Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | DMCA | Apply to YC | Contact

Search: