Hacker News new | past | comments | ask | show | jobs | submit login

Just a few quick notes. I've experience running ~300TB of usable Ceph storage.

Stay away from the 8TB drives. Performance and recovery will both suck. 4TB drives still give the best cost per GB.

Why are you using fat twins? Honestly, what does that buy you? You need more spindles, and fewer cores and memory. With your current configuration, what are you getting per rack unit?

Consider a 2028u based system. 30 of those with 4TB drives gets you the 1.4PB raw storage you're looking for. 2683v4 processors will give you back your core count, yielding 960 cores (1920 vCPUs) across that entire set. You can add a half terabyte of memory or more per system in that case.

Sebastien Han has written about "hyperconverged" ceph with containers. Ping him for help.

The P3700 is the right choice for Ceph journals. If you wanted to go cheap, maybe run a couple M2.NVME drives on adapters in PCI slots.

I didn't really need the best price per GB in my setup, so I went with 6TB HGST Deskstar NAS drives. I'm suggesting you use 4TB as you need the IOPs and aren't willing to deploy SSD. Those particular drives have 5 platters and a higher relatively high areal density giving them them some of the best throughput numbers in among spinning disks.

If you can figure out a way to make some 2.5" storage holes in your infrastructure, the Samsung SM863 gives amazing write performance and is way, way cheaper than the P3700. I recently picked up about $500k worth, I liked them so much. They run around $.45/GB. Increase over-provisioning to 28% and they outperform every other SATA SSD on the market (Intel S3710 included).

You'll probably want to use 40GE networking. I've not heard good things about Supermicro's switches. If I were doing this, I'd buy switches from Dell and run Cumulus linux on them.

Treat your metal deployment like IaaC just like any cloud deployment. Everything in git, including the network configs. Ansible seems to be the tool of choice for NetDevOps.

Thanks for the great suggestions.

We're considering the fat twins so we get both a lot of CPU and some disk. GitLab.com is pretty CPU heavy because of ruby and the CI runners that we might transfer in the future. So we wanted the maximum CPU per U.

The 2028u has 2.5" drives. For that I only see 2TB drives on http://www.supermicro.com/support/resources/HDD.cfm for the SS2028U-TR4+. How do you suggest getting to 4TB?

Also whatever you do, don't buy all one kind of disk. That'll be the thing that dies first and most frequently. Buy from different manufacturers and through different vendors to try and get disks from at least a few different batches. That way you don't get hit by some batch of parts being out of spec by 5% instead of 2% and them all failing within a year, all at the same time.

If you do somehow manage to pick the perfect disk sure having everything from a single batch would be the best since that'll ensure you have the longest MTBF. But how sure are you that you'll be picking the perfect batch simply by blind luck?

I had this problem with the Supermicro SATA DOMs. Had problems with the whole batch.

That said, I bought the same 6TB HGST disk for two years.

As long as you're not buying all the disks at once sticking with one manufacturer and brand should be fine. If you're buying 25% of your total inventory every year it'll all be spread out to just a few percent per month.

But when you're buying 100% of your disk inventory at once there's a serious "all eggs in one basket" risk.

Sorry, I was confused by the part numbers. I was thinking of the 6028u based system that have 12x3.5" drives. These are what I used for my OSD nodes in my Ceph deployment.

As for CPU density, I still feel like you're going to need more spindles to get the IO you're looking for.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact