The Hardest Scaling Issue

bee_rider · on Jan 28, 2023

It takes 8 paragraphs to work up to:

> The hardest scaling issue is: scaling human power.

I just point this out because the author makes a couple head-fakes at technical issues, haha.

hinkley · on Jan 29, 2023

I have all of these rah rah things I say about building teams and skill ladders and good in house or purchased tooling but I realized at some point in the last five years that what I’m really doing is scaling people vertically. I have had a histamine reaction to the idea of 300 developers trying to work on anything together for about as long as I can remember, so that tracks.

leowoo91 · on Jan 28, 2023

That could be the title even tbh.

nequo · on Jan 28, 2023

Is it optimal for Codeberg to host on their own hardware? They write that it puts immense pressure on them:

  We transitioned from a basic stack (ext4 filesystem, Postfix, not much more) to one including so many shiny new things: Starting with the duty of maintaining own hardware (which is a journey on its own), we introduced LXC containers, Ansible, BTRFS, ZFS, Ceph and more.

  In the past months, I found myself reading a lot of documentation … Still, there was no time left to dig deep into Ansible and ZFS, and by now both tools have even been mostly dropped, to reduce the stress on our team members.

They write that self-hosting also leads head-on to the trade-off of what to spend scarce donor money on:

  For donation-powered non-profits, this uses donations that could have served better, and leads to burn-out in the team.

The argument against cloud hosting is that the data then resides with a big hosting company:

  No matter which instance you join or if you host yourself, the data will reside on one of the few big cloud provider's systems.

But Codeberg’s mission is to host public repositories of open source projects. Why is it a problem if the data sits on someone else’s hardware? Wouldn’t the mission be better served by focusing on “Codeberg the software” instead of “Codeberg the hardware?”

latchkey · on Jan 28, 2023

We run petabytes of storage, and growing.

We dropped Ceph (which is too complex with a minefield of issues, and few people have deep experience in it) and just use ZFS (which tons of people have experience in and performs flawlessly).

We also use Ansible since it automates and codifies our deployments. I can't imagine dropping it... in favor of what? Manually typing install commands on servers?

It is good they have lifted the covers on their service in this blog post. It screams "stay away".

phphphphp · on Jan 28, 2023

> We run petabytes of storage, and growing.

That can't be right -- can it? They have less than 50,000 users. For petabytes of storage to be necessary, their average per-user storage would need to be dozens of gigabytes. I've been using GitHub for more than a decade and the sum total of every repository I've published is less than that. I found a few repositories on Codeberg that are using gigabytes of storage but the majority are normal sized Git repositories.

latchkey · on Jan 28, 2023

I'm talking about my own business.

arter4 · on Jan 28, 2023

Out of curiosity, are you running ZFS across datacenters or do you have a single DC?

My understanding is, ZFS doesn't inherently provide clustering, so each server has its own pool. If you have multiple DCs, how do you handle disaster recovery? Do you use async replication?

latchkey · on Jan 28, 2023

We have multiple DC's although the data isn't replicated across DC's. We are a Filecoin storage provider. Users of that network choose where they want to store things and are effectively responsible for geo data replication, then using the protocol we can prove to them that their data is on our machines.

scraptor · on Jan 28, 2023

If codeberg's mission only to host public repos with no further qualifiers they could just use github. The whole point is to not use services provided by untrustworthy and ideologically misaligned megacorps.

nequo · on Jan 28, 2023

I sympathize with that. I do see a point in-between those two extremes though.

I don't want to give GitHub free advertising by hosting my code with them nor do I want to feed their ML models. The latter is not completely avoidable since someone could mirror my repo on GitHub or they could just scrape it from Codeberg. But the former is successfully avoided by using Codeberg, even if Codeberg is hosted in the cloud.

jacooper · on Jan 28, 2023

100%, you can't always reach your goal in the ideal way, so don't put up with unnecessary things that hinder your progress towards you end goal.

rigelbm · on Jan 28, 2023

Isn't there a wide ecosystem of options between "I have to physically go to DC to configure new server" and "Storing everything in EvilCorp Public Cloud"? There are several hosting providers out there that offer anything from bare metal servers, to VPS, to OpenStack, to Kubernetes and beyond. Going into a DC physically to configure a new server sounds like the hardest option possible short of them managing the DC themselves.

I don't feel like the author gives enough rationale for the choices made. Making the decision so binary (Hard Way vs EvilCorp), it becomes easier to justify "Hard Way". I think if author accepts that there could be other options, maybe that will help with the problem stated.

andrewjl · on Jan 28, 2023

> Isn't there a wide ecosystem of options between "I have to physically go to DC to configure new server" and "Storing everything in EvilCorp Public Cloud"? There are several hosting providers out there that offer anything from bare metal servers, to VPS, to OpenStack, to Kubernetes and beyond. Going into a DC physically to configure a new server sounds like the hardest option possible short of them managing the DC themselves.

Equinix Bare Metal Cloud[1] is an example of something in-between those two options. Not affiliated, just interested in the space.

[1] https://deploy.equinix.com/metal/

prhrb · on Jan 28, 2023

Just host in the cloud and scale it with a few clicks

theideaofcoffee · on Jan 28, 2023

I don’t know if this was missing a ‘/s’ but... it’s never “just” (a four letter word in operations) and never “a few”.

debarshri · on Jan 28, 2023

Question is should they even take onus of hosting the service. Design wise, it could be selfhost the repositories and discover by a service managed by codeberg. That would save lot of donors money. I'm sure the storage cost alone runs into few thousands every month. I'll argue if it is not-for-profit org, does it justify this cost. To me it feels like at one moment, they have to make a hard call to switch the model.

cbrozefsky · on Jan 28, 2023

Just read the article before commenting with “just do X”

MoOmer · on Jan 28, 2023

Presumably cost/performance are of concern; along with probably wanting to use CephFS directly over an NFS.

sekh60 · on Jan 28, 2023

What's wrong with using CephFS over NFS? cephadm can automatically deploy NFS ganesha containers to serve out NFSv4, it's all automated.