The curse of scalable technology

sseagull · 2023-06-11T21:39:43

Slightly ranty: Ugh I've been feeling this recently in a few areas. I think a lot of it comes down to imprecise technology and assumptions that are made from that terminology.

1.) We have a very large, powerful server we are using for webapps/database. I wanted a simple way to "orchestrate" containers on it (from a friendly web interface or something). I've come to the conclusion that such a solution does not exist - it's either CLI (docker/docker compose) or run kubernetes (which I tried, and got it working, but it's too complicated for me right now, even on one server. Maybe someday)

2.) I want to aggregate various logs from the apps running on the server, and be able to visualize them in something like Grafana. Some currently log to their own postgres database, others to a file. The answer to this is to install half a dozen (or more) services, each with their own config language, quirks, and 200 page manuals, and hook it all together. The good news is I can use a "simple" config since I have less than 100GB of logs a day (wtf?, I have more like 50MB a day))

There seems to be a completely missing middle class of software/devops/sysadmin information. It's either toy programs or "web-scale" 1000 node clusters.

But coming back to the article, it's really frustrating to try to talk to others or find answers. "I want to aggregate logs" causes people to think my logging needs are bigger than they really are. Same with "container orchestration". And then I get told I'm doing it wrong (which believe me, under my current constraints, is the best we can do). I guess overall I wish people would respect my current constraints.

sh34r · 2023-06-12T01:23:36

If you're locked into containers as the solution, the answer to your problem is docker-compose. But verify that assumption first. Why containers and not VMs? VMs solved this particular problem years before anyone had ever heard of Docker. You don't need containers if you're just splitting up a single-server monolith. I don't know if I'd call them "friendly," but basically every VMM has a GUI. Docker's the easiest way to solve the runs-on-my-machine problem, but Vagrant isn't much harder.

k8s is a non-starter for this use case. So incredibly overcomplicated, and what did you think you were getting for that complexity? It's for large-scale deployments. You have one server. Resume driven development is getting out of control...

If you don't have a dedicated devops team, or you're not using a managed k8s service like AKS, just don't. Please don't. Stop spreading the madness to these beautifully simple environments. KISS: keep it simple, stupid. Running k8s because Docker CLI is too hard, is like learning to fly a F-16 because riding a bike is too hard.

sseagull · 2023-06-12T02:03:23

Docker compose is what we are going with (for now), but with one complication: how can I allow co-workers access without giving them root access to the server? I generally trust them (we're a tight knit group), but mistakes happen.

Now I could (and did) add them to the 'docker' group so they can run docker(-compose), but the issue ends up being permissions on any bind mounts. They end up not being able to read logs, view some config, etc.

It's not necessarily a single-server monolith, but a server that is meant to handle a multitude of (possibly unrelated) web-apps at once. I might have to think about VMs more. I really do like docker, and traefik is a god-send (making routing and certificates a heck of a lot easier).

sh34r · 2023-06-12T02:52:06

Security is generally considerably more challenging with containers than VMs.

https://docs.docker.com/engine/security/#docker-daemon-attac...

"Running containers (and applications) with Docker implies running the Docker daemon. This daemon requires root privileges unless you opt-in to Rootless mode..."

If they can log into the server and run docker containers themselves, you've probably inadvertently granted permissions that could be escalated to gain root access, if they were a sufficiently skilled adversary (or their user account is compromised by one).

You typically avoid this particular problem by having a CI/CD pipeline build, deploy, and run the containers. No one has permission to do this except the pipeline. Grant devs ssh access to the containers, using appropriate network rules to limit access to trusted IPs, and ideally behind a jump box in a VPC.

VMs give you a lot of OS-level protections for free. For one thing, getting root access to a VM shouldn't compromise the host or other VMs (in the absence of a 0-day, anyway).

Unfortunately, it sounds like you're pretty far down the Docker rabbit hole already. It's probably much more trouble to switch now than it's worth.

Jedd · 2023-06-12T00:01:45

Going into engineer-minded solution-focus mode and ignoring the meta here - Hashicorp Nomad is lightweight wrapper around docker, works just fine on a single-server, GUI or CLI at your discretion, automation options scale nicely.

Logging - look at Loki, also by Grafana Corp, will run in a container, Nomad has a docker driver extension to send logs to a Loki endpoint (podman has some sharper edges, but probably less so on a single-host setup).

EDIT: I haven't updated this in a while, but https://github.com/jedd/nomad-recipes/blob/master/loki.nomad gives a taste for a very basic Loki job under Nomad. In terms of prereqs, you'd need docker & nomad running, and have a persistent volume (to local disk) configured in Nomad.

ricardobeat · 2023-06-11T22:48:46

Did you consider MRSK[1], k3s[2], or dokku[3]? They are all significantly simpler to operate than Kubernetes, curious to hear your take.

On logs, I agree and have looked for the same. A simple way to aggregate logs in one machine, heck it could even be running SQLite, and query via a web UI. Doesn't seem to exist for this scale.

[1] https://github.com/mrsked/mrsk [2] https://k3s.io/ [3] https://dokku.com/

sseagull · 2023-06-11T23:04:13

I took a good hard look at dokku , and it almost fits (especially if you put the ledokku web interface in front). I think my main issue was that we have services that have multiple components in a docker compose (not just a webapp + database). These appear awkward for dokku to handle. However, I am still very interested in deploying it, and will revisit in the future.

I looked at k3s some, and it is promising. I think overall anything kubernetes related is probably in my future, but I am not familiar enough with the ecosystem to run it in production right now.

I did install rancher w/ rke2 and even managed to deploy some of my apps. It's really nice, and runs okay-ish on a single server once you increase the pod limit, but again, not comfortable running it for production (yet).

Had never heard of mrsk, but bookmarked for the future. Thanks!

n3t · 2023-06-12T00:26:43

How can k3s be simpler to operate than Kubernetes when k3s is a Kubernetes distribution?

ianbutler · 2023-06-12T00:49:19

k3s is a single binary install that unpacks a lot of k8s parts for you and uses sqlite instead of etcd for the data store. So it's simpler to spin up and manage, not necessarily simpler to use.

bsnnkv · 2023-06-12T02:47:17

I'm an DevOps/SRE guy by trade, spend my days wrangling supermassive Kubernetes clusters etc. I've come to the conclusion that the "middle class" is especially well covered by NixOS. Generally, for any situation where you're running a bunch of stuff on one big ol' server, NixOS is going to be the least stressful and most productive way to go once you become familiar with the ecosystem.

lgas · 2023-06-12T22:03:40

> I wanted a simple way to "orchestrate" containers on it (from a friendly web interface or something). I've come to the conclusion that such a solution does not exist - it's either CLI (docker/docker compose) or run kubernetes

Have you tried https://www.portainer.io/ ?

code_biologist · 2023-06-11T21:53:39

I wanted a simple way to "orchestrate" containers on it

I recently started to use Portainer for this. It seems pretty serviceable.

c0balt · 2023-06-12T08:20:02

+1 for Portainer especially when a GUI is needed. It should also be mentioned that Proxmox[0] has container support.

[0]: https://www.proxmox.com/en/

spiffytech · 2023-06-12T10:52:54

My favorite observation on this topic is that annual industry surveys (Stack Overflow survey, State of JS/CSS) consistently report around a quarter to a third of respondents work in companies of 20 people or fewer.

Companies that size I've worked at often had 2–5 technical staff total. That's where I've spent most of my career.

It contrast, some people feel a project with 25 developers is best described as having "only" 25. And I know that's not the top of the scale.

It's easy to imagine how many choices I'd make differently if I had too many people to reach organic consensus, or could count on a half-dozen team members leaving and getting replaced every year, or couldn't count on a certain baseline of skill or technical taste.

This stuff doesn't really come up when we talk online. We just hand each other assertions without context - "K8s is overkill, you should use systemd on bare metal", or "your app will collapse if you use the DB as a queue", etc.

I think the way we discuss our choices is worse for not clarifying our assumptions about the environment.

perrygeo · 2023-06-12T13:25:04

There seems to be this notion that we can make technical choices in a vacuum, that there are inherent qualities which universally make X better than Y. The job is then to search out the expert assessment on the matter. Without consideration of the software dimensions mentioned here, you're basically giving up on engineering.

It's a knowledge gap problem, which is precisely why it's comforting to seek out and lean on the consensus opinion. The alternative takes work: filling that knowledge gap with data from empirical observation and logical analysis. Well-designed experiments? Well-specified requirements? Design? Nobody has time for that /s. It's much easier to say "k8s bad 'cause I read it on HN"

jiggawatts · 2023-06-12T00:13:43

> endless debates where we talk past each other

Ah yes.

My favourite is when I look at a system that is slow as molasses even for one user, and the predictable refrain is: “we can scale up!”

Adding more lanes to a road with a speed limit of 5 doesn’t fix the problem of each car going slowly.

hinkley · 2023-06-12T01:03:55

Half of programming is making excuses so we don’t have to take a hard look at the other half.

Let’s go on an adventure instead of having a sober conversation about what should be table stakes skills.

AdieuToLogic · 2023-06-12T01:26:49

Much of this article's concerns can be addressed by a handful of tenets:

- spend time to understand the problem being solved.

- research applicable technologies for the problem which align closest with what the team knows.

- keep an open mind for other applicable technologies, but do not incorporate them without cause.

- accept that there are innumerable alternate ways to solve any given problem, but the time to solve is finite.

- do not emotionally attach oneself to the current solution so that other approaches can be objectively considered and employed when beneficial.

- alter technology used when benefit is identified and risk to success is minimized.

bettercallsalad · 2023-06-11T21:15:00

A relevant recent article on how Prime Video did something interesting come out of some of these curses. https://www.primevideotech.com/video-streaming/scaling-up-th...

And thread https://news.ycombinator.com/item?id=35853148

tbrownaw · 2023-06-11T22:21:08

The more different things a tool can be used for, the less context can be assumed from knowing that that tool is being used.

FpUser · 2023-06-11T21:15:31

Scalability does not really exist in a general case as soon as there are both reads and writes to the same data. There are various tricks exploiting particular properties of particular business case. Mostly physical or temporal data update sharding that sacrifice accessibility.

snowman647 · 2023-06-11T20:14:38

This is mostly the trick - today you start pet project in python, in a year you need to scale it on 1M clients. No one knows anything about what is right technology as requirements change too quickly.