I don’t understand the storage story, imo different apps deal differently with replication and there’s no « one size fits all ». How are compose volumes replicated? Is split brain handled by corrosion or is it up to the uncloud user to set up stuff right in the docker compose?
ps: I self host and have moved from compose to k8s on single node, kustomize helps keep app configs as DRY as possible
I agree there is no one-size-fits-all for data replication. Currently Uncloud doesn't handle volume replication. Moreover it doesn't support regular Docker volumes yet, only mounting a host path. The reason is I didn't have time to give it proper thought on how to design volumes in a cluster context without getting into the full-blown PV support like in K8s.
I suspect that I will implement support for regular local Docker volumes such that each service container will use its own volume on the machine it runs on. Uncloud won't automatically replicate data between volumes as storage replication adds significant complexity and potential failure modes. Apps that need HA such as databases can handle their own replication. I'm getting inspiration from Fly for this: https://fly.io/docs/volumes/overview/. Maybe it would make sense to implement handy commands for cloning, moving, and backing up volumes between machines, not sure yet.
Corrosion (the embedded CRDT SQLite) is only used for Uncloud's internal state management, not for application data.
I see, so off the top of my head I’m thinking if uncloud could host some of my containers: nextcloud, plex, radarr, transmission, listmonk, linkding. And it seems its no better than my single node k8s, i.e they all have either sqlite or filesystems state, and no clear way for HA replication… mind you this isnt a problem of uncloud, moreso of the apps themselves
Yep agreed, but on the other hand do you really need HA for them when you're perhaps a single user? I wonder, what motivated you to migrate to single-node K8s from Compose for your self-hosted setup?
No I don’t really need HA, for most of my services high availability comes last. I moved to k8s for learning purposes, I stayed because I like it. When I decide to move some services to HA I have a much more powerful base with k8s than compose.
> I realize you are probably using bitwarden directly, in which case don’t you trust them to safeguard your data?
Yes i use bitwarden directly, no self hosting. I do trust them keep my data safe (although i also trusted LastPass at some point, big mistake) but why not also keep a local copy, just in case. The type of data you store in bitwarden is worth the hassle and if Bitwarden Inc. ever gets into big trouble suddenly you'll be glad to have the backup.
Funny this pops up today, I’ve finished migrating form KeepassXC to a self hosted vaultwarden, the official bitwarden apps and briwser extension are super well made, so good so far with the switch.
Seems like the constant is another representation of the set of all prime numbers. I wonder of there is a branch of math that formalizes these different representations, there is the infinite series here that defines the constants, but how does that relate to the set of primes? And what are the other representations?
which has the curious property that as you substitute nonnegative integers for the variables, the positive values of the polynomial are exactly the set of prime numbers. (The polynomial also yields negative values.)
When put like this, it sounds like the polynomial must reveal something deep about the primes... but it's another cool magic trick. The MRDP theorem (famous for solving Hilbert's 10th problem negatively) implies that this kind of multivariate polynomial exists for exactly those sets of natural numbers that are computably enumerable, so the polynomials could be seen as a really esoteric programming language for set-enumeration algorithms.
I don’t think of ssd and hdd for different parts of the filesystem. Rather / is a zfs/btrfs/bcachefs pool, and the ssd is added as a read cache to the pool
Edit: it can also be a write cache but that’s more tricky, usually with a battery backed hardware raid it’s fine
Doesn’t matter when bottom feeding lawyers run an automated tool on your site and sue you in hope of a quick $5k settlement. That is what drives this not any notion of accessibility.
I really want to migrate my zfs pool to bcachefs so I can finally follow the latest version of fedora from day one, but this crap is making me doubt it’s a good move…
Regardless of everything else, most people should not be using bcachefs yet. Kent has even stated that unless you're okay not being able to access your data for chunks of time while bugs are being fixed, you shouldn't be using it. The conventional wisdom would be to wait 10 years after a new filesystem is introduced for it to stabilize before switching, so we're looking at summer next year at the earliest.
Apart from that, there are (or were, last I tried it six months ago) some performance bugs in the code.
Nothing that completely breaks it, but I found at the time that the high variance on read requests for Samsung 970 series NVMe causes the filesystem to also dispatch reads of cached data to the HDDs, even when it’s fully cached.
Which predictably increases latency a lot.
Really I should make another stab at fixing that, but the whole driver is C, and I’m not good at writing flawless C. Combine that with the problem actually being hard…
(“Always read from SSD” isn’t a valid solution here.)
Sorry to say, I have some old SSDs that are only 3-4 times faster than the HDDs. Especially when there’s a lot of HDDs in the pool, ignoring them could be leaving a lot of performance on the floor.
Though it would be an improvement over what I saw last time I tried, sure.
If we're talking about my desktop, its current configuration is 3x 2TB NVMe (configured as zfs cache) plus 2x 12TB HDDs (mirrored). I've set sync=disabled, with transaction groups committing every 10 minutes — this is fine for my use case — so the HDDs spend most of their time spun down.
I only actually have 4TB of data on the system. It keeps growing, but the working set is probably much less than that.
Which means, it's 100% cached. A single read sent to the HDDs would have a latency of multiple seconds; absolutely catastrophic for a desktop workload. In this case _always_ using the cache _is_ the right answer, but I've been trying to think of an algorithm that would be able to do so without hardcoding it.
Is there not some sort of standardized, stringent filesystem test yet? Like Jepsen is for databases? If passed, one can be sure it is reasonably free from bugs? Guess not.
For high-level summaries, I actually appreciate such framing because the Linux ecosystem sorely lacks the coherence found in other platforms:
"USB4 V2 is supported in Windows 11 24H2"
"The Translation API is now available in macOS 15"
"To use the built-in camera in these specific laptops, you need Linux kernel 6.10+, libcamera 0.3.1+, pipewire with some downstream patches, Firefox 116..."
You get the idea. For users and application developers, "Fedora 41" represents a _coherent whole_ (roughly) and is a more productive subject to center discussions around.
ps: I self host and have moved from compose to k8s on single node, kustomize helps keep app configs as DRY as possible
reply