Hacker News new | past | comments | ask | show | jobs | submit | jackhalford's comments login

I don’t understand the storage story, imo different apps deal differently with replication and there’s no « one size fits all ». How are compose volumes replicated? Is split brain handled by corrosion or is it up to the uncloud user to set up stuff right in the docker compose?

ps: I self host and have moved from compose to k8s on single node, kustomize helps keep app configs as DRY as possible


I agree there is no one-size-fits-all for data replication. Currently Uncloud doesn't handle volume replication. Moreover it doesn't support regular Docker volumes yet, only mounting a host path. The reason is I didn't have time to give it proper thought on how to design volumes in a cluster context without getting into the full-blown PV support like in K8s.

I suspect that I will implement support for regular local Docker volumes such that each service container will use its own volume on the machine it runs on. Uncloud won't automatically replicate data between volumes as storage replication adds significant complexity and potential failure modes. Apps that need HA such as databases can handle their own replication. I'm getting inspiration from Fly for this: https://fly.io/docs/volumes/overview/. Maybe it would make sense to implement handy commands for cloning, moving, and backing up volumes between machines, not sure yet.

Corrosion (the embedded CRDT SQLite) is only used for Uncloud's internal state management, not for application data.


I see, so off the top of my head I’m thinking if uncloud could host some of my containers: nextcloud, plex, radarr, transmission, listmonk, linkding. And it seems its no better than my single node k8s, i.e they all have either sqlite or filesystems state, and no clear way for HA replication… mind you this isnt a problem of uncloud, moreso of the apps themselves

Yep agreed, but on the other hand do you really need HA for them when you're perhaps a single user? I wonder, what motivated you to migrate to single-node K8s from Compose for your self-hosted setup?

No I don’t really need HA, for most of my services high availability comes last. I moved to k8s for learning purposes, I stayed because I like it. When I decide to move some services to HA I have a much more powerful base with k8s than compose.

Personally I just backup the underlying filesystem (i.e /data) that vaultwarden uses.

Edit: I realize you are probably using bitwarden directly, in which case don’t you trust them to safeguard your data?

ps: if it’s just ssh keys, just store them as key value pairs? I haven’t kept ssh keys for a long time thanks to tailscale ssh…


> I realize you are probably using bitwarden directly, in which case don’t you trust them to safeguard your data?

Yes i use bitwarden directly, no self hosting. I do trust them keep my data safe (although i also trusted LastPass at some point, big mistake) but why not also keep a local copy, just in case. The type of data you store in bitwarden is worth the hassle and if Bitwarden Inc. ever gets into big trouble suddenly you'll be glad to have the backup.


If the data worth the hassle to backup, isn’t it worth the hassle to self host? Especially if you were part of the lastpass breach


Funny this pops up today, I’ve finished migrating form KeepassXC to a self hosted vaultwarden, the official bitwarden apps and briwser extension are super well made, so good so far with the switch.


Seems like the constant is another representation of the set of all prime numbers. I wonder of there is a branch of math that formalizes these different representations, there is the infinite series here that defines the constants, but how does that relate to the set of primes? And what are the other representations?


One such representation is a polynomial with integer coefficients in 26 variables given in

https://www.tandfonline.com/doi/abs/10.1080/00029890.1976.11...

which has the curious property that as you substitute nonnegative integers for the variables, the positive values of the polynomial are exactly the set of prime numbers. (The polynomial also yields negative values.)

When put like this, it sounds like the polynomial must reveal something deep about the primes... but it's another cool magic trick. The MRDP theorem (famous for solving Hilbert's 10th problem negatively) implies that this kind of multivariate polynomial exists for exactly those sets of natural numbers that are computably enumerable, so the polynomials could be seen as a really esoteric programming language for set-enumeration algorithms.

More tricks: https://en.wikipedia.org/wiki/Formula_for_primes


I don’t think of ssd and hdd for different parts of the filesystem. Rather / is a zfs/btrfs/bcachefs pool, and the ssd is added as a read cache to the pool

Edit: it can also be a write cache but that’s more tricky, usually with a battery backed hardware raid it’s fine


Reminder that sometimes empty alt text is _more_ accessible, in particular for images that have no bearing on the main subject of the page


Doesn’t matter when bottom feeding lawyers run an automated tool on your site and sue you in hope of a quick $5k settlement. That is what drives this not any notion of accessibility.


I really want to migrate my zfs pool to bcachefs so I can finally follow the latest version of fedora from day one, but this crap is making me doubt it’s a good move…


Regardless of everything else, most people should not be using bcachefs yet. Kent has even stated that unless you're okay not being able to access your data for chunks of time while bugs are being fixed, you shouldn't be using it. The conventional wisdom would be to wait 10 years after a new filesystem is introduced for it to stabilize before switching, so we're looking at summer next year at the earliest.


Apart from that, there are (or were, last I tried it six months ago) some performance bugs in the code.

Nothing that completely breaks it, but I found at the time that the high variance on read requests for Samsung 970 series NVMe causes the filesystem to also dispatch reads of cached data to the HDDs, even when it’s fully cached.

Which predictably increases latency a lot.

Really I should make another stab at fixing that, but the whole driver is C, and I’m not good at writing flawless C. Combine that with the problem actually being hard…

(“Always read from SSD” isn’t a valid solution here.)


"Always read from SSD" seems like what you'd want, no?

I have something on the back burner to start benchmarking devices at format time, that would let us start making better decisions in these situations.


Sorry to say, I have some old SSDs that are only 3-4 times faster than the HDDs. Especially when there’s a lot of HDDs in the pool, ignoring them could be leaving a lot of performance on the floor.

Though it would be an improvement over what I saw last time I tried, sure.


Oh, that is tricky. If you want to play around with the algorithm that picks which device to read from, it's in fs/bcachefs/extents.c

  static inline bool ptr_better(struct bch_fs *c,
                              const struct extent_ptr_decoded p1,                                                                            
                              const struct extent_ptr_decoded p2)                                                                            
  {             
        if (likely(!p1.idx && !p2.idx)) {                                                     
                u64 l1 = dev_latency(c, p1.ptr.dev);
                u64 l2 = dev_latency(c, p2.ptr.dev);                                          
                
                /* Pick at random, biased in favor of the faster device: */
                                                                                              
                return bch2_rand_range(l1 + l2) > l1;
        }       
                            
        if (bch2_force_reconstruct_read)
                return p1.idx > p2.idx;        
                                               
        return p1.idx < p2.idx;
  }
Perhaps just squaring the device latencies would balance things out more the way we want.


I remember this code!

If we're talking about my desktop, its current configuration is 3x 2TB NVMe (configured as zfs cache) plus 2x 12TB HDDs (mirrored). I've set sync=disabled, with transaction groups committing every 10 minutes — this is fine for my use case — so the HDDs spend most of their time spun down.

I only actually have 4TB of data on the system. It keeps growing, but the working set is probably much less than that.

Which means, it's 100% cached. A single read sent to the HDDs would have a latency of multiple seconds; absolutely catastrophic for a desktop workload. In this case _always_ using the cache _is_ the right answer, but I've been trying to think of an algorithm that would be able to do so without hardcoding it.


Is there not some sort of standardized, stringent filesystem test yet? Like Jepsen is for databases? If passed, one can be sure it is reasonably free from bugs? Guess not.


The thing is that filesystems are inherently statful, so the same test might trigger different edge cases depending on the state of the fs.


Databases are for saving state, no?


The building process happens in a container?

> If everything goes well, Asterinas is now up and running inside a VM.

Seems like the developers are very confident about it too


Isn’t driver support tied to a kernel version? Why would it be tied only to a fedora version?

Edit: the whole chain is kernel → libcamera → pipewire | pipewire-camera-consuming-app, from the article. So other distros will be getting it too


For high-level summaries, I actually appreciate such framing because the Linux ecosystem sorely lacks the coherence found in other platforms:

"USB4 V2 is supported in Windows 11 24H2"

"The Translation API is now available in macOS 15"

"To use the built-in camera in these specific laptops, you need Linux kernel 6.10+, libcamera 0.3.1+, pipewire with some downstream patches, Firefox 116..."

You get the idea. For users and application developers, "Fedora 41" represents a _coherent whole_ (roughly) and is a more productive subject to center discussions around.


Yes, and work like this really makes me appreciate the impact of work that Red Had pays for has on the greater Linux eco-system.

As someone noted, these cameras had support on Ubuntu, but those changes were not upstreamed to the different projects.

Also, if people want more info you can look at this presentation: https://archive.fosdem.org/2024/schedule/event/fosdem-2024-3...


Nobody is saying other distros won't. This is just talking about it in Fedora and what it took to make it happen.


I know the ziglang project has been doing « sub issues » ad hoc by writing issue lists in a main meta issue. These enhancements will be welcome.


Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: