More

Koshkin · 2025-02-08T13:16:00 1739020560

Yes, too many times I have seen coders solve problems in O(n) instead of O(1)… But that’s nothing compared to not having any sensible, provable theoretical foundation for the design of a piece of software, most of which look like being cobbled together based on a consensus that “it might work”, a demand of “the highest paid person in the room”, or the opinion of a “subject matter expert”. The situation is truly sad.

Koshkin · 2025-02-03T00:49:48 1738543788

Did Rust make pointers complicated?

Koshkin · 2025-02-02T16:37:07 1738514227

Wouldn't a USB hub and, say, BTRFS do the job?

doubled112 · 2025-02-02T17:21:11 1738516871

It would work.

I RAIDed a bunch of cheap USB 2.0 flash drives on a hub with MDRAID as a learning tool back in the day.

It was horrendously unreliable. USB wasn’t a good choice for storage back then, and I’m convinced the hub had issues. This would work much better now.

I did, however, get to watch the blinkin lights, learn how to recover from failures, and discover quite a few gotchas.

aaronmdjones · 2025-02-03T08:05:57 1738569957

MDRAID is good for availability and fault tolerance but no good for integrity.

For example, in a RAID-1, if one of the drives has a silently corrupted block, MDRAID will happily return that if you're unlucky enough for it to decide to use that drive to satisfy that read request. If you have error detection at a higher level, you might start pulling all but one drive from the array at a time and re-issuing the read request until it gives you bad data again (then you know which drive is bad).

If you have an 8-drive RAID-6 and one of the data blocks in a stripe is corrupt, again, it will happily return that (it won't even read the parity blocks, because every drive is present). Again you would have to pull one drive at a time and re-issue the read request until you get back good data, assuming you have a way to know that (e.g. a Zip archive with a CRC32). If you're still getting bad data, you didn't pull the bad drive; re-add it and pull the next one. This would happen when you pull the drive with the corrupted block, because then it would calculate what that block was supposed to contain based on the parity in that stripe.

Most distros have something akin to a monthly scrub job, where MDRAID will confirm mirrors against each other and parity against data. Unfortunately this will only tell you when they don't agree; it's still your responsibility to identify which drive is at fault and correct it (by pulling the corrupted drive from the array, nuking the MDRAID metadata on it, and re-adding it, simulating a drive replacement and thus array rebuild).

Worse still, the RAID-5 and RAID-6 levels in MDRAID have a "repair" action in addition to the "check" action that detects the above. This doesn't do what you think it does; instead it just iterates every stripe in the array and recalculates the parity blocks based on the data blocks, writing new parity back. Thus you lose the above option of pulling 1 drive at a time because now your parity is for corrupted data.

You need a filesystem that detects and repairs corruption. Btrfs and ZFS both do this, but Btrfs' multi-device RAID support is (still) explicitly marked experimental and for throwaway data only.

In ZFS, you can either do mirroring (equivalent to an MDRAID RAID-1), a RAID-Z (equivalent to an MDRAID RAID-5 in practice but not implementation), a RAID-Z2 (RAID-6), a RAID-Z3 (no MDRAID equivalent), set the "copies" property which writes the same data multiple times (e.g. creating a 1 GiB file with copies=2 uses 2 GiB of filesystem free space while still reporting that the file is 1 GiB in size), or some combination thereof.

ZFS checksums everything (data and metadata blocks), and every read request (for metadata or data) is confirmed against the checksum when it is performed. If it does detect checksum mismatch and it has options for recovery (RAID-Z parity or another drive in a mirror or an extra filesystem-level copy created by the "copies" property being greater than 1), it will automatically correct this corruption and then return good data. If it doesn't, it will NOT return corrupted data; it will return -EIO. Better still, checksums are also metadata, so they are also replicated if you have any of the above topologies (except copies=). This protects against corruption that destroys a checksum (which would ordinarily prevent a read) rather than destroying (meta)data. A ZFS scrub will similarly detect all instances of checksum and (meta)data mismatch and automatically correct any corruption. Better still, a ZFS scrub only needs to operate on filesystem-level allocated space, not every stripe or block.

tl;dr: Don't use MDRAID on questionable storage media. It won't go well.

Koshkin · 2025-01-22T22:00:28 1737583228

> about "civilization ending" events this kind of compute

Realistically though, we'd be extremely lucky to need the abacus, even.

Koshkin · 2025-01-22T14:21:27 1737555687

I recommend SciTE. Is seems like a perfect middle ground between Notepad and more feature-rich text editors like Notepad++.

Koshkin · 2025-01-19T01:46:20 1737251180

Dusa McDuff rocks!

Koshkin · 2025-01-19T01:26:14 1737249974

64kB was huge. The first version of UNIX needed 24kB (half of which was taken by the kernel).

Koshkin · 2025-01-11T17:00:30 1736614830

Off-topic, but just wanted to note that using floating-point numbers as keys may be generally a bad idea (unless you use a custom comparator that takes into account the error that can accumulate during calculations).

mgaunard · 2025-01-11T17:30:52 1736616652

especially for an order book...

Koshkin · 2025-01-11T15:33:40 1736609620

‘auto’ (in C++) and ‘var’ (in C# and Java) is a blessing, makes code much less verbose. Also good for refactoring - less code to change.

BalinKing · 2025-01-11T15:40:29 1736610029

I’m only a C++ amateur, but IMHO C++ vs C#/Java isn’t really a fair comparison here—the latter doesn’t have template shenanigans and so types are much more transparent to the reader (by which I mean that you don’t have to execute a dynamically-typed program in your head to get from the term on the right-hand side to the type on the left).

bigstrat2003 · 2025-01-11T19:49:30 1736624970

Verbosity is not bad. When it makes the code clearer, it is even a good thing.

ithkuil · 2025-01-11T20:42:28 1736628148

Complex type parameters make explicit typing highly impractical to be used all the times

I like rust's approach in that it allows a mixture of explicit types and type inference using placeholders

For example: "let x : Result<Foo<int, _>, _> = make_foo();"

Koshkin · 2025-01-05T20:44:01 1736109841

Oh well... At least, in 1980 we got Xenix.