Hacker News new | past | comments | ask | show | jobs | submit login

ReiserFS does have its benefits in highly specific use-cases.

I have a filesystem with an absurd number of tiny files in it. I host a statically rendered wikipedia mirror. Tens of millions of gzipped html-files with a filesize in the range 1-5 Kb.

ReiserFS is the only filesystem I know that deals even the slightest bit gracefully with this thanks to tail-packing.




Have you tried btrfs? It has tail packing, block subdivision and inline files.


Is btrfs stable yet?


Core feature are stable.

It is good for most use cases, but you have to do your research

I would rather use other filesystem


It's always been. I used it on my servers in "Raid 10" mode and on my daily driver for 3 years now.

As long as you don't do BTRFS-level raid 5/6. (Just do lvm-level or md-raid-level raid 5/6) don't do many subvolume with quota. (many subvolumes is fine) It's production ready.

I don't know where this "btrfs is not stable" coming from. According to HackerNews, I should have lost my data due to BTRFS 10 times already.


It never ceases to amaze me how so many people only consider their own experiences as relevant

"btrfs is not stable" is coming from people for whom btrfs has not been stable. Why is that so hard to understand?


Okay, was it "not stable" because they expected it to be ext3-like and when btrfs discovered that their system eats bytes and refuses to mount (and they would need to use the recovery mode, get a second disk and do a copy and then a proper restore) or it was not stable because they ran into a kernel oops/bug/deadlock/etc?


From https://btrfs.wiki.kernel.org/index.php/Main_Page

"The Btrfs code base is stable"

And https://btrfs.wiki.kernel.org/index.php/Status is exactly what I said in the GP. Besides quotas and RAID56, everything is stable.


Try the web server Redbean. You put it all in a single zip file, serving a file simply copies the pre-gzipped content directly into a tcp stream. Obviously some of the FS read performance cost is moved into navigating the zip file, but the storage overhead is not, and nor is the read+write-able nature of the FS.


Do zip files really offer performant random access with tens of millions of entries?


I don't see why they wouldn't as long as the TOC (i.e. filenames + headers) fits in ram. The offsets are all there for a simple seek.


Yeah. Tens of millions sounds fine. And you don’t have to keep navigating the raw zip file, you just do it once. “All file offsets available in a single immutable in-memory hash map” is basically the dream scenario. I imagine if you were desperate you could pack more in by representing your file names efficiently in memory, a bit of path compression or a trie or whatever, but if it already works it already works.


Yes, that’s why for instance Java uses it to store large amounts of class files.


Years ago this was very similar to our use case, only much smaller file sizes, and Reiser saved an amazing amount of overhead.

The systems were hosting map tiles of the entire Earth, and a lot of them were solid blue (ocean) or solid beige (unoccupied land). Those files were something like 60-300 bytes, going from memory. Reiser was basically the only FS that would handle that reasonably.


Oh I have a repository of tens of millions of small json files, and xfs chokes on that specific partition it's amazing. 5 minutes for a ls in the root folder (with only 2 folders). Tried moving it disk to disk in case I was on a broken disk, but no.


Unless there is a good reason, I think in that situation I would probably try to avoid storing those tens of millions of small json files directly on the filesystem. For example : have you considered storing that data in jsonb/postgresql https://www.compose.com/articles/faster-operations-with-the-... ? (or even just in an sqlite db https://www.sqlite.org/fasterthanfs.html )


Stuff like this would probably benefit a lot from using a userspace VFS.


Indeed, I had a partition with a huge number of small files in the ext3 era, and reiserfs was the only one even remotely able to perform decently. I stumbled upon it last week and was happy and surprised to see that reiserfs was not removed.


We had millions of users of a mail service with everything stored in Maildirs. Reiserfs handled it fantastically on hardware ridiculously slow by modern standards. But I've not used it for maybe 15 years...


bcachefs has inline data extents, and should do pretty well here.


Wouldn’t it be better to keep such files inside a database?


https://www.sqlite.org/fasterthanfs.html They're selling it quite well here, though integration into other products might be the hard thing.


In my experience, integration with products is not really that hard. I've used sqlite3 bindings in half a dozen languages over the years, and they are generally fairly robust.



I literally discovered this just the other day, the whole concept is extremely intriguing but it hasn't been updated in some time (I'm guessing some forks have been though).

Relatedly, it would be pretty cool to be able to build your own filesystem features as layers on each other (such as block-level compression or encryption, block-level forward-error correction, cache promote, striping or other redundancy etc.). A modular FS with pluggable (but also upgradable) components and some sort of structured feature-flag schema metadata, and a way to migrate between variations (such as switching to a different compression or encryption algorithm).


Some of what you're describing is in ZFS. E.g., it has multiple compression algorithms; you can create many datasets -- each of these is either a ZFS filesystem or a volume which can have another FS deployed to it, but with compression and encryption handled by the "host" ZFS.

You may also be interested in FreeBSD's GEOM subsystem for storage.


A filesystem is a database.


But a special kind of database, that is usually bad with a lot of small entries, especially if they don't have a hierarchy (many files in one folder). Most SQL or NoSQL databases are much better at organizing small "files".


The difficult use case isn't many files in a folder, it's just many files period.


So what people think of as a database is a database inside a database?


Yeah. It's like that old Xzibit meme.

File systems aren't relational databases, but more like NoSQL half a lifetime before it was cool. They're typically backed by the same data structures and solve the same problems.


Indeed, some database servers (Oracle IIRC) don't use a filesystem, but write to a filesystem partition directly.


> I host a statically rendered wikipedia mirror. Tens of millions of gzipped html-files with a filesize in the range 1-5 Kb.

Have you ever tried Kiwix (https://www.kiwix.org ) and looked at the ZIM format (https://en.wikipedia.org/wiki/ZIM_(file_format) ), in particular those for Wikipedia (https://download.kiwix.org/zim ) ?


Yeah, I unpacked a ZIM to get data I'm deriving content from. But reading ZIMs are kinda slow compared to just reading a file from disk.


> I have a filesystem with an absurd number of tiny files in it. I host a statically rendered wikipedia mirror. Tens of millions of gzipped html-files with a filesize in the range 1-5 Kb.

> ReiserFS is the only filesystem I know that deals even the slightest bit gracefully with this thanks to tail-packing.

Are there any newer file systems that cope with this use-case?


Trailing off topic, but what's the static wikipedia mirror? I contemplated making something like that some years ago when wikipedia was quite slow.


I've modified the HTML of the pages quite a lot as it's largely a design experiment.

https://encyclopedia.marginalia.nu/

The data is a bit stale right now, since there's a new dump available and I haven't converted it yet. But it's pretty easy to set up. Just grab a zim-file from here[1] and unpack with an appropriate library with an optional step of parsing and transforming the DOM then save into gzipped html files and set up a server for unpacking and serving those.

The conversion job takes a few days but it's not that bad.

[1] http://wikimedia.bytemark.co.uk/ (use a mirror if it doesn't work)


Just mount a romfs image of site data as loopback.


Why can't I just use a filesystem to store files in?


yeah I'd stick all of those as blobs inside a sqlite or other db




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: