Hacker News new | past | comments | ask | show | jobs | submit login
Ask HN: Recommended guides for learning filesystems?
131 points by mgpela on Jan 20, 2017 | hide | past | favorite | 32 comments
Hi, *nix newbie here. I'm struggling with understanding how my Ubuntu's filesystem works and what general practices various applications follow when they install to my Ubuntu instance. I'm currently reading the ldconfig man page and feeling pretty lost. Any good guides or resources for understanding how filesystems and links work?


I think you're asking two different questions here. What are some standard conventions for where system files are placed in Ubuntu & Linux? And how are filesystems implemented.

For file placement you might start here: https://en.wikipedia.org/wiki/Filesystem_Hierarchy_Standard

Though I'd caution that it's not a strict standard, and you find that many linux distributions vary from it in particular ways.

For the implementation of file systems on Unix/Linux/Ubuntu, I'd say start at understanding inodes: (this link is just the first I found that seems like a reasonably short overview from my standpoint) https://www.cs.nmsu.edu/~pfeiffer/classes/474/notes/inodefs....

There are a world of different implementations when you get to the details, but the Linux kernel builds or reuses this core idea of inodes into many specific different filesystems. From there there is a lot of info, but the Linux VFS layer handles many common functions of of filesystems in Linux (even if you're system is using ext4 or btrfs, etc specifically). So the Linus Kernel vfs docs may be interesting if you're looking to go even deeper.

> For file placement you might start here: https://en.wikipedia.org/wiki/Filesystem_Hierarchy_Standard

And if you're at all interested in BSD file layout, there's hier(7): https://www.freebsd.org/cgi/man.cgi?hier(7) .

Re: Linux inodes (FreeBSD calls them "vnodes," short for virtual inodes, to distinguish the generic API from filesystem-specific per-file data, which it calls inodes), there's vnode(9) and VFS(9) to get started:



"The Design and Implementation of the FreeBSD Operating System" not only does a great job at covering how the UFS filesystem is implemented, but also does a great job at explaining how and Unix systems are implemented. I highly recommend this book to anyone with an interest in Unix internals.


Is that book similar in structure to the book "The Design of the Unix Operating System" by Maurice Bach? (or is it Design and Implementation of ... don't remember - had read it some years ago).

Came here to say this. Fantastic book for self-learners.

A good book, but not widely known (maybe a little low level, but still...)

Ah, it's also free ;)

From the the haiku project website: https://www.haiku-os.org/legacy-docs/practical-file-system-d...

This also covers the design of the Be File System, for those who don't know.

Written by Dominic Giampaolo, now working at Apple on APFS.

In university, one of my projects in Operating Systems was to build a file system.

I'd recommend anyone to do the same!

It does not have to be anything magical...just try to build a basic program that can emulate a file system (read files, write files, list files, have links, edit files, edit names, delete files, copy etc.)

The aim here is NOT to have a file system that you'll use daily, but to understand the concepts by programming them.

While this book goes into some details about it [0], I "understood" how it all worked together by building one.

FYI - I think the book focuses on C when giving examples, but if you know the concepts, it can be built in any language (Go, Java etc.)

[0] - https://www.amazon.com/Operating-System-Concepts-Abraham-Sil...

Xv6 is usually a good source to learn about OS primitives: https://pdos.csail.mit.edu/6.828/2016/xv6.html

I have not looked at the filesystem implementation, but since it is designed as a teaching OS, my guess is that it should be a good entry point.


> what general practices various applications follow when they install to my Ubuntu instance


  $ man hier 
It describes the file system hierarchy.

Wow, this is really helpful! Thanks!

A very interesting approach can be the forensic analysis of the effects of modern file systems to explorer their inner working. I really much enjoyed this way of learning about how they work and how one can observe their principles. The best book is without doubt:

Brian Carrier File System Forensic Analysis https://g.co/kgs/lv7gVN

For what it's worth I had asked someone the very same question when I started my career.

From the horses mouth take a look at ext4 on ubuntu https://help.ubuntu.com/community/LinuxFilesystemsExplained

Then a 10,000 mile view, I'd take a look at UNIX and Linux https://www.amazon.com/UNIX-Linux-System-Administration-Hand...

There are more books I could list and papers on FileSystems. I will leave you with three particular books that have guided my career

If you want to look solely at Ubuntu. The Ubuntu Server book https://www.amazon.com/Official-Ubuntu-Server-Book-3rd/dp/01...

Unix Made Easy - https://www.amazon.com/UNIX-Made-Easy-John-Muster/dp/0072193...

Hope this helps

I've really enjoyed the book The Linux Programming Interface by Kerrisk as a guide for learning some of these types of low level concepts.


Do you mean the technical details behind how a file system work internally, or how the files and the filesystem layout in a Linux/Ubuntu system are structured?

Right now I'm more interested in the second, as I believe that will help me understand more about how Ubuntu boxes are best maintained/structured, and hopefully help me to understand how tools are "installed" on Unix. I think I got what I asked for and more from the various responses on this thread so far, though...

These recommendations are off the top of my head, so take with a pinch of salt [1] or better, verify them by looking up the terms (which you should do anyway to get any value out of the recommendations).

[1] Why I say "take with a pinch of salt" is because this is based on what I learned a while ago about Unix file systems (and some of the terms are also related to DOS and Windows), and some enhanced file systems have come out after that, so some of the terms I mention may be out of date (but likely not, or not many).

Look up these words and terms (not given in any logical order):

(Unix) inode, directory entry, file system, disk partition, physical partition, logical partition, fdisk, cfdisk, boot sector, boot record, master boot record, GRUB, LILO, boot loader, ext2, ext3, ext4, file system types, journaling file system, fsck, fsdb, superblocks, disk 'cylinders, heads, tracks and sectors', disk block size, buffer cache, logical volumes, disk spanning, disk striping, RAID, disk formatting, block device, character device, device file, device driver, ...

You will get many of those terms explained in the resources that others in this thread have mentioned. But this can be a fast way of getting an overview of many of these terms via Wikipedia articles or other pages about them, which may even motivate you to read the longer explanations.

If you are a programmer and know some C, you can also look up functions / system calls / man pages like stat, dirent, opendir, readdir, etc., and have some fun getting (and for some things, setting) the contents of directories programmatically, getting file metadata like file size, type, times (access, modification, etc., owner and group, permissions for owner, group, other, etc.)

Edited to add a few more terms.

Note: not all of the terms above will necessarily result in good hits, but many or most should. As usual, caveat lector.

I can't think of any guides. I picked up most of my know-how through using Linux (Ubuntu, specifically).

A couple things for you which may be obvious, but a good place to start nonetheless:

/ - This is Root. Most of your system files are in this directory.

/home/username/ - This is your home directory, a common abbreviation is "~/" (tilde). If you are a windows user, this is the equivalent of the file system you are used to using. In this directory are also many dot files (.ssh/, .local/, etc). these are (usually hidden) system files that are specific to your user. Generally, there are defaults in / (root) and these files customize them for your user only.

Knowing these two things and a handful of basic commands will allow you to get started. Look into the commands 'cd', 'pwd', 'ls', 'sudo', 'touch', 'mkdir'.

Just to clarify: you typically don't have any files under `/`. It is literally the `root` of the directory structure. Nothing else. It could theoretically not be mounted at all, if all subdirectories are mounted individually (or it used to, I'm not convinced that's still the case).

In addition, you most typically wouldn't find the defaults in `/`, for the reason listed above. The default files copied over by the `useradd` (or `adduser`) command are usually sourced from `/etc/skel`.

To answer the OPs question, there are a number of differences in the way different distributions choose to implement their directory structure. That, and the default packages they ship with are the whole reason they exist. My best recommendation is to pick a distro, and learn the generalities you can from it.

Most configurations are found in `/etc/`. Most defaults can be found in `/etc/default`. `/opt` is usually after market software, or placed there to not conflict with the base system (for example, the Fedora/CentOS SCL installs newer toolsets, such as GCC/LD/Make into `/opt, though loads of proprietary software also installs there to not have to bother with proper packaging). `/usr` will be where all the system-provided tools and documentation will live. `/var` is used by programs and daemons to store their state.

Any directory can live on a different partition or disk. Use the `df -h` command to get an overview of your mount points. Also, unlike Windows, a disk or partition doesn't live next to all the others. They can all be interleaved into the root directory structure. Just because `/` is only 5GB doesn't mean you can't have a 2TB home directory or 30GB of programs and librairies installed.

> It could theoretically not be mounted at all, if all subdirectories are mounted individually (or it used to, I'm not convinced that's still the case).

The submounts need something to mount on. Root could conceivably be an in-memory only filesystem with a few directories automatically populated for submounts to mount on, though.

that's called initramfs.

In the case of Linux there's actually a (specially renamed) instance of a tmpfs mounted on /. Usually the real system root is mounted over that.

Lots of great links here. It is important to note however that the Filesystem Hierarchy Standard is not the one true way you should not deviate from. The FSH is full of design decisions that make sense in the historical context but are less than ideal today. There are alternatives.

If you are interested in alternatives, you can have a look at GoboLinux: https://gobolinux.org/?page=at_a_glance

Or have a look at NixOS, which has a different take: https://nixos.org/nixos/about.html http://funloop.org/post/2015-08-01-why-i-use-nixos.html

"man hier" and "man ln" are a decent start (the second one is because you mentionned links).

I'd recommend using FUSE for experimentation. You won't have to like hack on the kernel and rebuild it all the time and stuff.


Also see the Linux Standard Base project: https://en.wikipedia.org/wiki/Linux_Standard_Base

These are both somewhat dated but still very worth while and relevant reads, you can find them cheap on Amazon:

"UNIX Filesystems: Evolution, Design, and Implementation 1st Edition" by by Steve D. Pate

"Practical File System Design with the Be File System" by Dominic Giampaolo

Read the "Persistence" chapter: http://pages.cs.wisc.edu/~remzi/OSTEP/

Any book about operating systems should be a good start..you'll learn about drivers, virtual memory, file systems, task switching and a lot of other things and how they work together.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact