

How does mkdir() really work? - spolsky
http://unix.stackexchange.com/questions/797/hacking-into-linux-kernel

======
antics
Linux is a monolithic kernel designed to operate on all sorts of hardware, in
all sorts of environments. So it's complicated. In some cases it can be
beneficial to see a simpler implementation, which is a task I think Minix
[<http://www.minix3.org/>] is well suited for. Edit: Why? It's smaller, it's a
micro-kernel, and there is a lot of documentation, both about the source, and
about the theory of the source? And why is that, you ask?

For those who are just starting, the author (Tanenbaum) also wrote a book
called Operating Systems: Design and Implementation, which is a great resource
for learning the ins and outs of OSs. But it is a necessity to know C
beforehand, and you should have a reasonable understanding of basic CS first,
also.

Edit 2: Oh, also, Minix is in a lot of ways responsible for the genesis of
Linux, for those who didn't know.
[[http://groups.google.com/group/comp.os.minix/msg/b813d52cbc5...](http://groups.google.com/group/comp.os.minix/msg/b813d52cbc5a044b)]

~~~
recampbell
I disagree: Don't wait to have an understanding of C or CS, just dive in.

Find something interesting and figure out how it works. The best learning
happens when you get in over your head. Once you get sufficiently lost, go
back and learn some C. You'll appreciate it more and have a context to apply
what you learned.

~~~
antics
I tried that when I started in the fall of '09 and it failed miserably. When I
came back this summer after intensively studying C and computer systems in
general, things went a LOT smoother.

If you want to learn that way, you have to be incredibly tenacious. Some
people are, some people aren't, but I think my time was better served by
learning all the dependencies and then breezing through it when I was in the
right place. I mean, you could learn Organic Chemistry and just backtrack to
learn Chemistry where applicable, but that's a very hard way of doing it.

~~~
bconway
_I tried that when I started in the fall of '09 and it failed miserably. When
I came back this summer after intensively studying C and computer systems in
general, things went a LOT smoother._

I'm not doubting your story, but it's not a hard-and-fast rule. Con Kolivas,
who has written some interesting/thought-provoking schedulers for the Linux
kernel (and stirred up some politics along the way) had never touched C before
jumping in.
([http://apcmag.com/interview_with_con_kolivas_part_1_computin...](http://apcmag.com/interview_with_con_kolivas_part_1_computing_is_boring.htm))

~~~
antics
I completely agree. There certainly are bright people out there. What I'm
saying is, I consider myself to be about average intelligence, and so my
example is probably more typical, although I do concede there are probably
outliers.

------
Locke1689
The first response about C not knowing how to do syscalls with multiple
arguments is wrong. This kind of thing is done via the system syscall
interface. In fact, the more I think about it the more wrong he becomes.
System call interfaces don't have any such thing as "language dependence" --
if you can't do a syscall in straight object code you're screwed no matter
what language you're using. The syscall intrinsic in C and C++ is a
nonstandard Linux defined relation between the OS syscall interface (per the
OS designation for the ISA) and code being run.

Here's how you do a 2-arg mkdir:

    
    
      push %rax
      push %rbx
      push %rcx
      movq $0x27, %rax
      movq $path, %rbx
      movq $mode, %rcx
      int $0x80
    

Here's how you do a 3-arg:

    
    
      push %rax
      push %rbx
      push %rcx
      push %rdx
      movq $0x128, %rax
      movq $dfd, %rbx
      movq $path, %rcx
      movq $mode, %rdx
      int $0x80
    

Edit: Hmm, maybe he was referring to overloading the _intrinsic_. That kind of
makes sense, although there's a standard way to do that if necessary (first
arg is number of args, just pop x off the stack after the call, see printf).

~~~
rwmj
"int $0x80" still works for backwards compatibility, but it's not been used
for making syscalls from modern code for many many years.

~~~
dododo
oh? on a recent glibc:

    
    
      $ objdump -D /lib/libc.so.6 | grep 'int[[:space:]]*$0x80' | wc -l
      447
    

what were you thinking they used instead? sysenter? iirc, this turns out to be
slower than "int $0x80."

~~~
Locke1689
I was also trying to be Intel/AMD ISA independent. The Intel is SYSENTER, the
AMD is SYSCALL. He's right though, I probably would have used SYSENTER in
production code. You probably want to use int $0x80 in shellcode, though
(fewer save registers).

------
koenigdavidmj
Apparently on old Unices, mkdir was not a system call, so it had to manually
mknod(2) the directory and hard link the directories '.' and '..' in. This
also required the binary to be setuid root.

------
drv
The question and answers are all missing the key "glue" between the C mkdir()
function and the Linux syscall - glibc.

~~~
meastham
I get this, but why is there a declaration of the libc mkdir() function inside
of a kernel header?

~~~
caf
There isn't - <sys/stat.h> is a glibc header, not a kernel one.

~~~
Seth_Kriticos
True, to be more precise, the GNU C library has it in the sysdeps/unix/mkdir.c
file:

[..] char *cmd = __alloca (80 + strlen (path)); (mkdir command line parsing)
status = system (cmd); [..]

That's right, it just relays. I'm not sure how it gets to the kernel. I
suspect with a system call somewhere.

What I know is that it arrives in the kernel, in the fs source files: namei.c
for the vfs part and <fs-name>/namei.c for filesystem specific implementations
(that are called by the vfs code in the end, I guess).

Ps. Feel free to correct me. I only concluded this by poking around the
sources a bit, not into kernel development myself.

~~~
caf
That's a fallback mechanism, used by glibc on any "unix" system that doesn't
have a more specific implementation deeper in the sysdeps/ hierarchy (there's
a Linux one that defers to the syscall somewhere in there).

------
probablyrobots
Back in school I helped write a file system based on ramfs that used the ram
in all of the computers in our lab as one big shared super fast hard disk. It
was a great learning experience.

I wouldn't say it's necessary for every programmer to know how the linux vsf
system works and how files, directories and links are stored behind the
scenes, but it is really interesting. Here's a good description of vfs
[http://www.mjmwired.net/kernel/Documentation/filesystems/vfs...](http://www.mjmwired.net/kernel/Documentation/filesystems/vfs.txt)
. The implementation in ramfs is pretty simple (compared to ext3). You can
find it in your kernel source in fs/ramfs/inode.c. Theres a function in there
called ramfs_mkdir that allocates and configures a new inode. Anyway, thats
how i'd answer the question.

------
rnicholson
Why wouldn't this question be on StackOverflow?

~~~
MarkBook
There's a scavenging pack of higher rep OCD types on SO who immediately kill
any question they perceive to be weak

~~~
philwelch
StackOverflow has its own deletionists now? Is there any other site that does?

I think now that it's happened to more than one online community, it's worth a
lot of careful thinking to figure out why and how these communities develop
deletionist subcultures.

