
Distri: 20x faster initramfs (initrd) from scratch - based2
https://michael.stapelberg.ch/posts/2020-01-21-initramfs-from-scratch-golang/
======
andrewstuart
For those who haven't dug that deep.

initramfs is essentially a tiny Linux operating root image that Linux boots to
RAM, before it goes on to boot from hard disk.

You can actually configure Linux to never go onto the next step of booting
from hard disk, and run the operating system purely from RAM. Indeed there are
distros such as Tinycore Linux that are designed exactly this way which can
make things very fast compact and secure and diskless.

~~~
megous
You can also go in reverse, and escape the running distribution's mounted / by
pivoting to a tmpfs /, kill all processes and unmount the old root. Figuring
out how to do this was quite fun. Systemd makes it actually fairly easy,
because you can easily hijack the PID 1 if it's systemd, which is necessary to
be able to umount the old /.

~~~
Hello71
systemd makes it the _most_ easy. all you need is systemctl switch-root ROOT.
on other inits, you usually need to replace /sbin/init, then use some non-
portable (sysvinit is not the only competitor) method of telling init to re-
exec itself. sometimes that method doesn't even stop the running services, so
you need to either manually stop them, or kill them.

~~~
CameronNemo
In what situations would you expect a re-exec to stop the services? The normal
case is that the reexec preserves state.

~~~
Hello71
IIRC, some init implementations have this capability. I'm not saying that it's
good in general, but just that it facilitates this particular case.

------
matttproud
If the article reads as a so-what matter, it is worth remembering that the
major Linux distributions often rebuild the ramdisk on kernel upgrade as a
post-installation step, which is semi-frequent. Worse is how slow rebuilding
is.

~~~
semi-extrinsic
On Arch it's like ten seconds tops. How slow is it on others?

~~~
garmaine
Same order of magnitude. But that’s kinda ridiculous, no?

~~~
vbezhenar
Windows updates can take 10-30 minutes with computer unusable for anything
else. 10 seconds of background activity does not sound too bad. Why do you
care? You should be able to work on your tasks at this time.

~~~
GordonS
While Windows updates are nowhere near 10s, nowadays they are also nowhere
near 30m.

 _Years_ ago, _rarely_ , you'd get a Windows update that took up to 30m to to
complete. These days it's more like a few minutes - certainly noticeable, but
let's not exaggerate.

~~~
z3t4
If the upgrade fails it's very annoying though. First you have to wait 10
minutes for the upgrade, then it reboots, then it upgrades some more, then it
discovers something is wrong, it reboots, uninstalls the update which takes
maybe minutes, reboots again. That can easily kill half an hour if not more of
your work day, as you are not able to use the computer when it's upgrading and
rebooting. Compared to Linux where you rarely need to reboot after an update.
And you can also choose _when_ you want to upgrade, eg. you can shut off your
computer without upgrading. And most importantly, Linux lets your _start_ your
computer without having to wait 30 minutes for upgrades.

I'm currently using Linux for Windows apps that is no longer supported by
Windows. eg. they don't even work in compatibility mode. But they do work in
Wine!

~~~
viraptor
If the upgrade fails, that's a completely different issue, not a typical path.
We need to separate those, otherwise we'll end up with "On <any> system, when
the upgrade catastrophically fails, I have to reinstall the whole machine, so
upgrades take hours."

~~~
pixl97
Eh, windows updates, especially when you're including .net 4+ take a pretty
considerable amount of time, especially when on a HDD. Server 2016 updates Re
very memory hungry.

------
Mave83
We at croit.io use it to boot all systems in our environment. Debian uses a
better live-boot than most other distributions with Dracut (only half the ram
required).

After more then 10 years of PXE booting most systems, I personally can
definitely say it is rock solid, reduce maintenance times, and makes scaling
simple.

------
bogomipz
I really enjoyed this article. However the following passage seemed a bit
hand-wavy to me:

>"How will our userland program know which kernel modules to load? Linux
kernel modules declare patterns for their supported hardware as an alias,
e.g.:"

    
    
      initrd# grep virtio_pci lib/modules/5.4.6/modules.alias
      alias pci:v00001AF4d*sv*sd*bc*sc*i* virtio_pci
    

Devices in sysfs have a modalias file whose content can be matched against
these declarations to identify the module to load:

    
    
      initrd# cat /sys/devices/pci0000:00/*/modalias
      pci:v00001AF4d00001005sv00001AF4sd00000004bc00scFFi00
      pci:v00001AF4d00001004sv00001AF4sd00000008bc01sc00i00
    
    

>"Hence, for the initial round of module loading, it is sufficient to locate
all modalias files within sysfs and load the responsible modules."

Could someone elaborate on this a bit? I'm familiar with udevd. Is /sysfs
enumerated first and then a match is searched for in
lib/modules/5.4.6/modules.alias? Or the other way around? Are these files
parsed before normal kernel PCI bus enumeration then? Any other specifics or
detail would be greatly appreciated.

~~~
secure
> Is /sysfs enumerated first and then a match is searched for in
> lib/modules/5.4.6/modules.alias?

Yes :)

------
andrewstuart
This is 20x faster _build time_ to create the initramfs image, is that
correct?

It's not 20x faster to boot?

~~~
tlamponi
Yes, 20x faster build of initramfs image. But boot is also a bit faster due to
the faster go-based initial init process, I'd guess:

> Measuring early boot time using qemu, I measured the dracut-generated
> initramfs taking 588ms to display the full disk encryption passphrase
> prompt, whereas minitrd took only 195ms.

~~~
glandium
Is that an interesting difference, though? At least on my machine, boot time
is dominated by... grub loading the kernel and initramfs. Which is actually
ridiculous when you consider it's a few dozen megabytes on a NVMe. I have't
inveatigated what's up yet.

~~~
arminiusreturns
fyi a quick way to see this is with systemd-analyze

    
    
      systemd-analyze plot > bootplot.svg
    
      systemd-analyze critical-chain
    
      systemd-analyze blame

~~~
glandium
Systemd-analyze doesn't know about grub.

------
Twirrim
I want to like dracut, I really do. It's amazingly flexible and dynamic.
Mostly, though, I find it a complete pain in the neck any time I go remotely
outside the default behaviour.

It gets even worse when you enable debug. It's a series of really nested bash
scripts, and enabling debug does "set -x" for everything. That spews out so
much information over serial console / screen that actually tracking down the
bug is almost impossible. The sheer complexity of the embedding and sequence
of events means that even innocuous changes can have a way bigger impact than
expected. RedHat changed something between RHEL7.6 and RHEL7.7 and suddenly we
were having all sorts of issues with CentOS7 instances booting off iSCSI.

The one big positive about it (from my perspective at least) is that it _does_
make it pretty easy to inject additional functionality in to the initramfs.
Just drop in a bash script with a number prefix relevant to where you want it
to occur in the boot order.

------
cure
Related work is being done in the u-root project
([https://github.com/u-root/u-root](https://github.com/u-root/u-root)).

------
ghthor
This is an excellent article giving into the Initrd pattern.

------
hinkley
He cuts 390ms off of time to passphrase prompt. Does that translate to 390ms
faster boot time?

I can’t remember the last time I sat in front of a Linux box during reboot. Is
that a pittance or a noticeable improvement?

~~~
michaelmrose
With full disk encryption you can turn on a system and have a usable light
weight desktop up in 20 seconds.

Alternatively you can suspend and resume in most cases and have it in 2.

~~~
swiley
If your DE is light enough just put it and its dependancies in the initramfs
with your home folder (and /etc /var?) encrypted.

I used to do this, although without the encryption.

------
djsumdog
I use this on all of my systems. It's great and supports unlocking disks with
full disk encryption.

[https://github.com/slashbeast/better-
initramfs](https://github.com/slashbeast/better-initramfs)

------
Hello71
this article is oddly really knowledgeable and also really unknowledgeable at
the same time. despite its length, it doesn't explain what an initramfs is or
how it's created until basically the end (excluding the appendix). an
initramfs is simply a cpio archive with a minimum of one file, an executable
"/init", then compressed with a standard compressor. currently gzip, xz, and
lz4 are the useful ones.

this means that an optimized initramfs creator ought to take as long as a tar
file creator, because it basically is a glorified tar file creator. on my
system, with a full cache, mkinitcpio takes about 1.5 seconds to run. my
custom initramfs generator (not released yet) takes 0.03 seconds to run using
"cat" as compressor. it is written entirely in POSIX shell and uses only the
external commands ldd, gen_init_cpio, and the compressor. however, if gzip -9
is used for the compressor, then compressing the 12 MB takes 1.8 seconds. so,
we can see that compression can significantly inflate (pun intended) the time
consumed. as a matter of fact, looking at
[https://github.com/distr1/distri/blob/master/cmd/distri/init...](https://github.com/distr1/distri/blob/master/cmd/distri/initrd.go),
it appears to pass no options to gzip, so I assume the default is used,
probably -6. [http://man7.org/linux/man-
pages/man8/dracut.8.html](http://man7.org/linux/man-pages/man8/dracut.8.html)
indicates that dracut uses gzip -9 by default, which would increase the time
required. if dracut is configured in a non-default manner (e.g. lz4 -12) that
would further increase the time.

the author credits Go, concurrency, no external dependencies, and threads (?!)
for the improved performance. this is manifestly unnecessary: no threads are
needed to improve the performance of "tar", firstly because it is I/O bound,
and secondly because the compressor is usually the slowest part anyways. it
might be possible to slightly improve the performance by using native code,
but there is no substantial difference between running mkinitcpio in 1.0 vs
1.5 seconds, and there's definitely no difference between 10.0 and 11.5
seconds (lz4 -12 takes about 10 seconds to run, and I assume the remaining 1.5
seconds can be fully optimized to require 0 seconds).

regarding the rest of the post, it's... a little weird. I don't understand why
anyone would manually trudge through modules.dep files when that is the main
objective of the "modprobe" command, as opposed to "insmod", which does no
dependency resolution. (modprobe also supports config files, aliases, and some
other things, but the main and original objective is dependency resolution.)
it's also a bad idea to reimplement libblkid: it supports a ton of
filesystems, many of which one might actually want to use as a root
filesystem, but are not supported by this basic implementation, including xfs,
btrfs, or zfs. it also doesn't appear to support LUKS2. libblkid isn't even
necessary: busybox findfs works perfectly well for most common filesystems,
and runs very quickly (0.02 seconds on my machine). but, again, the main cost
when booting with a necessarily cold cache is likely to be I/O, not spawning a
process. I don't understand why someone would faff about with manually parsing
ELF files when ldd works just fine with simple text parsing and handles
interpreter, rpath, and transitive dependencies built-in.

this all just seems overly complicated to me. my initramfs generator is 77
lines of code, including automatic ldd dependency generation, and init is 37
lines, including LUKS decryption, dropbear for remote password entry, and
e2fsck with automatic reboot if requested by e2fsck, for a total of about 110
lines. I didn't count the lines of code for minitrd, but adding together the
two files linked is already about 1000 lines of code. mkinitcpio has far more
flexibility and is only about 1500 lines for the core code.

------
3xblah
"I currently have no desire to make minitrd available outside of distri. While
the technical challenges (such as extending the generator to not rely on
distri's hermetic packages) are surmountable, I don't want to support people's
initramfs remotely.

Also, I think that people's efforts should in general be spent on rallying
behind dracut and making it work faster, thereby benefiting all Linux
distributions that use dracut (increasingly more)."

As described on the website unixsheikh.com recently posted to HN^1 and making
the front page^2, I think this illustrates one difference between "Linux" and
BSD. Much of what unixsheikh describes has been said before however I think it
still remains true today. I am currently using Linux and open to being
persuaded otherwise. I want to learn "the Linux way" if there is such a thing.

That said, I always found booting to ramdisk, "root on mfs", "mfsroot", or
whatever people might call it, very straightforward on BSD. They maintained
and provided all the necessary tools, some of them have no Linux equivalent.
Things stayed more or less the same so once the process was learnt, there
seemed little chance of the goalposts moving. One did not need to be a
contributor to understand these processes. Sometimes I get the feeling I am
just not smart enough for Linux, and it _seems_ like these processes are
constantly subject to change. (I hope to find the truth is otherwise.)

As a BSD end user, I found it relatively easy to compile a custom kernel and a
custom, statically-linked, multi-call binary as a userland, like busybox but
using utilities "as is" not pared down with features missing, to be inserted
into the kernel. I could boot from USB stick^3 to ramdisk root, pull out the
USB stick, e.g., if I needed the USB port, then chroot into any userland. This
was extremely flexible and allowed for easy experimentation. Sometimes I might
have that userland prepared on internal media, the USB stick, or some other
external media, or I might create it dynamically by downloading and extracting
binary sets. This "targetroot"^4 userland might be mounted, usually r/o, on
HDD or it might just be mfs/tmpfs.

Apparently, the author was inspired to make his own distribution because he
found the package installation process too slow for large packages, e.g.
qemu.^5 He tried using SquashFS images to speed up the process. I did the same
thing in 2012 using cloop2 compressed filesystem images for large packages,
e.g. ghostscript, on NetBSD. I would keep them on the USB stick and just mount
them over an executable directory as necessary. I was not using these large
programs too frequently and I was running entirely from tmpfs, no HDD, so I
did not want them occupying space and depleting RAM.

1\.
[https://news.ycombinator.com/from?site=unixsheikh.com](https://news.ycombinator.com/from?site=unixsheikh.com)

2\.
[https://news.ycombinator.com/front?day=2020-01-20](https://news.ycombinator.com/front?day=2020-01-20)

3\. Thanks to the excellent BSD bootloaders I could manually choose the kernel
during boot. I would have several kernels on the USB stick. Of course the
kernel did not have to reside on the USB stick, I could load it from any
connected media. At one point I was able to boot NetBSD kernels using the
FreeBSD bootloader.

4\. TBH, I still do not know the true story behind targetroot. Things in BSD
always have a history behind them. Surely there is a story behind the
"targetroot" directory.

5 [https://michael.stapelberg.ch/posts/2019-08-17-linux-
package...](https://michael.stapelberg.ch/posts/2019-08-17-linux-package-
managers-are-slow/)

"How can minitrd be 20 times faster than dracut?

dracut is mainly written in shell, with a C helper program. It drives the
generation process by spawning lots of external dependencies (e.g. ldd or the
dracut-install helper program). I assume that the combination of using an
interpreted language (shell) that spawns lots of processes and precludes a
concurrent architecture is to blame for the poor performance."

Interesting that dracut uses bash, not dash for its scripts. Would dash be any
faster. Are bash features a necessity for the simple tasks dracut performs.

Also, could dracut use readelf instead of the bash shell script ldd. According
to Wikipedia, the Linux man page for ldd asserts it is a security risk.

Does default Ubuntu include a manpage for ldd. Mine is missing.

~~~
emj
The only thing in your post I really want in Linux is a better bootloader the
FreeBSD one is pretty neat, I've seen bootloaders that include kexec to change
Linux kernels in early boot. But I seldom see the need for a better pre boot
env, and I think this is the general consensus. Also the key to boot speed is
to forgo initramfs completely, at least if you look at those sub 500ms boots
that are out there.

~~~
3xblah
Seems there is a ever-growing list of Linux bootloaders. Which ones are not
satisfactory and why.

I was very satisfied with the BSD bootloaders. Plus the kernels comply with
the multiboot specification so it was possible to use other bootloaders. This
way I could boot kernels from different BSD projects from the same USB stick
using a single bootloader. I do not use GRUB.

~~~
LargoLasskhyfv
Since you seem to like to experiment, have you ever heard of

[https://www.plop.at/en/bootmanagers.html](https://www.plop.at/en/bootmanagers.html)
?

Though that seems for older hardware, mostly. The one for current systems is
in development. Anyways, on older hardware it enabled otherwise impossible
things for me.

