Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Ramroot – Run Arch Linux Entirely from RAM (2017) (ostechnix.com)
146 points by e18r on June 20, 2019 | hide | past | favorite | 81 comments



Many OS out there take advantage of RAM Disks. Alpine Linux for instance allows you to boot your root device on ram as a default. SmartOS runs a boot image completely on RAM, and keeps your storage free for a few configs and VMS.

Virtually all network booted computers run their OSs from RAM. If you have never pxe booted a machine, it is a beautiful experience once you overcome a few challenges: easy upgrades and rollbacks, being able to use a machine with different contexts/platforms by just rebooting, and having your servers cleaned up just by cycling (assuming you don't have local storage)

If you enjoy the idea of RAM root devices, please try pxe/ipxe to boot your computer from a network. Also, if you have a sufficiently fast network... it is probably faster than booting from disk!

EDIT: I missed the word "all" on the second paragraph, and another typo... sorry!


Couldn't agree more, I've got at least a couple of dozen computers (from workstations and thin clients through 486s and Pi's) at home and only the main server and its backup have persistent storage.

Especially useful for Windows; it is rather slower diskless (compared to BSD/linux) but it makes Windows instances disposable.

Takes a bit of effort (more these days, FUVM systemd) but having a RAMdisk + diskless RO /usr is a great way of having computers everywhere all singing the same song.


Ditto the love for PXE — booting from a SAN appliance stocked with a few dozen fast SSDs is amazing. When I was doing big VMWare deployments, I had a PXE bootstrap setup that helped me build an entire 40-node cluster from bare metal in an hour.

Even in the event you have to use some secure zone crap, it’s pretty trivial to build since all PXE requires are dhcpd and tfptd — which are often available by default on nearly any *nix variant.


PXE boot FTW! I use to work on a project that used LTSP to make a couple dozen thin clients. The thin clients were slow, but the OS was fast. This isn't entirely the OS in RAM as a big chunk of the file system was mounted over the network, but most of the OS was in RAM.


Sadly PXE boot of a modern Linux OS is easier said than done. Gone are the days of just handing out the kernel over tftp and providing a NFS root. SystemD gets super cranky and you can't even boot the thing without setting some undocumented flags.


A little tooting of my own horn here, but nyble[1] allows you to build a ramdisk bootable image and kernel, that you can serve trivially with any http server and iPXE. A related project, tiburon[2] allows you to make this JSON db controlled. Still working on documenting them in greater depth, but you can see an example of what nyble can do[3].

[1] https://github.com/joelandman/nyble

[2] https://github.com/joelandman/tiburon

[3] https://scalability.org/2019/05/nyble-ftw-installing-my-ramb...


What? I just did this today with Ubuntu 18, using dnsmasq and a nfsroot based on a virt-builder image. It runs fine..


I highly recommend using pixiecore as an iPXE server -- your remote machines only need to have PXE enabled. Previously, it was a pain in the ass to install the TFTP server and get a DHCP server capable of not binding to every request (and only PXE requests like the standard allows). pixiecore does everything for you in like 10 seconds: https://github.com/danderson/netboot/tree/master/pixiecore. We're using it on-premise to spin up a server rack into stateless Kubernetes nodes. None of the blades have a hard-drive/ssd :)


PXE booting Linux live media is fairly straight-forward. Grub/syslinux syntax translates pretty easily to iPxe (which is simply a different type of bootloader). You just provide a kernel,a ramdisk, and whatever kernel arguments you need. If you're trying to roll your own ramdisk environment from scratch you may run into trouble but in general network booting Linux isn't any trickier than making a bootable iso.


How did the systemd cabal manage to mess that up? I ditched that thing a while ago for good reasons, so I don't keep track anymore.


I'm not entirely sure of everything, but problems with UUIDs (which we expected), and some dbus signals not being generated when run over a NFS mounted root causing the boot to hang. We hacked some timeouts to get around the problem but never figured out exactly where the signals were supposed to be generated from.

A hint if you're doing this on Linux. We PXEBoot an iPXE loader to boot the machines. Doesn't work properly on UEFI unfortunately, gotta use BIOS boot.

If it helps, I have notes on how to set that up:

Go to http://rom-o-matic.net and choose gPXE git. Click on the "Customize" button to expand all of the options.

Choose: 1. PXE bootstrap loader image [Unload PXE stack] (.pxe)

2. all-drivers

3. PCI VENDOR CODE: [blank] PCI DEVICE CODE: [blank]

X CONSOLE_PCBIOS

_ CONSOLE_SERIAL

BANNER_TIMEOUT [20]

_ NET_PROTO_IPV6

(Serial Port Options are irrelevant)

X DOWNLOAD_PROTO_TFTP

X DOWNLOAD_PROTO_HTTP

_ DOWNLOAD_PROTO_HTTPS

_ DOWNLOAD_PROTO_FTP

_ SANBOOT_PROTO_ISCSI

_ SANBOOT_PROTO_AOE

X DNS_RESOLVER

X IMAGE_ELF

X IMAGE_NBI

X IMAGE_MULTIBOOT

X IMAGE_PXE

X IMAGE_SCRIPT

X IMAGE_BZIMAGE

X IMAGE_COMBOOT

X AUTOBOOT_CMD

X NVO_CMD

X CONFIG_CMD

X IFMGMT_CMD

X IWMGMT_CMD

X ROUTE_CMD

X IMAGE_CMD

X DHCP_CMD

_ SANBOOT_CMD

X LOGIN_CMD

_ TIME_CMD

_ DIGEST_CMD

X PXE_CMD

_ IPV6_CMD

_ CRYPTO_80211_WEP

_ CRYPTO_80211_WPA

_ CRYPTO_80211_WPA2

Embedded Script:

-----------------------------------------------------------------------------

#!gpxe

dhcp any

initrd http://<your_server_here>/initrd.img

kernel http://<your_server_here>/pxelinux.0

imgargs pxelinux.0 root=/dev/nfs rw boot=nfs nfsroot=<your_nfs_server_here>:/netroot root ip=dhcp nfsrootdebug

boot pxelinux.0

-----------------------------------------------------------------------------


> Also, if you have a sufficiently fast network... it is probably faster than booting from disk!

Is this also true on Windows, with all the hardware initialization it has to do on boot when it finds new hardware?


i ran a cluster of 6 nodes in 2001 with (before PXE) network book and NFS root. it was awesome.


Never having finished Home pxe setup is a great sadness of mine.


Turn that sadness into joy! Even an RPI can host your TFTP and your boot files... Today!

http://ipxe.org/start has all the info you'll need.


I shall attempt soon then. Thanks


I built something like this back in 2002 when I was at Red Hat for a client that wanted to have their firewalls to have read-only configurations on a diskless system. They would update the rules/config/system by burning a new CD and booting it up.

It worked basically how a Live CD worked - creating a temporary filesystem in ram, and I only learned later that they already existed (I didn't know at the time and the Internet wasn't as good at finding things as it is today :) )


I wonder whether this achieves anything performance-wise that just cat'ing every file to /dev/null to warm up the buffer cache wouldn't achieve.

In theory, the kernel uses a buffer cache that will hold on to disk pages until they're invalidated by writes. It'll evict the cache if there's memory pressure. But this setup will presumably just crash if there's memory pressure, so that doesn't seem like a win for the RAM disk.


I can't hard-core prove anything, but I've read the theory on how if I'm accessing warm disk cache, putting things into a RAM disk shouldn't speed them up, and every time I've done it and tested it, putting things in a RAM disk is faster than having it on disk, even when the very act of copying the stuff into RAM should definitely have just warmed everything up just before the test. I've not done it very often, but every time I've tried it in the last ~15 years it's been the case. I don't know exactly why. I don't think I've done this test with my NVMe disk, though. Nominally, since the entire point of this exercise is that we never physically touch the drive it shouldn't matter what sort of drive it is we aren't touching, but reality and theory can differ.


It shouldn't matter _if_ all you're doing is reading. If you boot Linux, there's a ton of things that do small writes and then call fsync. That won't do anything on a RAM disk but is relatively slow, even on an SSD.


Interesting! Did you make sure there aren't any writes happening in the warm disk cache case, even for things like atime?


Not in the rigidly scientific sense, but on a Linux system where you are the only user and nothing else "major" is happening, there isn't that much writing of any kind going on.


That is interesting. Could you please describe in short your methodology? Did you try to read/write many small files, or one big file?


More than any performance boost that could be achieved from this, I think what's cooler is the ability to unplug the storage device that held the OS.

For example, if you want to run hardware diagnostics on multiple machines at the same time with 1 consistent, custom environment, you can just setup one USB stick with this, and plug it on each machine in turn to boot them up. You wouldn't need to keep the USB stick plugged in while they're running.


One of the big advantages is that you do not need to install file systems on your storage for single purpose servers, which allows your application to run on top of the raw block devices. Databases and similar run much better in this configuration (assuming they support raw block devices), especially if you are running on top of a hypervisor.


Do you know of or could list which databases support raw block devices?


Most of your classic closed source RDBMS (e.g. Oracle) had this as an option, no idea if they still do. Some purpose-built cloud databases operate this way because it makes perfect sense in that environment. Many domain specialized databases kernels (e.g. for real-time sensor data models) support or require this. It isn't something I recall seeing in open source but that is probably because it only works if you are doing full kernel bypass, which also hinders portability.

I started implementing raw block device support for my own databases about five years ago and it turned out to be brilliant for more reasons than I expected. I should have tried it much sooner. Ironically, it requires less code than using the file system.


I saw and replied to a question on StackOverflow [1] recently which discussed it. That was the first time I've tried using a block device. It was kinda fun.

1: https://stackoverflow.com/a/55737206/1111557


Yes, because a database really is just a filesystem. And a filesystem is just a (cranky) database. A database that uses files is like networking through a NAT.


Maybe sqlite? :P

  sqlite3 /dev/sda
(Don't try to hold me responsible if you run that with root)


Warming up the page cache is in fact already done on modern systems: http://manpages.ubuntu.com/manpages/xenial/man8/ureadahead.8...

I'd agree that making ureadahead more aggressive would indeed make more sense than just sticking your roots in RAM. It shouldn't impact your boot time very much and is much more flexible!

But maybe there's some special perf benefit to the ramroot approach..


One theoretical benefit is that linux won't evict file data from a ramdisk, but it will evict it from cache and buffers if it thinks they are not needed that much. The kernel algorithms for memory management are very opaque and linux may do some suboptimal decisions, especially when applications are starving for memory.


This is neat, but nothing new. I remember using this builtin feature on live distros like Slax and DSL over a decade ago. It was fun to see old computers run (comparably) blazingly fast.


The Amiga was able to create a "Recoverable RAM Disk" that survived reboots, back in the early 80s: https://grimore.org/amiga/rad


The new thing is that it can be just be switched on/off on an existing install.

Sure people have been running distros from ram for a while but getting the tooling situated to make it like any other feature it super cool.


> The new thing is that it can be just be switched on/off on an existing install.

Is that only true for Arch, or is that also working on other distros?


Alternatively, one could use NBD (network block devices) to create a network block device that resides entirely on a server's RAM.

The nice thing about NBD is that its a super simple protocol, the server runs in user space, and it's easy to modify to suit own needs. I built sometime ago a version with block deduplication for a farm of disk-less clusters that had very little ram in ~1k LOC. Main disk had persistence activated, while swap drives were pure RAM.


It's worth mentioning Tiny Core Linux[1] which runs entirely from RAM as well! It's a wonderful little destro with a small footprint. I boot TCL off of a USB, load it entirely into RAM, and am able to use it as my daily driver.

[1]: tinycorelinux.net


I always keep TCL around as an emergency boot environment, Just In Case™.


Author here. Awesome to see some interest in my project.

I'm currently working on the next version in my spare time (you'll see it in the dev branch). Improvements include: configuration now done via /etc/ramroot.conf, ability to specify actions taken for other partitions, ability to copy files to any location only when booting from RAM (allowing custom configs and whatnot to be used when in the live environment), new install hook that includes binaries and modules rather than adding them to /etc/mkinitcpio.conf, sudo will no longer be a required package, custom memory requirement settings, and more...

Also, I have gotten this to work on Debian, Ubuntu, and Kali with minor modifications. I plan to include a makefile for installing to these distros but don't plan on packaging for them at this time.

https://github.com/arcmags/ramroot


Once upon a time I did this using Mac OS 6 on a PowerBook 100. The 100 had pseudostatic RAM so the RAM disk would persist between shutdowns. Using OS6 left enough space on the disk to also run an old-for-the-time version of Word, a perfect silent student writing machine.

Some info about it here:

http://www.pugo.org/collection/faq/21/


Surprised no one here has mentioned mfsBSD. It's an unofficial FreeBSD answer to this problem. Works very well for some maintenance tasks, like ensuring a new install's disks are clear of partitions/zpools. I have even booted it's ISO via IPMI over the internet with OpenVPN.

https://mfsbsd.vx.sk/


Nowadays, there's "mfslinux" [0] (based on OpenWRT) as well (also by Martin).

[0]: https://github.com/mmatuska/mfslinux


Thank you. Wasn't aware of this. Will certainly try it out.


I'm on mobile so I'll keep this short but I found this[0] a long time ago and it still seems to work. It uses strace, mmap, and mlock to load any program you want and it's libraries, etc. into RAM but only those programs so if you're short on memory, no problem. During the setup, you can even mouse around in the program and preload anything involved in that. Back in the day I used it on really slow stuff like OpenOffice, Firefox, GIMP, etc. and it sped the opening of those programs significantly. The great thing, again, is you set it to preload only the specific things you need into RAM and nothing you don't. And, once done, it's pretty much set and forget.

[0]https://forums.gentoo.org/viewtopic-t-622085-start-0.html


I run Arch Linux and I'm really excited to try this out tonight. I always make huge /tmp ramdisks (50 GB+) and run everything in there that's filesystem intensive. It's so much faster. This could be perfect for easy, stateless Arch Linux servers.


"Please note that this prompt (y/N) defaults to yes with a 15 second timeout if 4G or more of RAM is detected"

I understood the capitalisation to indicate the default. Or does the default change? That seems bad if you're intending to wrap this in a script.


The author's demo/test machine only had 2 GB of RAM so perhaps it defaulted to "N" for that reason.


https://en.wikipedia.org/wiki/List_of_Linux_distributions_th...

I came to know about this from Puppy Linux.


In college, I would use computer lab computers as a proxy for my... experiments. I would just load up a small Linux distro from a CD into ram. That way, I could just reboot the computer when done, and it would reboot into Windows.


I would love to see this feature added to CentOS/RHEL. In the past, we used NFS Diskless which is anything but diskless. Hacking together an initrd that loads everything into ram is, well, hacky. To have a dracut function or a toggle to enable this would be great for lightweight deployments, testing, labs, vagrant, etc... I'm sure that RHEL must have been contemplating it because there is /etc/sysconfig/readonly-root unless that was just a better way to do NFS diskless...


We use warewulf (via openhpc) for diskless centos 7 compute nodes at work. Basically initramfs creates a tmpfs, downloads the OS image to it, switches root to it.


Oh nice, I thought it was also using NFS diskless. I will take a look at it. Thankyou!


This is nothing new, or exciting. Most Linux distros will do this either via PXE, with NFS root, or rsync a rootfs to RAM and then boot, etc etc etc. There are literally so many ways of doing this, it would take me eons to stop the creativity kill.

What do so something cool? Boot a whole computer cluster using BitTorrent as a backend, diskless, diskfull, at lightning speeds: https://github.com/dchirikov/luna


I believe FreeNAS does something like this during startup. You can “install” it on mirrored USB drives in case you run out of power. The redundant USBs are used to install the OS back into RAM. An extra USB in case of corruption and they’re way cheaper than SSDs. I’m assuming you still need to write config files back to the USB.

FreeNAS prefers ZFS so it’s RAM hungry anyway. The article recommend 500mb more than you need.

Can you do this to an Arch VM?


SSDs cost 26€ for 128GB and offer significantly more performance and reliability. USB drives are not significantly cheaper.


USB thumbdrives are far more portable than NVMe or PCIe SSDs though. Performance doesn’t matter too much in this type of setup; once the OS is loaded into RAM it shouldn’t hit the thumb drive again.


SSDs would also require additional SATA (in my case) ports and a place to install the SSDs. I used all of those for the devices that are actually storing my data.

Because this is a "real" (headless) 2U server, none of the USB ports were being used so, while dedicated SSDs for the boot volume would be nice, a pair of decent 16 GB USB flash drives have been doing a wonderful job serving as my mirrored "freenas-boot" pool.


Why do I need 128GB when 8GB will do fine? I have so many USB drives I wouldn’t even have to buy one. Microcenter gave me a free 32GB USB 3.0 flash drive last time I went.

Amazon has $4.99 16GB USB 2.0 flash drives from SanDisk. I’m sure you could find cheaper chinese brands.

The FreeNAS community recommends 5+ year old server hardware running refurbished or used enterprise SAS drives. Performance is not an issue when you never want to shut it down.


Darch does this as well: https://godarch.com/

However, you can use Ubuntu/Debian/Void as well.

Here are my personal recipes: https://github.com/pauldotknopf/darch-recipes

PS: I'm the author of Darch.


Out of curiosity, does data persist after a shutdown when an OS runs on RAM? Are there dumps to a HDD?


Nope.

The piratebay used to run their servers like this. They'd boot off of a USB stick, pivot root to a ram disk, and then unmount the USB stick. Then when the cops would seize the machines and cut power to save as much evidence as they could, they'd actually be doing the opposite.


> Then when the cops would seize the machines and cut power to save as much evidence as they could, they'd actually be doing the opposite.

Let's for a second assume that forensic computer technicians aren't completely incompetent and actually understand that you do not pull the plug on a system that potentially contains evidence the first thing you do.

While this has been known to happen in the past, I can almost guarantee it's because of "helpful" police officers, much in the same way the job of any forensic technician can be ruined by good intentions of those who don't know better.


Lol, forensic technicians are pretty incompetent overall. Like I remember a peer reviewed article on PS3 forensics where they explicitly have a section on trying to run a PS3 disk image in VMWare and didn't understand why they couldn't get it to work.

And this piratebay mitigation was in the early 2000s. At that point most of the anti forensic mitigations were booby traps on input channels (keyboard, console, etc) and maybe a tilt sensor that'd wipe the drive if anyone tried to access it. Under that scheme, pulling power before inspection is what you want to do.


> Under that scheme, pulling power before inspection is what you want to do.

Why? Are you saying there were no measures to make data inaccessible in case of power failure?


Unfortunately no. All data in the RAM disk are gone. You can mount a HDD after boot up to save data if needed.


Does TAILS not do this? Forgive my ignorance, I only know the basic idea of it.


Most live distros do this— rather than have an initramfs that finds and mounts the removable media, they just put everything in the initramfs itself.

What's new here is being able to optionally and seamlessly copy the regular disk install into RAM on boot.


What do you mean by "flip to an existing install?"


I think the mechanism is that it's installed in a normal persistent fashion, but per-boot you can opt-in to loading the current install into memory.


Have updated my comment to be more clear.


Can you use this to repartition your disk?


I remember this feature on Xubuntu circa 2009. Not sure why this is noteworthy.


Why would this be useful?


Many reasons, but the ones that come to mind are:

- Testing - quickly deploy a version of an OS and software to a fleet, then reboot into a pristine state or new state.

- Lab cost reduction. No hard drives in most of the fleet. Less power used.

- Consistent version across many test servers. People change things, reboot or power cycle and you are back into the same state.

I've used something like this in production before using NFS diskless. That was a bit messy. For labs, testing, it is great.


Many, many years ago (~1995ish) I was tasked with putting in a small network for a factory. I settled on the newly released NetWare 4.0 for the server and ran Cat4 around the place and coax to link the hubs (yes hubs). PCs ran Win 3.11.

One of them had such a shit slow (RLL I think) hard disc, I decided to network boot it instead and saved quite a lot of money.


Because RAM is faster than disk?


A lot of modern laptops come with 32 GB of ram on the lower end. With something like syncthing you only need permanent storage as a cache.


Time traveler from the future brags about lower end laptops!


And I'll bet his system still takes 3 whole seconds to launch a word processor.


And me still having an 8GB one.




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: