Takeover.sh – Wipe and reinstall a running Linux system via SSH without reboot

gizmo · on Feb 11, 2017

Pretty cool, although I'm pretty sure I would never use something like this.

What has saved my skin on a number of occasions is the ability to boot remote servers into rescue mode and chroot into the broken system. That way you can use package managers, all your diagnostic tools, and everything else the boot image doesn't provide.

Basically you just mount the different partitions and then chroot just swaps /proc /sys /dev of the rescue image with the real ones, and BAM you're back in business.

For details see threads like: http://superuser.com/questions/111152/whats-the-proper-way-t...

I know that for many of you this isn't rocket surgery, but for those who don't know you have to google for "chroot" when you boot into a rescue image and discover you can't do anything, you might just remember this post.

shocks · on Feb 11, 2017

Haha yes, this is the default install procedure for Gentoo! :D

exDM69 · on Feb 11, 2017

Something like this has been my default way of installing Linux for years - and I've installed Debian, Ubuntu (with debootstrap), Gentoo and Arch this way. Typically I just create another partition to my LVM volume group, chroot and install over there and then reboot without removing the old OS install and I run the install procedure from the old OS (instead of booting from a Live CD / usb stick).

I do this because I know it works and there's no guesswork involved in what the OS installer does. They aren't really intended for installing beside another system and the default partitioning options aren't always that great (RAID, LVM, crypto, etc).

Basically you only need 3 things to run Linux: kernel, initramfs and rootfs.

yellowapple · on Feb 11, 2017

You really only need two (kernel and initrd/initramfs), at least to get into a basic running state. This requires building your own initrd, of course, but it's pretty common in non-graphical Linux installers (Slackware's install media still does this, IIRC).

I think it's even possible to embed the initrd in the kernel binary itself, but I've never really investigated that.

derefr · on Feb 11, 2017

CoreOS's "system volumes" are just rather large initramfs images. This means you don't really have a rootfs partition on CoreOS, just a boot partition (containing a newer and older set of kernel + initramfs) and a "state" partition (containing whatever you like.)

This choice creates a very nice upgrade-management strategy for CoreOS clusters: rather than letting each machine have its own boot partition and asynchronously downloading updates into it, you can just stand up a PXE server to serve the current CoreOS kernel+initramfs and tell your machines to boot from that. Now the entire OS (~200MB) is streamed to each machine on boot; to "update" a machine, you just reboot it. (And, unlike a shared NFS/iSCSI rootfs disk server, you don't have to be careful when updating to tiptoe around the machines that haven't rebooted yet; they just stay happily on the old OS until you kick them over.)

As an added benefit, if the programs you're running on those CoreOS nodes don't need any persistent storage either (i.e. they're not database nodes), then they can be entirely diskless, and just let the rootfs tmpfs hold everything.

exDM69 · on Feb 11, 2017

Yes, it's a fun exercise to build a tiny Linux install that's fully on the initrd. Not that you'd want to have that kind of system in daily use outside of special applications.

> I think it's even possible to embed the initrd in the kernel binary itself, but I've never really investigated that.

Yes, the kernel config has an option to embed the initrd in the kernel image. I'm not sure if there are any advantages to this.

arca_vorago · on Feb 11, 2017

I use the initramfs method to put an small ssh server in that I can use to unlock full disk encrypted headless boxen, so I could see initrd in the kernel being used in a similar way.

yellowapple · on Feb 12, 2017

The advantage to embedding the initrd is usually that you don't have to worry about a separate initrd (for example, if you're reliant on some really basic bootloader). It's also more in-line with, say, OpenBSD (AFAICT) which doesn't use a separate initrd/initramfs (not sure if this is true of all BSDs).

nickysielicki · on Feb 12, 2017

> I'm not sure if there are any advantages to this

This comes to mind: http://kroah.com/log/blog/2013/09/02/booting-a-self-signed-l...

amboar · on Feb 12, 2017

Having the initrd merged with the kernel is slightly less hassle when netbooting to test during kernel development

LookASquirrel · on Feb 11, 2017

You can ditch the initrd if you compile in the kernel the drivers you need to boot the system, as a minimal example SATA/SCSI, EXT4/JFS, that usually are compiled as kernel modules.

exDM69 · on Feb 11, 2017

This only works if you stick to a simple filesystem, not crypto, RAID, etc. Otherwise you'll need to have file system utilities (lvm2, mdadm, cryptsetup, zfsprogrs, etc) on the initrd to get your rootfs mounted.

LookASquirrel · on Feb 11, 2017

Damn, you are absolutely right!

It never crossed my mind that LVM and similar needs utilities to access the filesystem and to be honest, I thought initrd/initramfs contained only kernel modules, not executables and scripts... ^__^;

digi_owl · on Feb 11, 2017

How most sane distros do it.

Build a kernel with the most common hardware built into it, and use that to bootstrap. No need for messy things like balled up temporary rootfs in a ram drive.

Initrd/initramfs have become an excuse for piling on complexities that frankly should be added by the sysadmin after initial install.

exDM69 · on Feb 11, 2017

The purpose of the initrd to have the initrd bootstrap to rootfs. This involves userspace tools like mdadm, lvm2, cryptsetup, zfsprogs, etc.

It's not very common to need kernel modules (drivers) on initrd on typical hardware.

sh_tinh_hair · on Feb 12, 2017

The keywords there are 'typical hardware', but it can get worse. How about 'badly selected' hardware on a short time frame, in situ, waiting for an install and application port process to be performed by non-technical users following a recipe. Proprietary drivers for network and soft raid with a large enterprise linux vendor support and site license basically voided. There are worse things than starting an install,loading storage drivers, creating the lvm partitioning, hup'ing disk druid, installing, rebooting and adding the network driver to initrd, and having to explain and document that manual procedure to the same people who ordered the hardware...but I've forgotten them.

yellowapple · on Feb 12, 2017

It's not common to need them, perhaps, but it's still useful if you don't want to recompile your kernel just to add on-boot support for various devices. Slackware's 'mkinitrd' tool is one example of this sort of approach; you can add various modules (like for your root filesystem, keyboard, etc.) by adding to the $MODULE_LIST variable defined in '/etc/mkinitrd.conf' or by running 'mkinitrd -k $MODULES'.

You can of course accomplish similar things by just recompiling the kernel (which Slackware makes very easy to do), but if you still need to use 'mkinitrd' anyway (perhaps because you're using LVM or softraid or LUKS), it's often more convenient to just throw in the modules you need while you're at it.

josteink · on Feb 12, 2017

> Basically you only need 3 things to run Linux: kernel, initramfs and rootfs.

And networking, fully functional networking.

If you've missed out the FW packages for your wifi drivers, you will have a hard time using your package manager to get those missing bits ;)

Probably not an issue for servers, but for those who'd like to do "complex" desktop/laptop install with ZFS/btrfs roots or whatever, it's an easy mistake to make.

hashhar · on Feb 11, 2017

Arch also does it. The first time I encountered this was when I accidentally deleted my EFI partition and was unable to boot. Boot into live media, chroot, run grub-install and rejoice.

Accacin · on Feb 11, 2017

Yeah as an Arch user I've used it a few times where an pacman -Syu has caused something to break and it's pretty easy to chroot into your system and fix the broken package.

toxik · on Feb 11, 2017

It's funny, I knew how to do that when I borked my systems back in 2004 only because the Gentoo installation instructions were like that.

HeadlessChild · on Feb 11, 2017

Sort of the same for me but I learnt it from the Arch Linux installation process.

smudgymcscmudge · on Feb 11, 2017

This is one of those things you want to have practiced before it's critical.

rijoja · on Feb 11, 2017

I like the term rocket surgery, I will steal it from you for sure.

gizmo · on Feb 11, 2017

If I'm not mistaken, credit goes to Chris Rock (who is not me).

jonathonf · on Feb 11, 2017

Pretty sure the conflation has been around ever since rocket science and brain surgery have been used as a comparison. :)

AnkhMorporkian · on Feb 11, 2017

I don't tend to link youtube videos on HN, particularly of the comedic type, but I'm just going to leave this here for anyone who wants a good laugh for their morning.

https://www.youtube.com/watch?v=THNPmhBl-8I

henrikm85 · on Feb 11, 2017

While funny, I still cringe because of the http://www.dict.cc/?s=fremdschaemen

tombh · on Feb 11, 2017

I was thinking the exact same thing.

shincert · on Feb 11, 2017

That was great!

yellowapple · on Feb 11, 2017

But it's not exactly brain science, now is it?

HeadlessChild · on Feb 11, 2017

I work at an IT Helpdesk and we build a custom GRML image which we write to our pen drives. We use it, for example, to quickly live boot Debian/Ubuntu PC's and run a script that mounts all necessary file systems (including pseudo file systems). This makes it really easy to quickly chroot into a broken system right at a users desk.

On our Linux installations we have separated most of the directories into different LVM partitions. Such as '/home', '/usr', '/var' and of course '/' (the root). Just last week I helped a user live boot his system and ran a fsck on his '/home' partition. We had "frozen" the system and hard reset his computer which led to a couple of corrupt blocks and inodes.

lisivka · on Feb 11, 2017

I recommend to copy a rescue disk iso image or/and network installation image to boot partition and add it to GRUB menu. This way it is possible to boot into working environment without risk to damage root partition.

https://wiki.archlinux.org/index.php/Multiboot_USB_drive

gizmo · on Feb 11, 2017

That usually doesn't work for remote servers because you can't access the grub menu over SSH and you want something that works even in those cases that grub or your boot record are trashed. So you just want to do a netboot from a generic rescue iso. Zero dependencies. Works even when your hard drives are severely damaged and you just need to /bin/dd some raw sectors.

nitrogen · on Feb 11, 2017

Works even when your hard drives are severely damaged and you just need to /bin/dd some raw sectors.

That reminds me of the time when some partitioning utility overwrote the first block of my filesystem and then set the filesystem start to the wrong block. I thought I was hosed, but I managed to use hexdump to find a backup superblock, calculate the correct filesystem offset from that, and dd the backup superblock into the primary location. I may be misremembering some details.

JoshTriplett · on Feb 11, 2017

> That usually doesn't work for remote servers because you can't access the grub menu over SSH

Depends on your environment. My hosting company offers a remote console you can enable that gives access to a virtual serial console via SSH; with that, you can access your bootloader.

predakanga · on Feb 11, 2017

For anyone interested in adding this to their toolkit, I would suggest reading this StackOverflow answer: http://unix.stackexchange.com/a/227318/189858

In short, the answer details how to switch your running system to use an in-memory only root filesystem, without restarting. This allows installing a new OS, resizing the OS disks, etc.

It's a risky operation, but the linked answer covers many pitfalls that you might run into - I recently used it to shrink the root partition on a remote server, very much appreciated the detail.

aaronmdjones · on Feb 11, 2017

I've been using this procedure to remotely replace operating systems for years. The most common scenario is a VPS provider that doesn't give you the choice of OS you want.

Infact, I will be using it later today, to replace a Debian system with Gentoo, no less. The README in OPs link is spot-on here (the last few paragraphs).

djsumdog · on Feb 11, 2017

This is exactly what I thought when I saw this post. I really want to use Void Linux on a provider, but no one supports it (except for those who allow custom images like Linode). It'd be great to be able to provision a standard Ubuntu or CentOS box and then replace it with the OS I actually want to run.

drvdevd · on Feb 11, 2017

Thanks for this! Ive been doing wacky stuff like kexec which was completely unecessary when I could've just done a pivot_root...

rdslw · on Feb 11, 2017

Another nice trick of this family (with reboots, or without with using systemd-nspawn) lies with clever btrfs usage. Long story short:

* use btrfs, and create your main root filesystem as a btrfs partition subvolume and another btrfs subvolume for snapshots (also a sub of master btrfs partition)

* to start any experiment (e.g. installing whole gnome and 500 different packages you MIGHT WANT TO REVERT in the future) create before the risky operation snapshot (btrfs subvolume snapshot / /.snapshots/yournameofsnap) of current filesystem

* experiment in any way :)

* switch between old root (snapshot you created) or the new one with (btrfs subvoulme set-default)

* delete any of them (btrfs subvolume delete)

btrfs copy-on-write allows all of these commands to happen instantly without (almost) any actual copying. Also booting from both volumes is possible without any additional steps as long as master btrfs partition is the one to be booted from UEFI.

https://wiki.archlinux.org/index.php/Btrfs

floatboth · on Feb 11, 2017

Congratulations, you've reinvented ZFS Boot Environments :)

LeoPanthera · on Feb 11, 2017

Only to a very high level. btrfs snapshots are writable, but share blocks, so you can boot directly from a snapshot. It's a better system.

notaplumber · on Feb 11, 2017

This sounds similar to the more cleverly named FreeBSD Depenguinator project which could be written over top of a remote Linux server replacing it with FreeBSD, without console.

If you have remote console access, a similar thing can be done for OpenBSD by dd(1)'ing a miniroot ramdisk install image.

camtarn · on Feb 11, 2017

Seeing if I understand what this is doing: this keeps running the same Linux kernel and kernel modules, but swaps out absolutely everything else up to and including the init system - is that right?

efficax · on Feb 11, 2017

More precisely, what it does is start a new init system on a pseudofilesystem, starts an sshd chrooted to that new system, and lets you then login there, where you can then, if you want, umount the original root fs, wipe it, install a new os, then reboot the system.

camtarn · on Feb 11, 2017

Thanks!

haltcatchfire · on Feb 11, 2017

I guess this could be really useful for installing distributions that are not available at some VPS providers.

tyingq · on Feb 11, 2017

Another trick to do that is to use the VPS's rescue mode, download qemu, and start it with the remote desktop / VNC console. Then, do the install from there. Poor man's IPMI.

tomsmeding · on Feb 11, 2017

There is a similar script on github somewhere that does some more work and can install arch over debian on a DigitalOcean host. It's beautiful.

ce4 · on Feb 11, 2017

Reminds me of Debian Takeover from more than 10 years ago :-)

https://wiki.debian.org/DebianTakeover

zimbatm · on Feb 11, 2017

https://github.com/elitak/nixos-infect is similar but doesn't require to pivot root.

Aissen · on Feb 11, 2017

FYI, there's vps2arch that does the same thing with a different approach: https://github.com/drizzt/vps2arch

Edit: it doesn't really do the same thing. vps2arch could be implemented on top of takeover.sh for better reliability.

nashashmi · on Feb 11, 2017

Somebody correct if I am wrong, but this script somehow allows the session to live in the RAM. Once the OS is running directly from the RAM, the hard drive can be wiped and a new OS can be installed. The system is then booted to run off of the hard drive.

VoidWhisperer · on Feb 11, 2017

I feel like this is, on one hand, an amazing thing to be able to do because it removes the necessity to have IPMI to remotely re-install a system.

On the other hand, this seems like it would be an incredibly easy thing to screw up and potentially leave yourself with a corrupted or unbootable system.

ams6110 · on Feb 11, 2017

Hence the advice to only do this on machines where you have physical access, or at least can trigger a PXE boot via IPMI.

dredmorbius · on Feb 11, 2017

Which is fundamentally how an installer works already.

Look up the chroot-install method for Debian/Ubuntu, as an example of this in a user-accessible manner. The process uses debootstrap, but as I understand was originally developed via a chroot shell on a running system and either partition or ramdisk-based bootstrapping of a new system.

lisivka · on Feb 11, 2017

This scripts creates another root, then substitutes /sbin/init using `mount --bind`, then reloads /sbin/init, which pivots into new root with fakeinit.

IMHO, systemd in container mode can be used instead of fakeinit.

dredmorbius · on Feb 11, 2017

This is conceptually similar to the chroot installation method, which has been a documented, if not entirely standard, method on Debian for quite some time.

https://www.debian.org/releases/stable/amd64/apds03.html.en

https://wiki.debian.org/chroot

yellowapple · on Feb 11, 2017

It's also a common installation method for the more "advanced" distros. The "SAG Trifecta" (Slackware/Arch/Gentoo) uses a lot of chrooting in its installation procedures.

What's interesting about takeover.sh, though, is that it goes a step further and causes the chrooted system to actually replace an existing OS without a reboot and (theoretically) without involving additional boot media.

geoffmcc · on Feb 11, 2017

I wonder if this would help me switch from Ubuntu Desktop to Ubuntu Server on my laptop that has a broken screen.

pipo098 · on Feb 11, 2017

you can always connect an external screen and go from there

geoffmcc · on Feb 11, 2017

I can connect HDMI and can get into UBUNTU, but I cannot get into BIOS to tell it to boot from USB. If I could get that figured out- I think HDMI will activate on the boot of the ISO, but Im not sure. I even tried VGA to a monitor, but that dont show me bios either. Dont wake up till Ubuntu starts to boot.

umanwizard · on Feb 11, 2017

Can you physically disconnect the hard drive? Maybe you'll get lucky and USB will be next in the boot order. Contrary to popular belief, you can USUALLY plug the hard drive back in while the system is running from USB without any problems.

geoffmcc · on Feb 11, 2017

oh. You may be onto something. I am pretty sure USB is boot option 2, but if not it would be my disc drive- but I can make that work too. It's a Lenova (maybe acer) laptop with some wierd tool it seems to open it up. I can try to figure it out though. I just want to use it as an inhome server so it dont need to look pretty.

umanwizard · on Feb 11, 2017

Keep in mind, you might get a kernel panic when you try to hot-plug the drive back in. It's not guaranteed to work, but it's worth a try.

If that doesn't work (kernel panic or whatever) you can disconnect the drive and connect it via a SATA-to-USB adapter in which case hot plugging should definitely work.

NGTmeaty · on Feb 11, 2017

Holy shit, that's really cool.

technologyvault · on Feb 11, 2017

Wish I had known about this hack before today, even if it is just experimental at this point.

dpweb · on Feb 11, 2017

removed. No delete on HN comments? interesting

bjacobel · on Feb 11, 2017

I think you meant to comment on the GitLab postmortem thread.

dpweb · on Feb 11, 2017

oops, yeah..

rocky1138 · on Feb 11, 2017

You can delete comments but only up until a certain amount of time has passed, I think. I've been able to do it.

Klathmon · on Feb 11, 2017

I believe once someone replies you can no longer delete it. And you can edit for up to 2 hours after you've made the comment.

andruby · on Feb 11, 2017

Where does this takeover.sh readme mention anything about missing pg_dump files?

cperciva · on Feb 11, 2017

It doesn't... but wiping and reinstalling a system would result in missing pg_dump files...