
Wipe and reinstall a running Linux system via SSH (2017) - 1nvalid
https://github.com/marcan/takeover.sh
======
mirimir
> This script does not have any provisions for exiting out of the new
> environment back into something sane. You will have to reboot when you're
> done. If you get anything wrong, your machine won't boot. Tough luck.

So then "without rebooting" in the title is inaccurate, no?

I haven't studied TFA carefully. But reinstalling running Linux systems via
SSH is pretty routine, no? Using debian-installer with network-console, I
mean.

I occasionally reinstall remote servers with LUKS via SSH. I just login, build
the installer, and reboot into it. Then I SSH to the installer, and almost
complete it. Just before rebooting, I go to single-user mode, and setup
dropbear in initramfs. Then I reboot.

So OK, that's two reboots, not one. And if it fails, I just reboot using the
control panel. If I've really fucked up, I reinstall and try again.

~~~
cliffy
Maybe kexec-ing into the new kernel would be feasible?

I investigated this many years ago to see if it was, but I found scant
information on compatibility requirements when using kexec to hand over
execution to an arbitrary kernel.

The problem seems really hard though. One issue that stands out to me is that
even if you properly shutdown the old kernel, will all system devices be in a
'good enough' state to be reinitialized properly by the new one? Or do some
devices require a reboot for some reason?

~~~
xxpor
kexec is basically the same as a reboot, as far as userspace is concerned.
Your sshd is going to go away.

~~~
cliffy
Userspace sure, but what about underlying hardware? How will device drivers
react if they come up and encounter hardware that is not coming out of a ACPI
induced reboot? Will some devices and their corresponding drivers be OK? Or
will the drivers panic when they encounter a device in a weird state?

I'm genuinely curious is all. At the time I was pursing this I decided it was
going to get too complicated and that I had to live with a reboot.

~~~
etaoins
This is a concern but it usually works for two reasons:

1\. Most firmware is sufficiently broken that Linux drivers are already
hardened against devices being brought up in arbitrary states.

2\. kexec walks the device tree to shut down all devices before starting the
new kernel. This usually gets devices closer to a startup state, or at least a
smaller number of known shutdown states.

------
saagarjha
The Stack Overflow answer linked is horrifyingly delightful:
[https://unix.stackexchange.com/questions/226872/how-to-
shrin...](https://unix.stackexchange.com/questions/226872/how-to-shrink-root-
filesystem-without-booting-a-livecd/227318#227318)

~~~
mehrdadn
I love how the first comment praises it for being "straightforward". For
reference, this is how you shrink a file system on Windows:

    
    
      diskpart
      select volume C:
      shrink desired=4096
    

I guess it's nice that the Linux version lets you _move_ the partition as well
(or "shrink from behind", so to speak), but damn, "straightforward" is not a
word I would have used to describe it.

~~~
cormacrelf
I love that this line sounds like an old sea shanty:

    
    
        for i in dev proc sys run; do mount --move /oldroot/$i /$i; done

~~~
perlgeek
[https://en.wikipedia.org/wiki/Black_Perl](https://en.wikipedia.org/wiki/Black_Perl)
comes to mind :-)

------
lloeki
Not needed anymore since DO supports custom OS images now but I found this
script quite interesting, setting up a "blockplan" and applying it from a
minimal root FS in RAM to entirely replace a Debian OS with an ArchLinux
install unattended and without rebooting (except as a very last step to
actually boot into the Arch kernel using the replaced GRUB2)

[https://github.com/gh2o/digitalocean-debian-to-arch#how-
it-w...](https://github.com/gh2o/digitalocean-debian-to-arch#how-it-works)

I forked it to try and convert Debian installs from ext4 to btrfs.
Unfortunately while it does work, that's a very bad idea since btrfs-convert
produces a fs that fails to operate properly on the long run (IIRC there is -
was? - some increasing random space usage that cannot be reclaimed, ever).

[https://github.com/lloeki/digitalocean-ext4-to-
btrfs/blob/ma...](https://github.com/lloeki/digitalocean-ext4-to-
btrfs/blob/master/ext2btrfs)

BTW, the process to convert to btrfs is excellent, only creating btrfs
metadata in unallocated ext3 space, writing the btrfs header at the last
minute, and using subvolumes allowing you to keep the ext3 metadata around as
long as you want to roll back (obviously losing any subsequent modifications)
because barring the header the whole ext filesystem and data is _untouched_.

[https://btrfs.wiki.kernel.org/index.php/Conversion_from_Ext3](https://btrfs.wiki.kernel.org/index.php/Conversion_from_Ext3)

I suppose Apple did something similar to convert from HFS+ to APFS so swiftly
and so reliably.

------
thom_nic
I actually implemented something on an embedded linux product once: a busybox
initramfs that you could boot into "recovery mode," then along with dropbear,
SSH into a completely in-memory system and re-flash the entire system image
without having to pop out an SD card or connect a cable for DFU.

The recovery mode could even be initiated remotely, so you could re-flash a
device without ever touching it. Of course you have to be careful, if the re-
flash failed you could be SOL :) Apparently I need to go back and improve it
so we can re-flash without rebooting!

These days you can use things like containers (Balena also looks very cool) to
achieve a similar goal in possibly a "safer" way. But the idea of being able
to re-flash the entire system while running it felt sort of like changing the
engine of a car while driving it down the freeway!

~~~
mikepurvis
I've implemented something very similar for upgrading headless Linux mobile
robots. One nice property of this approach is that the in-memory installer
environment and associated scripts can be common between USB-based install
media and a remotely-triggerable kexec type installer.

At first, it surprised me there wasn't more standard tooling out there for
this kind of thing, but as I got more into it, I realised how specific to our
particular needs my solution had become, and I could see how it would be hard
to offer something generic that would be a good fit for a wide range of use-
cases without being super-bloated.

------
vdloo
We upgraded a whole fleet of AWS / Digitalocean instances without floating IPs
from Ubuntu Precise to Xenial based on this method back in the summer of 2017.
While obviously not having to do crazy stuff like this would be better, it's
nice to know that it is possible if you really need it.

[https://www.reddit.com/r/programming/comments/6o7i8p/dont_ru...](https://www.reddit.com/r/programming/comments/6o7i8p/dont_run_this_on_any_system_you_expect_to_be_up/)

Recently we ran into another use-case for this in production actually, we
needed to wipe a lot of servers in our datacenter remotely and we figured one
of the options would be to install some OS in memory with the relevant wiping
tools, pivot_root to that, unmount all disks and then perform the wipe. In the
end we went a different route and opted for a custom PXE-boot image instead
that the servers would boot into that scripted the whole thing.

------
stevekemp
The closest I think I came to doing this was to migrate a running Debian
system from being an i386 system to being an amd64-system, in-place:

[https://wiki.debian.org/CrossGrading](https://wiki.debian.org/CrossGrading)

The first step is to update the kernel from an i386 one, so that it could run
both i386 and amd64 binaries, but then you essentially overwrite every package
with the version from the new architecture, and hope like hell it doesn't mess
up.

At the time I had a pair of servers, a mail-host, and a web-host, and I
managed to successfully upgrade both, although it was a little scary. At least
I had console access if things did get horribly screwed up.

~~~
raverbashing
You could first chroot to a minimal stable system then update everything
without having to rely on hope so much (but yeah big updates are always
"exciting" and yum/apt/etc _never_ get it completely right)

------
simula67
Previously :

[https://news.ycombinator.com/item?id=18741529](https://news.ycombinator.com/item?id=18741529)

[https://news.ycombinator.com/item?id=13618378](https://news.ycombinator.com/item?id=13618378)

[https://news.ycombinator.com/item?id=13622301](https://news.ycombinator.com/item?id=13622301)

~~~
exegete
The first two links you posted have 1 and 0 comments respectively. The third
one is helpful though with lots of discussion.

------
ratling
This whole thread just makes me glad I refuse to work on bare metal anymore
for anything but the most niche of cases.

AWS everything.

------
raintrees
"you know you want to" Okay, I'm in.

------
Ixiaus
You can also do this with NixOS...

