I'd love to see this in the bootloader, along with a selection of binaries useful for recovery. Might sound silly but over the years I have had many a remote system get to the bootloader and then no further after an upgrade. Nowadays we've usually got a nicely sized EFI partition, why not stuff it all in there? Gimme a full Linux userspace from the bootloader, it would feel luxurious when I'm up at 3 am trying to recover a broken system halfway across the country.
Or is there already a solution to this that I've been missing? (Yeah, KVM/IPMI/etc, I know, but not all hosters make it easy to get to that.)
In new installs you do stuff everything in EFI partition and skip the old /boot partition as such.
The better solution is to use tpm, unified kernel image and secure boot skipping the network unlock.
The whole process is like this -
1. enable secure boot;
2. generate and install your own secure boot keys (using sbctl);
3. use clevis to enable automatic unlocking of the root fs only when secure boot check passes;
4. generate the unified kernel image (in EFI partition) that is signed by your secure boot key;
4. use efibootmgr to enable booting of said kernel image.
(5.) If your CPU supports it, enable memory encryption in BIOS (to mitigate cold boot attacks).
The unified kernel image doesn't accept additional kernel parameters, so only parameters that are set during generation of the initram are used. The secure boot makes sure no one else has tampered with the boot chain. And TPM stores the disk key securely.
You can still add some additional network level check to make sure that your computer is in your expected location before unlocking.
And you can also include some recovery tools + dropbear in your initram (within the unified kernel image), if you expect that you will have to do some recovery from the other side of the world.
> 3. use clevis to enable automatic unlocking of the root fs only when secure boot check passes;
Can also use systemd-cryptsetup/systemd-cryptenroll for this. I've not used clevis myself, but I'd imagine you have to do somewhat more rolling-your-own compared to the systemd tools.
> The unified kernel image doesn't accept additional kernel parameters, so only parameters that are set during generation of the initram are used. The secure boot makes sure no one else has tampered with the boot chain. And TPM stores the disk key securely.
FYI, multi-profile UKIs are a thing. You can have one UKI with multiple different command lines, e.g. one for regular boot, one for emergency mode, etc.
Sounds like you want ZFSBootMenu.org which offers remote SSH access with FDE in addition to snapshots in case of update falures or other issues. As long as you don't format the disk itself or wipe the ZFSBootMenu efi file you can recover and revert from anything remotely.
The solution is "don't apply untested upgrades to critical servers at 3am" :)
If you must do such upgrades, solutions include hot standby hardware, IPMI, an on-site tech with a screen and keyboard, or moving everything to the cloud.
Or is there already a solution to this that I've been missing? (Yeah, KVM/IPMI/etc, I know, but not all hosters make it easy to get to that.)