Recently I decided to move my second Linux Mint environment to another disk. One tiny detail left me stranded in GRUB console.
I keep a second Mint installation in a portable hard drive (actually a very small thing I can plug into USB in any hardware) that I don't use for anything serious, some browsing, Netflix and the sort, leaving everything development-wise to my beefier main system.
Not long ago, I got a small laptop that was perfect for this form factor, tempted me to let go of the USB thingie altogether and just use that laptop like a normal person. But, not wanting to reconfigure Mint, I decided to simply dd
the entire thing to the new disk.
This was uneventful. dd
did its job, I rebooted the system, booted it once from USB to check I didn't break anything, rebooted and let it boot from internal storage. This seemed ok so I unplugged the USB drive and carried on with the important task of watching some The Crown.
A couple of days later, a routing Mint update in this system exploded GRUB and the system stopped booting. Having recently used the USB disk with the original installation in other hardware and updated without incident, I tried booting from external, was cheerfully welcomed by GRUB, selected the first entry and was greeted with
Loading Linux linux ...
error: file `/boot/vmlinuz-5.4.0-72-generic' not found.
Loading initial ramdisk ...
error: you need to load the kernel first.
"Well, that can't be right" and sure enough, the external USB drive booted fine in another laptop. So the issue had to be in the system itself. Rebooting from external, I press e
to check what GRUB was dreaming of but everything seemed fine, I compared the root UUID with the external in another system and it was a match.
I'm sort of kinda familiar with GRUB's console, so a Ctrl+C
later I was staring at a grub>
prompt. So, let's start with the basics:
> ls
(proc) (hd0) (hd0,msdos1) (hd1) (hd1,msdos1)
Expected until now, since I booted from external hd0
would be my external (working) system and hd1
would be the broken internal system
> ls (hd0,msdos1)/boot
(a bunch of vmlinuz, initrd, etc, including the vmlinuz supposedly MIA)
> ls (hd1,msdos1)/boot
(another bunch of vmlinuz, initrd, but not the MIA vmlinuz)
Well, this also made sense, as the internal installation broke before the last update (and last kernel) I got in the external system, so the next step was to manually boot linux from GRUB:
> set root=(hd0,msdos1)
> set prefix=(hd0,msdos1)/boot/grub
> linux /boot/vmlinuz-5.4.0-72-generic
> initrd /boot/initrd.img-5.4.0-72-generic
> boot
Finally I start seeing Mint booting but soon enough I get the dreaded
Target filesystem doesn't have /sbin/init. No init found. Try passing init= bootarg.
and I get dropped into the emergency console. This one confused me a bit more. After fsck -f
the disks and retrying the reboot, I still got the same result. And that's more or less when it hit me: I did a very dumb thing when cloning this Mint installation.
To confirm this, from the emergency console I ran
> blkid
/dev/sda1: UUID="xxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxx" TYPE="ext4"
/dev/sdb1: UUID="xxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxx" TYPE="ext4"
and obviously the UUIDs matched, as I totally forgot to change the UUID of the target disk after running dd
. So GRUB was dutifully booting from external, but when searching for the root to load the kernel, kept trying the internal disk as it was the first block device with the root UUID.
I mounted both disks from the emergency console, checked fstab
and yep, I totally forgot about changing the UUIDs. I didn't have uuidgen
in this console so I wasn't too keen on messing with GRUB and fstab from here. However, GRUB lets us easily bypass this sort of stupidity from its console. So, another reboot from external, and now my manual boot sequence was
> set root=(hd0,msdos1)
> set prefix=(hd0,msdos1)/boot/grub
> linux /boot/vmlinuz-5.4.0-72-generic root=/dev/sda1
> initrd /boot/initrd.img-5.4.0-72-generic
> boot
after a while, I was greeted with Mint's login screen. Now to fix this screw-up: open a terminal
> sudo tune2fs /dev/sdb1 -U `uuidgen`
Confirmation prompt and a while after (changing the UUID in a live disk takes a bit) I double checked
> blkid
/dev/sda1: UUID="xxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxx" TYPE="ext4"
/dev/sdb1: UUID="yyyyyyy-yyyy-yyyy-yyyy-yyyyyyyyyyy" TYPE="ext4"
Mounted the internal system disk and replaced the new UUID in /etc/fstab
.
Another reboot from external, select first entry at GRUB screen, and the system booted fine.
Next I tried recovering booting from the internal disk — and hit a roadblock. GRUB displayed but as I hadn't run update-grub
it would be looking for the external block device which I had unplugged at this point. So trying the same routine I got to a different issue
> set root=(hd0,msdos1)
> set prefix=(hd0,msdos1)/boot/grub
> linux /boot/vmlinuz-5.4.0-66-generic root=/dev/sdb1
symbol grub_calloc not found
GRUB was broken on internal. Booting back from external I tried a fresh GRUB install to that disk
sudo mount /dev/sdb1 /mnt
sudo grub-install --root-directory=/mnt /dev/sdb
which after another reboot no longer complains about grub_calloc
but unfortunately also hang when loading the kernel without any more info.
However, this one will be fixed next time. For now, my second system is working again and I can drown my sorrows in another episode of The Crown (I know, I'm ridiculously late!)
Update: I actually figured out the kernel hang: I'm missing a kernel module needed to drive the internal disk. Fixing this one will be uneventful, so it doesn't merit a separate blog post. What a downer :)