Page MenuHomeVyOS Platform

Install and Boot from RAID Doesn't Work
Open, HighPublicBUG

Description

I was attempting to install VyOS to a pre-configured linux RAID 1, and it appears that while it states it installed successfully, on reboot, even when I manually select the drive it finds nothing and continues to the next item (virtual CD). The system specifications are:
Processor: Intel Xeon-E 2274G - 4 c / 8 t - 4 GHz / 4.9 GHz
Memory: 32GB DDR4
Disks: 2 x 960GB Samsung NVMe SSDs (not sure model)
More info: https://www.ovh.com/world/dedicated-servers/infra/infra-1/

I'm not quite sure the motherboard other than that it's ASRock, due to the fact that this is a leased server in a remote location, and I have yet to install an OS on it to check. Given the nature of this issue, I've taken a screen recording of me installing and rebooting the server.

Notes:

  • This is booting in Legacy Mode, not UEFI mode
  • Attempting to install in UEFI mode fails to even show a boot item in the BIOS
  • This uses GPT
  • There is a pause in the screen recording; this is because I'm using IPMI virtual media, so copying the squashfs file is slow

Details

Difficulty level
Normal (likely a few hours)
Version
1.3-rolling-201912060242
Why the issue appeared?
Will be filled on close
Is it a breaking change?
Unspecified (possibly destroys the router)
Issue type
Bug (incorrect behavior)

Event Timeline

Have a screen recording 😄

Unknown Object (User) added a subscriber: Unknown Object (User).Dec 11 2019, 7:18 AM

Hello @trae32566 , with 1.2-rolling the same installing result?

@dmitry yes, I tried 1.2 rolling as well. I have not been able to try 1.2.3 stable due to a lack of access.

This comment was removed by hagbard.
hagbard changed the task status from Open to Confirmed.Dec 11 2019, 4:41 PM

Looks like an issue with the raid metadata and grub, problem confirmed with virtual box. Tested, latest rolling, 1.2.3 and 1.2.4-epa1.

Looks like an issue with the raid metadata and grub, problem confirmed with virtual box. Tested, latest rolling, 1.2.3 and 1.2.4-epa1.

Just for clarity, I did try both 1.0 and 1.2 metadata formats, since md1.1+ is not bootable.

Yeah I figured. vyos is being install into /dev/mdX, I can boot via live cd and mount mount it and it has everything in there, but there seems something wrong with writing the boot sector since I would see at least grub. Instead it is empty.

@trae32566

I pushed a fix earlier which might fix this in UEFI mode. Can you check the rolling tomorrow (or build youself today). If you are interested, I also have a custom built ISO with the fix in it.

@kroy I tried just now with vyos-1.3-rolling-202001160217 in UEFI mode (even forced UEFI boot only in the BIOS to make sure) and am still having the same problem.

erkin set Issue type to Bug (incorrect behavior).Aug 31 2021, 6:07 PM
c-po triaged this task as High priority.Oct 18 2021, 6:08 PM
c-po edited projects, added VyOS 1.3 Equuleus (1.3.0-epa3); removed VyOS 1.3 Equuleus.
c-po added a subscriber: c-po.

This can easily be reproduced using ESXi and UEFI bios. Looks like an issue with live-boot.

UnicronNL changed the task status from Confirmed to On hold.Nov 11 2021, 2:39 PM

We have removed the option to reuse a raid partition, instead it will remove the old raid and check for configs and keys on the drives you selected for a raid1 installation.

I think this actually inadvertently broke things even more, because now:

  1. Existing RAIDs are detected after selecting the disk to use, meaning you can't actually select a RAID
  2. Existing partitions can now cause the installer to fail (I only had this happen once, and it appears to be a result of existing partition table metadata causing parted to prompt to continue, which the installer does not appear to respond to)

Also, as you can see in the video here, it appears to be trying to install to the disk initially selected and failing due to what appears to be it trying to reformat after formatting and mounting. I even zero'd the entire array (dd if=/dev/zero of=/dev/md127 bs=8192) and still it happens. On a related note, I also have to sudo killall progress-indicator for the progress indicator to stop spamming the terminal after the installer fails:

Why is this closed? If you don't want the functionality, fine, but don't leave broken functionality in the installer....at least take it out so you're not confusing your users when it doesn't work.

syncer removed subscribers: kroy, hagbard, Unknown Object (User), Active contributors.
syncer added a subscriber: UnicronNL.