Page MenuHomeVyOS Platform

System hangs at boot after mounting config
Closed, DuplicatePublicBUG

Description

bug report: 1.3-rolling-202009011736 and later fails to boot. Previous releases (eg. 1.3-rolling-202008301444) boot ok

This router is installed as a KVM VM running on a CentOS 8 host.

Output of serial console during boot follows:

Quoted Text

Press ESC to enter the Grub menu...

GNU GRUB  version 1.98+20100804-14+vyos1+helium1

+--------------------------------------------------------------------------+

VyOS 1.3-rolling-202009011736 (KVM console)
VyOS 1.3-rolling-202009011736 (Serial console)
Lost password change 1.3-rolling-202009011736 (KVM console)
Lost password change 1.3-rolling-202009011736 (Serial console)
VyOS 1.3-rolling-202008301444 (KVM console)
VyOS 1.3-rolling-202008301444 (Serial console)
Lost password change 1.3-rolling-202008301444 (KVM console)
Lost password change 1.3-rolling-202008301444 (Serial console)
VyOS 1.2.0-rolling+201907090337 (KVM console)
VyOS 1.2.0-rolling+201907090337 (Serial console)
Lost password change 1.2.0-rolling+201907090337 (KVM console)v

+--------------------------------------------------------------------------+

    Use the ^ and v keys to select which entry is highlighted.
    Press enter to boot the selected OS, 'e' to edit the commands
    before booting or 'c' for a command-line.

 The highlighted entry will be executed automatically in 0s.
Booting 'VyOS 1.3-rolling-202009011736 (Serial console)'

mount: mounting /dev/vda on /live/persistence/ failed: No such device
umount: /root: target is busy.
mount: /root/dev: /dev is not a block device.

Welcome to Debian GNU/Linux 10 (buster)!

[ OK ] Listening on initctl Compatibility Named Pipe.
[ OK ] Reached target Swap.
[ OK ] Set up automount Arbitrary…s File System Automount Point.
[ OK ] Listening on Syslog Socket.
[ OK ] Listening on Journal Audit Socket.
[ OK ] Listening on udev Control Socket.
[ OK ] Created slice User and Session Slice.
[ OK ] Listening on Journal Socket (/dev/log).
[ OK ] Started Forward Password R…uests to Wall Directory Watch.
[ OK ] Created slice system-serial\x2dgetty.slice.
[ OK ] Started Dispatch Password …ts to Console Directory Watch.
[ OK ] Reached target Paths.
[ OK ] Listening on udev Kernel Socket.
[ OK ] Reached target Local Encrypted Volumes.
[ OK ] Reached target Remote File Systems.
[ OK ] Created slice system-getty.slice.
[ OK ] Reached target Slices.
[ OK ] Listening on Journal Socket.

Mounting Huge Pages File System...
Starting Load Kernel Modules...
Starting Remount Root and Kernel File Systems...
Starting Journal Service...
Starting Create list of re…odes for the current kernel...
Mounting Kernel Debug File System...
Starting udev Coldplug all Devices...
Mounting POSIX Message Queue File System...

[ OK ] Mounted Huge Pages File System.
[ OK ] Started Remount Root and Kernel File Systems.
[ OK ] Started Load Kernel Modules.
[ OK ] Started Create list of req… nodes for the current kernel.
[ OK ] Mounted Kernel Debug File System.
[ OK ] Mounted POSIX Message Queue File System.

Starting Apply Kernel Variables...
Starting Load/Save Random Seed...

[ OK ] Started VyOS configuration daemon.
[ OK ] Started VyOS DNS configuration keeper.

Starting Create System Users...

[ OK ] Started Journal Service.
[ OK ] Started Apply Kernel Variables.
[ OK ] Started Load/Save Random Seed.

Starting Flush Journal to Persistent Storage...

[ OK ] Started Flush Journal to Persistent Storage.
[ OK ] Started Create System Users.

Starting Create Static Device Nodes in /dev...

[ OK ] Started udev Coldplug all Devices.

Starting Helper to synchronize boot up for ifupdown...

[ OK ] Started Create Static Device Nodes in /dev.

Starting udev Kernel Device Manager...

[ OK ] Reached target Local File Systems (Pre).

Mounting /tmp...

[ OK ] Started Helper to synchronize boot up for ifupdown.
[ OK ] Mounted /tmp.
[ OK ] Reached target Local File Systems.

Starting Create Volatile Files and Directories...
Starting Raise network interfaces...

[ OK ] Started udev Kernel Device Manager.
[ OK ] Started Create Volatile Files and Directories.
[ OK ] Reached target System Time Synchronized.
[ OK ] Started Entropy daemon using the HAVEGE algorithm.

Starting live-config conta…t process (late userspace)....
Starting Update UTMP about System Boot/Shutdown...

[ OK ] Started Update UTMP about System Boot/Shutdown.
[ OK ] Reached target System Initialization.
[ OK ] Started Daily rotation of log files.
[ OK ] Started Daily Cleanup of Temporary Directories.
[ OK ] Reached target Timers.
[ OK ] Listening on D-Bus System Message Bus Socket.
[ OK ] Listening on ACPID Listen Socket.
[ OK ] Reached target Sockets.
[ OK ] Found device /dev/ttyS0.
[ OK ] Started Raise network interfaces.
[ OK ] Reached target Network.
[ OK ] Reached target Network is Online.
[ OK ] Started live-config contai…oot process (late userspace)..
[ OK ] Reached target Basic System.
[ OK ] Started D-Bus System Message Bus.

Starting Login Service...

[ OK ] Started VyOS Router.

Starting OpenBSD Secure Shell session cleanup...
Starting LSB: powerful, ef…alable Mail Transport Agent...
Starting /etc/rc.local Compatibility...
Starting Permit User Sessions...

[ OK ] Started ACPI event daemon.

Starting LSB: AWS EC2 inst…tch and load ssh public key...
Starting LSB: Brings up/down network automatically...
Starting Atop process accounting daemon...
Starting Deferred execution scheduler...
Starting System Logging Service...

[ OK ] Started Regular background program processing daemon.

Starting Restore /etc/reso… the ppp link was shut down...

[ OK ] Started OpenBSD Secure Shell session cleanup.
[ OK ] Started LSB: powerful, eff…scalable Mail Transport Agent.
[ OK ] Started /etc/rc.local Compatibility.
[ OK ] Started Permit User Sessions.
[ OK ] Started LSB: Brings up/down network automatically.
[ OK ] Started Atop process accounting daemon.
[ OK ] Started Deferred execution scheduler.
[ OK ] Started Restore /etc/resol…re the ppp link was shut down.
[ 40.788514] vyos-router[712]: Waiting for NICs to settle down: settled in 1sec..
[ 46.450987] vyos-router[712]: Started watchfrr.
[ 46.530202] vyos-router[712]: Mounting VyOS Config...done.

Quoted Text

After the above the router simply hangs. This router's only purpose is to provide tunneled IPv6 connectivity (courtesy Hurricane Electric) to a lab environment. Config follows:

Quoted Text

firewall {

all-ping enable
broadcast-ping disable
config-trap disable
group {
    address-group admin_addresses {
        address **.***.***.***-**.***.***.***.***
        address **.***.***.***
        address **.***.*.**
        description "admin nets"
    }
    ipv6-network-group admin_nets_IPv6 {
        description "admin nets"
        network ****:***:****:**::/64
        network ****:****:****:****::/64
    }
    network-group admin_nets {
        description "admin nets"
        network **.**.**.**/28
        network ***.***.***.*/22
        network ***.***.***.*/18
    }
}
ipv6-name WANv6_IN {
    default-action drop
    enable-default-log
    rule 10 {
        action accept
        state {
            established enable
            related enable
        }
    }
    rule 20 {
        action drop
        description "Drop invalid state"
        state {
            invalid enable
        }
    }
    rule 30 {
        action accept
        description "Allow IPv6 icmp"
        protocol icmpv6
    }
}
ipv6-name WANv6_LOCAL {
    default-action drop
    description "WAN inbound traffic to the router"
    enable-default-log
    rule 10 {
        action accept
        description "Allow established/related sessions"
        state {
            established enable
            related enable
        }
    }
    rule 15 {
        action accept
        protocol all
        source {
            group {
                network-group admin_nets_IPv6
            }
        }
    }
    rule 20 {
        action drop
        description "Drop invalid state"
        state {
            invalid enable
        }
    }
    rule 30 {
        action accept
        description "Allow IPv6 icmp"
        protocol ipv6-icmp
    }
    rule 40 {
        action accept
        description "allow dhcpv6"
        destination {
            port 546
        }
        protocol udp
        source {
            port 547
        }
    }
}
ipv6-receive-redirects disable
ipv6-src-route disable
ip-src-route disable
log-martians enable
name WAN_IN {
    default-action drop
    description "WAN to internal"
    rule 20 {
        action accept
        description "Allow established/related"
        state {
            established enable
            related enable
        }
    }
    rule 30 {
        action drop
        description "Drop invalid state"
        state {
            invalid enable
        }
    }
}
name WAN_LOCAL {
    default-action drop
    description "WAN to router"
    rule 5 {
        action accept
        description "HE.NET IPv6 Tunnel source"
        protocol all
        source {
            address **.**.***.**
        }
    }
    rule 10 {
        action accept
        description "Allow established/related"
        state {
            established enable
            related enable
        }
    }
    rule 15 {
        action accept
        description "allow admin nets"
        protocol all
        source {
            group {
                address-group admin_addresses
            }
        }
    }
    rule 20 {
        action accept
        description "allow admin nets"
        protocol all
        source {
            group {
                network-group admin_nets
            }
        }
    }
    rule 30 {
        action drop
        description "Drop invalid state"
        state {
            invalid enable
        }
    }
}
receive-redirects disable
send-redirects enable
source-validation disable
syn-cookies enable
twa-hazards-protection disable

}
interfaces {

ethernet eth0 {
    address **.***.***.***/24
    duplex auto
    firewall {
        local {
            ipv6-name WANv6_LOCAL
            name WAN_LOCAL
        }
    }
    hw-id **:**:**:**:**:**
    ipv6 {
        address {
            eui64 ****:***:****:**::/64
        }
        dup-addr-detect-transmits 1
    }
    smp-affinity auto
    speed auto
}
loopback lo {
}
tunnel tun0 {
    address ****:***:****:**::2/64
    description "HE.NET IPv6 Tunnel"
    encapsulation sit
    local-ip **.***.***.***
    multicast disable
    remote-ip **.**.***.**
}

}
protocols {

static {
    interface-route6 ::/0 {
        next-hop-interface tun0 {
        }
    }
    route 0.0.0.0/0 {
        next-hop **.***.***.* {
        }
    }
}

}
service {

lldp {
    interface all {
    }
    legacy-protocols {
        cdp
    }
}
router-advert {
    interface eth0 {
        hop-limit 64
        interval {
            max 600
        }
        name-server 2001:470:20::2
        name-server 2001:4860:4860::8844
        prefix ****:***:****:**::/64 {
            valid-lifetime 2592000
        }
        reachable-time 0
        retrans-timer 0
    }
}
ssh {
    disable-host-validation
    port 22
}

}
system {

config-management {
    commit-revisions 20
}
console {
    device ttyS0 {
        speed 9600
    }
}
domain-name *********
host-name rtr1
login {
    user vyos {
        authentication {
            encrypted-password ****************
            plaintext-password ****************
        }
    }
}
name-server 74.82.42.42
name-server 2001:470:20::2
ntp {
    server 2.pool.ntp.org {
    }
    server 2.rhel.pool.ntp.org {
    }
    server 2.us.pool.ntp.org {
    }
    server ********* {
    }
    server stratum2-01.sfo01.publicntp.org {
    }
}
syslog {
    global {
        facility all {
            level notice
        }
        facility protocols {
            level debug
        }
    }
}
time-zone US/Hawaii

}

Quoted Text

Please let me know if you need additional info.

Details

Difficulty level
Unknown (require assessment)
Version
1.3-rolling-202009011736
Why the issue appeared?
Will be filled on close
Is it a breaking change?
Behavior change
Issue type
Bug (incorrect behavior)

Event Timeline

@querubin please try booting with the vyos-configd service masked: add the kernel boot parameter:

systemd.mask=vyos-configd.service

If that is successful, any errors from the previous boot related to that service will be in /var/log/messages (or messages.1 if rotated from previous boot) on lines containing 'vyos-configd'.

If you could share that information, it would be very useful; thanks.

I don't see problems with clean installed latest rolling

Tested configuration.

set firewall all-ping 'enable'
set firewall broadcast-ping 'disable'
set firewall config-trap 'disable'
set firewall group address-group admin_addresses address '192.168.122.1'
set firewall group address-group admin_addresses description 'admin nets'
set firewall group ipv6-network-group admin_nets_IPv6 description 'admin nets'
set firewall group ipv6-network-group admin_nets_IPv6 network '2001:DB8::/64'
set firewall group network-group admin_nets description 'admin nets'
set firewall ipv6-name WANv6_IN default-action 'drop'
set firewall ipv6-name WANv6_IN enable-default-log
set firewall ipv6-name WANv6_IN rule 10 action 'accept'
set firewall ipv6-name WANv6_IN rule 10 state established 'enable'
set firewall ipv6-name WANv6_IN rule 10 state related 'enable'
set firewall ipv6-name WANv6_IN rule 20 action 'drop'
set firewall ipv6-name WANv6_IN rule 20 description 'Drop invalid state'
set firewall ipv6-name WANv6_IN rule 20 state invalid 'enable'
set firewall ipv6-name WANv6_IN rule 30 action 'accept'
set firewall ipv6-name WANv6_IN rule 30 description 'Allow IPv6 icmp'
set firewall ipv6-name WANv6_IN rule 30 protocol 'icmpv6'
set firewall ipv6-name WANv6_LOCAL default-action 'drop'
set firewall ipv6-name WANv6_LOCAL description 'WAN inbound traffic to the router'
set firewall ipv6-name WANv6_LOCAL enable-default-log
set firewall ipv6-name WANv6_LOCAL rule 10 action 'accept'
set firewall ipv6-name WANv6_LOCAL rule 10 description 'Allow established/related sessions'
set firewall ipv6-name WANv6_LOCAL rule 10 state established 'enable'
set firewall ipv6-name WANv6_LOCAL rule 10 state related 'enable'
set firewall ipv6-name WANv6_LOCAL rule 15 action 'accept'
set firewall ipv6-name WANv6_LOCAL rule 15 protocol 'all'
set firewall ipv6-name WANv6_LOCAL rule 15 source group network-group 'admin_nets_IPv6'
set firewall ipv6-name WANv6_LOCAL rule 20 action 'drop'
set firewall ipv6-name WANv6_LOCAL rule 20 description 'Drop invalid state'
set firewall ipv6-name WANv6_LOCAL rule 20 state invalid 'enable'
set firewall ipv6-name WANv6_LOCAL rule 30 action 'accept'
set firewall ipv6-name WANv6_LOCAL rule 30 description 'Allow IPv6 icmp'
set firewall ipv6-name WANv6_LOCAL rule 30 protocol 'ipv6-icmp'
set firewall ipv6-name WANv6_LOCAL rule 40 action 'accept'
set firewall ipv6-name WANv6_LOCAL rule 40 description 'allow dhcpv6'
set firewall ipv6-name WANv6_LOCAL rule 40 destination port '546'
set firewall ipv6-name WANv6_LOCAL rule 40 protocol 'udp'
set firewall ipv6-name WANv6_LOCAL rule 40 source port '547'
set firewall ipv6-receive-redirects 'disable'
set firewall ipv6-src-route 'disable'
set firewall ip-src-route 'disable'
set firewall log-martians 'enable'
set firewall name WAN_IN default-action 'drop'
set firewall name WAN_IN description 'WAN to internal'
set firewall name WAN_IN rule 20 action 'accept'
set firewall name WAN_IN rule 20 description 'Allow established/related'
set firewall name WAN_IN rule 20 state established 'enable'
set firewall name WAN_IN rule 20 state related 'enable'
set firewall name WAN_IN rule 30 action 'drop'
set firewall name WAN_IN rule 30 description 'Drop invalid state'
set firewall name WAN_IN rule 30 state invalid 'enable'
set firewall name WAN_LOCAL default-action 'drop'
set firewall name WAN_LOCAL description 'WAN to router'
set firewall name WAN_LOCAL rule 5 action 'accept'
set firewall name WAN_LOCAL rule 5 description 'HE.NET IPv6 Tunnel source'
set firewall name WAN_LOCAL rule 5 protocol 'all'
set firewall name WAN_LOCAL rule 5 source address '100.64.0.2'
set firewall name WAN_LOCAL rule 10 action 'accept'
set firewall name WAN_LOCAL rule 10 description 'Allow established/related'
set firewall name WAN_LOCAL rule 10 state established 'enable'
set firewall name WAN_LOCAL rule 10 state related 'enable'
set firewall name WAN_LOCAL rule 15 action 'accept'
set firewall name WAN_LOCAL rule 15 description 'allow admin nets'
set firewall name WAN_LOCAL rule 15 protocol 'all'
set firewall name WAN_LOCAL rule 15 source group address-group 'admin_addresses'
set firewall name WAN_LOCAL rule 20 action 'accept'
set firewall name WAN_LOCAL rule 20 description 'allow admin nets'
set firewall name WAN_LOCAL rule 20 protocol 'all'
set firewall name WAN_LOCAL rule 20 source group network-group 'admin_nets'
set firewall name WAN_LOCAL rule 30 action 'drop'
set firewall name WAN_LOCAL rule 30 description 'Drop invalid state'
set firewall name WAN_LOCAL rule 30 state invalid 'enable'
set firewall receive-redirects 'disable'
set firewall send-redirects 'enable'
set firewall source-validation 'disable'
set firewall syn-cookies 'enable'
set firewall twa-hazards-protection 'disable'
set interfaces ethernet eth0 address '192.168.122.11/24'
set interfaces ethernet eth0 duplex 'auto'
set interfaces ethernet eth0 firewall local ipv6-name 'WANv6_LOCAL'
set interfaces ethernet eth0 firewall local name 'WAN_LOCAL'
set interfaces ethernet eth0 ipv6 address eui64 '2001:DB8::/64'
set interfaces ethernet eth0 ipv6 dup-addr-detect-transmits '1'
set interfaces ethernet eth1 address '100.64.0.1/24'
set interfaces loopback lo
set interfaces tunnel tun0 address '2001:DB8::2/64'
set interfaces tunnel tun0 description 'HE.NET IPv6 Tunnel'
set interfaces tunnel tun0 encapsulation 'sit'
set interfaces tunnel tun0 local-ip '100.64.0.1'
set interfaces tunnel tun0 multicast 'disable'
set interfaces tunnel tun0 remote-ip '100.64.0.2'
set protocols static interface-route6 ::/0 next-hop-interface tun0
set protocols static route 0.0.0.0/0 next-hop 192.168.122.1
set service lldp interface all
set service lldp legacy-protocols cdp
set service router-advert interface eth0 hop-limit '64'
set service router-advert interface eth0 interval max '600'
set service router-advert interface eth0 name-server '2001:DB8::2'
set service router-advert interface eth0 name-server '2001:DB80:4860::8844'
set service router-advert interface eth0 prefix 2001:DB8::/64 valid-lifetime '2592000'
set service router-advert interface eth0 reachable-time '0'
set service router-advert interface eth0 retrans-timer '0'
set service ssh disable-host-validation
set service ssh port '22'
set system config-management commit-revisions '20'
set system console device ttyS0 speed '9600'
set system domain-name 'foo'
set system host-name 'rtr1'
set system name-server '1.1.1.1'
set system name-server '74.82.42.42'
set system name-server '2001:470:20::2'
set system ntp server 0.pool.ntp.org
set system ntp server 1.pool.ntp.org
set system ntp server 2.pool.ntp.org
set system ntp server 2.rhel.pool.ntp.org
set system ntp server 2.us.pool.ntp.org
set system ntp server stratum2-01.sfo01.publicntp.org
set system syslog global facility all level 'notice'
set system syslog global facility protocols level 'debug'
set system time-zone 'US/Hawaii'

Version

vyos@rtr1:~$ show version 

Version:          VyOS 1.3-rolling-202009080118

Adding that kernel option results in a successful boot. Below is
/var/log/messages from an unsuccessful boot followed by a successful boot
using the kernel option.

Sep 9 23:28:09 rtr1 ntpd[1841]: ntpd exiting on signal 15 (Terminated)
Sep 9 23:28:50 rtr1 systemd-sysctl[495]: Couldn't write '1' to 'net/netfilter/nf_conntrack_helper', ignoring: No such file or directory
Sep 9 23:28:50 rtr1 systemd-tmpfiles[501]: [/usr/lib/tmpfiles.d/heartbeat.conf:3] Line references path below legacy directory /var/run/, updating /var/run/heartbeat → /run/heartbeat; please update the tmpfiles.d/ drop-in file accordingly.
Sep 9 23:28:50 rtr1 systemd-tmpfiles[501]: [/usr/lib/tmpfiles.d/heartbeat.conf:4] Line references path below legacy directory /var/run/, updating /var/run/heartbeat/ccm → /run/heartbeat/ccm; please update the tmpfiles.d/ drop-in file accordingly.
Sep 9 23:28:50 rtr1 systemd-tmpfiles[501]: [/usr/lib/tmpfiles.d/heartbeat.conf:5] Line references path below legacy directory /var/run/, updating /var/run/heartbeat/crm → /run/heartbeat/crm; please update the tmpfiles.d/ drop-in file accordingly.
Sep 9 23:28:50 rtr1 systemd-tmpfiles[501]: [/usr/lib/tmpfiles.d/heartbeat.conf:6] Line references path below legacy directory /var/run/, updating /var/run/heartbeat/dopd → /run/heartbeat/dopd; please update the tmpfiles.d/ drop-in file accordingly.
Sep 9 23:28:50 rtr1 systemd-tmpfiles[506]: [/usr/lib/tmpfiles.d/heartbeat.conf:3] Line references path below legacy directory /var/run/, updating /var/run/heartbeat → /run/heartbeat; please update the tmpfiles.d/ drop-in file accordingly.
Sep 9 23:28:50 rtr1 systemd-tmpfiles[506]: [/usr/lib/tmpfiles.d/heartbeat.conf:4] Line references path below legacy directory /var/run/, updating /var/run/heartbeat/ccm → /run/heartbeat/ccm; please update the tmpfiles.d/ drop-in file accordingly.
Sep 9 23:28:50 rtr1 systemd-tmpfiles[506]: [/usr/lib/tmpfiles.d/heartbeat.conf:5] Line references path below legacy directory /var/run/, updating /var/run/heartbeat/crm → /run/heartbeat/crm; please update the tmpfiles.d/ drop-in file accordingly.
Sep 9 23:28:50 rtr1 systemd-tmpfiles[506]: [/usr/lib/tmpfiles.d/heartbeat.conf:6] Line references path below legacy directory /var/run/, updating /var/run/heartbeat/dopd → /run/heartbeat/dopd; please update the tmpfiles.d/ drop-in file accordingly.
Sep 9 23:28:50 rtr1 systemd-tmpfiles[506]: [/usr/lib/tmpfiles.d/resource-agents.conf:1] Duplicate line for path "/run/resource-agents", ignoring.
Sep 9 23:28:50 rtr1 systemd-udevd[519]: Process '/etc/console-setup/cached_setup_font.sh' failed with exit code 1.
Sep 9 23:28:50 rtr1 kernel: [ 0.000000] Linux version 5.8.5-amd64-vyos (vyos_bld@4f5a23bf28b9) (gcc (Debian 8.3.0-6) 8.3.0, GNU ld (GNU Binutils for Debian) 2.31.1) #1 SMP Sun Sep 6 15:53:44 UTC 2020
Sep 9 23:28:50 rtr1 kernel: [ 0.116402] Kernel command line: BOOT_IMAGE=/boot/1.3-rolling-202009100118/vmlinuz boot=live quiet rootdelay=5 noautologin net.ifnames=0 biosdevname=0 vyos-union=/boot/1.3-rolling-202009100118 console=tty0 console=ttyS0,9600
Sep 9 23:28:50 rtr1 kernel: [ 0.120225] random: get_random_u64 called from kmem_cache_create+0x39/0x4e0 with crng_init=0
Sep 9 23:28:50 rtr1 kernel: [ 0.349258] audit: type=2000 audit(1599730097.535:1): state=initialized audit_enabled=0 res=1
Sep 9 23:28:50 rtr1 kernel: [ 0.785024] acpi PNP0A03:00: fail to add MMCONFIG information, can't access extended PCI configuration space under this bridge.
Sep 9 23:28:50 rtr1 kernel: [ 1.287833] Initialise system trusted keyrings
Sep 9 23:28:50 rtr1 kernel: [ 1.289637] Key type asymmetric registered
Sep 9 23:28:50 rtr1 kernel: [ 1.289648] Asymmetric key parser 'x509' registered
Sep 9 23:28:50 rtr1 kernel: [ 1.335073] Loading compiled-in X.509 certificates
Sep 9 23:28:50 rtr1 kernel: [ 1.820978] SCSI subsystem initialized
Sep 9 23:28:50 rtr1 kernel: [ 2.080428] virtio_blk virtio0: [vda] 4194304 512-byte logical blocks (2.15 GB/2.00 GiB)
Sep 9 23:28:50 rtr1 kernel: [ 7.527595] process '/usr/bin/fstype' started with executable stack
Sep 9 23:28:50 rtr1 kernel: [ 8.284571] random: fast init done
Sep 9 23:28:50 rtr1 kernel: [ 14.145674] EXT4-fs (vda1): warning: maximal mount count reached, running e2fsck is recommended
Sep 9 23:28:50 rtr1 kernel: [ 16.489401] random: systemd: uninitialized urandom read (16 bytes read)
Sep 9 23:28:50 rtr1 kernel: [ 16.489446] random: systemd: uninitialized urandom read (16 bytes read)
Sep 9 23:28:50 rtr1 kernel: [ 16.548556] random: systemd: uninitialized urandom read (16 bytes read)
Sep 9 23:28:50 rtr1 kernel: [ 16.980997] Bridge firewalling registered
Sep 9 23:28:50 rtr1 kernel: [ 21.472061] random: crng init done
Sep 9 23:28:50 rtr1 kernel: [ 21.472065] random: 7 urandom warning(s) missed due to ratelimiting
Sep 9 23:28:52 rtr1 watchfrr.sh: Cannot stop bfdd: pid file not found
Sep 9 23:28:52 rtr1 watchfrr.sh: Cannot stop ldpd: pid file not found
Sep 9 23:28:52 rtr1 watchfrr.sh: Cannot stop pimd: pid file not found
Sep 9 23:28:52 rtr1 watchfrr.sh: Cannot stop ospf6d: pid file not found
Sep 9 23:28:52 rtr1 watchfrr.sh: Cannot stop ospfd: pid file not found
Sep 9 23:28:52 rtr1 watchfrr.sh: Cannot stop staticd: pid file not found
Sep 9 23:28:52 rtr1 watchfrr.sh: Cannot stop ripngd: pid file not found
Sep 9 23:28:52 rtr1 watchfrr.sh: Cannot stop ripd: pid file not found
Sep 9 23:28:52 rtr1 watchfrr.sh: Cannot stop zebra: pid file not found
Sep 9 23:28:52 rtr1 watchfrr.sh: Cannot stop bgpd: pid file not found
Sep 9 23:29:00 rtr1 modprobe: FATAL: Module nf_nat_proto_gre not found in directory /lib/modules/5.8.5-amd64-vyos
Sep 9 23:29:10 rtr1 Report Time: 2020-09-09 23:29:10
Sep 9 23:29:10 rtr1 Image Version: VyOS 1.3-rolling-202009100118
Sep 9 23:29:10 rtr1 Release Train: equuleus
Sep 9 23:29:10 rtr1 Built by: [email protected]
Sep 9 23:29:10 rtr1 Built on: Thu 10 Sep 2020 01:18 UTC
Sep 9 23:29:10 rtr1 Build UUID: 5fcaca09-7cc8-47d4-98a6-9855824a9be6
Sep 9 23:29:10 rtr1 Build Commit ID: 5fbfd67b1a767b
Sep 9 23:29:10 rtr1 Architecture: x86_64
Sep 9 23:29:10 rtr1 Boot via: installed image
Sep 9 23:29:10 rtr1 System type: KVM guest
Sep 9 23:29:10 rtr1 Hardware vendor: Red Hat
Sep 9 23:29:10 rtr1 Hardware model: KVM
Sep 9 23:29:10 rtr1 Hardware S/N:
Sep 9 23:29:10 rtr1 Hardware UUID: 19049165-3eee-d9a7-5f70-b5c0bdb37149
Sep 9 23:29:10 rtr1 Traceback (most recent call last):
Sep 9 23:29:10 rtr1 File "/usr/lib/python3/dist-packages/vyos/configsource.py", line 315, in
init__
Sep 9 23:29:10 rtr1 self._running_config = ConfigTree(running_config_text) if running_config_text else None
Sep 9 23:29:10 rtr1 File "/usr/lib/python3/dist-packages/vyos/configtree.py", line 134, in init
Sep 9 23:29:10 rtr1 raise ValueError("Failed to parse config: {0}".format(msg))
Sep 9 23:29:10 rtr1 ValueError: Failed to parse config: Syntax error on line 1, character 1: Invalid syntax.
Sep 9 23:29:10 rtr1 During handling of the above exception, another exception occurred:
Sep 9 23:29:10 rtr1 Traceback (most recent call last):
Sep 9 23:29:10 rtr1 File "/usr/libexec/vyos/services/vyos-configd", line 217, in <module>
Sep 9 23:29:10 rtr1 config = initialization(socket)
Sep 9 23:29:10 rtr1 File "/usr/libexec/vyos/services/vyos-configd", line 137, in initialization
Sep 9 23:29:10 rtr1 session_config_text=session_string)
Sep 9 23:29:10 rtr1 File "/usr/lib/python3/dist-packages/vyos/configsource.py", line 318, in init
Sep 9 23:29:10 rtr1 raise ConfigSourceError(f"Init error in {type(self)}")
Sep 9 23:29:10 rtr1 vyos.configsource.ConfigSourceError: Init error in <class 'vyos.configsource.ConfigSourceString'>
Sep 9 23:29:10 rtr1 systemd[1]: vyos-configd.service: Main process exited, code=exited, status=1/FAILURE
Sep 9 23:29:10 rtr1 systemd[1]: vyos-configd.service: Failed with result 'exit-code'.
Sep 9 23:30:20 rtr1 systemd[1]: [email protected]: Start-pre operation timed out. Terminating.
Sep 9 23:30:20 rtr1 systemd[1]: [email protected]: Control process exited, code=killed, status=15/TERM
Sep 9 23:30:20 rtr1 systemd[1]: [email protected]: Failed with result 'timeout'.
Sep 9 23:30:20 rtr1 systemd[1]: [email protected]: Start-pre operation timed out. Terminating.
Sep 9 23:30:20 rtr1 systemd[1]: [email protected]: Control process exited, code=killed, status=15/TERM
Sep 9 23:30:20 rtr1 systemd[1]: [email protected]: Failed with result 'timeout'.
Sep 9 23:30:20 rtr1 systemd[1]: [email protected]: Found left-over process 1208 (sleep) in control group while starting unit. Ignoring.
Sep 9 23:30:20 rtr1 systemd[1]: This usually indicates unclean termination of a previous run, or service implementation deficiencies.
Sep 9 23:31:50 rtr1 systemd[1]: [email protected]: Start-pre operation timed out. Terminating.
Sep 9 23:31:50 rtr1 systemd[1]: [email protected]: Start-pre operation timed out. Terminating.
Sep 9 23:31:50 rtr1 systemd[1]: [email protected]: Control process exited, code=killed, status=15/TERM
Sep 9 23:31:50 rtr1 systemd[1]: [email protected]: Failed with result 'timeout'.
Sep 9 23:31:50 rtr1 systemd[1]: [email protected]: Control process exited, code=killed, status=15/TERM
Sep 9 23:31:50 rtr1 systemd[1]: [email protected]: Failed with result 'timeout'.
Sep 9 23:31:50 rtr1 systemd[1]: [email protected]: Found left-over process 1389 (sleep) in control group while starting unit. Ignoring.
Sep 9 23:31:50 rtr1 systemd[1]: This usually indicates unclean termination of a previous run, or service implementation deficiencies.
Sep 9 23:31:51 rtr1 systemd[1]: [email protected]: Found left-over process 1390 (sleep) in control group while starting unit. Ignoring.
Sep 9 23:31:51 rtr1 systemd[1]: This usually indicates unclean termination of a previous run, or service implementation deficiencies.
Sep 9 23:33:21 rtr1 systemd[1]: [email protected]: Start-pre operation timed out. Terminating.
Sep 9 23:33:21 rtr1 systemd[1]: [email protected]: Control process exited, code=killed, status=15/TERM
Sep 9 23:33:21 rtr1 systemd[1]: [email protected]: Failed with result 'timeout'.
Sep 9 23:33:21 rtr1 systemd[1]: [email protected]: Start-pre operation timed out. Terminating.
Sep 9 23:33:21 rtr1 systemd[1]: [email protected]: Control process exited, code=killed, status=15/TERM
Sep 9 23:33:21 rtr1 systemd[1]: [email protected]: Failed with result 'timeout'.
Sep 9 23:33:21 rtr1 systemd[1]: [email protected]: Found left-over process 1571 (sleep) in control group while starting unit. Ignoring.
Sep 9 23:33:21 rtr1 systemd[1]: This usually indicates unclean termination of a previous run, or service implementation deficiencies.
Sep 9 23:33:21 rtr1 systemd[1]: [email protected]: Found left-over process 1572 (sleep) in control group while starting unit. Ignoring.
Sep 9 23:33:21 rtr1 systemd[1]: This usually indicates unclean termination of a previous run, or service implementation deficiencies.
Sep 9 23:34:51 rtr1 systemd[1]: [email protected]: Start-pre operation timed out. Terminating.
Sep 9 23:34:51 rtr1 systemd[1]: [email protected]: Control process exited, code=killed, status=15/TERM
Sep 9 23:34:51 rtr1 systemd[1]: [email protected]: Failed with result 'timeout'.
Sep 9 23:34:51 rtr1 systemd[1]: [email protected]: Start-pre operation timed out. Terminating.
Sep 9 23:34:51 rtr1 systemd[1]: [email protected]: Control process exited, code=killed, status=15/TERM
Sep 9 23:34:51 rtr1 systemd[1]: [email protected]: Failed with result 'timeout'.
Sep 9 23:34:51 rtr1 systemd[1]: [email protected]: Found left-over process 1753 (sleep) in control group while starting unit. Ignoring.
Sep 9 23:34:51 rtr1 systemd[1]: This usually indicates unclean termination of a previous run, or service implementation deficiencies.
Sep 9 23:34:51 rtr1 systemd[1]: [email protected]: Found left-over process 1754 (sleep) in control group while starting unit. Ignoring.
Sep 9 23:34:51 rtr1 systemd[1]: This usually indicates unclean termination of a previous run, or service implementation deficiencies.
Sep 9 23:35:23 rtr1 systemd-logind[627]: System is powering down.
Sep 9 23:36:41 rtr1 systemd-sysctl[500]: Couldn't write '1' to 'net/netfilter/nf_conntrack_helper', ignoring: No such file or directory
Sep 9 23:36:41 rtr1 systemd-tmpfiles[502]: [/usr/lib/tmpfiles.d/heartbeat.conf:3] Line references path below legacy directory /var/run/, updating /var/run/heartbeat → /run/heartbeat; please update the tmpfiles.d/ drop-in file accordingly.
Sep 9 23:36:41 rtr1 systemd-tmpfiles[502]: [/usr/lib/tmpfiles.d/heartbeat.conf:4] Line references path below legacy directory /var/run/, updating /var/run/heartbeat/ccm → /run/heartbeat/ccm; please update the tmpfiles.d/ drop-in file accordingly.
Sep 9 23:36:41 rtr1 systemd-tmpfiles[502]: [/usr/lib/tmpfiles.d/heartbeat.conf:5] Line references path below legacy directory /var/run/, updating /var/run/heartbeat/crm → /run/heartbeat/crm; please update the tmpfiles.d/ drop-in file accordingly.
Sep 9 23:36:41 rtr1 systemd-tmpfiles[502]: [/usr/lib/tmpfiles.d/heartbeat.conf:6] Line references path below legacy directory /var/run/, updating /var/run/heartbeat/dopd → /run/heartbeat/dopd; please update the tmpfiles.d/ drop-in file accordingly.
Sep 9 23:36:41 rtr1 systemd-tmpfiles[506]: [/usr/lib/tmpfiles.d/heartbeat.conf:3] Line references path below legacy directory /var/run/, updating /var/run/heartbeat → /run/heartbeat; please update the tmpfiles.d/ drop-in file accordingly.
Sep 9 23:36:41 rtr1 systemd-tmpfiles[506]: [/usr/lib/tmpfiles.d/heartbeat.conf:4] Line references path below legacy directory /var/run/, updating /var/run/heartbeat/ccm → /run/heartbeat/ccm; please update the tmpfiles.d/ drop-in file accordingly.
Sep 9 23:36:41 rtr1 systemd-tmpfiles[506]: [/usr/lib/tmpfiles.d/heartbeat.conf:5] Line references path below legacy directory /var/run/, updating /var/run/heartbeat/crm → /run/heartbeat/crm; please update the tmpfiles.d/ drop-in file accordingly.
Sep 9 23:36:41 rtr1 systemd-tmpfiles[506]: [/usr/lib/tmpfiles.d/heartbeat.conf:6] Line references path below legacy directory /var/run/, updating /var/run/heartbeat/dopd → /run/heartbeat/dopd; please update the tmpfiles.d/ drop-in file accordingly.
Sep 9 23:36:41 rtr1 systemd-tmpfiles[506]: [/usr/lib/tmpfiles.d/resource-agents.conf:1] Duplicate line for path "/run/resource-agents", ignoring.
Sep 9 23:36:41 rtr1 systemd-udevd[514]: Process '/etc/console-setup/cached_setup_font.sh' failed with exit code 1.
Sep 9 23:36:41 rtr1 kernel: [ 0.000000] Linux version 5.8.5-amd64-vyos (vyos_bld@4f5a23bf28b9) (gcc (Debian 8.3.0-6) 8.3.0, GNU ld (GNU Binutils for Debian) 2.31.1) #1 SMP Sun Sep 6 15:53:44 UTC 2020
Sep 9 23:36:41 rtr1 kernel: [ 0.118111] Kernel command line: BOOT_IMAGE=/boot/1.3-rolling-202009100118/vmlinuz boot=live quiet rootdelay=5 noautologin net.ifnames=0 biosdevname=0 vyos-union=/boot/1.3-rolling-202009100118 console=tty0 console=ttyS0,9600 systemd.mask=vyos-configd.service
Sep 9 23:36:41 rtr1 kernel: [ 0.121963] random: get_random_u64 called from __kmem_cache_create+0x39/0x4e0 with crng_init=0
Sep 9 23:36:41 rtr1 kernel: [ 0.352719] audit: type=2000 audit(1599730571.195:1): state=initialized audit_enabled=0 res=1
Sep 9 23:36:41 rtr1 kernel: [ 0.787659] acpi PNP0A03:00: fail to add MMCONFIG information, can't access extended PCI configuration space under this bridge.
Sep 9 23:36:41 rtr1 kernel: [ 1.288813] Initialise system trusted keyrings
Sep 9 23:36:41 rtr1 kernel: [ 1.290553] Key type asymmetric registered
Sep 9 23:36:41 rtr1 kernel: [ 1.290564] Asymmetric key parser 'x509' registered
Sep 9 23:36:41 rtr1 kernel: [ 1.336211] Loading compiled-in X.509 certificates
Sep 9 23:36:41 rtr1 kernel: [ 1.831778] SCSI subsystem initialized
Sep 9 23:36:41 rtr1 kernel: [ 2.087306] virtio_blk virtio0: [vda] 4194304 512-byte logical blocks (2.15 GB/2.00 GiB)
Sep 9 23:36:41 rtr1 kernel: [ 7.530062] process '/usr/bin/fstype' started with executable stack
Sep 9 23:36:41 rtr1 kernel: [ 8.286268] random: fast init done
Sep 9 23:36:41 rtr1 kernel: [ 14.153789] EXT4-fs (vda1): warning: maximal mount count reached, running e2fsck is recommended
Sep 9 23:36:41 rtr1 kernel: [ 16.486178] random: systemd: uninitialized urandom read (16 bytes read)
Sep 9 23:36:41 rtr1 kernel: [ 16.486223] random: systemd: uninitialized urandom read (16 bytes read)
Sep 9 23:36:41 rtr1 kernel: [ 16.543938] random: systemd: uninitialized urandom read (16 bytes read)
Sep 9 23:36:41 rtr1 kernel: [ 17.081202] Bridge firewalling registered
Sep 9 23:36:41 rtr1 kernel: [ 21.250567] random: crng init done
Sep 9 23:36:41 rtr1 kernel: [ 21.250572] random: 7 urandom warning(s) missed due to ratelimiting
Sep 9 23:36:44 rtr1 watchfrr.sh: Cannot stop bfdd: pid file not found
Sep 9 23:36:44 rtr1 watchfrr.sh: Cannot stop ldpd: pid file not found
Sep 9 23:36:44 rtr1 watchfrr.sh: Cannot stop staticd: pid file not found
Sep 9 23:36:44 rtr1 watchfrr.sh: Cannot stop ospf6d: pid file not found
Sep 9 23:36:44 rtr1 watchfrr.sh: Cannot stop pimd: pid file not found
Sep 9 23:36:44 rtr1 watchfrr.sh: Cannot stop ripd: pid file not found
Sep 9 23:36:44 rtr1 watchfrr.sh: Cannot stop ripngd: pid file not found
Sep 9 23:36:44 rtr1 watchfrr.sh: Cannot stop ospfd: pid file not found
Sep 9 23:36:44 rtr1 watchfrr.sh: Cannot stop bgpd: pid file not found
Sep 9 23:36:44 rtr1 watchfrr.sh: Cannot stop zebra: pid file not found
Sep 9 23:36:49 rtr1 watchfrr.sh: Cannot stop bfdd: pid file not found
Sep 9 23:36:49 rtr1 watchfrr.sh: Failed to start bfdd!
Sep 9 23:36:50 rtr1 modprobe: FATAL: Module nf_nat_proto_gre not found in directory /lib/modules/5.8.5-amd64-vyos
Sep 9 23:37:02 rtr1 systemd[1]: [email protected]: Control process exited, code=killed, status=15/TERM
Sep 9 23:37:43 rtr1 ntpd[1838]: ntpd [email protected] (1): Starting
Sep 9 23:37:53 rtr1 systemd[1]: /lib/systemd/system/radvd.service:11: PIDFile= references path below legacy directory /var/run/, updating /var/run/radvd.pid → /run/radvd.pid; please update the unit file accordingly.
Sep 9 23:37:56 rtr1 ntpd[1844]: giving up resolving host stratum2-01.sfo01.publicntp.org: No address associated with hostname (-5)
Sep 9 23:37:58 rtr1 commit: Successful change to active configuration by user root on unknown
Sep 9 23:37:59 rtr1 systemd[1]: /lib/systemd/system/radvd.service:11: PIDFile= references path below legacy directory /var/run/, updating /var/run/radvd.pid → /run/radvd.pid; please update the unit file accordingly.

I attempted clean installations of VMs using both
vyos-1.3-rolling-202008301444-amd64.iso and
vyos-1.3-rolling-202009011736-amd64.iso. The first image boots up and
allows configuration. However, the latter hangs and never reaches a vyos
command line prompt. The last lines on the boot console are:

[ 6.358142] systemd[1]: Inserted module 'autofs4'
[ 6.390366] random: crng init done
[ 6.403494] systemd[1]: systemd 241 running in system mode. (+PAM
+AUDIT +SELINUX +IMA +APPARMOR +SMACK +SYSVINIT +UTMP +LIBCRYPTSETUP
+GCRYPT +GNUTLS +ACL +XZ +LZ4 +SECCOMP +BLKID +ELFUTILS +KMOD -IDN2 +IDN
-PCRE2 default-hierarchy=hybrid)
[ 6.413395] systemd[1]: Detected virtualization kvm.
[ 6.415873] systemd[1]: Detected architecture x86-64.
[ 6.435045] systemd[1]: Set hostname to <localhost.localdomain>.
[ 6.438876] systemd[1]: Initializing machine ID from KVM UUID.
[ 7.438653] systemd[1]: usr-lib-live-mount-medium.mount: Unit is bound
to inactive unit dev-sr0.device. Stopping, too.
[ 7.504517] systemd[1]: Listening on initctl Compatibility Named Pipe.
[ 7.510565] systemd[1]: Set up automount Arbitrary Executable File
Formats File System Automount Point.
[ 7.521718] systemd[1]: Created slice system-serial\x2dgetty.slice.
[ 7.527163] systemd[1]: Created slice User and Session Slice.
[ 7.571020] systemd[312]: usr-lib-live-mount-medium.mount: Failed to
connect stdout to the journal socket, ignoring: No such file or directory
[ 7.971723] bridge: filtering via arp/ip/ip6tables is no longer
available by default. Update your scripts to load br_netfilter if you
need this.
[ 7.991012] Bridge firewalling registered
[ 8.013765] mpls_gso: MPLS GSO support

Perhaps it's some kind of hosting compatibility issue? The host is
running Centos8 with the latest patches. The VM is configured with the
512MB of memory and 2GB of disk storage. No other special customization.

@querubin Thank you for the detailed results --- firstly, these issues may be overdetermined due to several updates earlier this month; one notable issue is that we had moved to a 5.x series kernel, which showed several problems re QAT support, and an identified kernel bug. We have reverted to 4.19 as of yesterday until the next LTS kernel is available. I would suggest trying the most recent rolling, and then we will diagnose any persistent issues.

Tried the latest rolling. It boots/runs if you give it 768MB of memory.
At 512MB it hangs as before. I guess minimum requirements will be
changing.

@querubin thanks for the info; that requirement should not persist, as current work should lessen the overhead. I'll link the task back here when defined.

erkin renamed this task from boot problem beginning with VyOS 1.3-rolling-202009011736 to System hangs at boot after mounting config.Aug 29 2021, 1:10 PM
erkin set Issue type to Bug (incorrect behavior).
erkin removed a subscriber: Active contributors.