TWSD-1296 Failed to adjust Platform Resource Assignment on the AVX 9800 after the vAPV Instance is created
Review Request #1473 — Created March 31, 2026 and updated
| Information | |
|---|---|
| ngurunathan | |
| AVX2 | |
| rel_avx_2_7_6 | |
| TWSD-1296 | |
| Reviewers | |
| bsrivalli, stevenku, wli | |
In some AVX devices with 512GB RAM slowed down NIC initialization and the netdev events like renaming of interfaces to slot based wouldn't be completed before backend comes up. This was fixed in https://arraynetworks.atlassian.net/browse/TWSD-863 by adding "ExecStartPre=/usr/bin/udevadm settle --timeout=30" in backend.service to wait for events to settle.
In some scenarios where udevadm settle times out, backend service fails and gets retried later. But, avxd_monitor service doesn't wait for backend.service and gets started. The port_monitor process of avxd_monitor.service creates /var/run/avx_port.cache before interface renaming done or VFs are created. This results in VFs being shown as 0 in Platform resource assignment.
To fix this, modified avx_monitor.service to check whether backend.service is active before it it brought up. Also added "Restart on failure" to retry if backend is inactive based on https://github.com/systemd/systemd/issues/1312#issuecomment-228874771
After fix when backend udevadm settle times out:
Mar 31 14:22:13 AN systemd: backend.service: control process exited, code=exited status=1
Mar 31 14:22:13 AN systemd: Failed to start AVX 2.0 Management Console Daemon.
Mar 31 14:22:13 AN systemd: Dependency failed for AVX 2.0 VA/RESOURCE Management Daemon.
Mar 31 14:22:13 AN systemd: Job avxd.service/start failed with result 'dependency'.
Mar 31 14:22:13 AN systemd: Unit backend.service entered failed state.
Mar 31 14:22:13 AN systemd: backend.service failed.
Mar 31 14:22:13 AN systemd: Starting avx monitor service...
Mar 31 14:22:13 AN systemd: Starting watchdog daemon...
Mar 31 14:22:13 AN watchdog[7000]: starting daemon (5.13):
Mar 31 14:22:13 AN systemd: Started watchdog daemon.
Mar 31 14:22:13 AN systemctl: failed
Mar 31 14:22:13 AN systemd: avx_monitor.service: control process exited, code=exited status=3
Mar 31 14:22:13 AN systemd: Failed to start avx monitor service.
Mar 31 14:22:13 AN systemd: Unit avx_monitor.service entered failed state.
Mar 31 14:22:13 AN systemd: avx_monitor.service failed.
Later starts up correctly:
Mar 31 14:23:04 AN backend: grep: write error: Broken pipe
Mar 31 14:23:04 AN systemd: Started AVX 2.0 Management Console Daemon.
Mar 31 14:23:04 AN backend: groupadd: group 'nova' already exists
Mar 31 14:23:04 AN backend: Failed to create openstack grp for nova
Mar 31 14:23:04 AN systemd: Starting avx monitor service...
Mar 31 14:23:04 AN systemctl: active
Mar 31 14:23:04 AN backend: adduser: user 'nova' already exists
Mar 31 14:23:04 AN backend: Failed to create openstack user nova
Mar 31 14:23:04 AN systemd: Started avx monitor service.----------NIC Resource Status--------
domain1:
port 1( 8VFs): 6 available.
port 2( 8VFs): 7 available.
port 3( 8VFs): 8 available.
port 4( 8VFs): 7 available.
port13(16VFs): 12 available.
port14( 0VFs): 0 available.
domain2:
port 5( 8VFs): 8 available.
port 6( 8VFs): 7 available.
port 7( 8VFs): 7 available.
port 8( 8VFs): 7 available.
port 9( 8VFs): 8 available.
port10( 8VFs): 8 available.
port11( 8VFs): 8 available.
port12( 8VFs): 8 available.
