Interrupts from PCI card no longer seen by Linux kernel after a reboot

Hi,

We have an LX2160 HoneyComb board which has a network interface PCI card plugged into in its PCI slot. What we have observed is that the interrupts sent from the PCI card to host CPU are no longer seen by the host CPU (/proc/interrupts shoes zero counts for the card’s interrupts). We have tried both our own custom PCI network card and an off the shelf Intel 1Gb NIC and both exhibit the same “loss of interrupts”.

Note that a full power cycle restores operation.

This looks like a different code path is taken to initialise the hardware on reboot compared to a power cycle. The only evidence I have for this is that on a reboot the following messages appear in the boot log that are not present after a power cycle:

Re-Distributor 0 LPI is already enabled

which comes from arch/arm/lib/gic-v3-its.c in the U-Boot sources. This may or may not be relevant.

Any help solving this issue would be appreciated.

Antony.

can you provide a pastebin of your dmesg output on an unsuccessful boot, as well as the output of cat /proc/interrupts thanks

Following is the console output from the board on a bad reboot SolidRun LX2160 Bad Reboot - Pastebin.com (and this is the console output on a good boot after a power cycle SolidRun LX2160 Good Boot - Pastebin.com). The output from cat /proc/interrupts:

           CPU0       CPU1       CPU2       CPU3       CPU4       CPU5       CPU6       CPU7       CPU8       CPU9       CPU10      CPU11      CPU12      CPU13      CPU14      CPU15      
  9:          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0     GICv3  25 Level     vgic
 11:       4103       3063       4548       3384       3501       4431       4950       3255       3326       3784       4181       3416       1773       1778       1808       1857     GICv3  30 Level     arch_timer
 12:          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0     GICv3  27 Level     kvm guest vtimer
 14:          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0     GICv3  23 Level     arm-pmu
 19:       4436          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0     GICv3  66 Level     2000000.i2c
 20:          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0     GICv3  67 Level     2020000.i2c
 21:         58          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0     GICv3 106 Level     2040000.i2c
 22:          3          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0     GICv3  57 Level     20c0000.spi
 23:       6547          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0     GICv3  60 Level     mmc0
 24:        265          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0     GICv3  95 Level     mmc1
 25:       1535          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0     GICv3  64 Level     uart-pl011
 27:          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0     GICv3  68 Level     gpio-cascade, gpio-cascade
 28:          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0     GICv3  69 Level     gpio-cascade, gpio-cascade
 30:          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0     GICv3  76 Level     2800000.timer
 31:          1          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0     GICv3 112 Level     xhci-hcd:usb1
 32:        115          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0     GICv3 113 Level     xhci-hcd:usb3
 33:          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0     GICv3 165 Level     ahci-qoriq[3200000.sata]
 34:          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0     GICv3 168 Level     ahci-qoriq[3210000.sata]
 35:          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0     GICv3 129 Level     ahci-qoriq[3220000.sata]
 36:          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0     GICv3 132 Level     ahci-qoriq[3230000.sata]
 37:          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0     GICv3 150 Level     PCIe PME, aerdrv
 38:          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0     GICv3 160 Level     PCIe PME, aerdrv
 39:          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0     GICv3  45 Level     arm-smmu global fault
 40:          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0     GICv3  46 Level     arm-smmu global fault
 41:          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0     GICv3  47 Level     arm-smmu global fault
 42:          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0     GICv3  48 Level     arm-smmu global fault
 43:          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0     GICv3 243 Level     arm-smmu global fault
 44:          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0     GICv3 244 Level     arm-smmu global fault
 45:          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0     GICv3 245 Level     arm-smmu global fault
 46:          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0     GICv3 246 Level     arm-smmu global fault
 47:          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0     GICv3 247 Level     arm-smmu global fault
 48:          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0     GICv3 248 Level     arm-smmu global fault
 49:          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0     GICv3 249 Level     arm-smmu global fault
 50:          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0     GICv3 250 Level     arm-smmu global fault
 51:          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0     GICv3 251 Level     arm-smmu global fault
 52:          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0     GICv3 252 Level     arm-smmu global fault
121:          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0  ITS-fMSI 230000 Edge      dpmac.10
122:          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0  ITS-fMSI 230001 Edge      dpmac.9
123:          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0  ITS-fMSI 230002 Edge      dpmac.8
129:          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0  ITS-fMSI 230008 Edge      dprtc.0
130:          3          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0  ITS-fMSI 230009 Edge      dpio.15
131:          0          4          0          0          0          0          0          0          0          0          0          0          0          0          0          0  ITS-fMSI 230010 Edge      dpio.14
132:          0          0          4          0          0          0          0          0          0          0          0          0          0          0          0          0  ITS-fMSI 230011 Edge      dpio.13
133:          0          0          0          4          0          0          0          0          0          0          0          0          0          0          0          0  ITS-fMSI 230012 Edge      dpio.12
134:          0          0          0          0         44          0          0          0          0          0          0          0          0          0          0          0  ITS-fMSI 230013 Edge      dpio.11
135:          0          0          0          0          0          2          0          0          0          0          0          0          0          0          0          0  ITS-fMSI 230014 Edge      dpio.10
136:          0          0          0          0          0          0         40          0          0          0          0          0          0          0          0          0  ITS-fMSI 230015 Edge      dpio.9
137:          0          0          0          0          0          0          0         15          0          0          0          0          0          0          0          0  ITS-fMSI 230016 Edge      dpio.8
138:          0          0          0          0          0          0          0          0         27          0          0          0          0          0          0          0  ITS-fMSI 230017 Edge      dpio.7
139:          0          0          0          0          0          0          0          0          0         91          0          0          0          0          0          0  ITS-fMSI 230018 Edge      dpio.6
140:          0          0          0          0          0          0          0          0          0          0          1          0          0          0          0          0  ITS-fMSI 230019 Edge      dpio.5
141:          0          0          0          0          0          0          0          0          0          0          0          4          0          0          0          0  ITS-fMSI 230020 Edge      dpio.4
142:          0          0          0          0          0          0          0          0          0          0          0          0          1          0          0          0  ITS-fMSI 230021 Edge      dpio.3
143:          0          0          0          0          0          0          0          0          0          0          0          0          0          3          0          0  ITS-fMSI 230022 Edge      dpio.2
144:          0          0          0          0          0          0          0          0          0          0          0          0          0          0          2          0  ITS-fMSI 230023 Edge      dpio.1
145:          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          5  ITS-fMSI 230024 Edge      dpio.0
146:          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0  ITS-fMSI 230025 Edge      dprc.1
147:          0          0          0          0          0          0          1          0          0          0          0          0          0          0          0          0  ITS-fMSI 230026 Edge      dpni.0
148:          0          0          0          0          0          0          0          1          0          0          0          0          0          0          0          0  ITS-fMSI 230027 Edge      dpni.1
377:          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0  mpc8xxx-gpio   0 Edge      sfp-0-mod-def0
378:          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0  mpc8xxx-gpio   9 Edge      sfp-1-mod-def0
379:          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0  mpc8xxx-gpio  10 Edge      sfp-2-mod-def0
380:          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0  mpc8xxx-gpio  11 Edge      sfp-3-mod-def0
381:          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0  mpc8xxx-gpio   6 Edge      power
382:        485          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0     GICv3 172 Level     8010000.jr
383:          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0     GICv3 173 Level     8020000.jr
384:          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0     GICv3 174 Level     8030000.jr
385:          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0   ITS-MSI 524288 Edge      bh2-bmi
386:          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0   ITS-MSI 524289 Edge      bh2-rx
417:          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0   ITS-MSI 526336 Edge      bh2-bmi
418:          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0   ITS-MSI 526337 Edge      bh2-rx
IPI0:      1002       1272       1556       1071       1866       1067       1057        896        985       1306       3999       1225         17         18         19         18       Rescheduling interrupts
IPI1:       953        301        204        177        185        173        166        227        159        170        187        181        153        153        153        154       Function call interrupts
IPI2:         0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0       CPU stop interrupts
IPI3:         0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0       CPU stop (for crash dump) interrupts
IPI4:         0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0       Timer broadcast interrupts
IPI5:         0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0       IRQ work interrupts
IPI6:         0          0          0          0          0          0          0          0          0          0          0          0          0          0          0          0       CPU wake-up interrupts
Err:          0

Note that the kernel configuration we are using is not the stock version since we have disabled everything related so sound and graphics and enabled most networking options (can provided the config if needed). We are also running a minimalist Ubuntu 22.04 LTS as the distribution.

can you also provide the output of lspci -vvv in both cases. In both cases your devices should be using MSI interrupts, so I don’t believe the error message you are seeing is relevant.

In both cases the cards are detected so I am guessing that most likely this is due to pcie powersaving misbehaving. Could you please also do a boot with pcie_aspm=off added to the kernel commandline to see if that makes a difference?

See SolidRun LX2160 Good lspci -vvv - Pastebin.com for the lspci -vvv output after a good boot and see SolidRun LX2160 Bad lspci -vvv - Pastebin.com after a bad reboot. This is with our custom PCI card installed (rather than the Intel NIC). The only difference I see in the output is the following change:

  • Good: Masking: fffffff8 Pending: 00000000
  • Bad Masking: fffffffc Pending: 00000000

in the capabilities configuration of the PCI card.

I added the pcie_aspm=off to the kernel boot options but this did not make any difference (problem still present).

Does the card work if MSI interrupts are disabled? pci=nomsi

oh I just notice are you using SR-IOV on these cards?

The answers to your questions:

  • Work with pci=nomsi ? - No
  • Are you using SR-IOV ? - No

Hi,

We have run into the exact same issue, even with LSDK21.08.

Does anyone have a solution too this ?

Cheers,

Neil

Is the card detected fine and shows up in lspci? My initial thought is that the card is having issues by not getting a hard reset on reboot.

Card is detected fine, Enumerated correctly (all BARS allocated correctly), driver loads correctly, but no interrupts trigger when the card writes to its MSI address. Card only supports MSI interrupts and does not support ASPM, this is correctly advertised in the PCIe configuration space, and reported by lspci. The card is reset correctly, we can communicate with it, seems the host is ignoring the write to the MSI address. Power cycle the board and all is fine.

Thanks for resurrecting this thread…

My initial thought is that the card is having issues by not getting a hard reset on reboot.

If this is the case then what can be done to get the PCIe reset (i.e. the PERSTn signal) to be asserted on reboot - do we need a CPLD/FPGA firmware update ?

Without a working PERSTn, a power cycle will be needed after every reboot.

Note that in addition to NICs (our own design and off the shelf) we have seen a similar "not work after reboot) problem appear on an NVME card (a Sabrent in this case).

I was curious about PERST, because our reset for PCIe and all the other external devices is tied to the reset of the system. You can review this under the Reset Logic and Boot Select section of the CEX7 simplified schematic, The SYSRST_OUT signal is tied into most the hardware reset signals. Sometimes some PCIe devices such as FPGAs require a longer time to reset and load firmware and the timing of PERST vs CLOCK etc can cause issues on reset.

If you don’t mind, and this isn’t a recommended solution just a debugging option, it would be interesting to know if you don’t see this issue if you use our UEFI firmware.

Otherwise we will probably need some more debug output from the IRQ and PCIe initialization to dig into this further.