Troubles booting a Cubox-i4P

Hi everyone. I’ve been using a Solidrun Cubox-i4P with 2 HDDs mounted as RAID1 since 2014 as a personal home server. The hardware is getting pretty old, but it is still serving my needs.

A couple days ago however, the machine started to reboot several times by itself around midnight. I didn’t find any concerning error messages in journald, just some pending sectors on one of the HDDs (which I believe are due to the abrupt reboot). I believed it was due to micro power outages from the electrical network, so after some days I turned it down for the night. When I tried to boot it in the morning, the machine started rebooting again and again, each time going a bit further in the process. Connected with the serial console, at first, it was just:

U-Boot 2021.10 (Jan 26 2022 - 12:00:00 +0000)

in loop. Then, after a couple of minutes:

U-Boot 2021.10 (Jan 26 2022 - 12:00:00 +0000)

CPU:   Freescale i.MX6Q rev1.2 1200 MHz (running at 792 MHz)
CPU:   Extended Commercial temperature grade (-20C to 105C) at 37C
Reset cause: POR
Board: MX6 Cubox-i
DRAM:  2 GiB

U-Boot SPL 2021.10 (Jan 26 2022 - 12:00:00 +0000)
WDT:   Not found!
Trying to boot from MMC1

also in loop. Then, later, it was able to find the EFI partition, to start GRUB, then launch the Linux kernel, an then rebooted.

Then it went a bit further and started systemd, but then rebooted…

On and on, up to the point where the system was fully booted and stable (i.e., it doesn’t reboot anymore, at least for a couple of hours).

I checked the filesystem on the memory card with fsck.ext4 and it is clean; I forced the check to be sure, and searched bad sectors too (none found). I haven’t done any maintenance operation on the machine for weeks. A thing I haven’t try is to boot from a different memory card, but I don’t have one at hand.

I’m not sure if the several reboots at midnight are the same as this “reboot cascade,” given that in the former case the system was fully operational between boots (logs getting written, etc.). As a home server, the last (voluntary) reboot of the machine was maybe last year. The cascade of reboots is concerning, and never happened before.

I’m wondering if this may be the sign of great age (11 years, at least 7 years operating non-stop), and that the hardware is starting to fail. What do you think?

We have had customers running devices non-stop since we first started manufacturing them. I personally have a CuBox-i that has been running for over a decade. A couple of things to check. First, just feel the CuBox and see if it is running warmer than expected, Second it may be the power supply that is starting to fail rather than the system.

The memory card could be an issue, but if other devices are able to read it I kind of doubt it.

If you don’t mind I would recommend monitoring temperature, and possibly testing another power supply first.

I’ve to check again, but the temperature was around 55-60°C (this summer was quite hot). I cannot tell by touching the Cubox if it’s warmer than usual. I have a throttling daemon which adjusts the CPU frequency with the temperature. I read another post where the culprit was the power suply, so I’ll try to replace it.

I just replaced the power supply, and the Cubox successfully booted on the first try. Thanks for the good advice!