[x86-64] Crashes overnight

  • I was unhappy with the performance of my Raspberry Pi 4 running LibreELEC, so I switched my HTPC to a ThinkCentre M720q. While the performance is much improved, it will crash overnight if left alone.

    The setup is: Computer (with USB CEC dongle) -> Marantz NR1604 receiver -> Sharp Aquos 4T-C70BK2UD TV. LibreELEC 12.2.1.

    Symptoms: No video output, no response to the keyboard, not connected to the network (it's on WiFi). When I turn the TV on, it calls the input "CEC Adapter" instead of Kodi.

    The crash seems to be time-based. I typically watch stuff in the evenings, if I turn it off around 9pm, then try turning it on the next day around 6-7pm, it will always be non-responsive. As an experiment, I turned the TV on the next morning; it was fine.

    I left a SSH session logged into it and tailing the logs the last time it crashed, but there was nothing helpful. It seems like it was only application logs, is there some way to get kernel logs out of this thing?

  • You can enable persistent logging in the LE settings add-on. Then you can do journalctl -b 1 | paste to pastebin the previous system log (current boot = 0, previous = 1, etc.). I'd be looking for Out Of Memory (OOM) issues and other kernel splats in the log.

  • Thanks. I enabled persistent logs, and it stayed up a few days. Found it crashed again this morning. Nothing interesting in the logs.

    Here's when it last booted:

    Code
    Mar 28 09:48:46 LibreELEC kernel: Linux version 6.16.12 (docker@c7f39856707b) (x86_64-libreelec-linux-gnu-gcc-13.2.0 (GCC) 13.2.0, GNU ld (GNU Binutils) 2.41) #1 SMP Sat Nov  1 07:12:40 UTC 2025

    And the last message in the logs is:

    Code
    Mar 29 15:14:21 baryon connmand[537]: ntp: adjust (slew): -0.000993 sec


    Last seven hours of logs are NTP messages and this temp file cleanup:

    Code
    Mar 29 10:04:58 baryon systemd[1]: Starting systemd-tmpfiles-clean.service...
    Mar 29 10:04:58 baryon systemd[1]: systemd-tmpfiles-clean.service: Deactivated successfully.
    Mar 29 10:04:58 baryon systemd[1]: Finished systemd-tmpfiles-clean.service.


    These four messages are the only ones that look like something that might be off:

    Code
    Mar 28 15:31:01 baryon kernel: perf: interrupt took too long (2525 > 2500), lowering kernel.perf_event_max_sample_rate to 79200
    Mar 28 19:07:47 baryon kernel: perf: interrupt took too long (3162 > 3156), lowering kernel.perf_event_max_sample_rate to 63000
    Mar 29 00:35:16 baryon kernel: perf: interrupt took too long (3972 > 3952), lowering kernel.perf_event_max_sample_rate to 50100
    Mar 29 08:44:23 baryon kernel: perf: interrupt took too long (4974 > 4965), lowering kernel.perf_event_max_sample_rate to 40200


    From what I see, these typically happen when the CPU frequency scales up or down, which doesn't sound like a problem.

    Any other ideas?

  • Hi, maybe isn't a "right solution" for this, but I'm using for a long time a daily reboot cron job. These OS's are developed to be used continuously... but that's true for an ideal world. For the "strange behaviors", a good prevention could be the regularly reboot. If you use a cron job like this: "15 04 * * * reboot", will reboot your device at everyday at 04:15 am.

  • It crashed again at 3:17am this morning. Same stuff in the logs, NTP, perf, tmpfiles cleanup. Nothing that looks obviously wrong. What else can I look at here? Is there somewhere to report a bug?

  • You can enable persistent logging in the LE settings add-on (requires reboot after) and this will allow you to reboot and then share the previous system log with e.g. journalctl -b1 --no-pager | paste but if the Kodi process simply goes unresponsive instead of actually crashing (which will leave traces) there's probably still nothing to see. You can also look at kodi.old.log (boot -1) to see if there are signs of problems there too. Kodi will also record crash logs, but (again) that requires process termination to happen and other problems like resource starvation won't necessarily effect a crash.

    You can also update to a current nightly to see if that magically fixes anything. Backup /storage/.kodi to /storage/.kodi-old first to make downgrading possible.

  • I think I may have a hardware issue, as the machine hung while I was watching stuff on it last night. Will try a different machine to see if I have the same symptoms there.

    Thanks for the ideas.

  • Also check for thermal issues. If the SoC gets too hot it will self-throttle resulting in reduced performance. These boards are happy running at much hotter temps than we're comfortable with but there are limits. Software issues that stuck/spin CPU cores can get things hot though and RPi4 requires notably more cooling than RPi3/RPi5.