LE11/12: CIR Stops Working after Upgrade

  • I've been troubleshooting an issue for months now and have gotten nowhere. I recently hoped that building a custom image with kernel debug symbols enabled would provide some insight to my issue. Since I don't have ever see a coredump or anything I'm guessing it was a pointless effort...

    Before opening an issue in git, I figured I should check if there is any additional information or things I could try to be able to include as much information in the report as possible.

    I have multiple 6th generation Intel Nucs with a nuvoton consumer infrared module (built-in) that has functioned for many years on LE7/8/9/10 without any issue. After an upgrade to either LE11/12/13 (basically kernel 6.x+) I see that for whatever reason my remote stops working--randomly. The issue ususally occurs if quickly after boot I start scrolling through my PVR/TV menu. Using ir-keytable -t after I notice no input is being recognized and to look for button presses, nothing shows up.

    To fix this I can simply unload and reload the kernel module (nuvoton_cir) and everything works as normal including ir-keytable populating properly. Generally after performing this it will work until it's restarted--but not always. Kodi debug logs show absolutely no issue/change when the problem occurs. dmesg output also isn't of much use:

    The issue did happen right at the second RX FIFO message. The next message (417.036419) is me unloading/reloading the kernel module. What's strange is if I sit and jam a bunch of input buttons I see this message, even though it doesn't freeze. Looking through source code, it looks like this is just an informative message maybe?

    Lircd is not setup/configured/running and this is with a new default install of LE11/12/13.


    This happens with both the legacy and generic image (as expected). Any ideas? It's driving me crazy.

  • From the problem description it's sounds like a kernel issue, so triage needs to involve building custom LE images with kernel changes that attempt to pinpoint changes to the nuvoton driver. The normal git bisect workflow (on the whole kernel) will be challenging to do as rolling back to a much earlier kernel requires more than just the kernel githash to be changed in the LE buildsystem, and you'll end up going down a rabbit hole of changes. I would instead see if there are interesting looking changes that can simply be reverted to see if anything has a positive impact.

    This is the history of the driver: https://github.com/torvalds/linux…c/nuvoton-cir.c

    If you are sure that problems start around the 5.x to 6.x transition we have to look back some way. Most of the changes look not related and harmless. The first one that sounds a little interesting to me is https://github.com/torvalds/linux…e6233beadc650c4 but to set expectations, I'm not a code guru.

    However, to illustrate the process and test the theory I would git clone kernel sources and then revert that specific change with "git revert 4345e2e5c75895232a17e6783e6233beadc650c4" .. and if we are lucky this will revert cleanly. If it does, you can then do "git format-patch HEAD~1" to create a diff patch of the revert commit and copy the patch file to packages/linux/patches/ in the LE buildsystem. If you now respin the image, the buildsystem will detect the patch and only rebuild the kernel package with the patch applied, and then you can test the image. If the git revert fails, it's because there are other changes to the same code area since that commit, then and you need to revert the commits that touch the code in reverse sequence, and that will either be a relatively simple task with a couple more steps (and generating a few more patches to copy/respin/test) or it'll be a huge challenge. The nuvoton driver doesn't look like it had that much real-world change in recent years though, so I have a hunch specific commits will be okay to revert and test. No promises though.

    The problem description shows you have a moderate level of Linux skills (enough to compile images at least) so hopefully you might have half a clue on what I'm rambling on about :)

  • chewitt That seems to have done the trick! I was going down the right path and looking at the same nuvoton-cir.c changes but wasn't sure how I could do the revert/patch in git--network engineer, not a developer so thanks for this!

    I just cloned the master Linux branch and didn't bother even checking out a branch closer to what LE11 is running. Ran the revert and patch and recompiled the image.

    I'll apply this to my other PCs and test for a few days. I could attempt to do a pull request for this but it may take a few attempts.. ;)

    Thanks!

  • No luck reverting to the next commit: https://github.com/torvalds/linux…a91b5d0dd6c7871 Lots of conflicts/add/removes that I'm unsure how to proceed fixing.

    I'm guessing since this next commit (183e19) happened in the 4.20rc1 branch/tag and the commit you referenced (4345e2e) was the first in the 5.x branch/tag that reverting to this wouldn't make a lot of sense since it for sure happens between Linux Kernel 5.10.x and Linux Kernel 6.1.x.

    Is there anything else you think I could try reverting or any additional ways to debug the nuvoton module? I just don't know enough about how this kernel module would interact with other parts of the kernel. It may look like the rc-core driver may work in conjunction with the nuvoton driver but I'm not sure...

  • smp that's my conclusion as well and the changes seem pretty basic in that commit. My next step was to look at maybe using kgdb (after enabling in the kernel config) to try to find some additional info.

    I'm an experienced Linux user (20+ years) and have been using xbmc/Kodi since the begining but this falls outside with what I know and not sure that it will be helpful. From initial research, debugging an in-tree loadable kernel module is tough to say the least..

    My configs are vanilla and have been unchanged since LE7 and willing to test/try just about anything rather than refresh 5 skull canyon PCs; I'm sure someone out there is facing the same issue? :(

    Thanks for all the helpful responses from everyone so far.

  • chewitt I've been attempting to do a bisect and you're correct, it's been very challenging especially with patches and gcc versions not liking certain kernel versions. Am I able to compile the vanilla kernel outside of the LE build environment and simply install it to an existing running LE instance?

    I'm having trouble determining the best workflow to complete the bisect and any guidance would be appreciated. I assume this is really the only way to fix this unless another LE developer has the same hardware and willing to test. I'm happy to do the leg work but I'm hours into the bisect screwing around with build errors.