How to *properly* diagnose a persistent memory leak ?

  • Hi all.

    TL;DR - How can I *properly* diagnose a persistent memory usage issue beyond the very basic info that normal debug.logs supply.

    I've been having an annoying and persistent issue with RAM usage in LE9.x/Kodi18.x that's proving very hard to nail down.

    (On a mini-HTPC with Intel CPU & Nvidia GPU, 4GB RAM)

    After a day or two of uptime I can easily reach 3GB of used RAM, and if I let it get much further the machine will reliably crash, often trashing any open files such as guisettings.xml which means having to do tedious clean up/restore-from-backup work to recover from.

    Use case: The mini-HTPC is attached to rear of TV on the VESA mount & controlled via IR remote, so I usually turn it off and on with suspend-to-ram rather than full poweroff/cold-boot, which would require reaching round the back of the TV to power it back on each time. This means that it's effective uptime can get very long, since it's really one long session with periodic STR interruptions, therefore any issues with poor memory handling over long sessions are more apparent as they don't get masked/hidden by frequent reboots.

    About the only vague conclusions I have after much investigation & trial/error is that it's likely something to do with not properly clearing buffered video segments after they are no longer needed.

    This is exaggerated with extensive use of Youtube & Twitch streams, and jumping around/seeking a lot, but is also apparent with locally stored content such as regular MP4/MKV video, or TS files from TVHeadend recordings, albeit to a lesser degree.

    What technical tools & procedures can I use to give meaningful technical information & diagnoses to the coding team that go beyond "give us a debug.log" ... which doesn't contain enough information on the underlying processes to usefully investigate further. ?

    Cheers,

    --

    kurai

  • Yep - saw that but at the time I looked at it, it didn't seem a promising lead since it seemed to be specific to VAAPI, which is not a factor in my setup with VDPAU.

    I was hoping there might be some more useful reporting tools tucked away in the release packaged build of LE, but I guess I'll have to construct a build/compile environment with the development toolchain & valgrind and see if I can muddle my way through to any meaningful result :/

  • Well, that was an ... adventure :/

    I recompiled an LE 9.02 image (Generic x86_64) with debug options on plus valgrind.

    Unfortunately it results in a running state that's so slow it is pretty much unusable navigating GUI, let alone when trying to replicate regular usage behaviour by watching videos/streams etc.

    Also ... valgrind stopped reporting errors after the first 1000 - which it reached before all the various Kodi startup operations & plugins had finished initialising. At that point (what would be about 30 seconds into a regular Kodi startup session) it had already generated a 6MB logfile.

    Does anyone have any advice about a saner way to go about this and/or how to interpret the valgrind logfile to derive usable information ?

  • A bit too basic to be useful, I'm afraid.

    That only tells us that it's kodi.bin that's holding on to all the memory - not which subsection/component is at fault.

    Running pmap against the process is not much more useful - it just shows all the missing RAM as being assigned to the general heap, not what put it there.

  • Thanks for responding, but we're already quite a long way past that kind of thing. ;)

    I'm really looking for advice on how to extract information useful to the coders from a specially compiled debug build with special analysis tools, rather than the general options that are available with the standard release build.

    • Official Post

    Thanks for responding, but we're already quite a long way past that kind of thing. ;)

    No problem, you're just already deep into it.:thumbup:

    So, I assume you know that all add-ons are OK, and all switchable services (Bluetooth etc.) are OK, too.

    You can play around with compile switches now, and start inserting your own debug log messages. That's the way I would go from there.