external HDDs - random unmount of drives on rpi4

  • I have an external hdd tower with 5 hdds inside. The hdd tower has its own 12V 6.5A power supply and uses a single USB cable to connect to rpi4. I can access media on all 5 drives most of the time, however, the system seems to randomly unmount some of the disks from time to time. The issue is resolved when I unplug the HDD hub from rpi4, plug it into a windows PC (which automatically prompts me to scan hdds for errors), check the disks, gracefully unmount from windows, and then plug back into rpi4. The disks are not damaged or corrupt. All drives are mounted into kodi, and I can watch media sources on all drives.

    ...until it happens again - without any obvious reason or trigger. Wash, rinse, repeat. I am tired of having to do this unplug/scan for errors/unmount/remount process at random times, and maybe twice a week. Why is this happening?

    dmesg output:

    External Content pastebin.com
    Content embedded from external sources will not be displayed without your consent.
    Through the activation of external content, you agree that personal data may be transferred to third party platforms. We have provided more information on this in our privacy policy.


    kodi log attached (too large for pastebin, and ix.io is down)

  • This sounds like power problem (despite the fact that your HDD tower is self-powered) - maybe some power peaks at night etc. I would suggest to add some reliable powered USB Hub between RPi and HDD tower (like this) and see if that makes any difference.

  • You don't state which format the disks are in but from the reference to the Windows PC I'm guessing NTFS. If so, and you need to keep Windows compatibility you could try formatting to ExFAT which seems more stable with LibreElec. If you don't need Windows compatibility then go to Ext4 and read this thread fsck

  • This sounds like power problem (despite the fact that your HDD tower is self-powered) - maybe some power peaks at night etc. I would suggest to add some reliable powered USB Hub between RPi and HDD tower (like this) and see if that makes any difference.

    The hdd tower has its own ~7A external supply. It powers both the hdds and the SATA-USB bridge. It does not make sense to add a USB hub, and the manufacturer explicitly recommends not using a hub. Why add more USB in the mix? HDDs do not draw power over USB, they are externally powered.


    All disks are NTFS. I am not aware of any hdd formatting issues with LE. Shouldn't all of the hdds equally have issues (all unmount instead of just 2-3)?


    The hdd tower is JBOD using this hardware: https://www.amazon.com/gp/product/B06XK972L1

    Maybe the tower is putting some inactive drives into sleep mode?


    There's clearly some useful debugging in dmesg if someone could help interpret it.


    Code
    [ 33.086977] ntfs3: sda5: volume is dirty and "force" flag is not set!
    [ 33.141480] ntfs3: sdc2: volume is dirty and "force" flag is not set!
    [ 33.178834] ntfs3: sde2: volume is dirty and "force" flag is not set!
  • The hdd tower has its own ~7A external supply. It powers both the hdds and the SATA-USB bridge. It does not make sense to add a USB hub, and the manufacturer explicitly recommends not using a hub. Why add more USB in the mix? HDDs do not draw power over USB, they are externally powered.

    Yes, I noticed this :D Your hdd tower might be simply broken, or it might not fully cooperate with RPi by design, or maybe your energy suplier experiences some random nightly brownouts - so I would think that some little experiments may help to reveal the root couse of your problem in a better way than reading loudly the hdd tower manual - play with additional powered USB hub, maybe invest into small power monitoring tool like this, see if upgrade to LE 12 changed anything... from my experience sudden hdd "unmounts" are typically finger-pointing to some power related issues.

  • I have one of those USB pass thru sticks to measure current draw. The hdd tower draws 0A from the rpi USB port, because it is externally powered. Everything is on UPS.

    Looking into the specific hdd tower, some people mentioned in reviews it would put drives to sleep, and sometimes only some of the drives, not necessarily perfectly correlated to inactive drives. My guess is some chinese-special firmware that is randomly sleeping some of the drives, which may cause an un-graceful unmount and remount attempt, which only windows can resolve ...because NTFS. The hdd enclosure mfr has a firmware utility to modify the sleep function. I have just used it to disable sleep. We'll see if that does it. Since the unmount is an intermittent issue, it could be some time before I say for sure.

    The NTFS issue I had no idea about until now. It had never been an issue for me in the past with several (the exact same) NTFS disks on older SATA-USB docking stations and several previous versions of LE (Leia, Matrix). Hopefully it's not the problem here since I'd have to buy a large new hdd to transfer/preserve data and reformat all disks. But I suppose that will be the next thing I try if the problem persists.

  • My 6TB Seagate HDD is set to power down (sleep) after (I think) 20 minutes activity. I've had that on RPi3, RPi4 and currently Dell Optiplex all running LibreElec. Same drive. Initially formatted to NTFS switched to ExFAT (because of the sort of problem you're having) and now back to NTFS (long story to do with timestamps).

    I have a feeling (I'm a Linux/LE newbie so no more than a feeling) that LE is somehow setting / not setting the dirty bit correctly and then when it has to remount the drive it fails. Hooking the drive into Windows and chkdsk (either manually or as part of the connection) clears that and so things work again. I say this because I've had a problem when there has been nothing written to the disk for weeks just reading to play videos.

    LE will check the Ext4 partition on the boot media at boot but that's it, or that's my interpretation of what I've been told.

  • Well, the issue repeated. So the sleep setting on the hdd tower was not the issue.

    There's one other complication to my story. After upgrading (clean fresh install) from LE10 to LE11 - all using the same hardware - that is when this issue first appeared. IIRC I had the same hdd tower on LE10 and did NOT have this unmount problem. I also think I was using docker and transmission on LE10, without issues.

    But on current build, LE11, docker and transmission often give me problems with having to stop/restart the service and even sometimes remove and reinstall transmission. The symptom I get is, when adding a new magnet link to transmission, the free space on the target drive shows up as 1.99 GB, when the drive is actually some 100GB free space. That is how I know something has gone awry. This is sometimes coupled with the disk unmount problem. Typically the unmounted disk is the target download disk, plus sometimes additional disks. But it is hard to say if that itself is a symptom of a deeper problem or the cause. I've uninstalled transmission and I'm going to run without it for some time to see if the problem keeps happening without any docker services running.

    Edited once, last by mklod (December 21, 2023 at 11:20 PM).

  • I'll make you a bet.

    Buy a diff media box from some other company. Use the second media box the same way you were and see if the same issue arises. I bet you each and every SBC you try has the same problem.

    My experience with 7 ext HDs on a 2 diff Ankor hubs (and other brands and 3 diff SBCs) is all will be well when you first set the new media box up. All drives will show up and scan in initially, and some time after that the problem of drives disappearing will arise again, regardless of brand of SBC or hub. The file system stuff is related to the box locking up and being shut down in an unfriendly way after one of the drives go senile.

    Been there, done that, then I bought a NAS. The externals back up the NAS, (for now).

  • By 'media box', do you mean the hdd tower?

    My only SBC in this scenario is rpi4. The hdd tower is a simple ATAPI bridge that turns 5x internal SATA hdds into a single USB device connected to the rpi4 by a single USB cable.

    I _could_ turn this hdd tower into a NAS, but if the hdd tower is the problem, that presumably wouldn't fix the issue. You bought a multi-bay NAS that doesnt have this problem? What is the model so I can look it up and check its hardware specs?

    It feels like overkill for my needs, but a true NAS would give me greater flexibility...on the other hand this feels like a situation where a simple one liner script to keep all 5 hdds awake might be the next step in identifying the true cause of the problem.

  • I've used Synology NAS boxes for years. I started out with the cheaper consumer ones but switched to Intel CPU models to make future updates from e.g. 4-bay to 6-bay easier. They aren't the cheapest, but I appreciate something that has regular updates and the few times I've interacted with their support and engineering people they were good. I could easily build something myself for less, but I greatly value being able to just hit the update button and benefit from someone else's maintenance effort instead of it all being my own responsibility.

  • I spent a full day looking into proper NAS solutions. Synology looks like the way to go for an off the shelf ($700) solution, but I've decided to make my own for ~$300 with TrueNAS. Will update the thread with results, but I expect a proper NAS will fix the unmount issue.