Manage USB autosuspend of Hard drives

  • My LE 8.2 running on odroid c2 keeps hanging every now and then. I believe the root cause is the USB hard drive going into sleep and then not waking up in a timely manner and failing. A reboot usually fixes the issue. But off late this has started happening almost daily.

    When I SSH and try to access a folder on the hard drive via 'ls' command or any other way, that SSH session hangs too. Even Ctrl+C does not help. I have to disconnect session and reconnect. My SSH session works as long as I stay away from accessing anything on the HDD.

    Here is the log entries from "journalctl -r" that makes me believe the problem is due to Hard drive:

    Oct 12 11:50:58 libreElec kernel: EXT4-fs warning (device sda1): __ext4_read_dirblock:901: error reading directory block (ino 9

    306186, block 0)

    Oct 12 11:50:58 libreElec kernel: end_request: I/O error, dev sda, sector 297863464

    Oct 12 11:50:58 libreElec kernel: cdb[0]=0x88: 88 00 00 00 00 00 11 c1 09 28 00 00 00 08 00 00

    Oct 12 11:50:58 libreElec kernel: sd 0:0:0:0: [sda] CDB:

    Oct 12 11:50:58 libreElec kernel: ASC=0x44 <<vendor>> ASCQ=0x81

    Oct 12 11:50:58 libreElec kernel: sd 0:0:0:0: [sda]

    Oct 12 11:50:58 libreElec kernel: Sense Key : 0x4 [current]

    Oct 12 11:50:58 libreElec kernel: sd 0:0:0:0: [sda]

    Oct 12 11:50:58 libreElec kernel: Result: hostbyte=0x00 driverbyte=0x08

    Oct 12 11:50:58 libreElec kernel: sd 0:0:0:0: [sda]

    Oct 12 11:50:58 libreElec kernel: sd 0:0:0:0: [sda] Unhandled sense code

    Oct 12 11:50:58 libreElec kernel: sd 0:0:0:0: timing out command, waited 180s


    Two questions:
    - How do I disable USB auto-suspend for hard drives and make it persist after reboots.

    - Is there any way to recover from the errors suggested in logs without reboot ? such as restarting a service or remounting hard drive ?

    Ideally, I would want the hard drives to spin down only at night. But that is assuming that it will come back up day after day without these errors.

    Please advice. Thanks in advance.

  • Looks like your HDD has errors. This could be for many reasons. My first thought is whether it is getting enough power - try using a powered hub and see if this improves things.

    Otherwise, you should check your HDD on another machine or use fsck to check the drive - google will help you on how to do this.

    As with everything - BACKUP - before doing anything that is potentially destructive.

  • Thanks Iridium

    I'll try fsck and clean out any errors. I do occasionally power cycle the LE device which could have caused errors on HDD. I'll report back on how this goes.

    Other thoughts:

    The power supply I'm using has more than enough power for sure, its a 5V 3A adapter. The odroid came with 5V 2A.

    Any idea if there are any logs generated on LE if the Hard drive was not getting enough power ?

    do I not need to worry about auto sleep of HDD on LE ? I do want to turn off auto sleep of HDD's in any case.
    Thanks !

  • The error seems to suggest an error on the HDD

    NEVER assume enough power. PSU 's companies are notorious for over enhancing their properties.

    Provide the output of "dmesg|paste" and journalctl|paste

    I think the "sleep" is a red herring in this instance. Most modern HDD are VERY intelligent (and LE knows this) so that should be discounted.

    .

  • fsck came back clean:

    Code
    fsck from util-linux 2.29
    e2fsck 1.43.4 (31-Jan-2017)
    /dev/sda1: clean, 10816/244187136 files, 218674802/976745728 blocks

    I noticed that the USB micro cable was 5% not inserted. I pushed it in the HDD and after that I was able to access the directories on HDD. So that may have been the problem. I'll know for sure if this doesn't happen again in the next 3-4 days.

    One more question - is there a way for me to install SMART on LE installation ?

    I had one USB HDD connected to this LE device permanently and I might add another one soon. I'll make sure it has enough power for device under full load and for the two HDDs.

    I'd like to put some sort of monitoring in place to keep an eye on the HDDs and any errors that may come up for the HDDs from time to time.

    Please advice on what the right approach to accomplish this would be. Thanks in advance.

  • output of dmesg and journalctl as requested: (The device was rebooted just a few minutes ago to unmount sda1 and run fsck on it)

    Code
    libreElec:~ # dmesg|paste
    http://ix.io/1p0N
    
    libreElec:~ # journalctl|paste
    http://ix.io/1p0O

    I had tested the power capability of PSU but running script that would max out CPU and still read and write on both hard drives continuously for a couple of hours.

    The PSU became hot but I don't think it missed any read/write operation. This was a while ago and I didn't write down all my observations so I'm not a 100% sure. I'll prolly do it again before I add second HDD.

    As for what is happening these days - one HDD. almost no activity on CPU. I try to play video on Kodi and hit play, but it hangs with message "working". Journalctl had the errors I posted in first message.

    Edited once, last by ajiratech: added note that device was recently rebooted, so logs are for a short uptime. (October 12, 2018 at 8:04 PM).