Posts by frakkin64

    NB: I have the impression that Armbian support for Amlogic isn't the best due to everyone on staff deliberately trying to look the other way and avoid the noise from unsupported "TV Box" users and board vendors not funding support. The net result seems to be a mixed bag of patches from various places and a generic defconfig. I respond to Q's when asked but I don't follow their development.

    Yeah, I am not super impressed with how the patches are managed. It's a hot-podge of diff's & am/format-patch styles, which of course am hates, so it's a PITA to pull it into Git and rebase the patches.

    So this same issue affects LE booted SDs, which means it's running an LE kernel & LE DTB. Mostly using Armbian kernels as a host because it is working & installed on the eMMC. So it seems like whatever the issue is related to the latest DT. So do you think there is a missing patch with LE as well?

    Edit: I actually rebuilt 5.16.9 with only patches for bootsplash/fb & patches from LE, same result. I think I also tested 5.16.9 mainline with just bootslash/fb patches, and had the same result.

    Got to the bottom of it, it is actually a DT issue.

    5.16.9 works fine with this DT:

    build/arm64-dts-amlogic-add-support-for-Radxa-Zero.patch at master · armbian/build
    Armbian Linux build framework. Contribute to armbian/build development by creating an account on GitHub.
    github.com

    Same kernel (literally the same one, just swapped DTB) with this DT causes SD corrupting and HW resets on SD (This is the one that should be in stable 5.16.y, I believe, because Armbian kicks it out):

    build/arm64-dts-amlogic-add-support-for-Radxa-Zero-0001.patch at master · armbian/build
    Armbian Linux build framework. Contribute to armbian/build development by creating an account on GitHub.
    github.com

    Not quite sure why yet. Didn't see anything directly touching &sd_emmc_b, but didn't spend a lot of time looking at the full DT. Also repeated the test with 5.13 (which oddly has Wifi issues) and the 2 DTBs and found the same result.

    Re-loaded the vendor U-Boot (Android bootloader), and popped in CE on a SD card and well everything is totally fine. So it seems like it may be a mainline issue.

    Tested more kernels with Armbian:

    5.16.9 bad

    5.15.18 bad

    5.15.8 bad

    Tested the Radxa image for Ubuntu Focal (with vendor kernel):

    5.10.69-10-amlogic-g617a45dd0fce ok

    So it appears to be a mainline issue, here is where I am at -- working on bisecting to find the offending commit:

    5.10.100 ok

    5.11 ok

    5.12 ok

    I'll update in the new few days to see if I can isolate it. The test is pretty simple, write 1GB to the SD card with the offending kernel and you should get failures with "dd if=/dev/zero of=/mnt/data bs=8192 count=131072" and some of the stuff above in dmesg.

    None of the Amlogic configs have it set by default but it shouldn't be an issue for LE images due to https://github.com/LibreELEC/Libr…otloader/config which is applied at build time.

    Yeah, haven't tested the u-boot built with LE. I was using the box image on SD, booting from Armbian U-Boot. The other two (LZ4, LZMA) appear to be defaults when run through u-boot's Kconfig already, but LZO is defaulted to no/off. I doubt it's an u-boot issue, because I would expect once the kernel is up then u-boot shouldn't play a role anymore.

    I think Radxa might have a build with legacy kernels, probably worth testing with that. If that doesn't work, then I am going to assume it is a SD issue.

    I've moved this into a separate thread so it's easier to track the discussion. Can you share diff patches or clearer instructions on what you've enabled or disabled for LZO?

    This is what was done for LZO on U-Boot, just enabling it on the defconfig. Armbian is just using straight mainline with minimal patches, so that issue is in upstream (It's hit or miss, some boards have it enabled and some do not in upstream).

    Diff
    diff --git a/configs/radxa-zero_defconfig b/configs/radxa-zero_defconfig
    index a9afb64ae06..bd3027a9c33 100644
    --- a/configs/radxa-zero_defconfig
    +++ b/configs/radxa-zero_defconfig
    @@ -63,3 +63,4 @@ CONFIG_VIDEO_DT_SIMPLEFB=y
     CONFIG_SPLASH_SCREEN=y
     CONFIG_SPLASH_SCREEN_ALIGN=y
     CONFIG_OF_LIBFDT_OVERLAY=y
    +CONFIG_LZO=y

    With that patch applied, it will decompress the kernel, and things look fine. As I mentioned both Armbian and LE have the same issue with the SD card, so I am leaning towards hardware/defective unit/marginal design. I guess the only way to rule that out would be to try vendor U-Boot and a vendor kernel.

    Running mainline U-Boot 2022.01 from Armbian on a Radxa Zero, the box image fails to boot because LZO decompression is not enabled in mainline on this board. I enabled LZO, rebuilt U-Boot, and now it looks like LibreELEC boots from SD, goes to resize the filesystem and ends up with file system write errors & corruption.

    Tried 3 different SD cards, from 2 different brands. So I guess either the SD slot is highly marginal, or perhaps an issue in LE's DT? I'll have to play with it some more from Armbian and see if that also causes filesystem corruption. Armbian is loaded into the eMMC and is running 5.16.9 kernel from mainline.

    Testing from Armbian to SD cards:

    kernel: [ 269.339082] mmc0: error -84 writing Cache Enable bit

    kernel: [ 269.339107] mmc0: tried to HW reset card, got error -84

    kernel: [ 269.378798] I/O error, dev mmcblk0, sector 2080 op 0x1:(WRITE) flags 0x100000 phys_seg 87 prio class 0

    kernel: [ 269.378827] Buffer I/O error on dev mmcblk0p1, logical block 32, lost async page write

    kernel: [ 269.378841] Buffer I/O error on dev mmcblk0p1, logical block 33, lost async page write

    kernel: [ 269.378846] Buffer I/O error on dev mmcblk0p1, logical block 34, lost async page write

    kernel: [ 269.378852] Buffer I/O error on dev mmcblk0p1, logical block 35, lost async page write

    kernel: [ 269.378857] Buffer I/O error on dev mmcblk0p1, logical block 36, lost async page write

    kernel: [ 269.378862] Buffer I/O error on dev mmcblk0p1, logical block 37, lost async page write

    kernel: [ 269.378867] Buffer I/O error on dev mmcblk0p1, logical block 38, lost async page write

    kernel: [ 269.378872] Buffer I/O error on dev mmcblk0p1, logical block 39, lost async page write

    kernel: [ 269.378890] Buffer I/O error on dev mmcblk0p1, logical block 40, lost async page write

    kernel: [ 269.378896] Buffer I/O error on dev mmcblk0p1, logical block 41, lost async page write

    kernel: [ 269.380164] I/O error, dev mmcblk0, sector 17216 op 0x1:(WRITE) flags 0x100000 phys_seg 87 prio class 0

    kernel: [ 269.405032] FAT-fs (mmcblk0p1): error, fat_get_cluster: invalid cluster chain (i_pos 484866)

    kernel: [ 269.405056] FAT-fs (mmcblk0p1): Filesystem has been set read-only

    kernel: [ 269.405065] FAT-fs (mmcblk0p1): error, fat_free_clusters: deleting FAT entry beyond EOF

    kernel: [ 269.625336] FAT-fs (mmcblk0p1): error, fat_get_cluster: invalid cluster chain (i_pos 484866)


    So it looks like both are broken, hardware is broken, or it's the kernel since they are both on 5.16. There is also similar initialization errors (false starts) during detection, shows up 2x in dmesg:

    [ 2.465881] mmc0: error -84 reading general info of SD ext reg

    [ 2.465914] mmc0: error -84 whilst initialising SD card

    Then eventually detected:

    [ 4.643469] mmc0: new high speed SDHC card at address 59b4

    [ 4.649821] mmcblk0: mmc0:59b4 LX32G 29.5 GiB

    [ 4.651880] mmcblk0: p1

    I think if you read the terms on some of these sites you would see what your doing is not strictly legal. For example, tvtv.us says:

    Quote

    Permission is granted to view the materials (information or software) on LocalTvTv Inc.'s website for personal, non-commercial transitory viewing only.

    This says the information can only be viewed on their website. It doesn't say you can scrape it, export it, or load it into another system and drive recording schedules on it. The word "transitory" means not permanent, in order to make use of it as an EPG makes it permanent.

    By the way, it seems to me that tvtv.us probably earns money through ad-revenue on the website, which is why their terms of use is forcing use of the data through their website.

    Code is obfuscated, seems sketchy, not to mention it seems the obfuscation creates massive bloat with non-sense goto's everywhere. At least in the US most legitimate sources for EPG data is a paid subscription.

    At least with Schedules Direct there is a support channel for line-up issues, which does come up from time to time. The subchannels do change networks from time to time.

    TomTom I would suggest running the HDR test build, you can get it in the LE11 nightlies, or this LE10 build:

    HiassofT
    December 18, 2021 at 9:25 PM

    And report back, it looks like it is a HDR 4K file.

    But the current kits I'm building for family and friends need to be compact and self sustaining.

    Ahh, I get it. Catering to a more novice user base that is familiar with Windows. I don't envy the tech support you will have to endure.

    I may be jumping the gun, but... I would already have had a CIFS mount dropout by now... It appears to be working! Woohoo!

    Good to hear. Sounds like it was a combination of a few things, firmware bump to perhaps address the rev 1.5 board firmware timeouts/splats, and Samba bump/mount option changes, care to summarize what you think solved it? It may help others that come across this problem.

    I thought that the issue was with the client, like you, but I ruled it out by using my home NAS as the server. All went well.


    I confirmed that it was a LE11 server issue when I fresh installed kodiplayer with Raspberry Pi OS. Same server hardware, different server software, same client hardware and software. Same network in all cases.

    So you ran Samba on Raspberry Pi OS and did the same testing and everything was fine? Interesting. I kind of got lost in all your updates. :)

    Well, the other thing to point out is there should be a Samba bump to 4.13.17 in tomorrow's nightly along with HiassofT's updates, so you can maybe look out for that as well. Personally, I don't use Samba at all with Kodi/LE, just NFS. Samba has a lot of protocol overhead, that NFS doesn't have. Scanning videos on a large library takes a long time over the network with Samba. I also only use the Pi to play media, that's all, no docker crap, no samba server crap (it's disabled), not even cron.

    Beyond that, perhaps you need to fire up Wireshark and do some packet capturing to see what is happening at a protocol level. I assume the only problem left is this Samba issue? I am assuming the crash on kodiplayer is because the connection was reset from the peer.

    But the remote CIFS client still drops. Oh well...

    CIFS issue probably has nothing to do with firmware. I assume you are using Samba on "kodiplayer" and it's crashing from what is noted above.

    This to me suggests the client "grabber"(?) is the one with the issues:

    Code
    smb2_sendfile_send_data: sendfile failed for file Transit/Incoming/The.Book.of.Boba.Fett.S01E06.2160p.WEB-DL.x265.10bit.HDR.DDP5.1.Atmos-BobarBiriyani[rartv]/The.Book.of.Boba.Fett.S01E06.Chapter.6.2160p.WEB-DL.DDP5.1.Atmos.HDR.HEVC-BobarBiriyani.mkv (Connection reset by peer) for client ipv4:10.25.25.1:44272

    Connection reset by peer is initiated by the peer -- they send an RST packet. Seems like a network issue perhaps, because 7 minutes earlier CIFS is "reconnecting" -- what was the cause of that? Maybe on the client see if vers=3.0 works on the mount options?

    start.elf and fixup.dat, right?

    Three files, I believe they are renamed from the Raspberry Pi OS distro.

    LibreELEC.tv/package.mk at 00c51c726c15954625629788105840b0816f2688 · LibreELEC/LibreELEC.tv
    Just enough OS for KODI. Contribute to LibreELEC/LibreELEC.tv development by creating an account on GitHub.
    github.com

    These would be renamed and go into the /flash partition. I haven't done this ever myself.

    Have you tested it with Raspberry Pi OS, minimal USB configuration (no external hub), then start adding devices in, and perhaps report the issue to https://github.com/raspberrypi/linux/ (they prefer testing on Raspberry Pi OS, but have been known to look into other OS issues, really depends). Once you have Raspberry Pi OS up then you will probably want to use rpi-update to test the next branch (5.15). The external hub & hard drive is questionable, whether if it is over-drawing current from the USB ports (design maximum is 1.2 amps from all USB ports from what I read). It could be your power adapter as well.

    I would be surprised if it is a hardware revision issue, at this point the hardware revisions should be fairly minor and well tested, opening that issue and putting your speculation into the issue may reveal what the hardware change was and what the engineers think about that theory. It is likely an issue with the 5.15 kernel. If you were getting flip timeouts/drm related issues earlier when the firmware timed out then it wouldn't surprise me, because those seem to cause kernel soft-lockups/deadlocks. It seems like your earlier splat is from VC4 DRM.

    Perhaps the kernel push that HiassofT referred to will help, he pushed the PR yesterday so it may show up in the nightly soon.

    Now it would also be nice if someone explained to me HOW this string acts on the pi4 / OS.

    It has to do with disabling 1.8V signaling for UHS cards, some cards & extenders do not like it. You would potentially have slower SD-card speeds. You can google around the raspberrypi.com forums or github for perhaps a more precise answer from one of their engineers.

    It also helps if you have sporadic boot failures in "mount_common" on the Pi4 with LE.

    If someone is compiling their own images and understands how to track CPU usage, that might be of interest.

    I'm seeing about < 6% CPU usage at idle on Kodi. It was around 18% before this patch, also CPU temp dropped about 2 degrees (51C -> 49C).

    I haven't noticed anything negative with UI rendering with that patch either, impression from reading the ticket and related PRs this tweak affects UI rendering speed? 40ms should be 25fps, so from 100fps to 25fps.

    BTW, I went back to 5.6, which there was a few issues to sort out, and MPEG2 doesn't appear to work on that build either. 5.6 is pretty rocky, as you would expect :). So my guess is something is missing, could be there is a patch not pushed, or a DT component missing, or perhaps it's very specific hardware.

    SW decoding MPEG2 for now is fine for me, so can wait for upstream.