Posts by camelreef

camelreef

Right, now using ext4 as the file system on the Samba shared 5TB USB v3 WD Elements 2.5" external hard drive.

I'm trying to hammer the system as much as I can using the planned workflow, so far, so good!

[UPDATE] I've added multiple massive files writes and read in parallel through the network on top of the unusually high load workflow and everything is holding up like a champ and behaving as expected. I'm getting very realistic, solid and consistent read and write performance numbers.

~~I'm still reserving declaration of a successful solution for later, as the occurrence of the drops was pretty random.~~

It looks like the ntfs3 (Paragon) kernel driver gives real bad performance in my context. I'll try to look for when it replaced ntfs-3g (FUSE) as the default driver in LE, as it may explain why the other identical setup worked well for a good while. Ah! December 2021! This correlates!

Replace ntfs-3g_ntfsprogs with Linux kernel 5.15 ntfs3 by heitbaum · Pull Request #5838 · LibreELEC/LibreELEC.tv

DONE ✅ Dependant on #5738 RPi now updated #5888 AMlogic now updated #5920 Amlogic kernel update/image/package.mk Note: Amlogic is already configured and…

github.com

heitbaum, I don't know how high this will ever get into the LE crew's priorities, or if anything can be done about it, but ntfs3 may need attention.

I'm losing a flexible functionality feature by going from NTFS to ext4, but everyone will survive, it's better than losing basic functionalities!

Many thanks to all who have pitched in!

camelreef

Quote from JoeAverage

quiet clear.
but on one half data could get unmentioned lost/damaged...

It's bulky data, but nothing that can't be replaced! That being said, I can;t remember the last time that cp failed copying anything other than perfectly.

It will have taken 3 days, but I'm nearly done with the partitions swap from NTFS to ext4 and associated data juggling.

Evaluating how that improves Samba will resume shortly!

camelreef

Quote from JoeAverage

if you need to do it under Windows:
https://fastcopy.jp/
faster copy with verify => unattend copy

Thanks! I will take a look for eventual future need. However, copying from NTFS to NTFS is only half the story, as I will be using ext4 partitions in the end.

I'm using a dead classic cp -rv within a muxed terminal on a spare Linux box. It will take the time it needs to take while I live my life normally!

camelreef

I haven't lost interest or fixed the issue. Normal life took over for a while.

However, I am gathering enough storage to shift content about (only a few TB!) and switch partitions to ext4 from NTFS and see if I/O perfs improve.

I'll report back!

camelreef

Quote from CvH

ntfs at Linux is really slow, its not a "native" filesystem (just a fuse driver) at linux at the moment thats the reason the driver its rather slow
so if you have a slow device combines with a slow fs you got very slow performance
thats why you have a NAS

ntfs3, from Paragon, as used currently in LE11 is not FUSE, like ntfs3g, and is supposed to provide very nice performance (it does on my Mac!) without resorting to user space.

The system that is performing very poorly is using ntfs3, not the FUSE ntfs3g.

I have another identical system (but different RPi 4 h/w rev.), so also using ntfs3, that is performing well.

I have provided a reference benchmark on an Ubuntu running on a NUC7, using ntfs3f/FUSE that is performing well, at least adequately enough.

Both RPis use the exact same external HDD brand/model/size, the NUC user an older and smaller model from the same brand.

As for a NAS, I do have one at home, see the signature's link, because I'm a geek . That little system is for a friend who is very much not a geek, without a NAS....

camelreef

Quote from JoeAverage

if it's an "WD_BLACK P10 Game Drive" (WDBA3A0050BBK-WESN):
- have CMR and
- support TRIM (=> untrimmed => write performance ?)

It's a basic and cheap WD Elements spindle. The same in both systems.

WD Elements Portable USB 3.0 External Hard Drive Storage (1 TB to 5 TB) | Western Digital | Western Digital

WD Elements™ portable hard drives with USB 3.0 offer reliable, high-capacity storage to go, fast data transfer rates, universal connectivity and massive…

www.westerndigital.com

Quote from JoeAverage

sudo hdparm --direct -tT /dev/sdX

might (I'm unsure) make the "echo 3 > /proc/sys/vm/drop_caches" superfluous ?

In the end, even if the test benefits from some caching, you would expect better numbers than what I am getting on that particular RPi 4 Rev. 1.5.

Nico

camelreef

Quote from CvH

you mount a NTFS disk at linux and use it as native device? thats basically a NONO since a long time because the NTFS fuse driver is slow and not feature complete
this may change with more recent kernels where finally a proper NTFS driver was merged
can you try a ext4 formated disk ?

I do not understand your comment.

I use an NTFS formatted disk for practical reasons, as it is meant to be plugged regularly on Windows machines.

Also, LE11 uses ntfs3, which is, I believe, the proper NTFS driver you mention.

Finally, my reference system uses the old FUSE ntfs3g driver, and gets acceptable performance. Another system uses the same ntfs3 driver and gets acceptable performance

I will try an ext4 formatted disk, but it will just be a data point, I need NTFS for my purpose (I also believe that NTFS disks usage is to be expected around LE).

I will also try another NTFS formatted disk I have spare and compare perfs, as another data point.

camelreef

Oooooh.... Interesting turn!

The Samba guys noticed that the disk subsystem was very slow and offered options to set.

Right... It's fine adapting smbd to a slow disk... But is it really slow?

So, let test the disk subsystem!

LE11 RPi 4 system

Code

/dev/sda2 on /var/media/content type ntfs3 (rw,relatime,uid=0,gid=0,fmask=37777600133,iocharset=utf8)

kodiplayer:~ # hdparm -Ttv /dev/sda2

/dev/sda2:
 multcount     =  0 (off)
 readonly      =  0 (off)
 readahead     = 256 (on)
 geometry      = 4769073/64/32, sectors = 9767061504, start = 411648
 Timing cached reads:   2030 MB in  2.00 seconds = 1016.80 MB/sec
 Timing buffered disk reads: 162 MB in  3.01 seconds =  53.83 MB/sec

Display More

That is indeed quite poor! This is a new 5TB WD 2.5" USB 3 external HDD, one NTFS partition mounted using ntfs3, connected to one of the USB 3 sockets of the Pi 4.

Just for perspective gathering, I've done a perf test on another Linux box of mine, with a much older 1 TB WD 2.5" USB 3 external HDD, one NTFS partition mounted using ntfs3g, so quite similar:

Reference system

Code

/dev/sdb1 on /mnt/Blue-1TB type fuseblk (rw,relatime,user_id=0,group_id=0,default_permissions,allow_other,blksize=4096)

13 nico@rastaman:~$ sudo hdparm -Ttv /dev/sdb1
[sudo] password for nico: 

/dev/sdb1:
 multcount     =  0 (off)
 readonly      =  0 (off)
 readahead     = 256 (on)
 geometry      = 121597/255/63, sectors = 1953458174, start = 2
 Timing cached reads:   22072 MB in  1.99 seconds = 11092.93 MB/sec
 Timing buffered disk reads: 324 MB in  3.01 seconds = 107.58 MB/sec

Display More

So, that's read speed.

Let's take a look at writing and reading on the actual file system:

LE11 RPi 4 system

Code

kodiplayer:/var/media/content # sync && echo 3 > /proc/sys/vm/drop_caches && dd if=/dev/zero of=testfile bs=128k count=16k && ls -lah testfile && dd if=testfile of=/dev/null bs=128k
 && rm testfile
16384+0 records in
16384+0 records out
2147483648 bytes (2.0GB) copied, 186.714454 seconds, 11.0MB/s
-rw-r--r--    1 root     root        2.0G Feb 24 10:24 testfile
16384+0 records in
16384+0 records out
2147483648 bytes (2.0GB) copied, 274.038229 seconds, 7.5MB/s

Reference system

Code

root@rastaman:/mnt/Blue-1TB# sync && echo 3 > /proc/sys/vm/drop_caches && dd if=/dev/zero of=testfile bs=128k count=16k && ls -lah testfile && dd if=testfile of=/dev/null bs=128k && rm testfile
16384+0 records in
16384+0 records out
2147483648 bytes (2.1 GB, 2.0 GiB) copied, 27.9964 s, 76.7 MB/s
-rwxrwxrwx 1 nico root 2.0G Feb 24 10:21 testfile
16384+0 records in
16384+0 records out
2147483648 bytes (2.1 GB, 2.0 GiB) copied, 42.7862 s, 50.2 MB/s

Well... That's not glorious for the LE11 RPi 4!

Remember that this is a H/W rev. 1.5 RPi 4 (running nightly-20220219-ae4b7da)

. I have a carbon copy system in production elsewhere, but with H/W Rev. 1.2 (running LE11 nightly-20220120-f7f2fd5, I'm a bit scared of upgrading it, it works):

Code

LibreELEC:/var/media/Media # sync && echo 3 > /proc/sys/vm/drop_caches && dd if=/dev/zero of=testfile bs=128k count=16k && ls -lah testfile && dd if=testfile of=/dev/null bs=128k &&
 rm testfile
16384+0 records in
16384+0 records out
2147483648 bytes (2.0GB) copied, 17.060671 seconds, 120.0MB/s
-rw-r--r--    1 root     root        2.0G Feb 24 10:36 testfile
16384+0 records in
16384+0 records out
2147483648 bytes (2.0GB) copied, 0.748839 seconds, 2.7GB/s

A striking difference!

I'm probably going to upgrade the production system to the same LE11 nightly build as the problematic system. I'm not too thrilled, but it will help knowing if it's a software issue...

So, maybe not a Samba issue, but a disk I/O performance issue...

camelreef

Quote from CvH

if there bugsystem is broken just try the mailing list
https://www.samba.org/samba/bugreports.html
its sadly quite normal that linux stuff has no userfriendly way to report bugs etc
sometimes mailinglists are the only way, welcome to 1990

See, your negativity shamed them enough to finally send me an email and allow me to open an account!

14988 – smbd pauses activity for 10s of seconds before resuming, silently

Let's see what happens now

camelreef

Can you believe that I have not yet been offered a Samba bugzilla account or got any feedback from them after 3 attempts?

Also, the Samba 4.16 inclusion in LE11 must have taken a back seat in favour of more pressing matters.

camelreef

Quote from JoeAverage

is the link to the samba bug report available now ?

I wish!

I've sent the email explaining why I'd like a bugzilla account, and never heard back, twice... A bit of a weird process...

camelreef

Quote from JoeAverage

camelreef
heitbaum is preparing nightly to run last samba
https://github.com/LibreELEC/LibreELEC.tv/pull/6241

This is giving me hope! Thanks for the tip!

camelreef

So, guys, any indication about generating some useful data about smbd? Some external monitoring, as getting anything internally appears to be hard...

I'll take the previous logs to the Samba guys in a bug report, see what they say. I'll add the link here when done.

camelreef

Ah!

On the client:

Code

Feb 14 21:46:10 grabber kernel: CIFS: Attempting to mount //10.25.25.2/content
Feb 14 21:55:20 grabber kernel: CIFS: VFS: \\10.25.25.2 sends on sock 00000000d18678b6 stuck for 15 seconds
Feb 14 21:55:20 grabber kernel: CIFS: VFS: \\10.25.25.2 Error -11 sending data on socket to server
Feb 14 21:58:32 grabber kernel: CIFS: VFS: \\10.25.25.2 has not responded in 180 seconds. Reconnecting...
Feb 14 21:58:58 grabber kernel: CIFS: VFS: \\10.25.25.2 sends on sock 000000009546efff stuck for 15 seconds
Feb 14 21:58:58 grabber kernel: CIFS: VFS: \\10.25.25.2 Error -11 sending data on socket to server
Feb 14 21:59:27 grabber kernel: CIFS: VFS: \\10.25.25.2 sends on sock 0000000022e8f049 stuck for 15 seconds
Feb 14 21:59:27 grabber kernel: CIFS: VFS: \\10.25.25.2 Error -11 sending data on socket to server

On the server:

https://youplala.net/~nico/log.10.25.25.1

The two machines agree on time (NTP).

Again, it's as if smbd is taking a break without noticing...

camelreef

Quote from JoeAverage

such bets are illegal (correct wording ?) cause only *you* know how big your disk is !

/dev/mmcblk0p2 13.5G 2.1G 11.4G 16% /storage

kodiplayer:~ # ls -lah log.10.25.25.1

-rw-r--r-- 1 root root 1.9G Feb 14 16:23 log.10.25.25.1

No drops... I've tried rebooting both machines a few time... Grrrrr.... Sad, considering that it was dropping constantly while I was setting up the client specific logging!

[MANY HOURS LATER] I'm going to believe that logging fixes the problem at this rate! Still no drop, at, all... Solution: client specific log at level 10 redirected to /dev/null! Only joking....

camelreef

Quote from CvH

not sure what commands you actually use to mount and which are there due defaults, I would try it with a bare minimum
vers=3.1.1 -> vers=3.0
~~cache=strict~~
~~serverino~~
~~mapposix~~

I have previously tried downgrading vers progressively down to 2.0, with identical results.

I can try with th other options removed.

BTW, the mount options are defaults from DietPi.

camelreef

UK morning all!

I have a bit of time to dedicate to this today.

First, I don't believe that I've ever given the client's mount parameters:

Code

//10.25.25.2/content on /mnt/content type cifs (rw,relatime,vers=3.1.1,cache=strict,username=guest,uid=1000,forceuid,gid=1000,forcegid,addr=10.25.25.2,file_mode=0770,dir_mode=0770,soft,nounix,serverino,mapposix,rsize=4194304,wsize=4194304,bsize=1048576,echo_interval=60,actimeo=1,x-systemd.automount)

Who knows why there is an NFS mount option in there (actimeo)...

The server's parameters are default LE, automatically sharing external discs.

The gaps in logs were in the Samba logs with Level 10 on.

Those Samba bugs are not really helpful, but I'm not the best person to judge.

I'll try to coerce the LE samba to do some client-specific logging and come back here with data.

[EDIT] It's on and working, giving very dense/verbose logs. How much you want to bet that the error won't crop up for hours, or at least before the /storage partition of my SD card is full?

Oh, and a last data point, not the smallest: the similar production setup running an LE11 20220120 that I believed not subjected to the issue has displayed the same problem. This is depressing as it removes the possibility of a good known point...

camelreef

I absolutely hate problems coming up randomly or without an identified...

I've been hammering the system with the usual workflow plus additional external transfers, and not a single drop or anything in the past 24 hours... Aaaaaaaaaaah!

Posts by camelreef

LE11 Nightly - Samba server freezes, including logs - NTFS3 KERNEL DRIVER IS SLOW

LE11 Nightly - Samba server freezes, including logs - NTFS3 KERNEL DRIVER IS SLOW

LE11 Nightly - Samba server freezes, including logs - NTFS3 KERNEL DRIVER IS SLOW

LE11 Nightly - Samba server freezes, including logs - NTFS3 KERNEL DRIVER IS SLOW

LE11 Nightly - Samba server freezes, including logs - NTFS3 KERNEL DRIVER IS SLOW

LE11 Nightly - Samba server freezes, including logs - NTFS3 KERNEL DRIVER IS SLOW

LE11 Nightly - Samba server freezes, including logs - NTFS3 KERNEL DRIVER IS SLOW

LE11 Nightly - Samba server freezes, including logs - NTFS3 KERNEL DRIVER IS SLOW

LE11 Nightly - Samba server freezes, including logs - NTFS3 KERNEL DRIVER IS SLOW

LE11 Nightly - Samba server freezes, including logs - NTFS3 KERNEL DRIVER IS SLOW

LE11 Nightly - Samba server freezes, including logs - NTFS3 KERNEL DRIVER IS SLOW

LE11 Nightly - Samba server freezes, including logs - NTFS3 KERNEL DRIVER IS SLOW

LE11 Nightly - Samba server freezes, including logs - NTFS3 KERNEL DRIVER IS SLOW

LE11 Nightly - Samba server freezes, including logs - NTFS3 KERNEL DRIVER IS SLOW

LE11 Nightly - Samba server freezes, including logs - NTFS3 KERNEL DRIVER IS SLOW

LE11 Nightly - Samba server freezes, including logs - NTFS3 KERNEL DRIVER IS SLOW

LE11 Nightly - Samba server freezes, including logs - NTFS3 KERNEL DRIVER IS SLOW

LE11 Nightly - Samba server freezes, including logs - NTFS3 KERNEL DRIVER IS SLOW