-
-
Notifications
You must be signed in to change notification settings - Fork 174
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fluctuating write performance #1332
Comments
Can you explain why you think this is a mergerfs issue? Drives have caches, the OS caches, you have cache.files=partial meaning you have double page caching going on... and so having inconsistent speed after caches are filled is normal and expected. You could have SMR drives which and those can slow down to single MB speeds at times. |
It doesn't necessarily have to be an issue related to mergerfs, but I'm a little bit "lost in space" to find the "real" issue (no dmsg/syslog errors), because in past (before OS upgrade) I didnt noted this behaviour. I started with |
If it is related to anything I mentioned there wouldn't be any log errors. It would be normal, expected behavior. Are you copying the same amounts of data? To the same branches? Are you copying large files or many small files? Are you copying to a filesystem that is fuller or more empty? Have you checked your buffer sizes when it gets slow? When it gets slow have you done checks against all the branches directly? Have you created a minimal setup with only one branch and tried the same? Confirmed that it isn't a singular one of your filesystems/drives? All of that matters. Every drive, every different interconnect, etc. matters. SMR drives will completely drop their perf to single digit write speeds once caches fill and it starts flushing and needs to write back to disk. You have 8TB drives and SMR was pretty common for some brands of 8TB drives. Having stops and goes sounds like a full cache being flushed and then normal resuming. I see this all the time on my HDDs including writes directly to their filesystem. There is a reason why tools like nocache exist for usage with rsync. Buffer cache bloat is a real problem in certain workloads like the one you describe. There really isn't anything I can do without a lot more info about the situation. |
According to the strace of mergerfs your branches were clearly busy at times. Orders of magnitude slower between the fastest and slowest. mergerfs is just another app like any other so this is representative of any random app trying to interact with those filesystems.
over a 1/10th a second for some writes to complete. vs the following for when it is fast
|
yes
large
no different, if the filesystem (ext4) is about 50% or nearly empty.
No, how I can check this?
No, because it works in the past.. and if I directly copy to the branch (with the same amount of data) without using mergerfs, I have expected behaviour.
All my drives has CMR |
Run |
To properly test you should have a single branch, disable page caching because it being enabled will increase cache usage as mentioned in the docs, and since you have large files the dropcacheonclose won't mean as much. |
I will create one with my spare disk in the evening.
done! |
no change with single branch.. some testings:
|
I still have the problem.. the writespeed from mergerfs to external drive often is limited to 40MB/s after some time-- starting with 100MB+ overall it should be a caching-issue.. but with Ubuntu 20.04 I never had that kind of issues. |
There's nothing I can do if I don't have information to work with. Even if I did... nothing has changed in mergerfs in terms of abilities or features in years related to writes. When it comes to writes there is almost zero logic. As I pointed out before your system / filesystems clearly are busy if a single write takes over 1/10 a second. |
..yes i know.. but was is the cause of the "busy filesystem" if this caused the problem.. and how I can find the app/prozess/whatever.. |
You have to look at the running system when it is occurring. mergerfs is just another piece of software like any other piece of software running on your machine in userspace as root. If it says that writes to the underlying filesystem are taking over 1/10 a second then you should just be able to run iotop or similar IO monitoring tools to see who is reading and writing. And if it is mergerfs then your problem is probably your hardware. Lower the number of threads or change mergerfs priorities or something to limit concurrency. Lots of consumer grade hardware don't handle concurrency well. |
Describe the bug
I can't remember the exact point in time, but since my Ubuntu-upgrade from 20.04 to 22.04 (+ nvme upgrade) I've been getting fluctuating write performance after ~120-180 seconds, starting with >200 MB/s down to <30 MB/s. (from a single nvme drive to mergerfs). hard drives) . I've played around with different mergefs settings but always with the same result.
working mergerfs-settings with v 2.28 / 2.33
/mnt/data/* /mnt/mysnapraid fuse.mergerfs defaults, direct_io,allow_other,use_ino,category.create=epmfs,moveonenospc=true,minfreespace=20G,fsname=mergerfsPool 0 0
To Reproduce
Copy a amount of large files (e.g. tv recordings) from a single nvme drive to mergerfs hard disks. (cp with midnight commander); after ~120-180s I got the write performance problems.
System information:
Ubuntu 22.04.4 LTS, 5.15.0-92-generic
v2.40.2
cp.strace.txt.gz
mergerfs_cp.strace.txt.gz
The text was updated successfully, but these errors were encountered: