-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
hung zfs task #15481
Comments
After about two hours, the hang seems to have resolved itself. It might also be worth noting that the system currently has 1669 snapshots, and |
A new set of stacktraces, from several processes:
|
Oh my god, I thought my server had a hardware defect. Same problem here, thanks for sharing! Edit: I switched my server from lts (6.1.61) to the stable kernel (6.5.9) and this problem has been solved. Seems to be a very specific case when using zfs 2.2.0 with the 6.1.61 kernel. Edit: The problem hasn't really been solved. My system doesn't freeze anymore after some hours, howeve I/O-heavy programs freeze and lock up individual CPU threads. Those processes also can't be killed and block shutdowns until I cut the power. Edit: After upgrading to the 6.5.9 kernel and shutting down I/O-heavy programs, my server ran for 9 hours without any errors in journalctl. Also worth mentioning: #15275 (comment) |
I experience an issue that shares much of the characteristics mentioned above: I/O-heavy workload, tasks freeze, can't be killed, delayed shutdowns (~15min). My versions are a bit older since I run Ubuntu 22.04 LTS. The configuration is a RAID10 here.
|
We do have hung_task panics on 6.5.11-4-pve kernel with zfs-kmod-2.2.0-pve3 while trying to work with 1000+ snapshots |
System information
Describe the problem you're observing
Linux reports that the
zfs
task is hung. Several processes (local as well as remote via NFS) block trying to access some files. However, not all accesses are blocked.Describe how to reproduce the problem
I don't have steps to reproduce the problem. The hang happened on a production system, with the following dominant actions happening:
Include any warning/errors/backtraces from the system logs
The text was updated successfully, but these errors were encountered: