You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Intermittent communication dropouts are occurring with SSD drives that are attached to Marvell 88SE9235 SATA controllers via Marvell 88SM9705 port multipliers. The SSD drives are M.2 form factor and are typically models from WD or SanDisk. When the issue occurs, communication with all SSD drives (5) connected to port multiplier is lost and the driver performs recovery steps in order to re-establish connection with the SSD drives. This results in ZFS I/O errors being reported from zpool status. Multiple events with unsuccessful recovery steps by driver can lead to pool suspension.
The issue occurs with both small and large I/O workloads, though usually takes longer to manifest with small I/O workload.
The issue does not occur with older version of CentOS and ZFS running on same hardware. Details:
Distribution: CentOS 7.9
Kernel Version: 3.10.0-1160.15.2
Architecture: x64
OpenZFS Version: 0.8.6-1
Typically, the ZFS pools in use on AlmaLinux were created in CentOS. Creating the pools in AlmaLinux did not resolve issue.
Have tried the following, in different combinations:
Disabling NCQ
Lowering SATA speed to 3.0
Upgrading ZFS to 2.1.13
Changing SATA power management from max_performance -> medium_power
Changing I/O scheduler from None -> mq-deadline
Change max_sectors_kb -> 512
The issue can be reproduced as follows:
Small I/O workload: Boot-up system w/ apps that generate small sustained I/O load on the ZFS pool and let it run w/o interaction
Large I/O workload: Use fio to generate heavy I/O workload on ZFS pool
Syslog Snippet
Dec 17 07:41:00.384 test01 kernel: ata7.00: failed to read SCR 1 (Emask=0x40)
Dec 17 07:41:00.384 test01 kernel: ata7.01: failed to read SCR 1 (Emask=0x40)
Dec 17 07:41:00.384 test01 kernel: ata7.02: failed to read SCR 1 (Emask=0x40)
Dec 17 07:41:00.384 test01 kernel: ata7.03: failed to read SCR 1 (Emask=0x40)
Dec 17 07:41:00.384 test01 kernel: ata7.04: failed to read SCR 1 (Emask=0x40)
Dec 17 07:41:00.384 test01 kernel: ata7.00: exception Emask 0x100 SAct 0x4200000 SErr 0x0 action 0x6 frozen
Dec 17 07:41:00.384 test01 kernel: ata7.00: failed command: WRITE FPDMA QUEUED
Dec 17 07:41:00.384 test01 kernel: ata7.00: cmd 61/0b:a8:da:66:d1/00:00:08:00:00/40 tag 21 ncq dma 5632 out
res 40/00:00:01:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout)
Dec 17 07:41:00.384 test01 kernel: ata7.00: status: { DRDY }
Dec 17 07:41:00.384 test01 kernel: ata7.00: failed command: WRITE FPDMA QUEUED
Dec 17 07:41:00.384 test01 kernel: ata7.00: cmd 61/15:d0:28:26:fe/00:00:06:00:00/40 tag 26 ncq dma 10752 out
res 40/00:00:01:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout)
Dec 17 07:41:00.384 test01 kernel: ata7.00: status: { DRDY }
Dec 17 07:41:02.242 test01 kernel: ata9.00: failed to read SCR 1 (Emask=0x40)
Dec 17 07:41:02.242 test01 kernel: ata9.01: failed to read SCR 1 (Emask=0x40)
Dec 17 07:41:02.242 test01 kernel: ata9.02: failed to read SCR 1 (Emask=0x40)
Dec 17 07:41:02.242 test01 kernel: ata9.03: failed to read SCR 1 (Emask=0x40)
Dec 17 07:41:02.242 test01 kernel: ata9.04: failed to read SCR 1 (Emask=0x40)
Dec 17 07:41:02.242 test01 kernel: ata9.00: exception Emask 0x100 SAct 0x40e36043 SErr 0x0 action 0x6 frozen
Dec 17 07:41:02.242 test01 kernel: ata9.00: failed command: READ FPDMA QUEUED
Dec 17 07:41:02.242 test01 kernel: ata9.00: cmd 60/05:00:bd:0e:c7/00:00:25:00:00/40 tag 0 ncq dma 2560 in
res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
Dec 17 07:41:02.242 test01 kernel: ata9.00: status: { DRDY }
Dec 17 07:41:02.242 test01 kernel: ata9.00: failed command: READ FPDMA QUEUED
Dec 17 07:41:02.242 test01 kernel: ata9.00: cmd 60/05:08:e1:0d:18/00:00:1b:00:00/40 tag 1 ncq dma 2560 in
res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
Dec 17 07:41:02.242 test01 kernel: ata9.00: status: { DRDY }
Dec 17 07:41:02.242 test01 kernel: ata9.00: failed command: READ FPDMA QUEUED
Dec 17 07:41:02.242 test01 kernel: ata9.00: cmd 60/05:30:83:32:7d/00:00:1d:00:00/40 tag 6 ncq dma 2560 in
res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
Dec 17 07:41:02.242 test01 kernel: ata9.00: status: { DRDY }
Dec 17 07:41:02.242 test01 kernel: ata9.00: failed command: READ FPDMA QUEUED
Dec 17 07:41:02.242 test01 kernel: ata9.00: cmd 60/05:68:f3:24:75/00:00:1d:00:00/40 tag 13 ncq dma 2560 in
res 40/00:00:01:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout)
Dec 17 07:41:02.242 test01 kernel: ata9.00: status: { DRDY }
Dec 17 07:41:02.242 test01 kernel: ata9.00: failed command: READ FPDMA QUEUED
Dec 17 07:41:02.242 test01 kernel: ata9.00: cmd 60/05:70:37:ad:66/00:00:1d:00:00/40 tag 14 ncq dma 2560 in
res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
Dec 17 07:41:02.242 test01 kernel: ata9.00: status: { DRDY }
Dec 17 07:41:02.242 test01 kernel: ata9.00: failed command: WRITE FPDMA QUEUED
Dec 17 07:41:02.242 test01 kernel: ata9.00: cmd 61/20:80:1d:26:fe/00:00:06:00:00/40 tag 16 ncq dma 16384 out
res 40/00:00:00:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout)
Dec 17 07:41:02.242 test01 kernel: ata9.00: status: { DRDY }
Dec 17 07:41:02.242 test01 kernel: ata9.00: failed command: READ FPDMA QUEUED
Dec 17 07:41:02.242 test01 kernel: ata9.00: cmd 60/05:88:93:bf:6d/00:00:23:00:00/40 tag 17 ncq dma 2560 in
res 40/00:00:00:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout)
Dec 17 07:41:02.242 test01 kernel: ata9.00: status: { DRDY }
Dec 17 07:41:02.242 test01 kernel: ata9.00: failed command: WRITE FPDMA QUEUED
Dec 17 07:41:02.243 test01 kernel: ata9.00: cmd 61/cc:a8:60:a0:6e/00:00:29:00:00/40 tag 21 ncq dma 104448 out
res 40/00:00:00:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout)
Dec 17 07:41:02.243 test01 kernel: ata9.00: status: { DRDY }
Dec 17 07:41:02.243 test01 kernel: ata9.00: failed command: READ FPDMA QUEUED
Dec 17 07:41:02.243 test01 kernel: ata9.00: cmd 60/05:b0:bb:23:78/00:00:1d:00:00/40 tag 22 ncq dma 2560 in
res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
Dec 17 07:41:02.243 test01 kernel: ata9.00: status: { DRDY }
Dec 17 07:41:02.243 test01 kernel: ata9.00: failed command: READ FPDMA QUEUED
Dec 17 07:41:02.243 test01 kernel: ata9.00: cmd 60/05:b8:ff:9d:61/00:00:1d:00:00/40 tag 23 ncq dma 2560 in
res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
Dec 17 07:41:02.243 test01 kernel: ata9.00: status: { DRDY }
Dec 17 07:41:02.243 test01 kernel: ata9.00: failed command: READ FPDMA QUEUED
Dec 17 07:41:02.243 test01 kernel: ata9.00: cmd 60/05:f0:4d:f5:6a/00:00:1d:00:00/40 tag 30 ncq dma 2560 in
res 40/00:00:01:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout)
Dec 17 07:41:02.243 test01 kernel: ata9.00: status: { DRDY }
Dec 17 07:41:02.243 test01 kernel: ata9.02: exception Emask 0x100 SAct 0x800 SErr 0x0 action 0x6 frozen
Dec 17 07:41:02.243 test01 kernel: ata9.02: failed command: WRITE FPDMA QUEUED
Dec 17 07:41:02.243 test01 kernel: ata9.02: cmd 61/1d:58:a8:1f:ab/00:00:3c:00:00/40 tag 11 ncq dma 14848 out
res 40/00:00:00:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout)
Dec 17 07:41:02.243 test01 kernel: ata9.02: status: { DRDY }
Dec 17 07:41:02.243 test01 kernel: ata9.04: exception Emask 0x100 SAct 0x1000 SErr 0x0 action 0x6 frozen
Dec 17 07:41:02.243 test01 kernel: ata9.04: failed command: WRITE FPDMA QUEUED
Dec 17 07:41:02.243 test01 kernel: ata9.04: cmd 61/1d:60:a8:1f:ab/00:00:3c:00:00/40 tag 12 ncq dma 14848 out
res 40/00:00:00:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout)
Dec 17 07:41:02.243 test01 kernel: ata9.04: status: { DRDY }
Dec 17 07:41:10.384 test01 kernel: ata7.15: softreset failed (1st FIS failed)
Dec 17 07:41:12.242 test01 kernel: ata9.15: softreset failed (1st FIS failed)
Dec 17 07:41:20.384 test01 kernel: ata7.15: softreset failed (1st FIS failed)
Dec 17 07:41:22.242 test01 kernel: ata9.15: softreset failed (1st FIS failed)
Dec 17 07:41:55.384 test01 kernel: ata7.15: softreset failed (1st FIS failed)
Dec 17 07:41:55.384 test01 kernel: ata7.15: limiting SATA link speed to 3.0 Gbps
Dec 17 07:41:57.242 test01 kernel: ata9.15: softreset failed (1st FIS failed)
Dec 17 07:41:57.242 test01 kernel: ata9.15: limiting SATA link speed to 3.0 Gbps
Dec 17 07:42:00.384 test01 kernel: ata7.15: softreset failed (1st FIS failed)
Dec 17 07:42:00.384 test01 kernel: ata7.15: failed to reset PMP, giving up
Dec 17 07:42:00.384 test01 kernel: ata7.15: Port Multiplier detaching
Dec 17 07:42:00.384 test01 kernel: ata7.00: disabled
Dec 17 07:42:00.384 test01 kernel: ata7.02: disabled
Dec 17 07:42:00.384 test01 kernel: ata7.03: disabled
Dec 17 07:42:00.384 test01 kernel: ata7.04: disabled
Dec 17 07:42:00.384 test01 kernel: ata7.00: disabled
Dec 17 07:42:00.384 test01 kernel: sd 6:0:0:0: [sdb] tag#21 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE cmd_age=90s
Dec 17 07:42:00.384 test01 kernel: sd 6:0:0:0: [sdb] tag#21 Sense Key : Not Ready [current]
Dec 17 07:42:00.384 test01 kernel: sd 6:0:0:0: [sdb] tag#21 Add. Sense: Logical unit not ready, hard reset required
Dec 17 07:42:00.385 test01 kernel: sd 6:0:0:0: [sdb] tag#21 CDB: Write(10) 2a 00 08 d1 66 da 00 00 0b 00
Dec 17 07:42:00.385 test01 kernel: blk_update_request: I/O error, dev sdb, sector 147941082 op 0x1:(WRITE) flags 0x700 phys_seg 11 prio class 0
Dec 17 07:42:00.385 test01 kernel: zio pool=extpool vdev=/dev/sdb1 error=5 type=2 offset=75744785408 size=5632 flags=40080c80
Dec 17 07:42:00.385 test01 kernel: sd 6:0:0:0: [sdb] tag#26 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE cmd_age=90s
Dec 17 07:42:00.385 test01 kernel: sd 6:0:0:0: [sdb] tag#26 Sense Key : Not Ready [current]
Dec 17 07:42:00.385 test01 kernel: sd 6:0:0:0: [sdb] tag#26 Add. Sense: Logical unit not ready, hard reset required
Dec 17 07:42:00.385 test01 kernel: sd 6:0:0:0: [sdb] tag#26 CDB: Write(10) 2a 00 06 fe 26 28 00 00 15 00
Dec 17 07:42:00.385 test01 kernel: blk_update_request: I/O error, dev sdb, sector 117319208 op 0x1:(WRITE) flags 0x700 phys_seg 10 prio class 0
Dec 17 07:42:00.385 test01 kernel: zio pool=extpool vdev=/dev/sdb1 error=5 type=2 offset=60066385920 size=10752 flags=40080c80
Dec 17 07:42:00.385 test01 kernel: sd 6:2:0:0: rejecting I/O to offline device
Dec 17 07:42:00.385 test01 kernel: blk_update_request: I/O error, dev sdc, sector 1017847720 op 0x1:(WRITE) flags 0x700 phys_seg 1 prio class 0
Dec 17 07:42:00.385 test01 kernel: zio pool=datapool vdev=/dev/sdc1 error=5 type=2 offset=521136984064 size=14848 flags=180880
Dec 17 07:42:00.385 test01 kernel: sd 6:0:0:0: rejecting I/O to offline device
Dec 17 07:42:00.386 test01 kernel: blk_update_request: I/O error, dev sdb, sector 443372922 op 0x1:(WRITE) flags 0x700 phys_seg 1 prio class 0
Dec 17 07:42:00.386 test01 kernel: sd 6:4:0:0: rejecting I/O to offline device
Dec 17 07:42:00.386 test01 kernel: blk_update_request: I/O error, dev sde, sector 1017847720 op 0x1:(WRITE) flags 0x700 phys_seg 1 prio class 0
Dec 17 07:42:00.386 test01 kernel: blk_update_request: I/O error, dev sdb, sector 629693050 op 0x0:(READ) flags 0x700 phys_seg 1 prio class 0
Dec 17 07:42:00.386 test01 kernel: zio pool=extpool vdev=/dev/sdb1 error=5 type=2 offset=227005887488 size=43520 flags=180880
Dec 17 07:42:00.386 test01 kernel: zio pool=datapool vdev=/dev/sde1 error=5 type=2 offset=521136984064 size=14848 flags=180880
Dec 17 07:42:00.386 test01 kernel: zio pool=extpool vdev=/dev/sdb1 error=5 type=1 offset=322401793024 size=2560 flags=180880
Dec 17 07:42:00.386 test01 kernel: blk_update_request: I/O error, dev sdb, sector 494258904 op 0x0:(READ) flags 0x700 phys_seg 1 prio class 0
Dec 17 07:42:00.386 test01 kernel: zio pool=extpool vdev=/dev/sdb1 error=5 type=1 offset=253059510272 size=2560 flags=180880
Dec 17 07:42:00.386 test01 kernel: blk_update_request: I/O error, dev sdb, sector 443372488 op 0x1:(WRITE) flags 0x700 phys_seg 11 prio class 0
Dec 17 07:42:00.386 test01 kernel: blk_update_request: I/O error, dev sdb, sector 104967442 op 0x0:(READ) flags 0x700 phys_seg 1 prio class 0
Dec 17 07:42:00.386 test01 kernel: blk_update_request: I/O error, dev sdb, sector 449342746 op 0x0:(READ) flags 0x700 phys_seg 1 prio class 0
Dec 17 07:42:00.386 test01 kernel: zio pool=extpool vdev=/dev/sdb1 error=5 type=1 offset=230062437376 size=2560 flags=180880
Dec 17 07:42:00.386 test01 kernel: zio pool=extpool vdev=/dev/sdb1 error=5 type=2 offset=227005665280 size=44032 flags=180880
Dec 17 07:42:00.386 test01 kernel: zio pool=extpool vdev=/dev/sdb1 error=5 type=1 offset=53742281728 size=43520 flags=180880
Dec 17 07:42:00.386 test01 kernel: zio pool=extpool vdev=/dev/sdb1 error=5 type=2 offset=227005843456 size=43520 flags=180880
Dec 17 07:42:00.386 test01 kernel: zio pool=extpool vdev=/dev/sdb1 error=5 type=2 offset=227006019584 size=43520 flags=180880
Dec 17 07:42:00.386 test01 kernel: zio pool=extpool vdev=/dev/sdb1 error=5 type=1 offset=476445496320 size=2560 flags=180880
Dec 17 07:42:00.386 test01 kernel: zio pool=extpool vdev=/dev/sdb1 error=5 type=2 offset=224627123712 size=4096 flags=180880
Dec 17 07:42:00.386 test01 kernel: zio pool=extpool vdev=/dev/sdb1 error=5 type=1 offset=317432351744 size=2560 flags=180880
Dec 17 07:42:00.386 test01 kernel: zio pool=extpool vdev=/dev/sdb1 error=5 type=1 offset=92569902080 size=43520 flags=180880
Dec 17 07:42:00.386 test01 kernel: zio pool=extpool vdev=/dev/sdb1 error=5 type=1 offset=253234531328 size=2560 flags=180880
Dec 17 07:42:00.386 test01 kernel: zio pool=extpool vdev=/dev/sdb1 error=5 type=2 offset=227005711360 size=43520 flags=180880
Dec 17 07:42:00.386 test01 kernel: zio pool=extpool vdev=/dev/sdb1 error=5 type=2 offset=224627127808 size=4096 flags=180880
Dec 17 07:42:00.386 test01 kernel: zio pool=extpool vdev=/dev/sdb1 error=5 type=2 offset=227005799424 size=43520 flags=180880
Dec 17 07:42:00.386 test01 kernel: zio pool=extpool vdev=/dev/sdb1 error=5 type=2 offset=227005931520 size=43520 flags=180880
Dec 17 07:42:00.386 test01 kernel: zio pool=extpool vdev=/dev/sdb1 error=5 type=2 offset=227005975552 size=43520 flags=180880
Dec 17 07:42:00.386 test01 kernel: zio pool=extpool vdev=/dev/sdb1 error=5 type=2 offset=330506853376 size=102400 flags=40080c80
Dec 17 07:42:00.387 test01 kernel: sd 6:3:0:0: rejecting I/O to offline device
Dec 17 07:42:00.387 test01 kernel: zio pool=extpool vdev=/dev/sdb1 error=5 type=2 offset=330506955776 size=88064 flags=40080c80
Dec 17 07:42:00.387 test01 kernel: ata7.00: detaching (SCSI 6:0:0:0)
Dec 17 07:42:00.387 test01 kernel: zio pool=extpool vdev=/dev/sdb1 error=5 type=1 offset=270336 size=8192 flags=b08c1
Dec 17 07:42:00.387 test01 kernel: zio pool=extpool vdev=/dev/sdb1 error=5 type=1 offset=1024199237632 size=8192 flags=b08c1
Dec 17 07:42:00.387 test01 kernel: zio pool=extpool vdev=/dev/sdb1 error=5 type=1 offset=1024199499776 size=8192 flags=b08c1
Dec 17 07:42:00.387 test01 kernel: zio pool=extpool vdev=/dev/sdb1 error=5 type=2 offset=60066396672 size=7168 flags=40080c80
Dec 17 07:42:00.387 test01 kernel: zio pool=datapool vdev=/dev/sdc1 error=5 type=1 offset=270336 size=8192 flags=b08c1
Dec 17 07:42:00.387 test01 kernel: zio pool=extpool vdev=/dev/sdb1 error=5 type=2 offset=75744791040 size=2560 flags=40080c80
Dec 17 07:42:00.387 test01 kernel: zio pool=datapool vdev=/dev/sdc1 error=5 type=1 offset=1024199237632 size=8192 flags=b08c1
Dec 17 07:42:00.387 test01 kernel: zio pool=datapool vdev=/dev/sdc1 error=5 type=1 offset=1024199499776 size=8192 flags=b08c1
Dec 17 07:42:00.387 test01 kernel: zio pool=extpool vdev=/dev/sdb1 error=5 type=2 offset=76127388672 size=5632 flags=40080c80
Dec 17 07:42:00.387 test01 kernel: zio pool=extpool vdev=/dev/sdb1 error=5 type=2 offset=87584065024 size=1024 flags=180880
Dec 17 07:42:00.387 test01 kernel: zio pool=extpool vdev=/dev/sdb1 error=5 type=2 offset=330507043840 size=88064 flags=40080c80
Dec 17 07:42:00.387 test01 kernel: zio pool=extpool vdev=/dev/sdb1 error=5 type=2 offset=330507131904 size=103424 flags=40080c80
Dec 17 07:42:00.387 test01 kernel: zio pool=datapool vdev=/dev/sde1 error=5 type=1 offset=270336 size=8192 flags=b08c1
Dec 17 07:42:00.387 test01 kernel: zio pool=datapool vdev=/dev/sde1 error=5 type=1 offset=1024199237632 size=8192 flags=b08c1
Dec 17 07:42:00.387 test01 kernel: zio pool=datapool vdev=/dev/sde1 error=5 type=1 offset=1024199499776 size=8192 flags=b08c1
Dec 17 07:42:00.387 test01 kernel: zio pool=extpool vdev=/dev/sdb1 error=5 type=2 offset=227006063104 size=88576 flags=40080c80
Dec 17 07:42:00.387 test01 kernel: sd 6:0:0:0: [sdb] Stopping disk
Dec 17 07:42:00.387 test01 kernel: sd 6:0:0:0: [sdb] Start/Stop Unit failed: Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
Dec 17 07:42:00.394 test01 kernel: ata7.02: detaching (SCSI 6:2:0:0)
Dec 17 07:42:00.406 test01 kernel: sd 6:2:0:0: [sdc] Stopping disk
Dec 17 07:42:00.406 test01 kernel: sd 6:2:0:0: [sdc] Start/Stop Unit failed: Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
Dec 17 07:42:00.416 test01 kernel: ata7.03: detaching (SCSI 6:3:0:0)
Dec 17 07:42:00.418 test01 kernel: sd 6:3:0:0: [sdd] Stopping disk
Dec 17 07:42:00.418 test01 kernel: sd 6:3:0:0: [sdd] Start/Stop Unit failed: Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
Dec 17 07:42:00.442 test01 kernel: ata7.04: detaching (SCSI 6:4:0:0)
Dec 17 07:42:00.447 test01 kernel: sd 6:4:0:0: [sde] Stopping disk
Dec 17 07:42:00.447 test01 kernel: sd 6:4:0:0: [sde] Start/Stop Unit failed: Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
Dec 17 07:42:02.241 test01 kernel: ata9.15: softreset failed (1st FIS failed)
Dec 17 07:42:02.242 test01 kernel: ata9.15: failed to reset PMP, giving up
Dec 17 07:42:02.242 test01 kernel: ata9.15: Port Multiplier detaching
Dec 17 07:42:02.242 test01 kernel: ahci 0000:0e:00.0: FBS is disabled
Dec 17 07:42:02.243 test01 kernel: ata9.00: disabled
Dec 17 07:42:02.243 test01 kernel: ata9.01: disabled
Dec 17 07:42:02.243 test01 kernel: ata9.02: disabled
Dec 17 07:42:02.243 test01 kernel: ata9.03: disabled
Dec 17 07:42:02.243 test01 kernel: ata9.04: disabled
Dec 17 07:42:02.243 test01 kernel: ata9.00: disabled
Dec 17 07:42:02.243 test01 kernel: sd 8:0:0:0: [sdk] tag#0 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE cmd_age=90s
Dec 17 07:42:02.244 test01 kernel: sd 8:0:0:0: [sdk] tag#0 Sense Key : Not Ready [current]
Dec 17 07:42:02.244 test01 kernel: sd 8:0:0:0: [sdk] tag#0 Add. Sense: Logical unit not ready, hard reset required
Dec 17 07:42:02.244 test01 kernel: sd 8:0:0:0: [sdk] tag#0 CDB: Read(10) 28 00 25 c7 0e bd 00 00 05 00
Dec 17 07:42:02.245 test01 kernel: zio pool=extpool vdev=/dev/sdk1 error=5 type=1 offset=324505795072 size=2560 flags=180880
Dec 17 07:42:02.245 test01 kernel: sd 8:0:0:0: [sdk] tag#1 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE cmd_age=90s
Dec 17 07:42:02.246 test01 kernel: sd 8:0:0:0: [sdk] tag#1 Sense Key : Not Ready [current]
Dec 17 07:42:02.246 test01 kernel: sd 8:0:0:0: [sdk] tag#1 Add. Sense: Logical unit not ready, hard reset required
Dec 17 07:42:02.246 test01 kernel: sd 8:0:0:0: [sdk] tag#1 CDB: Read(10) 28 00 1b 18 0d e1 00 00 05 00
Dec 17 07:42:02.247 test01 kernel: zio pool=extpool vdev=/dev/sdk1 error=5 type=1 offset=232734310912 size=2560 flags=180880
Dec 17 07:42:02.247 test01 kernel: sd 8:0:0:0: [sdk] tag#6 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE cmd_age=91s
Dec 17 07:42:02.247 test01 kernel: sd 8:0:0:0: [sdk] tag#6 Sense Key : Not Ready [current]
Dec 17 07:42:02.321 test01 kernel: sd 8:0:0:0: rejecting I/O to offline device
Dec 17 07:42:02.321 test01 kernel: sd 8:0:0:0: [sdk] tag#6 Add. Sense: Logical unit not ready, hard reset required
Dec 17 07:42:02.321 test01 kernel: sd 8:0:0:0: [sdk] tag#6 CDB: Read(10) 28 00 1d 7d 32 83 00 00 05 00
Dec 17 07:42:02.393 test01 kernel: zio pool=extpool vdev=/dev/sdk1 error=5 type=1 offset=253307979264 size=2560 flags=180880
Dec 17 07:42:02.393 test01 kernel: zio pool=extpool vdev=/dev/sdk1 error=5 type=2 offset=227005842944 size=44032 flags=180880
Dec 17 07:42:02.393 test01 kernel: zio pool=extpool vdev=/dev/sdk1 error=5 type=2 offset=227005798912 size=44032 flags=180880
Dec 17 07:42:02.393 test01 kernel: sd 8:2:0:0: [sdm] tag#11 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE cmd_age=91s
Dec 17 07:42:02.454 test01 kernel: sd 8:2:0:0: [sdm] tag#11 Sense Key : Not Ready [current]
Dec 17 07:42:02.457 test01 kernel: zio pool=extpool vdev=/dev/sdk1 error=5 type=2 offset=227005931008 size=44032 flags=180880
Dec 17 07:42:02.457 test01 kernel: sd 8:2:0:0: [sdm] tag#11 Add. Sense: Logical unit not ready, hard reset required
Dec 17 07:42:02.457 test01 kernel: zio pool=extpool vdev=/dev/sdk1 error=5 type=2 offset=224627123712 size=4096 flags=180880
Dec 17 07:42:02.457 test01 kernel: sd 8:2:0:0: [sdm] tag#11 CDB: Write(10) 2a 00 3c ab 1f a8 00 00 1d 00
Dec 17 07:42:02.458 test01 kernel: zio pool=extpool vdev=/dev/sdk1 error=5 type=2 offset=227005975040 size=44032 flags=180880
Dec 17 07:42:02.458 test01 kernel: zio pool=extpool vdev=/dev/sdk1 error=5 type=2 offset=227005710848 size=44032 flags=180880
Dec 17 07:42:02.458 test01 kernel: zio pool=datapool vdev=/dev/sdm1 error=5 type=2 offset=521136984064 size=14848 flags=180880
Dec 17 07:42:02.458 test01 kernel: zio pool=extpool vdev=/dev/sdk1 error=5 type=1 offset=253135649280 size=2560 flags=180880
Dec 17 07:42:02.458 test01 kernel: zio pool=extpool vdev=/dev/sdk1 error=5 type=2 offset=227005665280 size=44032 flags=180880
Dec 17 07:42:02.458 test01 kernel: zio pool=extpool vdev=/dev/sdk1 error=5 type=1 offset=252951240192 size=2560 flags=180880
Dec 17 07:42:02.458 test01 kernel: sd 8:4:0:0: [sdo] tag#12 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE cmd_age=91s
Dec 17 07:42:02.480 test01 kernel: zio pool=extpool vdev=/dev/sdk1 error=5 type=2 offset=224627127808 size=4096 flags=180880
Dec 17 07:42:02.480 test01 kernel: sd 8:4:0:0: [sdo] tag#12 Sense Key : Not Ready [current]
Dec 17 07:42:02.482 test01 kernel: sd 8:4:0:0: [sdo] tag#12 Add. Sense: Logical unit not ready, hard reset required
Dec 17 07:42:02.507 test01 kernel: sd 8:4:0:0: [sdo] tag#12 CDB: Write(10) 2a 00 3c ab 1f a8 00 00 1d 00
Dec 17 07:42:02.522 test01 kernel: zio pool=datapool vdev=/dev/sdo1 error=5 type=2 offset=521136984064 size=14848 flags=180880
Dec 17 07:42:02.523 test01 kernel: zio pool=extpool vdev=/dev/sdk1 error=5 type=2 offset=227005886976 size=44032 flags=180880
Dec 17 07:42:02.523 test01 kernel: sd 8:0:0:0: [sdk] tag#13 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE cmd_age=91s
Dec 17 07:42:02.523 test01 kernel: sd 8:0:0:0: [sdk] tag#13 Sense Key : Not Ready [current]
Dec 17 07:42:02.523 test01 kernel: zio pool=extpool vdev=/dev/sdk1 error=5 type=2 offset=227005754880 size=44032 flags=180880
Dec 17 07:42:02.523 test01 kernel: sd 8:0:0:0: [sdk] tag#13 Add. Sense: Logical unit not ready, hard reset required
Dec 17 07:42:02.523 test01 kernel: sd 8:0:0:0: [sdk] tag#13 CDB: Read(10) 28 00 1d 75 24 f3 00 00 05 00
Dec 17 07:42:02.523 test01 kernel: zio pool=extpool vdev=/dev/sdk1 error=5 type=1 offset=253037766144 size=2560 flags=180880
Dec 17 07:42:02.523 test01 kernel: sd 8:0:0:0: [sdk] tag#14 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE cmd_age=91s
Dec 17 07:42:02.523 test01 kernel: sd 8:0:0:0: [sdk] tag#14 Sense Key : Not Ready [current]
Dec 17 07:42:02.524 test01 kernel: sd 8:0:0:0: [sdk] tag#14 Add. Sense: Logical unit not ready, hard reset required
Dec 17 07:42:02.524 test01 kernel: sd 8:0:0:0: [sdk] tag#14 CDB: Read(10) 28 00 1d 66 ad 37 00 00 05 00
Dec 17 07:42:02.524 test01 kernel: zio pool=extpool vdev=/dev/sdk1 error=5 type=1 offset=252552310272 size=2560 flags=180880
Dec 17 07:42:02.524 test01 kernel: sd 8:0:0:0: [sdk] tag#16 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE cmd_age=92s
Dec 17 07:42:02.524 test01 kernel: sd 8:0:0:0: [sdk] tag#16 Sense Key : Not Ready [current]
Dec 17 07:42:02.524 test01 kernel: sd 8:0:0:0: [sdk] tag#16 Add. Sense: Logical unit not ready, hard reset required
Dec 17 07:42:02.524 test01 kernel: sd 8:0:0:0: [sdk] tag#16 CDB: Write(10) 2a 00 06 fe 26 1d 00 00 20 00
Dec 17 07:42:02.524 test01 kernel: zio pool=extpool vdev=/dev/sdk1 error=5 type=2 offset=60066380288 size=16384 flags=40080c80
Dec 17 07:42:02.524 test01 kernel: zio pool=extpool vdev=/dev/sdk1 error=5 type=1 offset=304329205248 size=2560 flags=180880
Dec 17 07:42:02.524 test01 kernel: zio pool=extpool vdev=/dev/sdk1 error=5 type=2 offset=355898277888 size=104448 flags=40080c80
Dec 17 07:42:02.524 test01 kernel: zio pool=extpool vdev=/dev/sdk1 error=5 type=1 offset=253138269696 size=2560 flags=180880
Dec 17 07:42:02.524 test01 kernel: zio pool=extpool vdev=/dev/sdk1 error=5 type=2 offset=227006019072 size=88064 flags=40080c80
Dec 17 07:42:02.524 test01 kernel: zio pool=extpool vdev=/dev/sdk1 error=5 type=1 offset=252382543360 size=2560 flags=180880
Dec 17 07:42:02.524 test01 kernel: zio pool=extpool vdev=/dev/sdk1 error=5 type=1 offset=252695976448 size=2560 flags=180880
Dec 17 07:42:02.524 test01 kernel: zio pool=extpool vdev=/dev/sdk1 error=5 type=2 offset=330506852864 size=102400 flags=40080c80
Dec 17 07:42:02.524 test01 kernel: zio pool=extpool vdev=/dev/sdk1 error=5 type=2 offset=330506955264 size=88064 flags=40080c80
Dec 17 07:42:02.524 test01 kernel: zio pool=extpool vdev=/dev/sdk1 error=5 type=2 offset=227006107136 size=88064 flags=40080c80
Dec 17 07:42:02.524 test01 kernel: zio pool=extpool vdev=/dev/sdk1 error=5 type=2 offset=330507043328 size=88064 flags=40080c80
Dec 17 07:42:02.524 test01 kernel: zio pool=extpool vdev=/dev/sdk1 error=5 type=1 offset=253023013376 size=2560 flags=180880
Dec 17 07:42:02.525 test01 kernel: zio pool=extpool vdev=/dev/sdk1 error=5 type=2 offset=227006195200 size=88064 flags=40080c80
Dec 17 07:42:02.525 test01 kernel: zio pool=extpool vdev=/dev/sdk1 error=5 type=2 offset=330507131392 size=103936 flags=40080c80
Dec 17 07:42:02.525 test01 kernel: zio pool=extpool vdev=/dev/sdk1 error=5 type=1 offset=253031490048 size=2560 flags=180880
Dec 17 07:42:02.525 test01 kernel: zio pool=extpool vdev=/dev/sdk1 error=5 type=1 offset=253238470144 size=2560 flags=180880
Dec 17 07:42:02.525 test01 kernel: zio pool=extpool vdev=/dev/sdk1 error=5 type=1 offset=209465724416 size=2560 flags=180880
Dec 17 07:42:02.525 test01 kernel: zio pool=extpool vdev=/dev/sdk1 error=5 type=1 offset=253234530816 size=3072 flags=180880
Dec 17 07:42:02.525 test01 kernel: zio pool=extpool vdev=/dev/sdk1 error=5 type=2 offset=330507235328 size=113152 flags=40080c80
Dec 17 07:42:02.525 test01 kernel: ata9.00: detaching (SCSI 8:0:0:0)
Dec 17 07:42:02.525 test01 kernel: zio pool=extpool vdev=/dev/sdk1 error=5 type=1 offset=92569901568 size=44032 flags=180880
Dec 17 07:42:02.525 test01 kernel: zio pool=extpool vdev=/dev/sdk1 error=5 type=1 offset=230062436864 size=3072 flags=180880
Dec 17 07:42:02.525 test01 kernel: zio pool=extpool vdev=/dev/sdk1 error=5 type=1 offset=53742281216 size=44032 flags=180880
Dec 17 07:42:02.525 test01 kernel: zio pool=extpool vdev=/dev/sdk1 error=5 type=1 offset=253059509760 size=3072 flags=180880
Dec 17 07:42:02.525 test01 kernel: zio pool=extpool vdev=/dev/sdk1 error=5 type=1 offset=322401792512 size=3072 flags=180880
Dec 17 07:42:02.525 test01 kernel: zio pool=extpool vdev=/dev/sdk1 error=5 type=1 offset=317432351232 size=3072 flags=180880
Dec 17 07:42:02.525 test01 kernel: zio pool=extpool vdev=/dev/sdk1 error=5 type=1 offset=476445495808 size=3072 flags=180880
Dec 17 07:42:02.525 test01 kernel: zio pool=extpool vdev=/dev/sdk1 error=5 type=1 offset=270336 size=8192 flags=b08c1
Dec 17 07:42:02.525 test01 kernel: zio pool=extpool vdev=/dev/sdk1 error=5 type=1 offset=1024199237632 size=8192 flags=b08c1
Dec 17 07:42:02.525 test01 kernel: zio pool=extpool vdev=/dev/sdk1 error=5 type=2 offset=60066396672 size=7168 flags=40080c80
Dec 17 07:42:02.525 test01 kernel: zio pool=extpool vdev=/dev/sdk1 error=5 type=1 offset=1024199499776 size=8192 flags=b08c1
Dec 17 07:42:02.525 test01 kernel: zio pool=extpool vdev=/dev/sdk1 error=5 type=2 offset=75744782336 size=10752 flags=40080c80
Dec 17 07:42:02.525 test01 kernel: sd 8:2:0:0: rejecting I/O to offline device
Dec 17 07:42:02.526 test01 kernel: zio pool=datapool vdev=/dev/sdm1 error=5 type=1 offset=270336 size=8192 flags=b08c1
Dec 17 07:42:02.526 test01 kernel: zio pool=datapool vdev=/dev/sdm1 error=5 type=1 offset=1024199237632 size=8192 flags=b08c1
Dec 17 07:42:02.526 test01 kernel: zio pool=datapool vdev=/dev/sdm1 error=5 type=1 offset=1024199499776 size=8192 flags=b08c1
Dec 17 07:42:02.526 test01 kernel: sd 8:4:0:0: rejecting I/O to offline device
Dec 17 07:42:02.532 test01 kernel: zio pool=datapool vdev=/dev/sdo1 error=5 type=1 offset=270336 size=8192 flags=b08c1
Dec 17 07:42:02.532 test01 kernel: zio pool=datapool vdev=/dev/sdo1 error=5 type=1 offset=1024199237632 size=8192 flags=b08c1
Dec 17 07:42:02.532 test01 kernel: zio pool=datapool vdev=/dev/sdo1 error=5 type=1 offset=1024199499776 size=8192 flags=b08c1
Dec 17 07:42:02.532 test01 kernel: sd 8:1:0:0: rejecting I/O to offline device
Dec 17 07:42:02.536 test01 kernel: zio pool=datapool vdev=/dev/sdl1 error=5 type=2 offset=40090974208 size=512 flags=180880
Dec 17 07:42:02.536 test01 kernel: sd 8:3:0:0: rejecting I/O to offline device
Dec 17 07:42:02.543 test01 kernel: zio pool=datapool vdev=/dev/sdn1 error=5 type=2 offset=40090974208 size=512 flags=180880
Dec 17 07:42:02.543 test01 kernel: zio pool=datapool vdev=/dev/sdl1 error=5 type=2 offset=5946151936 size=512 flags=180880
Dec 17 07:42:02.543 test01 kernel: zio pool=datapool vdev=/dev/sdn1 error=5 type=2 offset=5946151936 size=512 flags=180880
Dec 17 07:42:02.543 test01 kernel: zio pool=datapool vdev=/dev/sdl1 error=5 type=1 offset=270336 size=8192 flags=b08c1
Dec 17 07:42:02.543 test01 kernel: zio pool=datapool vdev=/dev/sdl1 error=5 type=1 offset=1024199237632 size=8192 flags=b08c1
Dec 17 07:42:02.543 test01 kernel: zio pool=datapool vdev=/dev/sdl1 error=5 type=1 offset=1024199499776 size=8192 flags=b08c1
Dec 17 07:42:02.543 test01 kernel: zio pool=datapool vdev=/dev/sdd1 error=5 type=2 offset=40090974720 size=512 flags=180880
Dec 17 07:42:02.543 test01 kernel: zio pool=datapool vdev=/dev/sdl1 error=5 type=2 offset=40090974720 size=512 flags=180880
Dec 17 07:42:02.543 test01 kernel: zio pool=datapool vdev=/dev/sdl1 error=5 type=2 offset=13820093952 size=512 flags=180880
Dec 17 07:42:02.543 test01 kernel: zio pool=datapool vdev=/dev/sdn1 error=5 type=2 offset=13820093952 size=512 flags=180880
Dec 17 07:42:02.543 test01 kernel: zio pool=datapool vdev=/dev/sdd1 error=5 type=2 offset=11875658240 size=512 flags=180880
Dec 17 07:42:02.543 test01 kernel: zio pool=datapool vdev=/dev/sdd1 error=5 type=1 offset=270336 size=8192 flags=b08c1
Dec 17 07:42:02.543 test01 kernel: zio pool=datapool vdev=/dev/sdd1 error=5 type=1 offset=1024199237632 size=8192 flags=b08c1
Dec 17 07:42:02.543 test01 kernel: zio pool=datapool vdev=/dev/sdd1 error=5 type=1 offset=1024199499776 size=8192 flags=b08c1
Dec 17 07:42:02.544 test01 kernel: zio pool=datapool vdev=/dev/sdd1 error=5 type=2 offset=150898688 size=512 flags=180880
Dec 17 07:42:02.544 test01 kernel: zio pool=datapool vdev=/dev/sdl1 error=5 type=2 offset=11875657728 size=512 flags=180880
Dec 17 07:42:02.544 test01 kernel: zio pool=datapool vdev=/dev/sdn1 error=5 type=1 offset=270336 size=8192 flags=b08c1
Dec 17 07:42:02.544 test01 kernel: zio pool=datapool vdev=/dev/sdd1 error=5 type=2 offset=28639322112 size=512 flags=180880
Dec 17 07:42:02.544 test01 kernel: zio pool=datapool vdev=/dev/sdl1 error=5 type=2 offset=150898688 size=512 flags=180880
Dec 17 07:42:02.544 test01 kernel: zio pool=datapool vdev=/dev/sdn1 error=5 type=1 offset=1024199237632 size=8192 flags=b08c1
Dec 17 07:42:02.544 test01 kernel: zio pool=datapool vdev=/dev/sdn1 error=5 type=1 offset=1024199499776 size=8192 flags=b08c1
Dec 17 07:42:02.544 test01 kernel: zio pool=datapool vdev=/dev/sdn1 error=5 type=2 offset=150898688 size=512 flags=180880
Dec 17 07:42:02.544 test01 kernel: zio pool=datapool vdev=/dev/sdn1 error=5 type=2 offset=11875657728 size=512 flags=180880
Dec 17 07:42:02.544 test01 kernel: zio pool=datapool vdev=/dev/sdd1 error=5 type=2 offset=150899200 size=512 flags=180880
Dec 17 07:42:02.544 test01 kernel: zio pool=datapool vdev=/dev/sdl1 error=5 type=2 offset=150899200 size=512 flags=180880
Dec 17 07:42:02.544 test01 kernel: zio pool=datapool vdev=/dev/sdn1 error=5 type=2 offset=150899200 size=512 flags=180880
Dec 17 07:42:02.544 test01 kernel: zio pool=datapool vdev=/dev/sdl1 error=5 type=2 offset=28639322112 size=512 flags=180880
Dec 17 07:42:02.544 test01 kernel: zio pool=datapool vdev=/dev/sdd1 error=5 type=2 offset=28639322624 size=512 flags=180880
Dec 17 07:42:02.544 test01 kernel: WARNING: Pool 'datapool' has encountered an uncorrectable I/O failure and has been suspended.
Dec 17 07:42:02.559 test01 kernel: sd 8:0:0:0: [sdk] Stopping disk
Dec 17 07:42:02.559 test01 kernel: sd 8:0:0:0: [sdk] Start/Stop Unit failed: Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
Dec 17 07:42:02.605 test01 kernel: ata9.01: detaching (SCSI 8:1:0:0)
Dec 17 07:42:02.608 test01 kernel: sd 8:1:0:0: [sdl] Stopping disk
Dec 17 07:42:02.608 test01 kernel: sd 8:1:0:0: [sdl] Start/Stop Unit failed: Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
Dec 17 07:42:02.627 test01 kernel: ata9.02: detaching (SCSI 8:2:0:0)
Dec 17 07:42:02.658 test01 kernel: sd 8:2:0:0: [sdm] Stopping disk
Dec 17 07:42:02.658 test01 kernel: sd 8:2:0:0: [sdm] Start/Stop Unit failed: Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
Dec 17 07:42:02.692 test01 kernel: ata9.03: detaching (SCSI 8:3:0:0)
Dec 17 07:42:02.692 test01 kernel: sd 8:3:0:0: [sdn] Stopping disk
Dec 17 07:42:02.692 test01 kernel: sd 8:3:0:0: [sdn] Start/Stop Unit failed: Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
Dec 17 07:42:02.735 test01 kernel: ata9.04: detaching (SCSI 8:4:0:0)
Dec 17 07:42:02.737 test01 kernel: sd 8:4:0:0: [sdo] Stopping disk
Dec 17 07:42:02.737 test01 kernel: sd 8:4:0:0: [sdo] Start/Stop Unit failed: Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
Dec 17 07:42:08.671 test01 kernel: WARNING: Pool 'datapool' has encountered an uncorrectable I/O failure and has been suspended.
Dec 17 07:42:13.338 test01 kernel: WARNING: Pool 'datapool' has encountered an uncorrectable I/O failure and has been suspended.
Dec 17 07:42:13.343 test01 kernel: WARNING: Pool 'datapool' has encountered an uncorrectable I/O failure and has been suspended.
Zpool Status
pool: datapool
state: SUSPENDED
status: One or more devices are faulted in response to IO failures.
action: Make sure the affected devices are connected, then run 'zpool clear'.
see: https://openzfs.github.io/openzfs-docs/msg/ZFS-8000-HC
config:
NAME STATE READ WRITE CKSUM
datapool DEGRADED 0 0 0
raidz3-0 DEGRADED 2 94 0
7766977507585751734 FAULTED 0 0 0 was /dev/sdc1
sdd ONLINE 3 169 0
sdg ONLINE 0 0 0
sdi ONLINE 0 0 0
sdl ONLINE 3 178 0
sdn ONLINE 3 168 0
sdq ONLINE 0 0 0
sds ONLINE 0 0 0
sdv ONLINE 0 0 0
sdx ONLINE 0 0 0
sdaa ONLINE 0 0 0
sdac ONLINE 0 0 0
raidz3-1 ONLINE 1 2 0
sdc ONLINE 3 2 0
sde ONLINE 3 2 0
sdh ONLINE 0 0 0
sdj ONLINE 0 0 0
sdm ONLINE 3 2 0
sdo ONLINE 3 2 0
sdr ONLINE 0 0 0
sdt ONLINE 0 0 0
sdw ONLINE 0 0 0
sdy ONLINE 0 0 0
sdab ONLINE 0 0 0
sdad ONLINE 0 0 0
errors: List of errors unavailable: pool I/O is currently suspended
pool: extpool
state: DEGRADED
status: One or more devices could not be used because the label is missing or
invalid. Sufficient replicas exist for the pool to continue
functioning in a degraded state.
action: Replace the device using 'zpool replace'.
see: https://openzfs.github.io/openzfs-docs/msg/ZFS-8000-4J
config:
NAME STATE READ WRITE CKSUM
extpool DEGRADED 0 0 0
raidz3-0 DEGRADED 0 0 0
sdb UNAVAIL 26 1.23K 0
sdf ONLINE 0 0 0
sdk UNAVAIL 26 1.16K 0
sdp ONLINE 6 32 1
sdu ONLINE 0 0 0
sdz ONLINE 0 0 0
errors: No known data errors
The text was updated successfully, but these errors were encountered:
Distribution: AlmaLinux 8.8
Kernel Version: 4.18.0-477.13.1
Architecture: x64
OpenZFS Version: 2.1.5-1
Intermittent communication dropouts are occurring with SSD drives that are attached to Marvell 88SE9235 SATA controllers via Marvell 88SM9705 port multipliers. The SSD drives are M.2 form factor and are typically models from WD or SanDisk. When the issue occurs, communication with all SSD drives (5) connected to port multiplier is lost and the driver performs recovery steps in order to re-establish connection with the SSD drives. This results in ZFS I/O errors being reported from
zpool status
. Multiple events with unsuccessful recovery steps by driver can lead to pool suspension.The issue occurs with both small and large I/O workloads, though usually takes longer to manifest with small I/O workload.
The issue does not occur with older version of CentOS and ZFS running on same hardware. Details:
Distribution: CentOS 7.9
Kernel Version: 3.10.0-1160.15.2
Architecture: x64
OpenZFS Version: 0.8.6-1
Typically, the ZFS pools in use on AlmaLinux were created in CentOS. Creating the pools in AlmaLinux did not resolve issue.
Have tried the following, in different combinations:
Disabling NCQ
Lowering SATA speed to 3.0
Upgrading ZFS to 2.1.13
Changing SATA power management from max_performance -> medium_power
Changing I/O scheduler from None -> mq-deadline
Change max_sectors_kb -> 512
The issue can be reproduced as follows:
Small I/O workload: Boot-up system w/ apps that generate small sustained I/O load on the ZFS pool and let it run w/o interaction
Large I/O workload: Use
fio
to generate heavy I/O workload on ZFS poolSyslog Snippet
Zpool Status
The text was updated successfully, but these errors were encountered: