-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
sql/colcontainer: TestExternalDistinct leaks Queues, file descriptors #81413
Comments
This commit includes non-trivial changes to account for the change in the `vfs.WithDiskHealthChecks` function signature change. Additionally, this PR uncovered cockroachdb#81413 and necessitated some changes to work around it. ``` e567fec8 db: ensure Open closes opened directories on error b8c9a560 internal/metamorphic: overwrite unused bounds buffers 7c5f0cbb db: copy user-provided bounds and optimize unchanged bounds f8897076 *: Add IterOption to optionally read L6 filter blocks. d79f9617 vfs: extend disk-health checking to filesystem metadata operations 5ae21746 db: remove newRangeKeyIter closure 6d975489 db: add BenchmarkIteratorScan 782d102e sstable: fix invariant check for sstable size estimation 738a7f0b db: fix NewIter regression resulting in extra memtable levels 37558663 *: Use keyspan.LevelIter for rangedels in compactions e6c60c71 db: use sublevel level iters for all compactions out of L0 d8f63bef db: extend documentation on ingested file overlap; improve test cases b9e970a8 internal/keyspan: correct and document MergingIter key stability 498177bb internal/keyspan: collapse fragmentBoundIterator into MergingIter ``` Release note (bug fix): Fix a gap in disk-stall detection. Previously, disk stalls during filesystem metadata operations could go undetected, inducing deadlocks. Now stalls during these types of operations will correctly fatal the process.
This commit includes non-trivial changes to account for the change in the `vfs.WithDiskHealthChecks` function signature change. Additionally, this PR uncovered cockroachdb#81413 and necessitated some changes to work around it. ``` e567fec8 db: ensure Open closes opened directories on error b8c9a560 internal/metamorphic: overwrite unused bounds buffers 7c5f0cbb db: copy user-provided bounds and optimize unchanged bounds f8897076 *: Add IterOption to optionally read L6 filter blocks. d79f9617 vfs: extend disk-health checking to filesystem metadata operations 5ae21746 db: remove newRangeKeyIter closure 6d975489 db: add BenchmarkIteratorScan 782d102e sstable: fix invariant check for sstable size estimation 738a7f0b db: fix NewIter regression resulting in extra memtable levels 37558663 *: Use keyspan.LevelIter for rangedels in compactions e6c60c71 db: use sublevel level iters for all compactions out of L0 d8f63bef db: extend documentation on ingested file overlap; improve test cases b9e970a8 internal/keyspan: correct and document MergingIter key stability 498177bb internal/keyspan: collapse fragmentBoundIterator into MergingIter ``` Release note (bug fix): Fix a gap in disk-stall detection. Previously, disk stalls during filesystem metadata operations could go undetected, inducing deadlocks. Now stalls during these types of operations will correctly fatal the process.
This commit includes non-trivial changes to account for the change in the `vfs.WithDiskHealthChecks` function signature change. Additionally, this PR uncovered cockroachdb#81413 and necessitated some changes to work around it. ``` e567fec8 db: ensure Open closes opened directories on error b8c9a560 internal/metamorphic: overwrite unused bounds buffers 7c5f0cbb db: copy user-provided bounds and optimize unchanged bounds f8897076 *: Add IterOption to optionally read L6 filter blocks. d79f9617 vfs: extend disk-health checking to filesystem metadata operations 5ae21746 db: remove newRangeKeyIter closure 6d975489 db: add BenchmarkIteratorScan 782d102e sstable: fix invariant check for sstable size estimation 738a7f0b db: fix NewIter regression resulting in extra memtable levels 37558663 *: Use keyspan.LevelIter for rangedels in compactions e6c60c71 db: use sublevel level iters for all compactions out of L0 d8f63bef db: extend documentation on ingested file overlap; improve test cases b9e970a8 internal/keyspan: correct and document MergingIter key stability 498177bb internal/keyspan: collapse fragmentBoundIterator into MergingIter ``` Release note (bug fix): Fix a gap in disk-stall detection. Previously, disk stalls during filesystem metadata operations could go undetected, inducing deadlocks. Now stalls during these types of operations will correctly fatal the process.
With the changes from #81419, I still saw a couple file descriptors leak in For now, I'm going to continue disable disk-health checking for these in-memory unit test engines, so that we can merge this Pebble bump. |
Thanks Jackson! Could you please drop a quick comment on how to enable those checks back so that I could debug? |
Oh, I see, it's just |
81389: vendor: bump Pebble to e567fec84c6e r=jbowens a=jbowens This commit includes non-trivial changes to account for the change in the `vfs.WithDiskHealthChecks` function signature change. Additionally, this PR uncovered #81413 and necessitated some changes to work around it. ``` e567fec8 db: ensure Open closes opened directories on error b8c9a560 internal/metamorphic: overwrite unused bounds buffers 7c5f0cbb db: copy user-provided bounds and optimize unchanged bounds f8897076 *: Add IterOption to optionally read L6 filter blocks. d79f9617 vfs: extend disk-health checking to filesystem metadata operations 5ae21746 db: remove newRangeKeyIter closure 6d975489 db: add BenchmarkIteratorScan 782d102e sstable: fix invariant check for sstable size estimation 738a7f0b db: fix NewIter regression resulting in extra memtable levels 37558663 *: Use keyspan.LevelIter for rangedels in compactions e6c60c71 db: use sublevel level iters for all compactions out of L0 d8f63bef db: extend documentation on ingested file overlap; improve test cases b9e970a8 internal/keyspan: correct and document MergingIter key stability 498177bb internal/keyspan: collapse fragmentBoundIterator into MergingIter ``` Release note (bug fix): Fix a gap in disk-stall detection. Previously, disk stalls during filesystem metadata operations could go undetected, inducing deadlocks. Now stalls during these types of operations will correctly fatal the process. 81535: colmem: improve the behavior of ResetMaybeReallocate r=yuzefovich a=yuzefovich Previously, `Allocator.ResetMaybeReallocate` would always grow the capacity of the batch until `coldata.BatchSize()` (unless the memory limit has been exceeded). This behavior allows us to have batches with dynamic size when we don't know how many rows we need to process (for example, in the `cFetcher` we start out with 1 row, then grow it exponentially - 2, 4, 8, etc). However, in some cases we know exactly how many rows we want to include into the batch, so that behavior can result in us re-allocating a batch needlessly when the old batch already have enough capacity. This commit improves the situation by adding a knob to indicate that if the desired capacity is satisfied by the old batch, then it should not be re-allocated. All callers have been audited accordingly. Release note: None 81552: roachtest: fix pgjdbc and typeorm r=otan a=rafiss fixes #81515 fixes #81431 See individual commits. 81558: sql,stmtdiagnostics: remove some version gates r=yuzefovich a=yuzefovich This commit addresses several TODOs with my name on them about removing the version gates, mostly around the conditional statement diagnostics introduced in 22.1 cycle. Release note: None 81579: roachtest: allow TC_BUILDTYPE_ID to be accessible by Docker r=rickystewart a=renatolabs In #81103, the process of generating TeamCity links in test failure reports started relying on the `TC_BUILDTYPE_ID` environment variable. While that variable was added to TeamCity builds, it was not being passed down to Docker where the tests actually run. As a result, links generated by the GitHub poster were broken (see, for example, #81572). This commit makes `TC_BUILDTYPE_ID` accessible by Docker for every build that was already passing `TC_BUILD_BRANCH`. This should be sufficient to cover all existing cases and more, in case having access to this variable becomes useful in the future. Release note: None Co-authored-by: Jackson Owens <[email protected]> Co-authored-by: Yahor Yuzefovich <[email protected]> Co-authored-by: Rafi Shamim <[email protected]> Co-authored-by: Renato Costa <[email protected]>
In #81389, I originally moved where Pebble's disk-health checking is layered onto the filesystem. Disk-health checking monitors for excessively slow writes, crashing the process if writes appear to be stalled in the kernel. The changes in #81389 caused disk-health checking to be used in
TestExternalDistinct
.TestExternalDistinct
began to fail with leaked goroutines in the disk-health checking. I dug into it a bit, and traced it to adiskQueue
that is neverClose
d, causing itswriteFile
to never be closed.For now I should be able to refactor #81389 to continue omitting disk-health checking in this code path and unblock the vendor bump, so this isn't a blocker for me.
Jira issue: CRDB-15202
The text was updated successfully, but these errors were encountered: