release-22.1: colexecdisk: make sure to release resources in all cases #81492

yuzefovich · 2022-05-19T00:26:08Z

Backport 1/1 commits from #81419.

/cc @cockroachdb/release

Previously, it was possible to not release some resources when external
distinct or external hash aggregator short circuit their execution
(either because of an error or because of the LIMIT on the query) in
some cases (namely, when a sort is planned on top of the external
operation to restore the desired ordering). This was the case because
the hash-based partitioner (which abstracts away the disk-backed
algorithm) wasn't added to OpWithMetaInfo.ToClose slice since there is
a sort on top of it nor was it closed by that sort.

Here is an example diagram for all the infra that is set up for the
disk-backed distinct when ordering needs to be maintained:

         diskSpillerBase (disk-backed distinct)
           |                      |
UnorderedDistinct     diskSpillerBase [1] (disk-backed sort)
                      |              |                  |
                in-mem sorter   external sorter   hash-based partitioner [2]

In this diagram, hash-based partitioner [2] is the external distinct
that is the input to the diskSpillerBase [1]. In the happy path (when
[2] is exhausted), it is Closed automatically. However, if its
execution is short-circuited, [2] will never be closed because:

due to the way the infra was created, it was never added to the
ToClose slice (so it will not be closed on the flow cleanup)
diskSpillerBase [1] doesn't close its inputs
external sorter nor the in-memory sorter end up closing [2] either
because there are other utility operators that don't implement
Closer interface between the sorters and [2].

As a result of not closing [2], some disk resources might be leaked.
This commit fixes the issue by making diskSpillerBase close all of its
inputs (which is a single input in the case of the disk-backed distinct
and disk-backed hash aggregator). Close is allowed to be called
multiple time, so it is ok if there happen to be other codepaths
calling it.

Fixes: #81413.

Release note: None

Release justification: low risk bug fix.

Previously, it was possible to not release some resources when external distinct or external hash aggregator short circuit their execution (either because of an error or because of the LIMIT on the query) in some cases (namely, when a sort is planned on top of the external operation to restore the desired ordering). This was the case because the hash-based partitioner (which abstracts away the disk-backed algorithm) wasn't added to `OpWithMetaInfo.ToClose` slice since there is a sort on top of it nor was it closed by that sort. Here is an example diagram for all the infra that is set up for the disk-backed distinct when ordering needs to be maintained: ``` diskSpillerBase (disk-backed distinct) | | UnorderedDistinct diskSpillerBase [1] (disk-backed sort) | | | in-mem sorter external sorter hash-based partitioner [2] ``` In this diagram, `hash-based partitioner [2]` is the external distinct that is the input to the `diskSpillerBase [1]`. In the happy path (when `[2]` is exhausted), it is `Close`d automatically. However, if its execution is short-circuited, `[2]` will never be closed because: - due to the way the infra was created, it was never added to the `ToClose` slice (so it will not be closed on the flow cleanup) - `diskSpillerBase [1]` doesn't close its inputs - `external sorter` nor the in-memory sorter end up closing `[2]` either because there are other utility operators that don't implement `Closer` interface between the sorters and `[2]`. As a result of not closing `[2]`, some disk resources might be leaked. This commit fixes the issue by making `diskSpillerBase` close all of its inputs (which is a single input in the case of the disk-backed distinct and disk-backed hash aggregator). `Close` is allowed to be called multiple time, so it is ok if there happen to be other codepaths calling it. Release note: None

blathers-crl · 2022-05-19T00:26:10Z

cockroach-teamcity · 2022-05-19T00:26:17Z

This change is

michae2

Reviewed 5 of 5 files at r1, all commit messages.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @cucaroach)

yuzefovich requested review from michae2 and cucaroach May 19, 2022 00:26

michae2 approved these changes May 19, 2022

View reviewed changes

yuzefovich merged commit 019ec14 into cockroachdb:release-22.1 May 19, 2022

yuzefovich deleted the backport22.1-81419 branch May 19, 2022 15:41

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

release-22.1: colexecdisk: make sure to release resources in all cases #81492

release-22.1: colexecdisk: make sure to release resources in all cases #81492

yuzefovich commented May 19, 2022

blathers-crl bot commented May 19, 2022 •

edited by yuzefovich

Loading

cockroach-teamcity commented May 19, 2022

michae2 left a comment

release-22.1: colexecdisk: make sure to release resources in all cases #81492

release-22.1: colexecdisk: make sure to release resources in all cases #81492

Conversation

yuzefovich commented May 19, 2022

blathers-crl bot commented May 19, 2022 • edited by yuzefovich Loading

cockroach-teamcity commented May 19, 2022

michae2 left a comment

Choose a reason for hiding this comment

blathers-crl bot commented May 19, 2022 •

edited by yuzefovich

Loading