-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
colmem: improve memory accounting when memory limit is exceeded #86357
Conversation
30269ce
to
984487b
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
WDYT of pushing this down into BoundAccount
and BytesMonitor
? Seems like there could be other cases where we're accounting after-the-fact and would want to both throw an error and keep track of the overflow.
Reviewed 1 of 10 files at r1.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @DrewKimball and @michae2)
Hm, I'm quite certain there are other cases like this, but that would be too large of a refactor and would make the API of Currently, the contract is such that you're supposed to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I think it would mean expanding the contract a bit to include the possibility of debt.
It might be as simple as a change to BytesMonitor.reserveBytes
to include the possibility of BytesMonitor.mu.curAllocated > BytesMonitor.limit
.
I think we could keep the refactor small, by only doing it with new methods and/or new types and leaving the existing code alone. I.e. keep BoundAccount.Grow
the same, and add BoundAccount.GrowWithOverdraft
(or some other better name) that can put the account / monitor into debt. (Or maybe add a new type BoundOverdraftAccount
or BoundCreditCard
that has its own Grow
if a type change would be simpler than a method change.) Something like this:
diff --git a/pkg/util/mon/bytes_usage.go b/pkg/util/mon/bytes_usage.go
index 4d2f9246eb..f328ef3165 100644
--- a/pkg/util/mon/bytes_usage.go
+++ b/pkg/util/mon/bytes_usage.go
@@ -710,6 +710,25 @@ func (b *BoundAccount) Grow(ctx context.Context, x int64) error {
return nil
}
+func (b *BoundAccount) GrowWithOverdraft(ctx context.Context, x int64) error {
+ if b == nil {
+ return nil
+ }
+ if b.Mu != nil {
+ b.Mu.Lock()
+ defer b.Mu.Unlock()
+ }
+ var err error
+ if b.reserved < x {
+ minExtra := b.mon.roundSize(x - b.reserved)
+ err = b.mon.reserveBytes(ctx, minExtra, true)
+ b.reserved += minExtra
+ }
+ b.reserved -= x
+ b.used += x
+ return err
+}
+
// Shrink releases part of the cumulated allocations by the specified size.
//
// If Mu is set, it is safe for use by concurrent goroutines.
@@ -738,7 +757,7 @@ func (b *BoundAccount) Shrink(ctx context.Context, delta int64) {
// reserveBytes declares an allocation to this monitor. An error is returned if
// the allocation is denied.
// x must be a multiple of `poolAllocationSize`.
-func (mm *BytesMonitor) reserveBytes(ctx context.Context, x int64) error {
+func (mm *BytesMonitor) reserveBytes(ctx context.Context, x int64, overdraft bool) error {
mm.mu.Lock()
defer mm.mu.Unlock()
// Check the local limit first. NB: The condition is written in this manner
@@ -747,10 +766,14 @@ func (mm *BytesMonitor) reserveBytes(ctx context.Context, x int64) error {
//
// TODO(knz): make the monitor name reportable in telemetry, after checking
// that the name is never constructed from user data.
+ var err error
if mm.mu.curAllocated > mm.limit-x {
- return errors.Wrapf(
+ err = errors.Wrapf(
mm.resource.NewBudgetExceededError(x, mm.mu.curAllocated, mm.limit), "%s", mm.name,
)
+ if !overdraft {
+ return err
+ }
}
// Check whether we need to request an increase of our budget.
if mm.mu.curAllocated > mm.mu.curBudget.used+mm.reserved.used-x {
@@ -786,7 +809,7 @@ func (mm *BytesMonitor) reserveBytes(ctx context.Context, x int64) error {
log.Infof(ctx, "%s: now at %d bytes (+%d) - %s",
mm.nameWithPointer, mm.mu.curAllocated, x, util.GetSmallTrace(3))
}
- return nil
+ return err
}
// releaseBytes releases bytes previously successfully registered via
Is that too scary? Maybe I'm missing something? I don't want to block your work, but it also seems a little silly to allocate another monitor and account for something that the original monitor and account could track without very many changes. But maybe I'm minimizing it or glossing over some complexity?
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @DrewKimball and @michae2)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another possibility to consider would be to add a method to the BufferingInMemoryOperator
interface that releases the memory reserved in the limited account and accounts for all memory used by the in-memory operator in an unlimited disk-spiller account, since the disk-spiller should "know" how long the in-memory operator's allocations will live.
If we want to stick with this approach though, the changes
Reviewed 10 of 10 files at r1, 42 of 42 files at r2, 3 of 3 files at r3, all commit messages.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @yuzefovich)
pkg/sql/colexec/sort.go
line 341 at r3 (raw file):
// Finalize the accounting in case the capacity of the new slice is // larger than we asked for. extraCap := cap(p.order) - len(p.order)
Is it possible for the capacity to be larger than the length requested with make
?
This commit improves the memory accounting when memory limit is exceeded. Previously, in several operators we could run into a situation where we perform some allocations and run into a memory limit error later, which results in those allocations being unaccounted for. In some cases this is acceptable (when the query results in an error), but in others the memory error is caught and spilling to disk occurs. In the latter scenarios we would under-account, and this commit fixes most of such situations. Now, each disk-spilling operator instantiates a "limited" allocator that will grow an unlimited memory account when a memory error is encountered. The idea is that even though the denied allocations cannot be registered with the main memory account (meaning the operator has exceeded its memory limit), we still will draw from the `--max-sql-memory` pool since the allocations can be live for non-trivial amount of time. If an error occurs when growing the unlimited memory account, then that error is returned (not the original memory error) so that the disk spiller doesn't catch it. This commit audits all operators in `execplan` to use the limited allocator where appropriate. The new accounting method is only used in a handful of places which cover most of the use cases. The goal is to make this commit backportable whereas the follow-up commit will audit usages of `AdjustMemoryUsage` and will not be backported. Release justification: bug fix. Release note: None
This commit audits all callers of `Allocator.AdjustMemoryUsage` to use the newly-exported `AdjustMemoryUsageAfterAllocation` where applicable (meaning that if an allocation occurs before the method is called, then the new method is now used). In many cases this won't result in a change in the behavior since the allocators are not instantiated with limited memory accounts, but in some cases it is still useful. Release justification: bug fix. Release note: None
(@yuzefovich to be clear, I can be argued out of this idea, if you don't want to do it. I just wanted to flesh it out a bit.) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It might be as simple as a change to
BytesMonitor.reserveBytes
to include the possibility ofBytesMonitor.mu.curAllocated > BytesMonitor.limit
.
Hm, it's a good point. It's a bit difficult to reason about whether anything will break because of it (e.g. are all of the bytes correctly released when the account is closed if the overdraft occurred).
I started going down this route, and TestAdjustMemoryUsage
(that is being added in this PR) does expose a problem - allowing the monitor going into debt without making some adjustments to the memory account that is putting the monitor in debt breaks the synchronization between the two. In other words, we can now arrive at a situation where the account thinks that it is using 64MiB but it has consumed more from the monitor, so when the account is closed, not all of the bytes are correctly released, so we have a "memory accounting leak". If we do make the account update its used
field even when the monitor returns an error, then we end up in a problematic situation when the root monitor returns the error. We would have to examine the error message and update the limited account only if the error message comes from the limited monitor, and not the unlimited one.
So after all I think this idea makes the implementation more complicated and more error-prone, but offers somewhat cleaner API. I'm inclined to keep my original approach since it seems safer and thus can be backported.
Another possibility to consider would be to add a method to the
BufferingInMemoryOperator
interface that releases the memory reserved in the limited account and accounts for all memory used by the in-memory operator in an unlimited disk-spiller account, since the disk-spiller should "know" how long the in-memory operator's allocations will live.
That would only solve the problem this PR attempts to solve if the disk-backed operator also recomputed the size of all live allocations. In other words, if the limited memory account under-accounts, then it would not matter if we transfer that under-accounted reservation to the unlimited account of the disk-backed operator. We would need to ask the disk-backed op to calculate the precise footprint of all allocations it is taking over and clear the limited account.
Reviewable status: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @DrewKimball)
pkg/sql/colexec/sort.go
line 341 at r3 (raw file):
Previously, DrewKimball (Drew Kimball) wrote…
Is it possible for the capacity to be larger than the length requested with
make
?
Yeah, you're right - it shouldn't be possible, removed this commit. My mind was in the "append-only" mode when I wrote that commit :)
I see two thumbs-ups, so I'm assuming we're all ok with this change. TFTRs! bors r+ |
Build failed (retrying...): |
Build failed (retrying...): |
Build succeeded: |
colmem: improve memory accounting when memory limit is exceeded
This commit improves the memory accounting when memory limit is
exceeded. Previously, in several operators we could run into a situation
where we perform some allocations and run into a memory limit error
later, which results in those allocations being unaccounted for. In some
cases this is acceptable (when the query results in an error), but in
others the memory error is caught and spilling to disk occurs. In the
latter scenarios we would under-account, and this commit fixes most of
such situations.
Now, each disk-spilling operator instantiates a "limited" allocator that
will grow an unlimited memory account when a memory error is
encountered. The idea is that even though the denied allocations cannot
be registered with the main memory account (meaning the operator has
exceeded its memory limit), we still will draw from the
--max-sql-memory
pool since the allocations can be live fornon-trivial amount of time. If an error occurs when growing the
unlimited memory account, then that error is returned (not the original
memory error) so that the disk spiller doesn't catch it.
This commit audits all operators in
execplan
to use the limitedallocator where appropriate. The new accounting method is only used in
a handful of places which cover most of the use cases. The goal is to
make this commit backportable whereas the follow-up commit will audit
usages of
AdjustMemoryUsage
and will not be backported.Addresses: #64906.
Fixes: #86351.
Addresses: https://github.com/cockroachlabs/support/issues/1762.
Release justification: bug fix.
Release note: None
colmem: audit callers of AdjustMemoryUsage
This commit audits all callers of
Allocator.AdjustMemoryUsage
to usethe newly-exported
AdjustMemoryUsageAfterAllocation
where applicable(meaning that if an allocation occurs before the method is called, then
the new method is now used). In many cases this won't result in a change
in the behavior since the allocators are not instantiated with limited
memory accounts, but in some cases it is still useful.
Release justification: bug fix.
Release note: None