-
Notifications
You must be signed in to change notification settings - Fork 3.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
sql: preliminary mechanism to track and limit SQL memory usage. #8691
Conversation
This PR does not yet implement a flexible policy as documented by #7657; this is because I'd like to avoid making this PR larger than it already is. I will address the remaining points (policy and tracking write intents) after we reach consensus on the approach here. |
3968ccd
to
ae99758
Compare
90d99ed
to
c89bbf0
Compare
Such modernity much accountability https://www.youtube.com/watch?v=n4RjJKxsamQ One thought about what we were talking about - how to get the statement that failed an allocation back to the user: how about you have the Another incoherent thought I had is to maybe think how all this fits together with the reservation mechanism Bram has done for receiving snapshots. <3 Review status: 0 of 44 files reviewed at latest revision, 29 unresolved discussions, some commit checks failed. cli/start.go, line 109 [r1] (raw file):
nit: this change seems superfluous to me server/admin.go, line 178 [r1] (raw file):
thanks, Golang sql/distinct.go, line 174 [r1] (raw file):
so sql/executor.go, line 134 [r1] (raw file):
nit: assign a dummy ResultColumns to that interface to assert that it implements it sql/plan.go, line 176 [r1] (raw file):
what's this "suggests" phrasing? :) sql/planner.go, line 228 [r1] (raw file):
I'm confused. How does that this work with the sql/trace.go, line 66 [r1] (raw file):
you're adding a distinction between using a context for logging and using it for other stuff, but we're still just passing a single context to the sql/mon/mem_usage.go, line 34 [r1] (raw file):
I think this struct deserves a comment about how it's intended to be used. There's gonna be one of these per query? It's not thread safe (that might be a problem for some of the distsql nodes)? Can they be reused after sql/mon/mem_usage.go, line 34 [r1] (raw file):
May I suggest a slightly different API for this guy? I think sql/mon/mem_usage.go, line 39 [r1] (raw file):
say that all these are in bytes, or even add it to the names sql/mon/mem_usage.go, line 50 [r1] (raw file):
ummm so we should generally try to not store contexts in structs. Instead, operations that need one (or, ideally, pretty much every single function) should take a context from the caller, representing the operation that the caller is in the process of doing at that moment. sql/mon/mem_usage.go, line 60 [r1] (raw file):
sql/mon/mem_usage.go, line 88 [r1] (raw file):
sql/mon/mem_usage.go, line 108 [r1] (raw file):
panic? Say that sql/mon/mem_usage.go, line 131 [r1] (raw file):
please document. Is this really needed? Can you use sql/mon/mem_usage_test.go, line 30 [r1] (raw file):
I don't see why this is needed. The test doesn't seem to look at the logs? (and if it did, maybe we can find another option :) ) sql/parser/row_container.go, line 17 [r1] (raw file):
does this need to be in sql/parser/row_container.go, line 28 [r1] (raw file):
how about sql/parser/row_container.go, line 33 [r1] (raw file):
I think the word "managed" doesn't mean anything, and generally it'd be good if this comment was expanded a little since I guess this type is gonna become pretty important. sql/parser/row_container.go, line 34 [r1] (raw file):
I think there's a structure that at least superficially serves the same purpose of storing a bunch of nodes, maybe with some preallocations, in sql/parser/row_container.go, line 46 [r1] (raw file):
put the unit in the name? sql/parser/row_container.go, line 50 [r1] (raw file):
please document the sql/parser/row_container.go, line 68 [r1] (raw file):
ha? Despite the name, this doesn't take into account variable vs non-variable cols, or actually the type of the columns at all. sql/parser/row_container.go, line 74 [r1] (raw file):
are sql/parser/row_container.go, line 76 [r1] (raw file):
This seems weird to me. Is it particularly convenient to support sql/parser/row_container.go, line 98 [r1] (raw file):
maybe we can interact with the sql/parser/row_container.go, line 125 [r1] (raw file):
but you can't sql/parser/row_container.go, line 138 [r1] (raw file):
"inquire". We're not traveling on dual carriageways over here. sql/pgwire/v3.go, line 712 [r1] (raw file):
comment that this destroys the results Comments from Reviewable |
Nice stuff! I still have to go through most of the change in detail but I like the core idea of the row container. We should think in very general terms about how we will be able to use the same infrastructure for Review status: 0 of 44 files reviewed at latest revision, 32 unresolved discussions, some commit checks failed. sql/parser/row_container.go, line 28 [r1] (raw file):
Do we think we will have multiple implementations for this interface? sql/parser/row_container.go, line 39 [r1] (raw file):
it's the columns that are fixed not the row, maybe fixedColsSize sql/parser/row_container.go, line 50 [r1] (raw file):
sql/parser/row_container.go, line 98 [r1] (raw file):
|
I'll talk with Bram for reservations and with you both for various optimizations. In the meantime PTAL. Review status: 0 of 42 files reviewed at latest revision, 32 unresolved discussions, some commit checks pending. cli/start.go, line 109 [r1] (raw file):
|
TFYR btw ❤️ |
but I'm trying again with suggestions about changing the API of the monitor to something that's makes more explicit the lifetime of each allocation. Review status: 0 of 42 files reviewed at latest revision, 18 unresolved discussions, some commit checks failed. server/admin.go, line 178 [r1] (raw file):
|
cac75ae
to
e1e9fe5
Compare
PTAL. So thanks to Andrei's previous remarks I saw the light and refactored the entire thing to use span objects as interface between client components and the monitor. This makes the lifecycle of allocated objects more explicit. Also I had to think more about how state is preserved in the monitor. I came to the conclusion/realization that a monitor is session-bound, not statement-bound. So I changed this too; the monitor is started and stopped with the lifecycle of the entire session. Once this was done I could then successfully instrument the pgwire code to get prepared statements and portals also tracked by the monitor. Please check I didn't screw up anything there. Then finally, before this goes in I'd like another round of advice as to how to best gather and log the allocation history in a session. To debug this code I have made a first attempt (currently commented out in the code with a "FIXME: andrei?") which logs all events to a Review status: 0 of 42 files reviewed at latest revision, 12 unresolved discussions, some commit checks failed. sql/distinct.go, line 174 [r1] (raw file):
|
f308734
to
7123910
Compare
package sql | ||
|
||
import "github.com/cockroachdb/cockroach/sql/mon" | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@andreimatei the comment below is for you. I think I got the best I could as long as #7775 is not fixed.
Review status: 25 of 51 files reviewed at latest revision, 25 unresolved discussions, some commit checks pending. sql/mon/account.go, line 91 [r6] (raw file):
We should panic if we don't have Comments from Reviewable |
Review status: 25 of 51 files reviewed at latest revision, 25 unresolved discussions, some commit checks pending. sql/mon/account.go, line 36 [r5] (raw file):
|
Review status: 25 of 51 files reviewed at latest revision, 25 unresolved discussions, some commit checks pending. sql/mon/account.go, line 91 [r6] (raw file):
|
Review status: 25 of 51 files reviewed at latest revision, 25 unresolved discussions, some commit checks pending. sql/mon/account.go, line 36 [r5] (raw file):
|
Review status: 25 of 51 files reviewed at latest revision, 25 unresolved discussions, some commit checks pending. sql/mon/account.go, line 36 [r5] (raw file):
|
Review status: 25 of 51 files reviewed at latest revision, 25 unresolved discussions, some commit checks pending. sql/mon/account.go, line 91 [r6] (raw file):
|
Review status: 25 of 51 files reviewed at latest revision, 25 unresolved discussions, all commit checks successful. sql/mon/account.go, line 91 [r6] (raw file):
|
Review status: 25 of 51 files reviewed at latest revision, 23 unresolved discussions, some commit checks failed. sql/mon/account.go, line 102 [r7] (raw file):
We don't need this here, it's exactly what (I was suggesting something else - checking that Comments from Reviewable |
Review status: 25 of 51 files reviewed at latest revision, 23 unresolved discussions, some commit checks failed. sql/mon/account.go, line 102 [r7] (raw file):
|
Review status: 20 of 52 files reviewed at latest revision, 23 unresolved discussions, all commit checks successful. sql/mon/account.go, line 29 [r5] (raw file):
|
Review status: 20 of 52 files reviewed at latest revision, 23 unresolved discussions, some commit checks failed. sql/mon/account.go, line 29 [r5] (raw file):
|
This patch instruments SQL execution to track memory allocations whose amounts depend on user input and current database contents (as opposed to allocations dependent on other parameters e.g. statement size). It does so by introducing a new MemoryUsageMonitor object intended to be instantiated once per session. It is set up and teared down with the lifecycle of a session. Components can then link and report their memory usage to the monitor via MemoryAccount objects. Accounts can gate incremental allocations and keep track of the cumulative allocations so that all can be released at once. This is used to track and limits allocations: - in `valueNode`, - buckets in `groupNode`, - sorted data in `sortNode`, - temp results in `joinNode`, - seen prefixes and suffixes in `distinctNode`, - window and partition slices in `windowNode`, - the Result arrays in Executor, - prepared statements and prepared portals in pgwire. A moderate amount of intelligence is implemented so as to avoid computing sizes for every column of every row in a valueNode - the combined size of all fixed-size columns is precomputed and counted at once for every row. This patch does not track the memory allocated for write intents in the KV transaction object. For troubleshooting the following mechanisms are provided: - an env var COCKROACH_NOTEWORTHY_MEMORY_USAGE, if set, defines the minimum total allocated size for a monitor before it starts logging how much memory it is consuming. - detailed allocation progress is logged at level 2 or above. To trace just SQL activity and memory allocation, one can use for example `--vmodule=executor=2,mem_usage=2`.
PTAL, and please review the 3 commits separately. Review status: 18 of 51 files reviewed at latest revision, 23 unresolved discussions. Comments from Reviewable |
Every access to a MemoryAccount struct needs a logging context obtained from the session. Simplify the code by introducing Session wrapper methods to provide this context reference. Idea courtesy of @petermattis.
Since all uses of MemoryAccount go through the Session object (because they need the Session's logging context), and since the Session object hosts the monitor object, have Session provide the monitor pointer to the MemoryAccount methods. This saves a copy of the monitor pointer in every account instantiated.
Ok you know what since I have the other important PR lined up which requires this change I will go ahead and merge this. You will notice that I did my homework and performed change you requested to the MemoryAccount API in the first, main "large" commit, then changed back again to the simplified form in a small, well-contained commit 1e04a49 that can be easily reverted without removing the main feature. So if someone believes that the code is better without that latter commit, feel free to revert just that one and submit the result. I'll even be happy to review it. |
It is evident from the discussion that three of us (Andrei, Nate, and myself) think the code is better without that commit so I'm not sure why you included it. Also, without that commit, the wrappers in Comments from Reviewable |
In fact, I think |
My initial implementation had the contexts as well but Andrei convinced me these should be passed at every call instead.Sent from my Android device with K-9 Mail. Please excuse my brevity. |
The wrappers are not one liners any more, or at least not in the way they were initially. The new API preserves the desired look and feel at the points of use.Sent from my Android device with K-9 Mail. Please excuse my brevity. |
This patch instruments SQL execution to track memory allocations whose
amounts depend on user input and current database contents (as opposed
to allocations dependent on other parameters e.g. statement size).
This tracks and limits allocations in
valueNode
, buckets ingroupNode
, sorted data insortNode
, temp results injoinNode
,seen prefixes in
distinctNode
and the Result array in Executor.A moderate amount of intelligence is implemented so as to avoid
computing sizes for every column of every row in a valueNode - the
combined size of all fixed-size columns is precomputed and counted at
once for every row.
The maximum amount of memory allocated by SQL values is limited in
this patch to 1GB. An error is reported to the client if this limit is
exceeded. Additionally, queries that allocate more than 10KB worth of
data will see their usage printed in the log files automatically.
This patch does not track the memory allocated for write intents in
the KV transaction object.
This change is