Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sql, import: add metrics for max_row_size guardrails #69457

Merged
merged 3 commits into from
Aug 28, 2021

Conversation

michae2
Copy link
Collaborator

@michae2 michae2 commented Aug 27, 2021

sql: rename max_row_size guardrails to match transaction row limits

Addresses: #67400

Rename sql.mutations.max_row_size.{log|err} to
sql.guardrails.max_row_size_{log|err} for consistency with
transaction_rows_{read|written}_{log|err} and upcoming metrics.

Release justification: Low-risk update to new functionality.

Release note (ops change): New variables
sql.mutations.max_row_size.{log|err} were renamed to
sql.guardrails.max_row_size_{log|err} for consistency with other
variables and metrics.

sql/row: remove dependency on sql/execinfra

The next commit needs to make sql/row a dependency of sql/execinfra, so
remove two small references to sql/execinfra.

(I will squash this into the next commit before merging.)

Release justification: Low-risk update to new functionality.

Release note: None

sql, import: plumb RowMetrics from ExecutorConfig to row.Helper

Addresses: #67400

Add metrics for sql/row and pass them down from ExecutorConfig and
FlowCtx all the way to row.Helper. Like sql.Metrics, there are two
copies, one for user queries and one for internal queries. (These were
originally part of sql.Metrics, but there are several users of sql/row
that do not operate under a sql.Server or connExecutor so we are forced
to add these to the ExecutorConfig and FlowCtx instead.)

I ran into difficulty passing these through import, as FlowCtx itself
is not plumbed through.

There are only two metrics at first, corresponding to
sql.guardrails.max_row_size_{log|err}.

Release justification: Low-risk update to new functionality.

Release note (ops): Added four new metrics,
sql.guardrails.max_row_size_{log|err}.count{.internal} which are
incremented whenever a large row violates
sql.guardrails.max_row_size_{log|err}.

@michae2 michae2 requested review from rytaft, yuzefovich, a team and shermanCRL and removed request for a team August 27, 2021 07:21
@cockroach-teamcity
Copy link
Member

This change is Reviewable

@shermanCRL shermanCRL removed their request for review August 27, 2021 13:23
Copy link
Collaborator

@rytaft rytaft left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!

Reviewed 3 of 3 files at r1, 2 of 2 files at r2, 24 of 24 files at r3, all commit messages.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @michae2 and @yuzefovich)


pkg/sql/exec_util.go, line 2975 at r3 (raw file):

}

func NewRowMetrics(internal bool) row.RowMetrics {

I'd put this in a different package other than sql.


pkg/sql/row/fetcher.go, line 343 at r2 (raw file):

	if memMonitor != nil {
		rf.mon = mon.NewMonitorInheritWithLimit("fetcher-mem", 0 /* limit */, memMonitor)

why are you using a limit of 0 here?

@shermanCRL shermanCRL requested a review from adityamaru August 27, 2021 15:02
Copy link
Collaborator Author

@michae2 michae2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for looking! 🙂

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @adityamaru, @rytaft, and @yuzefovich)


pkg/sql/exec_util.go, line 2975 at r3 (raw file):

Previously, rytaft (Rebecca Taft) wrote…

I'd put this in a different package other than sql.

Unfortunately it has to be in sql to use getMetricMeta without creating a circular dependency.


pkg/sql/row/fetcher.go, line 343 at r2 (raw file):

Previously, rytaft (Rebecca Taft) wrote…

why are you using a limit of 0 here?

This is a paste of the body of execinfra.NewMonitor from processorbase.go which uses 0 internally. I think the de facto limit comes from the parent monitor (or an ancestor of the parent), in this case memMonitor.

@michae2
Copy link
Collaborator Author

michae2 commented Aug 27, 2021

@adityamaru I think you were tagged because of the third commit. It makes some changes to the sql/row package which then required plumbing some information through various parts of bulk IO. I was struggling with this when I saw your comment on plumbing FlowCtx and I definitely agree as this would make it much easier to pass these metrics down through import! 😄 Maybe later today I will take a shot at it, if that's alright with you.

Copy link
Collaborator

@rytaft rytaft left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm: although it would be good to get approval from at least one other person who knows this code better than I do.

Reviewed 2 of 27 files at r4, 1 of 2 files at r5, 24 of 24 files at r6, all commit messages.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @adityamaru and @yuzefovich)


pkg/sql/exec_util.go, line 2975 at r3 (raw file):

Previously, michae2 (Michael Erickson) wrote…

Unfortunately it has to be in sql to use getMetricMeta without creating a circular dependency.

Ah ok, makes sense.

Copy link
Member

@yuzefovich yuzefovich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewed 2 of 3 files at r1, 2 of 27 files at r4, 1 of 2 files at r5, 24 of 24 files at r6, all commit messages.
Reviewable status: :shipit: complete! 2 of 0 LGTMs obtained (waiting on @adityamaru and @michae2)


pkg/sql/execinfra/server_config.go, line 293 at r6 (raw file):

// GetRowMetrics returns the proper RowMetrics for either internal or user
// queries.
func (flowCtx *FlowCtx) GetRowMetrics(internal bool) *row.Metrics {

nit: I thinkFlowCtx has EvalContext inside, so maybe we could peek into the session data and remove the need for an argument here?


pkg/sql/row/fetcher.go, line 343 at r2 (raw file):

Previously, michae2 (Michael Erickson) wrote…

This is a paste of the body of execinfra.NewMonitor from processorbase.go which uses 0 internally. I think the de facto limit comes from the parent monitor (or an ancestor of the parent), in this case memMonitor.

Yep. Just to expand on Michael's comment: passing 0 as the limit we're only making the new monitor "locally unlimited" - any reservations still need to be approved by the parent (which might also not have a local limit and will ask its parent, etc, eventually the ask will be against rootSQLMemoryMonitor).


pkg/sql/row/metrics.go, line 16 at r6 (raw file):

var (
	// Metadata for sql.guardrails.max_row_size_log.count{.internal}

nit: missing period.

@michae2 michae2 force-pushed the max_row_size_metrics branch from a7f7ce3 to ea57be2 Compare August 27, 2021 20:52
Copy link
Collaborator Author

@michae2 michae2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 2 stale) (waiting on @adityamaru, @rytaft, and @yuzefovich)


pkg/sql/execinfra/server_config.go, line 293 at r6 (raw file):

Previously, yuzefovich (Yahor Yuzefovich) wrote…

nit: I thinkFlowCtx has EvalContext inside, so maybe we could peek into the session data and remove the need for an argument here?

Oh! Good point! Done!

(It looks like the EvalContext inside FlowCtx is often copied by operators using NewEvalCtx but that's fine, we can still use FlowCtx.EvalContext because SessionData().Internal will always be the same in both copies.)


pkg/sql/row/metrics.go, line 16 at r6 (raw file):

Previously, yuzefovich (Yahor Yuzefovich) wrote…

nit: missing period.

Done.

@michae2
Copy link
Collaborator Author

michae2 commented Aug 27, 2021

TFTRs! Failure looks like an unrelated flake. I'd like to merge this before the 21.2 branch is cut (and before I go on vacation 😎) so I think I'll give it a shot.

bors r=rytaft,yuzefovich

@michae2
Copy link
Collaborator Author

michae2 commented Aug 28, 2021

Looks like bors got stuck.

bors r-

@craig
Copy link
Contributor

craig bot commented Aug 28, 2021

Canceled.

@michae2 michae2 force-pushed the max_row_size_metrics branch from ea57be2 to 075ff71 Compare August 28, 2021 04:08
@michae2
Copy link
Collaborator Author

michae2 commented Aug 28, 2021

Let's try this again! 🎲

bors r=rytaft,yuzefovich

@craig
Copy link
Contributor

craig bot commented Aug 28, 2021

Build failed (retrying...):

@craig
Copy link
Contributor

craig bot commented Aug 28, 2021

Build failed (retrying...):

@craig
Copy link
Contributor

craig bot commented Aug 28, 2021

Build failed:

@michae2
Copy link
Collaborator Author

michae2 commented Aug 28, 2021

bors r-

Hmm.

@maryliag
Copy link
Contributor

@michae2 there were some tests failing that we fixed and got merged this morning. I suggest rebase your branch and trying again, maybe it helps :)

Addresses: cockroachdb#67400

Rename sql.mutations.max_row_size.{log|err} to
sql.guardrails.max_row_size_{log|err} for consistency with
transaction_rows_{read|written}_{log|err} and upcoming metrics.

Release justification: Low-risk update to new functionality.

Release note (ops change): New variables
sql.mutations.max_row_size.{log|err} were renamed to
sql.guardrails.max_row_size_{log|err} for consistency with other
variables and metrics.
The next commit needs to make sql/row a dependency of sql/execinfra, so
remove two small references to sql/execinfra.

(I will squash this into the next commit before merging.)

Release justification: Low-risk update to new functionality.

Release note: None
Addresses: cockroachdb#67400

Add metrics for sql/row and pass them down from ExecutorConfig and
FlowCtx all the way to row.Helper. Like sql.Metrics, there are two
copies, one for user queries and one for internal queries. (I wanted to
make these part of sql.Metrics, but there are several users of sql/row
that do not operate under a sql.Server or connExecutor so we are forced
to add row.Metrics to the ExecutorConfig and FlowCtx instead.)

I ran into difficulty passing these through import, as FlowCtx itself
is not plumbed through.

There are only two metrics at first, corresponding to
violations of sql.guardrails.max_row_size_{log|err}.

Release justification: Low-risk update to new functionality.

Release note (ops): Added four new metrics,
sql.guardrails.max_row_size_{log|err}.count{.internal} which are
incremented whenever a large row violates the corresponding
sql.guardrails.max_row_size_{log|err} limit.
@michae2 michae2 force-pushed the max_row_size_metrics branch from 075ff71 to 590514b Compare August 28, 2021 17:32
@michae2
Copy link
Collaborator Author

michae2 commented Aug 28, 2021

Third time's the charm!

bors r=rytaft,yuzefovich

@craig
Copy link
Contributor

craig bot commented Aug 28, 2021

Build failed (retrying...):

@craig
Copy link
Contributor

craig bot commented Aug 28, 2021

Build succeeded:

@craig craig bot merged commit 3d20259 into cockroachdb:master Aug 28, 2021
@blathers-crl
Copy link

blathers-crl bot commented Aug 28, 2021

Encountered an error creating backports. Some common things that can go wrong:

  1. The backport branch might have already existed.
  2. There was a merge conflict.
  3. The backport branch contained merge commits.

You might need to create your backport manually using the backport tool.


error creating merge commit from 38c0a3c to blathers/backport-release-21.1-69457: POST https://api.github.com/repos/cockroachdb/cockroach/merges: 409 Merge conflict []

you may need to manually resolve merge conflicts with the backport tool.

Backport to branch 21.1.x failed. See errors above.


🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is otan.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants