-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
sql: add metrics to schema changer #54855
sql: add metrics to schema changer #54855
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for picking this up.
Reviewed 6 of 6 files at r1.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner)
pkg/sql/schema_changer.go, line 557 at r1 (raw file):
// no-op, as we'll fail to find the job for our mutation in the jobs registry. func (sc *SchemaChanger) exec(ctx context.Context) error { sc.metrics.RunningSchemaChanges.Inc(1)
I think this should wrap the entirety of Resume
(in which case we'd want a separate metric for ongoing OnFailOrCancel
rollbacks, I think).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @lucy-zhang)
pkg/sql/schema_changer.go, line 557 at r1 (raw file):
Previously, lucy-zhang (Lucy Zhang) wrote…
I think this should wrap the entirety of
Resume
(in which case we'd want a separate metric for ongoingOnFailOrCancel
rollbacks, I think).
Fair enough. I wanted to add a gauge on schema changes in retry backoff but was annoyed by what it would take to test it so I removed it. I was trying to make those two states different. I suppose it might make more sense to do what you're suggesting and then additionally have a gauge to deal with the retry case. I'll add a testing knob to make testing that reasonable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner)
pkg/sql/schema_changer.go, line 557 at r1 (raw file):
Previously, ajwerner wrote…
Fair enough. I wanted to add a gauge on schema changes in retry backoff but was annoyed by what it would take to test it so I removed it. I was trying to make those two states different. I suppose it might make more sense to do what you're suggesting and then additionally have a gauge to deal with the retry case. I'll add a testing knob to make testing that reasonable.
I was more concerned with getting all the invocations of exec()
in the resumer being represented as a single logical schema change in the metrics. But I guess it really actually depends on what we want to measure. Do we want DROP DATABASE CASCADE to be a single schema change or not? I find it hard to predict exactly what metrics would be most useful here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner)
pkg/sql/schema_changer.go, line 557 at r1 (raw file):
Previously, lucy-zhang (Lucy Zhang) wrote…
I was more concerned with getting all the invocations of
exec()
in the resumer being represented as a single logical schema change in the metrics. But I guess it really actually depends on what we want to measure. Do we want DROP DATABASE CASCADE to be a single schema change or not? I find it hard to predict exactly what metrics would be most useful here.
Maybe the more comprehensive solution is that the job callbacks get per-job-type gauges at the job registry level (or something), and we keep schema-specific gauges for individual calls to exec()
, so it would basically be a per-descriptor gauge. I'm actually fine with this PR as is.
816255e
to
698edec
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @lucy-zhang)
pkg/sql/schema_changer.go, line 557 at r1 (raw file):
Previously, lucy-zhang (Lucy Zhang) wrote…
Maybe the more comprehensive solution is that the job callbacks get per-job-type gauges at the job registry level (or something), and we keep schema-specific gauges for individual calls to
exec()
, so it would basically be a per-descriptor gauge. I'm actually fine with this PR as is.
I've added #55438 to deal with general metrics around jobs which makes this more sane. In light of that, I'll keep this as is.
0d1a2dc
to
4e247f1
Compare
It's far from exhaustive and hardly tested but something is better than nothing. Touches cockroachdb#52078 Release note (general change): Added some metrics surrounding schema changes.
4e247f1
to
99e74fe
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sgtm
Reviewed 6 of 6 files at r2.
Reviewable status: complete! 0 of 0 LGTMs obtained
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
bors r+
Reviewable status: complete! 0 of 0 LGTMs obtained
Build succeeded: |
It's far from exhaustive and hardly tested but something is better than
nothing.
Touches #52078
Release note (general change): Added some metrics surrounding schema changes.