Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sql: add metrics to schema changer #54855

Merged
merged 1 commit into from
Oct 19, 2020

Conversation

ajwerner
Copy link
Contributor

It's far from exhaustive and hardly tested but something is better than
nothing.

Touches #52078

Release note (general change): Added some metrics surrounding schema changes.

@ajwerner ajwerner requested a review from thoszhang September 28, 2020 14:01
@cockroach-teamcity
Copy link
Member

This change is Reviewable

Copy link
Contributor

@thoszhang thoszhang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for picking this up.

Reviewed 6 of 6 files at r1.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner)


pkg/sql/schema_changer.go, line 557 at r1 (raw file):

// no-op, as we'll fail to find the job for our mutation in the jobs registry.
func (sc *SchemaChanger) exec(ctx context.Context) error {
	sc.metrics.RunningSchemaChanges.Inc(1)

I think this should wrap the entirety of Resume (in which case we'd want a separate metric for ongoing OnFailOrCancel rollbacks, I think).

Copy link
Contributor Author

@ajwerner ajwerner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @lucy-zhang)


pkg/sql/schema_changer.go, line 557 at r1 (raw file):

Previously, lucy-zhang (Lucy Zhang) wrote…

I think this should wrap the entirety of Resume (in which case we'd want a separate metric for ongoing OnFailOrCancel rollbacks, I think).

Fair enough. I wanted to add a gauge on schema changes in retry backoff but was annoyed by what it would take to test it so I removed it. I was trying to make those two states different. I suppose it might make more sense to do what you're suggesting and then additionally have a gauge to deal with the retry case. I'll add a testing knob to make testing that reasonable.

Copy link
Contributor

@thoszhang thoszhang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner)


pkg/sql/schema_changer.go, line 557 at r1 (raw file):

Previously, ajwerner wrote…

Fair enough. I wanted to add a gauge on schema changes in retry backoff but was annoyed by what it would take to test it so I removed it. I was trying to make those two states different. I suppose it might make more sense to do what you're suggesting and then additionally have a gauge to deal with the retry case. I'll add a testing knob to make testing that reasonable.

I was more concerned with getting all the invocations of exec() in the resumer being represented as a single logical schema change in the metrics. But I guess it really actually depends on what we want to measure. Do we want DROP DATABASE CASCADE to be a single schema change or not? I find it hard to predict exactly what metrics would be most useful here.

Copy link
Contributor

@thoszhang thoszhang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner)


pkg/sql/schema_changer.go, line 557 at r1 (raw file):

Previously, lucy-zhang (Lucy Zhang) wrote…

I was more concerned with getting all the invocations of exec() in the resumer being represented as a single logical schema change in the metrics. But I guess it really actually depends on what we want to measure. Do we want DROP DATABASE CASCADE to be a single schema change or not? I find it hard to predict exactly what metrics would be most useful here.

Maybe the more comprehensive solution is that the job callbacks get per-job-type gauges at the job registry level (or something), and we keep schema-specific gauges for individual calls to exec(), so it would basically be a per-descriptor gauge. I'm actually fine with this PR as is.

@ajwerner ajwerner force-pushed the ajwerner/schema-changer-metrics branch from 816255e to 698edec Compare October 12, 2020 14:13
Copy link
Contributor Author

@ajwerner ajwerner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @lucy-zhang)


pkg/sql/schema_changer.go, line 557 at r1 (raw file):

Previously, lucy-zhang (Lucy Zhang) wrote…

Maybe the more comprehensive solution is that the job callbacks get per-job-type gauges at the job registry level (or something), and we keep schema-specific gauges for individual calls to exec(), so it would basically be a per-descriptor gauge. I'm actually fine with this PR as is.

I've added #55438 to deal with general metrics around jobs which makes this more sane. In light of that, I'll keep this as is.

@ajwerner ajwerner force-pushed the ajwerner/schema-changer-metrics branch 2 times, most recently from 0d1a2dc to 4e247f1 Compare October 12, 2020 17:41
It's far from exhaustive and hardly tested but something is better than
nothing.

Touches cockroachdb#52078

Release note (general change): Added some metrics surrounding schema changes.
@ajwerner ajwerner force-pushed the ajwerner/schema-changer-metrics branch from 4e247f1 to 99e74fe Compare October 13, 2020 12:33
Copy link
Contributor

@thoszhang thoszhang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sgtm

Reviewed 6 of 6 files at r2.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained

Copy link
Contributor Author

@ajwerner ajwerner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bors r+

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained

@craig
Copy link
Contributor

craig bot commented Oct 19, 2020

Build succeeded:

@craig craig bot merged commit 47a5c7e into cockroachdb:master Oct 19, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants