Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

roachprod, roachtest: add grafana annotations API #120019

Merged
merged 3 commits into from
Mar 17, 2024

Conversation

DarrylWong
Copy link
Contributor

@DarrylWong DarrylWong commented Mar 6, 2024

This change adds the ability to create Grafana annotations from both roachprod and roachtests. The API handles authentication as well as creating an HTTP client and request. See individual commits for more details.

Release note: none
Epic: none
Fixes: #119821

@cockroach-teamcity
Copy link
Member

This change is Reviewable

@DarrylWong
Copy link
Contributor Author

DarrylWong commented Mar 6, 2024

Last commit can be dropped if we don't think the annotations are helpful there. It was more of a proof of concept + way to test that it actually works.

image

Note: if you want to play around with it yourself, I've only added annotation queries to the sql dashboard so far.


I played around with using clusterID instead of the runID and it worked, but found an edge case where if we are passing in an already existing cluster (using roachtest --cluster), the clusterID will be reused for potentially the same test.

I also found another edge case: if we run the same test multiple times using --count. In that case, we have to know the testRunID (test name + run #) instead of just the test name. I don't see a way to get around this besides exposing the testRunID as well.

We could just not support these edge cases, but between:

  1. exposing run_id + testRunID to cluster
  2. having to change all the grafana dashboards + not supporting these two edge cases

The former seems like the less bad solution to me. So I opted to go with that, but I'm open to other's opinions though. For what it's worth, instead of directly passing run_id and testRunID as fields in cluster, I passed them via a grafanaTags field, to make it more generic + obvious this isn't something to be used outside of that context.

@DarrylWong DarrylWong force-pushed the roachtest-grafana-annotations branch 2 times, most recently from 2e20911 to a3c8642 Compare March 6, 2024 22:00
@DarrylWong DarrylWong marked this pull request as ready for review March 6, 2024 22:04
@DarrylWong DarrylWong requested review from a team as code owners March 6, 2024 22:04
@DarrylWong DarrylWong requested review from herkolategan and srosenberg and removed request for a team March 6, 2024 22:04
@@ -242,6 +243,13 @@ func (tr *testRunner) runSingleStep(ctx context.Context, ss *singleStep, l *logg
defer func() {
prefix := fmt.Sprintf("FINISHED [%s]", timeutil.Since(start))
tr.logStep(prefix, ss, l)
annotation := fmt.Sprintf("(%d): %s", ss.ID, ss.impl.Description())
_, err := tr.cluster.AddGrafanaAnnotation(tr.ctx, tr.logger, false /*internal*/, grafana.AddAnnotationRequest{
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! Do you have a screenshot of how this looks for the whole (test) run?

Copy link
Contributor Author

@DarrylWong DarrylWong Mar 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here's what follower-reads/mixed-version/single-region looks like.

image

Not sure how soon the data gets GC, but the grafana link in case you see this before it does.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure how soon the data gets GC, but the grafana link in case you see this before it does.

Thanks. By default annotations are kept forever [1], and we're using the default. I'll add the following settings to make sure we don't forget to GC,

[annotations.dashboard]
max_age = 3M

[annotations.api]
max_age = 3M

[1] https://github.com/grafana/grafana/blob/608b40b2a8860c63f37fe5ac542b2a52b7109317/conf/sample.ini#L1241

@srosenberg srosenberg requested a review from renatolabs March 10, 2024 15:35
@srosenberg
Copy link
Member

Nice work! I've only gone through the code and left a few minor comments. I'll play around with the annotations to get a better idea of the "look & feel".

@DarrylWong DarrylWong force-pushed the roachtest-grafana-annotations branch from a3c8642 to e381adf Compare March 11, 2024 18:28
@DarrylWong DarrylWong requested a review from a team as a code owner March 11, 2024 18:28
@DarrylWong DarrylWong requested review from dt and removed request for a team March 11, 2024 18:28
@DarrylWong DarrylWong force-pushed the roachtest-grafana-annotations branch 2 times, most recently from 92de6b2 to 2d54365 Compare March 11, 2024 19:57
@DarrylWong DarrylWong removed the request for review from dt March 11, 2024 19:57
@srosenberg srosenberg self-requested a review March 12, 2024 23:20
@srosenberg srosenberg requested review from srosenberg and removed request for srosenberg March 12, 2024 23:20
Copy link
Member

@srosenberg srosenberg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Annotations look good to me! If need be, we can easily tweak things in subsequent PR(s). Let's give @renatolabs a chance to review the changes since it touches the mixedversion runner. Otherwise, 🚢

Copy link
Contributor

@renatolabs renatolabs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work!

All my comments are ignorable nits.

pkg/cmd/roachprod/grafana/annotations.go Outdated Show resolved Hide resolved
pkg/cmd/roachprod/grafana/annotations.go Outdated Show resolved Hide resolved
pkg/cmd/roachprod/grafana/annotations.go Outdated Show resolved Hide resolved
pkg/cmd/roachprod/main.go Outdated Show resolved Hide resolved
pkg/cmd/roachprod/main.go Outdated Show resolved Hide resolved
pkg/cmd/roachtest/cluster.go Show resolved Hide resolved
@DarrylWong DarrylWong force-pushed the roachtest-grafana-annotations branch from 2d54365 to 5200855 Compare March 13, 2024 22:01
@DarrylWong
Copy link
Contributor Author

😞 Unfortunately adding a cluster.AddGrafanaAnnotation() call to runSingleStep breaks the Test_runSingleStep unit test since the unit test doesn't have a cluster mocked.

One easy solution would be to just wrap it in a if cluster != nil, but seems wrong to change it because it doesn't work with a test. On the other hand, mocking an entire cluster interface just to fix this one thing seems overkill? Would love to hear others thoughts, maybe I'm missing a better solution.

@DarrylWong DarrylWong force-pushed the roachtest-grafana-annotations branch 2 times, most recently from fb9aa1a to 69dff21 Compare March 15, 2024 16:13
@renatolabs
Copy link
Contributor

mocking an entire cluster interface just to fix this one thing seems overkill?

Yes, definitely an overkill. We've had to fix similar problems in the past; e.g., see how we mock the architecture for tests:

func (t *Test) clusterArch() vm.CPUArch {
if t._arch != nil {
return *t._arch // test-only
}
return t.cluster.Architecture()
}

I kind of prefer having this type of behaviour encapsulated in a function rather than having != nil checks spread over the core logic. Maybe in this case we could have a _addAnnotation func() field and a function like:

func (r runner) addGrafanaAnnotation() {
    if r._addAnnotation != nil {
        r._addAnnotation(annotation) // test-only
        return
    }

    r.cluster.AddGrafanaAnnotation(annotation)
}

Needless to say, the above is pseudo-code to illustrate the idea. Does that make sense? You could then even inject your own annotation function in the test to validate the annotation we're sending.

@DarrylWong DarrylWong force-pushed the roachtest-grafana-annotations branch from 69dff21 to 8f227fa Compare March 15, 2024 16:53
@DarrylWong
Copy link
Contributor Author

Yes that makes perfect sense! Thank you!

This change adds a Grafana annotation API for roachprod. It
adds support to add annotations to a specified grafana instance.

The Grafana OpenAPI is used to create an HTTP client and easily
create requests.
AddGrafanaAnnotation allows roachtests to create grafana
annotations. Roachtests can either add to the centralized
Grafana instance that contains metrics for all GCE runs,
or the internal Grafana instance created through
cluster.StartGrafana.

Depending on the instance chosen, AddGrafanaAnnotation
will resolve the host name for the caller.
@DarrylWong DarrylWong force-pushed the roachtest-grafana-annotations branch 3 times, most recently from 2c69b24 to b14a2a0 Compare March 15, 2024 20:56
Now that we have the ability to create grafana annotations
in roachtests, we can annotate each mixed version step
in grafana.
@DarrylWong DarrylWong force-pushed the roachtest-grafana-annotations branch from b14a2a0 to b4994a9 Compare March 15, 2024 21:22
@DarrylWong
Copy link
Contributor Author

CI is finally happy and confirmed it works on TC with the env variables.

TFTRs!

bors r=renatolabs, srosenberg

@craig
Copy link
Contributor

craig bot commented Mar 17, 2024

@craig craig bot merged commit 0b6de9c into cockroachdb:master Mar 17, 2024
21 of 22 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

roachtest: grafana API for annotations
4 participants