Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

roachtest: include link to testeng grafana in issue posts #105894

Closed
jbowens opened this issue Jun 30, 2023 · 4 comments · Fixed by #107391
Closed

roachtest: include link to testeng grafana in issue posts #105894

jbowens opened this issue Jun 30, 2023 · 4 comments · Fixed by #107391
Assignees
Labels
A-testing Testing tools and infrastructure C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) quality-friday A good issue to work on on Quality Friday T-testeng TestEng Team

Comments

@jbowens
Copy link
Collaborator

jbowens commented Jun 30, 2023

This is a wishlist feature request: When a roachtest fails and its metrics were reported to the testeng grafana instance, it would be very helpful in triaging if the posted GitHub issue could include a link to the grafana with the relevant cluster and test timeframe selected. As an example, a failure from a disk stall could easily be determined just by looking at the fsync latency.

Jira issue: CRDB-29261

@jbowens jbowens added C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) A-testing Testing tools and infrastructure T-testeng TestEng Team labels Jun 30, 2023
@blathers-crl
Copy link

blathers-crl bot commented Jun 30, 2023

cc @cockroachdb/test-eng

@tbg

This comment was marked as off-topic.

@tbg tbg added the quality-friday A good issue to work on on Quality Friday label Jul 20, 2023
@tbg
Copy link
Member

tbg commented Jul 21, 2023

Some breadcrumbs. Roachtest's entry point for posting issues is here

return g.issuePoster(
context.Background(),
l,
issues.UnitTestFormatter,
g.createPostRequest(t, t.firstFailure(), message),
)
}

We'd probably put this link in the HelpCommand:

HelpCommand: func(renderer *issues.Renderer) {
issues.HelpCommandAsLink(
"roachtest README",
"https://github.com/cockroachdb/cockroach/blob/master/pkg/cmd/roachtest/README.md",
)(renderer)
issues.HelpCommandAsLink(
"How To Investigate (internal)",
"https://cockroachlabs.atlassian.net/l/c/SSSBr8c7",
)(renderer)
},

So basically the task is to write code that generates the link

https://grafana.testeng.crdb.io/d/crdb-console/crdb-console-overview?orgId=1&from=<FROMEPOCHSECONDS>&to=<TOEPOCHSECONDS>&var-cluster=<CLUSTERNAME>

and renders it inline via HelpCommandAsLink.

While we're here, it might be useful to pull out the HelpCommand impl into a free-standing function and to write unit tests for it (perhaps a small one using echotest)1.

Footnotes

  1. here's an example of how echotest works.

@annrpom annrpom self-assigned this Jul 21, 2023
@tbg
Copy link
Member

tbg commented Jul 21, 2023

One small trailer of an idea, instead of hard-linking the test-eng grafana, we could use a vanity URL instead using https://go.crdb.dev/, which allows rewrites. I just set one up:

https://go.crdb.dev/p/roachfana/silvano-cluster/1689957243817/1689957733137

So the link is https://go.crdb.dev/p/roachfana/<clustername>/<fromts>/<tots>.

annrpom added a commit to annrpom/cockroach that referenced this issue Jul 24, 2023
<what was there before: Previously, ...>
This adds a link, with relevant cluster and test timeframe, to the testeng
grafana instance for failed roachtests.
<why it needed to change: This was inadequate because ...>
<what you did about it: To address this, this patch ...>
Fixes: cockroachdb#105894
Release note: None
annrpom added a commit to annrpom/cockroach that referenced this issue Jul 26, 2023
This adds a link, populated with relevant cluster name and test timeframe,
to the testeng grafana instance for failed roachtests.

Fixes: cockroachdb#105894
Release note: None
craig bot pushed a commit that referenced this issue Jul 28, 2023
107391: roachtest: include link to testeng grafana in issue posts r=smg260,tbg a=annrpom

This adds a link, populated with relevant cluster name and test timeframe, to the testeng grafana instance for failed roachtests.

Fixes: #105894
Release note: None

107659: serverutils: provide SQLConn/SQLConnE in ApplicationLayerInterface r=stevendanna a=knz

Fixes  #107672.
Part of solving #107058.
Informs #106772.

Epic: CRDB-18499



107697: rpc: avoid crash in newPeer r=erikgrinaker a=tbg

It was previously possible to make a new peer while the old one was in the
middle of being deleted, which caused a crash due to to acquiring child metrics
when they still existed.

Luckily, this is easy enough to fix: just remove some premature optimization
where I had tried to be too clever.

Fixes #105335.

Epic: CRDB-21710
Release note: None (bug never released)

107721: asim: skip TestAllocatorSimulatorDeterministic and example_fulldisk r=wenyihu6 a=wenyihu6

We found some non-deterministic behavior in the allocator simulator (see #105904
for more details). For now, we are skipping these potentially flaky tests.

Release Note: None 
Epic: None

107728: persistedsqlstats: specify background qos for compaction job r=xinhaoz a=xinhaoz

The compaction job can be an expensive operation so we should de-prioritize it with the `UserLow` qos setting.

Fixes: #99949

Release note: None

107750: ui: fix app = empty string filter on stmts page r=xinhaoz a=xinhaoz

The filter on app name = empty string was not working on
the stmts page. This was due to the fact that we use (unset)
as the option in the filter to represent selecting the empty
string app name. However when filtering statements, the empty
string app name on the stmt was not changed accordingly.
this commit fixes this and also adds testing for the unset case.

Epic: none
Fixes: #107748

Release note (bug fix): Filter on stmts page works for
app name = empty string (represented as 'unset').

https://www.loom.com/share/2fee4f0fb7b04208803e0dac1d9694ab?sid=5cabecf9-1c2a-406b-89a8-b378ed07d329



107753: backupccl: deflake TestBackupAndRestoreJobDescription r=stevendanna a=adityamaru

This change sorts the jobs based on when they
were created to ensure we get a stable sort of
job descriptions.

Fixes: #107684
Release note: None

Co-authored-by: Annie Pompa <[email protected]>
Co-authored-by: Raphael 'kena' Poss <[email protected]>
Co-authored-by: Tobias Grieger <[email protected]>
Co-authored-by: wenyihu6 <[email protected]>
Co-authored-by: Xin Hao Zhang <[email protected]>
Co-authored-by: adityamaru <[email protected]>
@craig craig bot closed this as completed in c885ab1 Jul 28, 2023
annrpom added a commit to annrpom/cockroach that referenced this issue Aug 3, 2023
This adds a link, populated with relevant cluster name and test timeframe,
to the testeng grafana instance for failed roachtests.

Fixes: cockroachdb#105894
Release note: None
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-testing Testing tools and infrastructure C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) quality-friday A good issue to work on on Quality Friday T-testeng TestEng Team
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants