-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
roachtest: name clusters after running test #98658
Labels
C-enhancement
Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception)
T-testeng
TestEng Team
Comments
tbg
added
C-enhancement
Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception)
T-testeng
TestEng Team
labels
Mar 15, 2023
cc @cockroachdb/test-eng |
smg260
pushed a commit
to smg260/cockroach
that referenced
this issue
Aug 2, 2023
This commit will add a `test_name` label to each VM when a particular roachtest is about to be executed on the cluster, with the label being removed at the end of the roachtest. The `test_name` label is being scraped by Prometheus to allow filtering of dashboards based on the roachtest name. GCE labelling rules mean that test names are sanitised to match `[a-zA-Z-]`. Epic: none Fixes: cockroachdb#98658 Release note: None
smg260
pushed a commit
to smg260/cockroach
that referenced
this issue
Aug 10, 2023
This commit will add a `test_name` label to each VM when a particular roachtest is about to be executed on the cluster, with the label being removed at the end of the roachtest. The `test_name` label is being scraped by Prometheus to allow filtering of dashboards based on the roachtest name. GCE labelling rules mean that test names are sanitised to match `[a-zA-Z-]`. Epic: none Fixes: cockroachdb#98658 Release note: None
craig bot
pushed a commit
that referenced
this issue
Aug 14, 2023
107965: roachtest: roachprod: add test name and run id vm labels for metrics r=herkolategan a=smg260 These 2 commits add labels to clusters running roachtests, so that metrics can be better filtered in various dashboards. 1. Adds `test_name` label to each cluster, and removes the label at the end of the test. Thus, each cluster would have this label updated for each test that it runs during a particular roachtest invocation. The test name will be simplified to conform to cloud labelling rules `[a-zA-Z-]` 2. Adds `test_run_id` label to each VM, *once*, for the duration of the run. Thus, each cluster would have this label added once at the beginning of a roachtest run (which would include multiple tests), and removed only after deregistration at the end. \ In TeamCity this would take the form `<TC_USER>-<TC_BUILD_ID>`, and run locally `<USER>-<UNIX_TS>` These 2 labels combined will allow it easy for a user to find metrics for a particular run of roachtest. (e.g. a specific GCE nightly) Here is a [copy of an existing dashboard](https://grafana.testeng.crdb.io/d/qdkBruq4k/crdb-console-runtime-by-test?orgId=1&from=now-3h&to=now), modified to utilise the new labels. Epic: None Fixes: #98658 Release note: None 108037: server: return authoritative span statistics for db details endpoint r=THardy98 a=THardy98 Resolves: #96163 This change makes the admin API endpoint getting database statistics scan KV for span statistics instead of using the range descriptor cache. This provides authoritative output, helping deflake `TestMultiRegionDatabaseStats`. Release note (sql change): admin API database details endpoint now returns authoritative range statistics. 108711: upgrades: deflake TestRoleMembersIDMigration1500Users r=rafiss a=rafiss TeamCity has a new machine type where this test has started to time out more, so this change will make it take less time. fixes #108539 Release note: None Co-authored-by: Miral Gadani <[email protected]> Co-authored-by: Thomas Hardy <[email protected]> Co-authored-by: Rafi Shamim <[email protected]>
smg260
pushed a commit
to smg260/cockroach
that referenced
this issue
Aug 21, 2023
This commit will add a `test_name` label to each VM when a particular roachtest is about to be executed on the cluster, with the label being removed at the end of the roachtest. The `test_name` label is being scraped by Prometheus to allow filtering of dashboards based on the roachtest name. GCE labelling rules mean that test names are sanitised to match `[a-zA-Z-]`. Epic: none Fixes: cockroachdb#98658 Release note: None
smg260
pushed a commit
to smg260/cockroach
that referenced
this issue
Sep 18, 2023
This commit will add a `test_name` label to each VM when a particular roachtest is about to be executed on the cluster, with the label being removed at the end of the roachtest. The `test_name` label is being scraped by Prometheus to allow filtering of dashboards based on the roachtest name. GCE labelling rules mean that test names are sanitised to match `[a-zA-Z-]`. Epic: none Fixes: cockroachdb#98658 Release note: None
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
C-enhancement
Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception)
T-testeng
TestEng Team
Is your feature request related to a problem? Please describe.
The clusters get scraped by our internal Prom/Grafana instance, so it's helpful if the cluster name "means something".
Describe the solution you'd like
Switch from
$USER-1678829488-01-n10cpu8
to something like$USER-nameoftestbutsanitizedandshortenedifnecessary-YYMMDD-nonce
One problem with this approach is that we'd have to rethink cluster reuse in roachtest; it would be confusing if a cluster for test A were to be reused by test B. It's unclear if reuse is something we need to keep. (I'm uneasy about cross-pollution between tests because we increasingly do random systemd-run stuff that
roachprod wipe
won't clear up).Describe alternatives you've considered
We could also try to export the name of the running test as a label. But I'm not sure how feasible this is.
Additional context
Slack (internal)
Jira issue: CRDB-25424
The text was updated successfully, but these errors were encountered: