Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ui: make custom chart tool work at store level #122151

Merged
merged 1 commit into from
Apr 19, 2024

Conversation

abarganier
Copy link
Contributor

@abarganier abarganier commented Apr 10, 2024

Fixes: #121364

This patch fixes a bug in the DB Console custom chart tool, where
selecting the "Per Node" checkbox on a metric would not properly display
store-level metrics. The previous expected behavior was that the check
box would cause the metric to aggregate across stores at the node level
(e.g. if the node had 3 stores, it'd SUM the store-level timeseries
together and return a single timeseries for the node). Instead, the
feature was only showing the 1st store associated with the node.

This was due to a bug in the code used to determine if a metric was
store-level. A function was used that improperly assumed that the
cr.node.* or cr.store.* prefix had been stripped from the metric
name, which was not always the case. This led to us improperly
interpret store-level metrics as node-level.

The fix is to fix the logic used to determine if a metric is
store-level.

Additionally, this patch updates the code to no longer aggregate
store-level metrics across each node. Instead, we will now show a single
timeseries per-store to provide finer-grained observability into
store-level metrics within the custom chart tool.

Release note (bug fix): A bug has been fixed in the DB Console's custom
chart tool, where store-level metrics were not being displayed properly.
Previously, if a store-level metric was selected to be displayed at the
store-level on a multi-store node, only data for the 1st store ID
associated with that node would be displayed.

This patch ensures that data is displayed for all stores present on a
node. Additionally, it updates the behavior to show a single timeseries
for each store, as opposed to aggregating (e.g. SUM) all stores across
the node. This allows finer-grained observability into store-level
metrics when using the custom chart tool in DB Console.

@cockroach-teamcity
Copy link
Member

This change is Reviewable

@abarganier abarganier changed the title [WIP] ui: make custom chart tool work at store level ui: make custom chart tool work at store level Apr 15, 2024
@abarganier abarganier marked this pull request as ready for review April 15, 2024 17:14
@abarganier abarganier requested a review from a team as a code owner April 15, 2024 17:14
@abarganier abarganier requested review from nkodali, dhartunian, a team and koorosh and removed request for a team and nkodali April 15, 2024 17:14
Copy link
Collaborator

@koorosh koorosh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 1 of 2 files at r1, all commit messages.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @abarganier and @dhartunian)


pkg/ui/workspaces/db-console/src/views/reports/containers/customChart/index.tsx line 216 at r1 (raw file):

    }
    return isStoreMetric(nodesSummary.nodeStatuses[0], metricName)
      ? Object.values(nodesSummary.storeIDsByNodeID).flatMap(s => s)

I think it should also filter out stores for selected node (in case only specific node selected as a source)
Screenshot 2024-04-16 at 15.28.33.png

Also it makes confusing when node metrics selected with single node selected as a source with per node/store option selected. I'd suggest that per node/store option should be disabled or does nothing as only single node selected as a source.


pkg/ui/workspaces/db-console/src/views/reports/containers/customChart/index.tsx line 249 at r1 (raw file):

          ));
        });
      } else if (m.perSource) {

This else if condition looks identical to the first one with only exception how title and key fields defined. It could be simplified by conditionally build those strings only. WDYT?

Copy link
Contributor Author

@abarganier abarganier left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TFTR!

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @dhartunian and @koorosh)


pkg/ui/workspaces/db-console/src/views/reports/containers/customChart/index.tsx line 216 at r1 (raw file):

Previously, koorosh (Andrii Vorobiov) wrote…

I think it should also filter out stores for selected node (in case only specific node selected as a source)
Screenshot 2024-04-16 at 15.28.33.png

Also it makes confusing when node metrics selected with single node selected as a source with per node/store option selected. I'd suggest that per node/store option should be disabled or does nothing as only single node selected as a source.

Thanks for catching this, I hadn't considered it!

I think keeping the per node/store option still makes sense, though.

For example, if I take a store-level metric like cr.store.capacity.used and chart it, it shows things cluster-wide:
Screenshot 2024-04-16 at 4.47.11 PM.png

If I then select a specific node as a source, it switches to show the sum of the metric across all the stores on that node:
Screenshot 2024-04-16 at 4.47.42 PM.png

Then, if I select the "Per Node/Store" checkbox, it shows me the metric for each individual store available for the node:
Screenshot 2024-04-16 at 4.47.57 PM.png


pkg/ui/workspaces/db-console/src/views/reports/containers/customChart/index.tsx line 249 at r1 (raw file):

Previously, koorosh (Andrii Vorobiov) wrote…

This else if condition looks identical to the first one with only exception how title and key fields defined. It could be simplified by conditionally build those strings only. WDYT?

Doesn't the previous one have an additional nested level of iteration though, for the tenant IDs?

As in, the else if (m.perSource) block creates a <Metric> for each source, but the previous if block creates a <Metric> for each source, for each tenant.

Let me know if I'm misunderstanding!

Copy link
Collaborator

@koorosh koorosh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

::LGTM:: Just few nits to adjust commented below.

Reviewed 2 of 2 files at r2, all commit messages.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @abarganier and @dhartunian)


pkg/ui/workspaces/db-console/src/views/reports/containers/customChart/index.tsx line 216 at r1 (raw file):

Previously, abarganier (Alex Barganier) wrote…

Thanks for catching this, I hadn't considered it!

I think keeping the per node/store option still makes sense, though.

For example, if I take a store-level metric like cr.store.capacity.used and chart it, it shows things cluster-wide:
Screenshot 2024-04-16 at 4.47.11 PM.png

If I then select a specific node as a source, it switches to show the sum of the metric across all the stores on that node:
Screenshot 2024-04-16 at 4.47.42 PM.png

Then, if I select the "Per Node/Store" checkbox, it shows me the metric for each individual store available for the node:
Screenshot 2024-04-16 at 4.47.57 PM.png

🔥🔥🔥


pkg/ui/workspaces/db-console/src/views/reports/containers/customChart/index.tsx line 249 at r1 (raw file):

Previously, abarganier (Alex Barganier) wrote…

Doesn't the previous one have an additional nested level of iteration though, for the tenant IDs?

As in, the else if (m.perSource) block creates a <Metric> for each source, but the previous if block creates a <Metric> for each source, for each tenant.

Let me know if I'm misunderstanding!

You're right, i've missed that nested iteration!


pkg/ui/workspaces/db-console/src/views/reports/containers/customChart/index.tsx line 226 at r2 (raw file):

    if (isStoreMetric(nodesSummary.nodeStatuses[0], metricState.metric)) {
      // If a specific node is selected, return the storeIDs associated with that node.
      // Otherwise, we're at the cluster level, so we cna

nit. sentence isn't completed.


pkg/ui/workspaces/db-console/src/views/reports/containers/customChart/index.tsx line 263 at r2 (raw file):

              downsampler={m.downsampler}
              derivative={m.derivative}
              sources={this.getSources(nodesSummary, m)}

nit. you have const sources = this.getSources(nodesSummary, m); (above the flatMap call) which can be used instead of calling it every time.

Fixes: cockroachdb#121364

This patch fixes a bug in the DB Console custom chart tool, where
selecting the "Per Node" checkbox on a metric would not properly display
store-level metrics. The previous expected behavior was that the check
box would cause the metric to aggregate across stores at the node level
(e.g. if the node had 3 stores, it'd SUM the store-level timeseries
together and return a single timeseries for the node). Instead, the
feature was only showing the 1st store associated with the node.

This was due to a bug in the code used to determine if a metric was
store-level. A function was used that improperly assumed that the
`cr.node.*` or `cr.store.*` prefix had been stripped from the metric
name, which was not always the case. This led to us improperly
interpret store-level metrics as node-level.

The fix is to fix the logic used to determine if a metric is
store-level.

Additionally, this patch updates the code to no longer aggregate
store-level metrics across each node. Instead, we will now show a single
timeseries per-store to provide finer-grained observability into
store-level metrics within the custom chart tool.

Release note (bug fix): A bug has been fixed in the DB Console's custom
chart tool, where store-level metrics were not being displayed properly.
Previously, if a store-level metric was selected to be displayed at the
store-level on a multi-store node, only data for the 1st store ID
associated with that node would be displayed.

This patch ensures that data is displayed for all stores present on a
node. Additionally, it updates the behavior to show a single timeseries
for each store, as opposed to aggregating (e.g. SUM) all stores across
the node. This allows finer-grained observability into store-level
metrics when using the custom chart tool in DB Console.
Copy link
Contributor Author

@abarganier abarganier left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TFTR! Addressed nits.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @dhartunian and @koorosh)


pkg/ui/workspaces/db-console/src/views/reports/containers/customChart/index.tsx line 226 at r2 (raw file):

Previously, koorosh (Andrii Vorobiov) wrote…

nit. sentence isn't completed.

Ah, thanks for catching! Fixed.


pkg/ui/workspaces/db-console/src/views/reports/containers/customChart/index.tsx line 263 at r2 (raw file):

Previously, koorosh (Andrii Vorobiov) wrote…

nit. you have const sources = this.getSources(nodesSummary, m); (above the flatMap call) which can be used instead of calling it every time.

Good catch! Fixed to be more consistent when m.perSource && !m.perTenant (the following else if).

Copy link
Collaborator

@koorosh koorosh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewed 1 of 1 files at r3, all commit messages.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @dhartunian)

@abarganier
Copy link
Contributor Author

abarganier commented Apr 18, 2024

TFTR!

bors r=koorosh

@abarganier abarganier added backport-23.1.x Flags PRs that need to be backported to 23.1 backport-23.2.x Flags PRs that need to be backported to 23.2. backport-24.1.x Flags PRs that need to be backported to 24.1. labels Apr 19, 2024
@abarganier
Copy link
Contributor Author

bors r=koorosh

@craig
Copy link
Contributor

craig bot commented Apr 19, 2024

This PR was included in a batch that was canceled, it will be automatically retried

@rail
Copy link
Member

rail commented Apr 19, 2024

bors retry
(it crashed)

@craig
Copy link
Contributor

craig bot commented Apr 19, 2024

Build failed (retrying...):

@craig craig bot merged commit 50393fa into cockroachdb:master Apr 19, 2024
22 checks passed
Copy link

blathers-crl bot commented Apr 19, 2024

Encountered an error creating backports. Some common things that can go wrong:

  1. The backport branch might have already existed.
  2. There was a merge conflict.
  3. The backport branch contained merge commits.

You might need to create your backport manually using the backport tool.


error creating merge commit from c5e95e9 to blathers/backport-release-23.1-122151: POST https://api.github.com/repos/cockroachdb/cockroach/merges: 409 Merge conflict []

you may need to manually resolve merge conflicts with the backport tool.

Backport to branch 23.1.x failed. See errors above.


🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf.

abarganier added a commit to abarganier/cockroach that referenced this pull request Apr 25, 2024
cockroachdb#122151 fixed the custom
chart tool to properly display store-level metrics. However, it did not
include any tests. Despite metric charts being difficult to unit test,
we can still test the logic used to determine which metric sources
should be included in the timeseries query.

This commit adds tests for this logic.

Release note: none
dhartunian pushed a commit to abarganier/cockroach that referenced this pull request Apr 25, 2024
cockroachdb#122151 fixed the custom
chart tool to properly display store-level metrics. However, it did not
include any tests. Despite metric charts being difficult to unit test,
we can still test the logic used to determine which metric sources
should be included in the timeseries query.

This commit adds tests for this logic.

Release note: none
craig bot pushed a commit that referenced this pull request Apr 25, 2024
123086: ui: tests for custom chart logic to determine metric sources r=dhartunian a=abarganier

#122151 fixed the custom chart tool to properly display store-level metrics. However, it did not include any tests. Despite metric charts being difficult to unit test, we can still test the logic used to determine which metric sources should be included in the timeseries query.

This commit adds tests for this logic.

Release note: none

Epic: none

Co-authored-by: Alex Barganier <[email protected]>
abarganier added a commit that referenced this pull request Apr 26, 2024
#122151 fixed the custom
chart tool to properly display store-level metrics. However, it did not
include any tests. Despite metric charts being difficult to unit test,
we can still test the logic used to determine which metric sources
should be included in the timeseries query.

This commit adds tests for this logic.

Release note: none
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport-23.1.x Flags PRs that need to be backported to 23.1 backport-23.2.x Flags PRs that need to be backported to 23.2. backport-24.1.x Flags PRs that need to be backported to 24.1.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

db-console: per-store metrics not displayed after importing multi-store tsdump
4 participants