Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pkg/ui: filter metric dashboard options for tenants #97995

Merged
merged 1 commit into from
Mar 8, 2023

Conversation

abarganier
Copy link
Contributor

@abarganier abarganier commented Mar 3, 2023

This patch uses the ui.Config feature flags to communicate
to the UI whether or not the currently logged in tenant
is able to view KV metric dashboards in DB Console.

The patch filters the list of dashboard options as well
as guards against users using URL params to try to view
KV dashboards as a tenant without the appropriate permissions.

Filtering is also made possible within the dashboards
for specific charts, depending on whether they display
KV-level information.

Release note (ui change): secondary tenants using DB Console
will no longer be able to view metrics dashboards that display
KV-level information.

Epic: CRDB-12100

Addresses: #97736

@abarganier abarganier requested review from dhartunian and a team March 3, 2023 20:32
@abarganier abarganier requested review from a team as code owners March 3, 2023 20:32
@cockroach-teamcity
Copy link
Member

This change is Reviewable

@abarganier abarganier removed request for a team March 3, 2023 20:32
@abarganier abarganier force-pushed the filter-dashboards branch 2 times, most recently from d96a794 to f452219 Compare March 3, 2023 20:59
s.runtime, /* runtimeStatsSampler */
gwMux, /* handleRequestsUnauthenticated */
s.debug, /* handleDebugUnauthenticated */
s.rpcContext.TenantID, /* tenantID */
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In my PR I chose to have the Flags object as an argument and let server.go and tenant.go set whatever they want directly. What do you think of that alternative? https://github.com/cockroachdb/cockroach/pull/97945/files#diff-a28fa383e9ea02d78b07ec67d736a9c3e6a98a10de830399afa217d49b2ff618

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea, this is much more flexible. I changed over to this approach based on your PR.

);
const filteredDropdownOptions = dashboardDropdownOptions
// Don't show KV dashboards if the logged-in user doesn't have permission to view them.
.filter(option => (canViewKvGraphs ? true : !option.isKvDashboard))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

const graphs = dashboards[dashboard].component(dashboardProps);
const graphs = dashboards[dashboard]
.component(dashboardProps)
.filter(d => (canViewKvGraphs ? true : !d.props.isKvGraph));
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the ternary can be replaced with ||

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

.filter(option =>
this.props.crossClusterReplicationEnabled
? true
: option.label !== "Cross-Cluster Replication",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

This patch uses the ui.Config feature flags to communicate
to the UI whether or not the currently logged in tenant
is able to view KV metric dashboards in DB Console.

The patch filters the list of dashboard options as well
as guards against users using URL params to try to view
KV dashboards as a tenant without the appropriate permissions.

Filtering is also made capable within the dashboards
for specific charts.

Marking a graph as non-KV-related is "opt-in". New graphs added
to dashboards visible by secondary tenants will need to be
explicitly marked as non-KV-related in order to render for
secondary tenants.

For dashboards that are entirely KV-specific, we don't specify
whether the individual graphs are KV-specific since we filter
things out at the dashboard level.

Release note (ui change): secondary tenants using DB Console
will no longer be able to view metrics dashboards that display
KV-level information. When viewing metrics dashboards in DB
Console as a secondary tenant, in addition to not being able to
view certain dashboards that are entirely KV-specific, we
filter out certain graphs from visible dashboards. These
filtered graphs are only those which are KV specific, such
as the "Replicas per Node" graph on the Overview dashboard.
@abarganier abarganier requested a review from dhartunian March 6, 2023 21:35
Copy link
Collaborator

@dhartunian dhartunian left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@abarganier
Copy link
Contributor Author

TFTR!

bors r=dhartunian

@craig
Copy link
Contributor

craig bot commented Mar 8, 2023

Build succeeded:

@craig craig bot merged commit 994c0f9 into cockroachdb:master Mar 8, 2023
abarganier added a commit to abarganier/cockroach that referenced this pull request Apr 4, 2023
cockroachdb#97995 updated the
DB Console to filter out KV-specific charts from the metrics page
when viewing DB Console as a secondary application tenant.

The PR missed a couple small details. This patch cleans those
up with the following:

- Removes KV latency charts for app tenants
- Adds a single storage graph for app tenants showing livebytes
- Removes the "Capacity" chart on the Overview dashboard for app
  tenants

Release note: none
abarganier added a commit to abarganier/cockroach that referenced this pull request Apr 4, 2023
cockroachdb#97995 updated the
DB Console to filter out KV-specific charts from the metrics page
when viewing DB Console as a secondary application tenant.

The PR missed a couple small details. This patch cleans those
up with the following:

- Removes KV latency charts for app tenants
- Adds a single storage graph for app tenants showing livebytes
- Removes the "Capacity" chart on the Overview dashboard for app
  tenants

Release note: none
craig bot pushed a commit that referenced this pull request Apr 7, 2023
99663: sql: update connExecutor logic for pausable portals r=ZhouXing19 a=ZhouXing19

This PR replaces #96358 and is part of the initial implementation of multiple active portals.

----

This PR is to add limited support for multiple active portals. Now portals satisfying all following restrictions can be paused and resumed (i.e., with other queries interleaving it):

1. Not an internal query;
2. Read-only query;
3. No sub-queries or post-queries.

And such a portal will only have the statement executed with a _non-distributed_ plan. 

This feature is gated by a session variable `multiple_active_portals_enabled`. When it's set `true`, all portals that satisfy the restrictions above will automatically become "pausable" when being created via the pgwire `Bind` stmt. 

The core idea of this implementation is 
1. Add a `switchToAnotherPortal` status to the result-consumption state machine. When we receive an `ExecPortal` message for a different portal, we simply return the control to the connExecutor. (#99052)
2. Persist `flow` `queryID` `span` and `instrumentationHelper` for the portal, and reuse it when we re-execute a portal. This is to ensure we _continue_ the fetching rather than starting all over. (#99173)
3. To enable 2, we need to delay the clean-up of resources till we close the portal. For this we introduced the stacks of cleanup functions. (This PR)

Note that we kept the implementation of the original "un-pausable" portal, as we'd like to limit this new functionality only to a small set of statements. Eventually some of them should be replaced (e.g. the limitedCommandResult's lifecycle) with the new code. 

Also, we don't support distributed plan yet, as it involves much more complicated changes. See `Start with an entirely local plan` section in the [design doc](https://docs.google.com/document/d/1SpKTrTqc4AlGWBqBNgmyXfTweUUsrlqIaSkmaXpznA8/edit). Support for this will come as a follow-up.

Epic: CRDB-17622

Release note (sql change): initial support for multiple active portals. Now with session variable `multiple_active_portals_enabled` set to true,  portals satisfying all following restrictions can be executed in an interleaving manner:  1. Not an internal query; 2. Read-only query; 3. No sub-queries or post-queries. And such a portal will only have the statement executed with an entirely local plan. 





99947: ui: small fixes to DB Console charts shown for secondary tenants r=dhartunian a=abarganier

#97995 updated the
DB Console to filter out KV-specific charts from the metrics page
when viewing DB Console as a secondary application tenant.

The PR missed a couple small details. This patch cleans those
up with the following:

- Removes KV latency charts for app tenants
- Adds a single storage graph for app tenants showing livebytes
- Removes the "Capacity" chart on the Overview dashboard for app
  tenants

Release note: none

Epic: https://cockroachlabs.atlassian.net/browse/CRDB-12100

NB: Please only review the final commit. 1st commit is being reviewed separately @ #99860

100188: changefeedccl: pubsub sink refactor to batching sink r=rickystewart a=samiskin

Epic: https://cockroachlabs.atlassian.net/browse/CRDB-13237

This change is a followup to #99086 which moves the Pubsub sink to the batching sink framework.

The changes involve:
1. Moves the Pubsub code to match the `SinkClient` interface, moving to using the lower level v1 pubsub API that lets us publish batches manually
3. Removing the extra call to json.Marshal
4. Moving to using the `pstest` package for validating results in unit tests
5. Adding topic handling to the batching sink, where batches are created per-topic
6. Added a pubsub_sink_config since it can now handle Retry and Flush config settings
7. Added metrics even to the old pubsub for the purpose of comparing the two versions

At default settings, this resulted in a peak of 90k messages per second on a single node with throughput at 27.6% cpu usage, putting it at a similar level to kafka.

Running pubsub v2 across all of TPCC (nodes ran out of ranges at different speeds):
<img width="637" alt="Screenshot 2023-03-30 at 3 38 25 PM" src="https://user-images.githubusercontent.com/6236424/229863386-edaee27d-9762-4806-bab6-e18b8a6169d6.png">

Running pubsub v1 (barely visible, 2k messages per second) followed by v2 on tpcc.order_line (in v2 only 2 nodes ended up having ranges assigned to them):
<img width="642" alt="Screenshot 2023-04-04 at 12 53 45 PM" src="https://user-images.githubusercontent.com/6236424/229863507-1883ea45-d8ce-437b-9b9c-550afec68752.png">

In the following graphs from the cloud console, where v1 was ran followed by v2, you can see how the main reason v1 was slow was that it wasn't able to batch different keys together.
<img width="574" alt="Screenshot 2023-04-04 at 12 59 51 PM" src="https://user-images.githubusercontent.com/6236424/229864083-758c0814-d53c-447e-84c3-471cf5d56c44.png">

Publish requests remained the same despite way more messages in v2
<img width="1150" alt="Screenshot 2023-04-04 at 1 46 51 PM" src="https://user-images.githubusercontent.com/6236424/229875314-6e07177e-62c4-4c15-b13f-f75e8143e011.png">



Release note (performance improvement): pubsub sink changefeeds can now support higher throughputs by enabling the changefeed.new_pubsub_sink_enabled cluster setting.

100620: pkg/server: move DataDistribution to systemAdminServer r=dhartunian a=abarganier

The DataDistribution endpoint reports replica counts by database and table. When it was built, it operated off the assumption that a range would only ever contain a single table's data within.

Now that we have coalesced ranges, a single range can span multiple tables. Unfortunately, the DataDistribution endpoint does not take this fact into account, meaning it reports garbled and inaccurate data, unless the `spanconfig.storage_coalesce_adjacent.enabled` setting is set to false (see #98820).

For secondary tenants, ranges are *always* coalesced, so this endpoint in its current state could never report meaningful data for a tenant.

Given all of this, we have decided to make this endpoint only available for the system tenant. This patch
accomplishes this by moving the endpoint away from the adminServer and into the systemAdminServer, making it effectively unimplemented for secondary tenants.

Release note: none

Informs: #97942

Co-authored-by: Jane Xing <[email protected]>
Co-authored-by: Alex Barganier <[email protected]>
Co-authored-by: Shiranka Miskin <[email protected]>
abarganier added a commit that referenced this pull request Apr 10, 2023
#97995 updated the
DB Console to filter out KV-specific charts from the metrics page
when viewing DB Console as a secondary application tenant.

The PR missed a couple small details. This patch cleans those
up with the following:

- Removes KV latency charts for app tenants
- Adds a single storage graph for app tenants showing livebytes
- Removes the "Capacity" chart on the Overview dashboard for app
  tenants

Release note: none
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants