Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

stmtdiagnostics: remove the usage of gossip #83547

Merged
merged 1 commit into from
Jul 7, 2022

Conversation

yuzefovich
Copy link
Member

@yuzefovich yuzefovich commented Jun 28, 2022

This commit removes the usage of gossip in the stmt diagnostics feature
since it is really an optimization (i.e not required for the feature to
work) and is in a way of the architecture unification (i.e.
multi-tenancy effort).

The gossip was previously used to let all nodes in the cluster know
about the new stmt diagnostics request or about the cancellation of an
existing one (in fact, the gossip propagation of the latter was broken
anyway since we forgot to register the corresponding callback). We now
rely on the polling mechanism on each node to read from the system
table to populate the registry of current requests. This polling takes
place every sql.stmt_diagnostics.poll_interval seconds (10 by default).

Additionally, this commit changes the class of that setting from
TenantWritable to TenantReadOnly because

  1. we don't charge the user for the polling
  2. to prevent the abuse where a tenant could set the polling interval
    to a very low value incurring no costs to themselves (we excluded the
    polling from the tenant cost recently).

The follow-up work remains to bring the optimization of propagating new
requests quickly using a range feed on the system table.

Epic: CRDB-16702
Informs: #47893

Release note: None

@yuzefovich yuzefovich requested review from knz, mgartner and a team June 28, 2022 21:48
@cockroach-teamcity
Copy link
Member

This change is Reviewable

Copy link
Collaborator

@mgartner mgartner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm: But I'd like someone with more experience with this code to sign off on this.

Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @knz)

@yuzefovich
Copy link
Member Author

I think Radu is the most familiar with this code. Maybe @irfansharif could take a quick look too given that you looked at this code recently?

@yuzefovich yuzefovich requested a review from irfansharif June 29, 2022 15:24
@knz
Copy link
Contributor

knz commented Jun 29, 2022

The access to gossip can be thought of as a "nice-to-have" because it is not required for the feature to work

Can we remove the use of gossip altogether? An optimization that does not work with multitenancy is not worth having at all. We can remove it now, and if folk would like the mechanism to work faster, we can add another optimization, later, that makes it work with multitenancy.

@yuzefovich
Copy link
Member Author

The access to gossip can be thought of as a "nice-to-have" because it is not required for the feature to work

Can we remove the use of gossip altogether? An optimization that does not work with multitenancy is not worth having at all. We can remove it now, and if folk would like the mechanism to work faster, we can add another optimization, later, that makes it work with multitenancy.

We could, but we might want to bring @kevin-v-ngo to this discussion. The question is whether it's worth making the stmt diagnostics feature have a larger latency (of propagation of the diagnostics requests throughout the cluster) in order to simplify the architecture unification effort.

@kevin-v-ngo
Copy link

We could, but we might want to bring @kevin-v-ngo to this discussion. The question is whether it's worth making the stmt diagnostics feature have a larger latency (of propagation of the diagnostics requests throughout the cluster) in order to simplify the architecture unification effort.

Do we have a ballpark estimate on the impact to latency? But yes, sounds reasonable to me to be incremental and surface the functionality for CockroachDB Serverless and track a separate improvement on improving its performance.

@yuzefovich
Copy link
Member Author

Do we have a ballpark estimate on the impact to latency? But yes, sounds reasonable to me to be incremental and surface the functionality for CockroachDB Serverless and track a separate improvement on improving its performance.

Yes, that latency should have an upper bound determined by sql.stmt_diagnostics.poll_interval cluster setting (10 seconds by default).

@yuzefovich yuzefovich force-pushed the diagnostics-multi-tenant branch from df80be3 to 849bcf0 Compare June 29, 2022 19:21
@yuzefovich yuzefovich requested review from a team as code owners June 29, 2022 19:21
@yuzefovich yuzefovich force-pushed the diagnostics-multi-tenant branch from 849bcf0 to 0a6bc0e Compare June 29, 2022 19:22
@yuzefovich yuzefovich changed the title stmtdiagnostics: don't error in multi-tenant setup stmtdiagnostics: remove the usage of gossip Jun 29, 2022
@yuzefovich
Copy link
Member Author

I removed the usage of gossip entirely (and realized that the propagation of the cancellation requests via gossip was already broken when it was introduced in 22.1 time frame). Some new questions:

  • do we want to allow tenants to change the polling interval?
  • do we want to excluding the reads performed as part of polling from the tenant costs since users will no longer have a way to disable it? cc @andy-kimball

Copy link
Contributor

@irfansharif irfansharif left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code changes LGTM. I'll defer to youse about this now being up to 10s stale, and maybe wanting to power this through rangefeeds for lower latency bounds. It's worth noting that unless we do something like #70614 and make the closed timestamp target duration a property of the table (and not the global cluster setting it is today), the staleness is going to be 3s, which is presumably still slower than gossip.

@ajwerner
Copy link
Contributor

the staleness is going to be 3s, which is presumably still slower than gossip.

I'm not sure how relevant this is. The events due to writes show up on rangefeeds at quite low-latency. You don't need the timestamp to be resolved to just get notified.

@andy-kimball
Copy link
Contributor

I had thought this PR already excludes statement diagnostics from being charged for storage reads/writes. Are you seeing something different?

Is there some reason why statement diagnostics are required for customers? If we allowed customers to turn off statement diagnostics, what problems would that cause? If they do have a way to turn them off, then perhaps we should be charging them for the reads/writes.

@yuzefovich
Copy link
Member Author

I had thought #83005 already excludes statement diagnostics from being charged for storage reads/writes. Are you seeing something different?

Indeed, looks like we're good - I didn't see the change to the stmt diagnostics in #83005 before you pointed it out.

Is there some reason why statement diagnostics are required for customers? If we allowed customers to turn off statement diagnostics, what problems would that cause? If they do have a way to turn them off, then perhaps we should be charging them for the reads/writes.

Before this PR it was possible to turn off the polling mechanism of the stmt diagnostics, but the feature would still work in the single-tenant setting; however, in multi-tenant setting any attempt to create a new stmt diagnostics request would error out due to unavailability of gossip.

With this change I'm proposing to disallow the turning off of the polling so that the stmt diagnostics always worked. However, now that we exempt the polling queries from the tenant cost control, it seems possible that the polling interval could be set to a very low value, like 1ms, becoming a vector of abuse in the serverless setup. Maybe we ought to have a reasonable lower bound, say on the order of 1s to prevent such a scenario? We could also make the setting TenantReadOnly.

I could also see an argument for keeping the ability to turn it off - if a customer knows that they don't have any stmt diagnostics requests, it seems safe to turn the polling off entirely. Then we could be charging the users for reads and writes as you're suggesting. But that seems like a product decision.

@andy-kimball
Copy link
Contributor

@vy-ton, do you have any opinion about whether we could/should make SQL statement diagnostics optional? They'd be on by default, just like optimizer statistics, but they'd be possible to turn off if a customer knew they did not need them. If they can be turned off, then we'd charge RUs for them, under the principle that we only exempt features from charging if they cannot be controlled by customers (i.e. turned on/off).

If they cannot be turned off, then I think we should set a minimum collection interval of 10s for tenants. Is there a good reason to allow it to be any lower than that?

@vy-ton
Copy link
Contributor

vy-ton commented Jun 30, 2022

@andy-kimball Are RUs consumed for the statement diagnostics polling mechanism? So users always have some RU cost even if they don't activate statement diagnostics.

@yuzefovich
Copy link
Member Author

Are RUs consumed for the statement diagnostics polling mechanism? So users always have some RU cost even if they don't activate statement diagnostics.

If this polling mechanism is not disabled, then yes, RUs will always be consumed, even if users don't ever use the stmt diagnostics feature.

@irfansharif
Copy link
Contributor

I'm not sure how relevant this is. The events due to writes show up on rangefeeds at quite low-latency. You don't need the timestamp to be resolved to just get notified.

@ajwerner: I was not imagining a "get notified and the go scan the table" scheme. I was imagining using a rangefeedcache.Watcher which maintains a materialized view of the table state cheaply, with state accessible only after the closed timestamp was closed out (trailing now() - 3s today, by default).

@irfansharif
Copy link
Contributor

Would use of a rangefeedcache.Watcher obviate the need for periodic scans of the table state? That way stmtdiagnostics can always be left enabled without accruing RUs.

@yuzefovich
Copy link
Member Author

Would use of a rangefeedcache.Watcher obviate the need for periodic scans of the table state?

Yeah, I guess so (not very familiar with that machinery). Irfan, do you have an estimate for how long implementing that would take?

@irfansharif
Copy link
Contributor

Probably a day or two if you just cargo-cult the settingswatcher, which I think is the most barebones user of the library:

// Package settingswatcher provides utilities to update cluster settings using
// a range feed.
package settingswatcher

@andy-kimball
Copy link
Contributor

It'd be quite valuable for Serverless to get rid of the polling mechanism and instead only trigger background activity when we know there's probably work to be done.

@yuzefovich yuzefovich force-pushed the diagnostics-multi-tenant branch from 0a6bc0e to 5bf3df6 Compare July 6, 2022 02:55
Copy link
Member Author

@yuzefovich yuzefovich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I brought back the ability to disable the polling mechanism but changed the cluster setting class to TenantReadOnly. I think we should merge this change as is, and I might work on the range feed mechanism (I took a quick look today, and it doesn't seem trivial, at least to me).

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @knz)

@yuzefovich yuzefovich force-pushed the diagnostics-multi-tenant branch from 5bf3df6 to 33a0980 Compare July 6, 2022 16:56
@yuzefovich
Copy link
Member Author

Alright, so here is what I'm proposing with this PR:

  • remove the usage of gossip entirely (this helps with the unification effort and enables the feature on the multi-tenant setup)
  • keep the ability to disable the polling
  • tenants don't get charged for polling the system table in Serverless (which is already the case, but previously they wouldn't be able to use the feature at all - we were just wasting the effort of polling)
  • make the polling interval TenantReadOnly. cc @knz it'd be good to get your sign-off on this one.

Fully addressing #47893 by implementing a cache using a range feed over the system table in order to remove the polling mechanism completely will be done separately (with no timeline yet), so it is left as a follow-up. cc @vy-ton @andy-kimball for awareness.

If there are no objections (and the CI is green), I'll merge this PR tomorrow.

@knz
Copy link
Contributor

knz commented Jul 6, 2022

make the polling interval TenantReadOnly. cc @knz it'd be good to get your sign-off on this one.

Yes, and in the explanatory text (in commit message / PR description) you can also mention why we restrict this: because we don't charge the customer for the polling.

(If we charged them, which we may do eventually, then they could set the polling interval to whichever value they choose. But maybe this point becomes entirely moot when we do the rangefeed thing instead)

Also, who's filing the issue for the followup work?

This commit removes the usage of gossip in the stmt diagnostics feature
since it is really an optimization (i.e not required for the feature to
work) and is in a way of the architecture unification (i.e.
multi-tenancy effort).

The gossip was previously used to let all nodes in the cluster know
about the new stmt diagnostics request or about the cancellation of an
existing one (in fact, the gossip propagation of the latter was broken
anyway since we forgot to register the corresponding callback). We now
rely on the polling mechanism on each node to read from the system
table to populate the registry of current requests. This polling takes
place every `sql.stmt_diagnostics.poll_interval` seconds (10 by default).

Additionally, this commit changes the class of that setting from
TenantWritable to TenantReadOnly because
1. we don't charge the user for the polling
2. to prevent the abuse where a tenant could set the polling interval
to a very low value incurring no costs to themselves (we excluded the
polling from the tenant cost recently).

The follow-up work remains to bring the optimization of propagating new
requests quickly using a range feed on the system table.

Release note: None
@yuzefovich yuzefovich force-pushed the diagnostics-multi-tenant branch from 33a0980 to f422bad Compare July 6, 2022 19:35
@yuzefovich
Copy link
Member Author

in the explanatory text (in commit message / PR description) you can also mention

Expanded the comment.

who's filing the issue for the followup work?

I think it is already covered by #47893 (which is owned by SQL o11y).

@knz
Copy link
Contributor

knz commented Jul 6, 2022

thanks!

@yuzefovich
Copy link
Member Author

Thanks everyone for the discussion and the reviews!

bors r+

@craig
Copy link
Contributor

craig bot commented Jul 7, 2022

Build succeeded:

@craig craig bot merged commit 4dfbe65 into cockroachdb:master Jul 7, 2022
@yuzefovich yuzefovich deleted the diagnostics-multi-tenant branch July 7, 2022 18:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants