-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
sql,backupccl: add replica oracle that prefers non-leaseholders, use in backup #93810
Conversation
…in backup Following cockroachdb#91405 which enables followers to serve ExportRequests, this patch introduces an oracle that prefers non-potential-leaseholders when selecting a replica. This oracle is then used in backups so that ExportRequests during backups have a preference to be served by non-leaseholders. Release note: None
The ability to configure the replica choice function on a per planning basis looks really useful, so I vote we move forward with this, just maybe not using PreferFollower for backups (or anything at the moment) What if we kept the signature of SetupAllNodesPlanning the same anded a second flavor, SetupAllNodesPlanningWithReplicaChoice or something -- I think that's remove a bunch of the changes from this diff that are just passing |
95710: sql: support `*` in udf bodies r=rharding6373 a=rharding6373 This change allows `*` usage in UDF bodies. We rewrite UDF ASTs in place to expand `*`s into the columns they reference. Informs: #90080 Epic: CRDB-19496 Release note (sql change): Allow `*` expressions in UDFs. 96029: pkg/util/metric: option to use legacy hdrhistogram model, increase bucket fidelity r=tbg,aadityasondhi a=dhartunian This patch reeintroduces the old HdrHistogram model to optionally be enabled in favor of the new Prometheus model, gated behind an environment variable called `COCKROACH_ENABLE_HDR_HISTOGRAMS`, allowing users a means to "fall back" to the old model in the event that the new model does not adequately serve their needs (think of this as an "insurance policy" to protect against this from happening again with no real mitigation - ideally, this environment variable should never have to be used). It also updates the pre-defined bucket boundaries used by the Prometheus backed histograms with more buckets. This aims to improve precision, especially for latency histograms, when calculating quantiles (the low precision being the core cause of the issue at hand). Note: some histograms were introduced *after* the new Prometheus histograms were added to CockroachDB. In this case, we use the `ForceUsePrometheus` option in the `HistogramOptions` struct to ignore the value of the env var, since there never was a time where these specific histograms used the HdrHistogram model. Release note (ops change): Histogram metrics can now optionally use the legacy HdrHistogram model by setting the environment var `COCKROACH_ENABLE_HDR_HISTOGRAMS=true` on CockroachDB nodes. **Note that this is not recommended** unless users are having difficulties with the newer Prometheus-backed histogram model. Enabling can cause performance issues with timeseries databases like Prometheus, as processing and storing the increased number of buckets is taxing on both CPU and storage. Note that the HdrHistogram model is slated for full deprecation in upcoming releases. Fixes: #95833 96269: sql/physicalplan: allow passing replica oracle to planning r=dt a=dt This PR pulls out a subset of the changes in #93810, namely allowing a replica-choice oracle to be passed on a per-plan basis, but without changing which oracle is actually used in any existing planning, and only introducing the API for future users. This refactors span resolver and the physical planning methods to allow passing a replica-choice oracle per planning instead of using a fixed oracle (either bin-packing or closest depending on server type). The existing API that uses the default oracle is kept as is for all existing callers, while a new WithOracle variant is added for future callers who wish to pass their own oracle. An new oracle that prefers followers over leaseholders is also added here, however it is not used at this time. Release note: none. Epic: none. 96349: logictest: skip another flaky test in ranges r=mgartner a=mgartner This commit skips a flaky test that I missed in #96271. Informs #96136 Epic: None Release note: None 96433: storage: add string repr for FullReplicaID r=pavelkalinnikov a=tbg Helpful for subsequent unit testing involving this type. NB: this type is in the wrong place. Now's not the time to move it. Epic: none Release note: None 96434: kvserver,kvstorage: move read/write methods for cluster versions r=pavelkalinnikov a=tbg Same rationale as #95432 - they belong in `kvstorage` and I need them there now as I'm working on a datadriven test. Touches #93310. Epic: CRDB-220 Release note: None 96445: backupccl: add 'aws-weekly' tag to restore/tpce/32tb test r=healthy-pod,renatolabs a=msbutler Epic: none Release note: None Co-authored-by: rharding6373 <[email protected]> Co-authored-by: David Hartunian <[email protected]> Co-authored-by: Alex Barganier <[email protected]> Co-authored-by: Rui Hu <[email protected]> Co-authored-by: Marcus Gartner <[email protected]> Co-authored-by: Tobias Grieger <[email protected]> Co-authored-by: Michael Butler <[email protected]>
Following #91405 which enables followers to serve ExportRequests, this patch introduces an oracle that prefers non-potential-leaseholders when selecting a replica. This oracle is then used in backups so that ExportRequests during backups have a preference to be served by non-leaseholders.
Release note: None