Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sql,backupccl: add replica oracle that prefers non-leaseholders, use in backup #93810

Closed
wants to merge 1 commit into from

Conversation

rhu713
Copy link
Contributor

@rhu713 rhu713 commented Dec 16, 2022

Following #91405 which enables followers to serve ExportRequests, this patch introduces an oracle that prefers non-potential-leaseholders when selecting a replica. This oracle is then used in backups so that ExportRequests during backups have a preference to be served by non-leaseholders.

Release note: None

…in backup

Following cockroachdb#91405 which enables followers to serve ExportRequests, this patch
introduces an oracle that prefers non-potential-leaseholders when selecting a
replica. This oracle is then used in backups so that ExportRequests during
backups have a preference to be served by non-leaseholders.

Release note: None
@rhu713 rhu713 requested a review from stevendanna December 16, 2022 20:45
@rhu713 rhu713 requested review from a team as code owners December 16, 2022 20:45
@rhu713 rhu713 requested review from a team December 16, 2022 20:45
@rhu713 rhu713 requested a review from a team as a code owner December 16, 2022 20:45
@cockroach-teamcity
Copy link
Member

This change is Reviewable

@rhu713 rhu713 removed request for a team December 16, 2022 20:46
@rhu713 rhu713 marked this pull request as draft December 16, 2022 23:15
@dt
Copy link
Member

dt commented Jan 19, 2023

The ability to configure the replica choice function on a per planning basis looks really useful, so I vote we move forward with this, just maybe not using PreferFollower for backups (or anything at the moment)

What if we kept the signature of SetupAllNodesPlanning the same anded a second flavor, SetupAllNodesPlanningWithReplicaChoice or something -- I think that's remove a bunch of the changes from this diff that are just passing nil.

craig bot pushed a commit that referenced this pull request Feb 3, 2023
95710: sql: support `*` in udf bodies r=rharding6373 a=rharding6373

This change allows `*` usage in UDF bodies. We rewrite UDF ASTs in place
to expand `*`s into the columns they reference.

Informs: #90080

Epic: CRDB-19496
Release note (sql change): Allow `*` expressions in UDFs.

96029: pkg/util/metric: option to use legacy hdrhistogram model, increase bucket fidelity r=tbg,aadityasondhi a=dhartunian

This patch reeintroduces the old HdrHistogram model to optionally be
enabled in favor of the new Prometheus model, gated behind
an environment variable called `COCKROACH_ENABLE_HDR_HISTOGRAMS`,
allowing users a means to "fall back" to the old model in the
event that the new model does not adequately serve their needs
(think of this as an "insurance policy" to protect against
this from happening again with no real mitigation - ideally,
this environment variable should never have to be used).

It also updates the pre-defined bucket boundaries used by the Prometheus
backed histograms with more buckets. This aims to improve precision,
especially for latency histograms, when calculating quantiles (the low precision
being the core cause of the issue at hand).

Note: some histograms were introduced *after* the new
Prometheus histograms were added to CockroachDB. In this
case, we use the `ForceUsePrometheus` option in the
`HistogramOptions` struct to ignore the value of the env
var, since there never was a time where these specific
histograms used the HdrHistogram model.

Release note (ops change): Histogram metrics can now optionally
use the legacy HdrHistogram model by setting the environment var
`COCKROACH_ENABLE_HDR_HISTOGRAMS=true` on CockroachDB nodes.
**Note that this is not recommended** unless users are having
difficulties with the newer Prometheus-backed histogram model.
Enabling can cause performance issues with timeseries databases
like Prometheus, as processing and storing the increased number
of buckets is taxing on both CPU and storage. Note that the
HdrHistogram model is slated for full deprecation in upcoming
releases.

Fixes: #95833

96269: sql/physicalplan: allow passing replica oracle to planning r=dt a=dt

This PR pulls out a subset of the changes in #93810, namely allowing a replica-choice oracle to be passed on a per-plan basis, but without changing which oracle is actually used in any existing planning, and only introducing the API for future users. 

This refactors span resolver and the physical planning methods to allow passing a replica-choice oracle per planning instead of using a fixed oracle (either bin-packing or closest depending on server type). The existing API that uses the default oracle is kept as is for all existing callers, while a new WithOracle variant is added for future callers who wish to pass their own oracle.

An new oracle that prefers followers over leaseholders is also added here, however it is not used at this time.

Release note: none.
Epic: none.

96349: logictest: skip another flaky test in ranges r=mgartner a=mgartner

This commit skips a flaky test that I missed in #96271.

Informs #96136

Epic: None

Release note: None

96433: storage: add string repr for FullReplicaID r=pavelkalinnikov a=tbg

Helpful for subsequent unit testing involving this type.

NB: this type is in the wrong place. Now's not the time
to move it.

Epic: none
Release note: None


96434: kvserver,kvstorage: move read/write methods for cluster versions r=pavelkalinnikov a=tbg

Same rationale as #95432 - they belong in `kvstorage` and I need
them there now as I'm working on a datadriven test.

Touches #93310.

Epic: CRDB-220
Release note: None


96445: backupccl: add 'aws-weekly' tag to restore/tpce/32tb test r=healthy-pod,renatolabs a=msbutler

Epic: none

Release note: None

Co-authored-by: rharding6373 <[email protected]>
Co-authored-by: David Hartunian <[email protected]>
Co-authored-by: Alex Barganier <[email protected]>
Co-authored-by: Rui Hu <[email protected]>
Co-authored-by: Marcus Gartner <[email protected]>
Co-authored-by: Tobias Grieger <[email protected]>
Co-authored-by: Michael Butler <[email protected]>
@rhu713 rhu713 closed this Mar 14, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants