-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
kvserver: sep-raft-log: verify ReplicaID consistency at boot time #93310
Labels
C-enhancement
Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception)
T-kv
KV Team
Comments
tbg
added
C-enhancement
Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception)
T-kv-replication
labels
Dec 9, 2022
cc @cockroachdb/replication |
tbg
added a commit
to tbg/cockroach
that referenced
this issue
Dec 9, 2022
These are the beginnings of cockroachdb#93310 and cockroachdb#93244. We need to inspect and possibly reconcile the persisted state at startup time. This PR refactors what we have into a shape where new code can be added more easily. Release note: None
tbg
added a commit
to tbg/cockroach
that referenced
this issue
Dec 16, 2022
These are the beginnings of cockroachdb#93310 and cockroachdb#93244. We need to inspect and possibly reconcile the persisted state at startup time. This PR refactors what we have into a shape where new code can be added more easily. Release note: None
tbg
added a commit
to tbg/cockroach
that referenced
this issue
Dec 19, 2022
These are the beginnings of cockroachdb#93310 and cockroachdb#93244. We need to inspect and possibly reconcile the persisted state at startup time. This PR refactors what we have into a shape where new code can be added more easily. Release note: None
craig bot
pushed a commit
that referenced
this issue
Dec 19, 2022
93311: kvserver: introduce loadAndReconcileReplicas r=pavelkalinnikov a=tbg These are the beginnings of #93310 and #93244. We need to inspect and possibly reconcile the persisted state at startup time. This PR refactors what we have into a shape where new code can be added more easily. Epic: CRDB-220 Release note: None Co-authored-by: Tobias Grieger <[email protected]>
There can be uninitialized replicas that were created before we introduced RaftReplicaID. We can "just" remove these when we encounter them. |
tbg
added a commit
to tbg/cockroach
that referenced
this issue
Jan 18, 2023
I'm working on a datadriven test for `kvstorage` and I need to be able to initialize the engine there. There's more that should move, but I'm taking it one step at a time. As we build up more and more sophisticated datadriven tests in `kvstorage` we can move whatever we need when we need it. PS: I looked at the unit test being modified by the move and it can't yet be moved to kvstorage - it also bootstraps the initial ranges, etc, requiring a lot more code movement until we can move it. Touches cockroachdb#93310. Epic: CRDB-220 Release note: None
craig bot
pushed a commit
that referenced
this issue
Jan 18, 2023
94670: metrics: add tenant name to _status/vars output r=knz a=dhartunian This commit adds a `tenant` prometheus label to each metrics output via the `_status/vars` HTTP endpoint. The label is populated with either "system" or the name of the tenant generating the metric. When queried from the system tenant's HTTP port or handler, the result will now also contain metrics from all in-process tenants on the same node. When initializing a tenant, an optional "parent" metrics recorder is passed down allowing the tenant's recorder to be registered with the parent. When we query the parent it iterates over all child recorders (much like we already do for stores) and outputs all of their metrics as well. Example: ``` sql_txn_commit_count{tenant="system"} 0 sql_txn_commit_count{tenant="demo-tenant"} 0 ``` Resolves: #94663 Epic: CRDB-18798 Release note (ops change): Metrics output via `_status/vars` now contain `tenant` labels allowing the user to distinguish between metrics emitted by the system tenant vs other app tenants identified by their IDs. Co-authored-by: Alex Barganier <[email protected]> Co-authored-by: Aaditya Sondhi <[email protected]> 95234: sql: evaluate correlated subqueries as routines r=mgartner a=mgartner #### opt: create tree.Routine planning closure in helper function The logic for creating a planning closure for a `tree.Routine` has been moved to a helper function so that it can be reused in future commits. Release note: None #### sql: evaluate correlated subqueries as routines Previously, the optimizer would error in rare cases when it was unable to hoist correlated subqueries into apply-joins. Now, scalar, correlated subqueries that aren't hoisted are executed successfully. There is remaining work to apply the same method in this commit to `EXISTS` and `<op> ANY` subqueries. Hoisting correlated subqueries is not possible when a conditional expression, like a `CASE`, wraps a subquery that is not leak-proof. One of the effects of hoisting a subquery is that the subquery will be unconditionally evaluated. For leak-proof subqueries, the worst case is that unnecessary computation is performed. For non-leak-proof subqueries, errors could originate from the subquery when it should have never been evaluated because the corresponding conditional expression was never true. So, in order to support these cases, we must be able to execute a correlated subquery. A correlated subquery can be thought of as a relational expression with parameters that need to be filled in with constant value arguments for each invocation. It is essentially a user-defined function with a single statement in the function body. So, the `tree.RoutineExpr` machinery that powers UDFs is easily repurposed to facilitate evaluation of correlated subqueries. Fixes #71908 Fixes #73573 Fixes #80169 Release note (sql change): Some queries which previously resulted in the error "could not decorrelate subquery" now succeed. 95432: kvserver,kvstorage: move InitEngine r=pavelkalinnikov a=tbg I'm working on a datadriven test for `kvstorage` and I need to be able to initialize the engine there. There's more that should move, but I'm taking it one step at a time. As we build up more and more sophisticated datadriven tests in `kvstorage` we can move whatever we need when we need it. PS: I looked at the unit test being modified by the move and it can't yet be moved to kvstorage - it also bootstraps the initial ranges, etc, requiring a lot more code movement until we can move it. 100% mechanical movement via Goland. Touches #93310. Epic: CRDB-220 Release note: None Co-authored-by: David Hartunian <[email protected]> Co-authored-by: Marcus Gartner <[email protected]> Co-authored-by: Tobias Grieger <[email protected]>
tbg
added a commit
to tbg/cockroach
that referenced
this issue
Jan 20, 2023
Same rationale as cockroachdb#95432 - they belong in `kvstorage` and I need them there now as I'm working on a datadriven test. Touches cockroachdb#93310. Epic: CRDB-220 Release note: None
tbg
added a commit
to tbg/cockroach
that referenced
this issue
Jan 23, 2023
See cockroachdb#93310. This is also the beginning of cockroachdb#93247. Epic: CRDB-220 Release note: None
tbg
added a commit
to tbg/cockroach
that referenced
this issue
Feb 2, 2023
Same rationale as cockroachdb#95432 - they belong in `kvstorage` and I need them there now as I'm working on a datadriven test. Touches cockroachdb#93310. Epic: CRDB-220 Release note: None
tbg
added a commit
to tbg/cockroach
that referenced
this issue
Feb 2, 2023
Same rationale as cockroachdb#95432 - they belong in `kvstorage` and I need them there now as I'm working on a datadriven test. Touches cockroachdb#93310. Epic: CRDB-220 Release note: None
tbg
added a commit
to tbg/cockroach
that referenced
this issue
Feb 2, 2023
See cockroachdb#93310. This is also the beginning of cockroachdb#93247. Epic: CRDB-220 Release note: None
craig bot
pushed a commit
that referenced
this issue
Feb 3, 2023
95710: sql: support `*` in udf bodies r=rharding6373 a=rharding6373 This change allows `*` usage in UDF bodies. We rewrite UDF ASTs in place to expand `*`s into the columns they reference. Informs: #90080 Epic: CRDB-19496 Release note (sql change): Allow `*` expressions in UDFs. 96029: pkg/util/metric: option to use legacy hdrhistogram model, increase bucket fidelity r=tbg,aadityasondhi a=dhartunian This patch reeintroduces the old HdrHistogram model to optionally be enabled in favor of the new Prometheus model, gated behind an environment variable called `COCKROACH_ENABLE_HDR_HISTOGRAMS`, allowing users a means to "fall back" to the old model in the event that the new model does not adequately serve their needs (think of this as an "insurance policy" to protect against this from happening again with no real mitigation - ideally, this environment variable should never have to be used). It also updates the pre-defined bucket boundaries used by the Prometheus backed histograms with more buckets. This aims to improve precision, especially for latency histograms, when calculating quantiles (the low precision being the core cause of the issue at hand). Note: some histograms were introduced *after* the new Prometheus histograms were added to CockroachDB. In this case, we use the `ForceUsePrometheus` option in the `HistogramOptions` struct to ignore the value of the env var, since there never was a time where these specific histograms used the HdrHistogram model. Release note (ops change): Histogram metrics can now optionally use the legacy HdrHistogram model by setting the environment var `COCKROACH_ENABLE_HDR_HISTOGRAMS=true` on CockroachDB nodes. **Note that this is not recommended** unless users are having difficulties with the newer Prometheus-backed histogram model. Enabling can cause performance issues with timeseries databases like Prometheus, as processing and storing the increased number of buckets is taxing on both CPU and storage. Note that the HdrHistogram model is slated for full deprecation in upcoming releases. Fixes: #95833 96269: sql/physicalplan: allow passing replica oracle to planning r=dt a=dt This PR pulls out a subset of the changes in #93810, namely allowing a replica-choice oracle to be passed on a per-plan basis, but without changing which oracle is actually used in any existing planning, and only introducing the API for future users. This refactors span resolver and the physical planning methods to allow passing a replica-choice oracle per planning instead of using a fixed oracle (either bin-packing or closest depending on server type). The existing API that uses the default oracle is kept as is for all existing callers, while a new WithOracle variant is added for future callers who wish to pass their own oracle. An new oracle that prefers followers over leaseholders is also added here, however it is not used at this time. Release note: none. Epic: none. 96349: logictest: skip another flaky test in ranges r=mgartner a=mgartner This commit skips a flaky test that I missed in #96271. Informs #96136 Epic: None Release note: None 96433: storage: add string repr for FullReplicaID r=pavelkalinnikov a=tbg Helpful for subsequent unit testing involving this type. NB: this type is in the wrong place. Now's not the time to move it. Epic: none Release note: None 96434: kvserver,kvstorage: move read/write methods for cluster versions r=pavelkalinnikov a=tbg Same rationale as #95432 - they belong in `kvstorage` and I need them there now as I'm working on a datadriven test. Touches #93310. Epic: CRDB-220 Release note: None 96445: backupccl: add 'aws-weekly' tag to restore/tpce/32tb test r=healthy-pod,renatolabs a=msbutler Epic: none Release note: None Co-authored-by: rharding6373 <[email protected]> Co-authored-by: David Hartunian <[email protected]> Co-authored-by: Alex Barganier <[email protected]> Co-authored-by: Rui Hu <[email protected]> Co-authored-by: Marcus Gartner <[email protected]> Co-authored-by: Tobias Grieger <[email protected]> Co-authored-by: Michael Butler <[email protected]>
tbg
added a commit
to tbg/cockroach
that referenced
this issue
Feb 3, 2023
See cockroachdb#93310. This is also the beginning of cockroachdb#93247. Epic: CRDB-220 Release note: None
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
C-enhancement
Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception)
T-kv
KV Team
We need to check that all replicas (incl. uninitialized ones) have a persisted ReplicaID and that it matches that in the descriptor (if initialized).
See #93244; that check will live in a similar location.
Epic: CRDB-220
Jira issue: CRDB-22267
The text was updated successfully, but these errors were encountered: