Merge cockroachdb#90222 cockroachdb#91458 cockroachdb#92826 cockroach…

…db#93758 cockroachdb#95007 cockroachdb#95145 cockroachdb#95387 cockroachdb#95557 cockroachdb#95559 cockroachdb#95564 cockroachdb#95583 cockroachdb#95606 90222: server: add api for decommission pre-flight checks r=AlexTalks a=AlexTalks While we have an API for checking the status of an in-progress decommission, we did not previously have an API to execute sanity checks prior to requesting a node to move into the `DECOMMISSIONING` state. This adds an API to do just that, intended to be called by the CLI prior to issuing a subsequent `Decommission` RPC request. Fixes cockroachdb#91568. Release note: None 91458: roachtest/mixed-versions-compat: use corpus for master r=fqazi a=fqazi Informs: cockroachdb#91350 The CI build scripts take advantage of the branch name to uplaod the corpus to GCS. Unfortunately, we have no way of know if the current build is master inside the roachtest. To address this, this patch supports fetching the master corpus as a fallback. Release note: None 92826: multiregionccl: add a cold start latency test r=ajwerner a=ajwerner This commit adds a test which creates an MR serverless cluster and then boots the sql pods in each region while disallowing connectivity to other regions. It also simulates latency to make sure the routing logic works and to provide a somewhat realistic picture of what to expect. Epic: CRDB-18596 Release note: None 93758: server: evaluate decommission pre-checks r=kvoli a=AlexTalks This adds support for the evaluation of the decommission readiness of a node (or set of nodes), by simulating their liveness to have the DECOMMISSIONING status and utilizing the allocator to ensure that we are able to perform any actions needed to repair the range. This supports a "strict" mode, in which case we expect all ranges to only need replacement or removal due to the decommissioning status, or a more permissive "non-strict" mode, which allows for other actions needed, as long as they do not encounter errors in finding a suitable allocation target. The non-strict mode allows us to permit situations where a range may have more than one action needed to repair it, such as a range that needs to reach its replication factor before the decommissioning replica can be replaced, or a range that needs to finalize an atomic replication change. Depends on cockroachdb#94024. Part of cockroachdb#91568 95007: admission: CPU slot adjustment and utilization metrics r=irfansharif a=sumeerbhola Our existing metrics are gauges (total and used slots) which don't give us insight into what is happening at smaller time scales. This creates uncertainty when we observe admission queueing but the gauge samples show total slots consistenly greater than used slots. Additionally, if total slots is steady during queuing, it doesn't tell us whether that was because of roughly matching increments or decrements, or no increments/decrements. The following metrics are added: - admission.granter.slots_exhausted_duration.kv: cumulative duration when the slots were exhausted. This can give insight into how much exhaustion was occurring. It is insufficient to tell us whether 0.5sec/sec of exhaustion is due to a long 500ms of exhaustion and then non-exhaustion or alternating 1ms of exhaustion and non-exhaustion. But this is an improvement over what we have. - admission.granter.slot_adjuster_{increments,decrements}.kv: Counts the increments and decrements of the total slots. - admission.granter.cpu_load_{short,long}_period_duration.kv: cumulative duration of short and long ticks, as indicated by the period in the CPULoad callback. We don't expect long period ticks when admission control is active (and we explicitly disable enforcement during long period ticks), but it helps us eliminate some hypothesis during incidents (e.g. long period ticks alternating with short period ticks causing a slow down in how fast we increment slots). Additionally, the sum of the rate of these two, if significantly < 1, would indicate that CPULoad frequency is lower than expected, say due to CPU overload. Fixes cockroachdb#92673 Epic: none Release note: None 95145: sql/stats: include partial stats in results of statsCache.GetTableStats r=rytaft a=michae2 We were not including partial stats in the list of table statistics returned by `statsCache.GetTableStats`. This was fine for the optimizer, which currently cannot use partial stats directly, but it was a problem for backup. We'd like to use partial stats directly in the optimizer eventually, so this commit goes ahead and adds them to the results of `GetTableStats`. The optimizer then must filter them out. To streamline this we add some helper functions. Finally, in an act of overzealous refactoring, this commit also changes `MergedStatistics` and `ForecastTableStatistics` to accept partial statistics and full statistics mixed together in the same input list. This simplifies the code that calls these functions. Fixes: cockroachdb#95056 Part of: cockroachdb#93983 Epic: CRDB-19449 Release note: None 95387: kvserver: separate loadstats cpu nanos to raft/req r=andrewbaptist a=kvoli Previously, loadstats tracked replica raft/request cpu nanos per second separately but returned both summed together in `load.ReplicaLoadStats`. This patch separates `RaftCPUNanosPerSecond` and `RequestCPUNanosPerSecond` in the returned `load.ReplicaLoadStats` so that they may be used independently. Informs cockroachdb#95380 Release note: None 95557: sql: Fix testing_optimizer_disable_rule_probability usage with vtables r=cucaroach a=cucaroach If a vtable scan query tries to use the dummy "0" column the exec builder errors out, this typically won't happen thanks to prune columns normalization rules and those rules are marked as "essential" but the logic allowing those rules to be applied was flawed. Epic: CRDB-20535 Informs: cockroachdb#94890 Release note: None 95559: rpc: fix comment r=andreimatei a=andreimatei This copy-pasta comment was mentioning a KV node, which was not right. Release note: None Epic: None 95564: cpustopwatch: s/grunning.Difference/grunning.Elapsed r=irfansharif a=irfansharif `grunning.Elapsed()` is the API to use when measuring the running time spent doing some piece of work, with measurements from the start and end. This only exists due to `grunning.Time()`'s non-monotonicity, a bug in our runtime patch: cockroachdb#95529. The bug results in slight {over,under}-estimation of the running time (the latter breaking monotonicity), but is livable with our current uses of this library, including the one here in cpustopwatch. `grunning.Elapsed()` papers over this bug by 0-ing out when `grunning.Time()`stamps regress. This is unlike `grunning.Difference()` which would return the absolute value of the regression -- not what we want here. Release note: None 95583: sql: fix cluster setting propagation flake take 2 r=cucaroach a=cucaroach Previously we tried to fix this with one retry but that was insufficient. Extend it to all queries in this section of the test. Release note: None Epic: CRDB-20535 95606: backupccl: deflake TestScheduleChainingLifecycle r=adityamaru a=msbutler This patch will skip the test if the machine clock is close to midnight, and increases the frequency of incremental backups to run every 2 minutes. Previously, the backup schedule in this test used the following crontab recurrence: '*/5 * * * *'. In english, this means "run a full backup now, and then run a full backup every day at midnight, and an incremental every 5 minutes. This test also relies on an incremental backup running after the first full backup. But, what happens if the first full backup gets scheduled to run within 5 minutes of midnight? A second full backup may get scheduled before the expected incremental backup, breaking the invariant the test expects. Fixes cockroachdb#88575 cockroachdb#95186 Release note: None Co-authored-by: Alex Sarkesian <[email protected]> Co-authored-by: Faizan Qazi <[email protected]> Co-authored-by: Andrew Werner <[email protected]> Co-authored-by: sumeerbhola <[email protected]> Co-authored-by: Michael Erickson <[email protected]> Co-authored-by: Austen McClernon <[email protected]> Co-authored-by: Tommy Reilly <[email protected]> Co-authored-by: Andrei Matei <[email protected]> Co-authored-by: irfan sharif <[email protected]> Co-authored-by: Michael Butler <[email protected]>
AlexTalks · Jan 20, 2023 · 1b79102 · 1b79102
13 parents 47d6f18 + f6cf9b3 + 137e0c0 + 4c59d79 + 09181f8 + 774df8d + 3bc0992 + 7925b6e + 6dfc5e2 + 74d45b7 + cb47b78 + 73408a2 + a7f1750
commit 1b79102
Show file tree

Hide file tree

Showing 51 changed files with 1,744 additions and 252 deletions.
diff --git a/docs/generated/http/full.md b/docs/generated/http/full.md
@@ -6280,6 +6280,104 @@ DrainResponse is the response to a successful DrainRequest.
 
 
 
+## DecommissionPreCheck
+
+
+
+DecommissionPreCheck requests that the server execute preliminary checks
+to evaluate the possibility of successfully decommissioning a given node.
+
+Support status: [reserved](#support-status)
+
+#### Request Parameters
+
+
+
+
+DecommissionPreCheckRequest requests that preliminary checks be run to
+ensure that the specified node(s) can be decommissioned successfully.
+
+
+| Field | Type | Label | Description | Support status |
+| ----- | ---- | ----- | ----------- | -------------- |
+| node_ids | [int32](#cockroach.server.serverpb.DecommissionPreCheckRequest-int32) | repeated |  | [reserved](#support-status) |
+| num_replica_report | [int32](#cockroach.server.serverpb.DecommissionPreCheckRequest-int32) |  | The maximum number of ranges for which to report errors. | [reserved](#support-status) |
+| strict_readiness | [bool](#cockroach.server.serverpb.DecommissionPreCheckRequest-bool) |  | If true, all ranges on the checked nodes must only need replacement or removal for decommissioning. | [reserved](#support-status) |
+| collect_traces | [bool](#cockroach.server.serverpb.DecommissionPreCheckRequest-bool) |  | If true, collect traces for each range checked. Requires num_replica_report > 0. | [reserved](#support-status) |
+
+
+
+
+
+
+
+#### Response Parameters
+
+
+
+
+DecommissionPreCheckResponse returns the number of replicas that encountered
+errors when running preliminary decommissioning checks, as well as the
+associated error messages and traces, for each node.
+
+
+| Field | Type | Label | Description | Support status |
+| ----- | ---- | ----- | ----------- | -------------- |
+| checked_nodes | [DecommissionPreCheckResponse.NodeCheckResult](#cockroach.server.serverpb.DecommissionPreCheckResponse-cockroach.server.serverpb.DecommissionPreCheckResponse.NodeCheckResult) | repeated | Status of the preliminary decommission checks across nodes. | [reserved](#support-status) |
+
+
+
+
+
+
+<a name="cockroach.server.serverpb.DecommissionPreCheckResponse-cockroach.server.serverpb.DecommissionPreCheckResponse.NodeCheckResult"></a>
+#### DecommissionPreCheckResponse.NodeCheckResult
+
+The result of checking a single node's readiness for decommission.
+
+| Field | Type | Label | Description | Support status |
+| ----- | ---- | ----- | ----------- | -------------- |
+| node_id | [int32](#cockroach.server.serverpb.DecommissionPreCheckResponse-int32) |  |  | [reserved](#support-status) |
+| decommission_readiness | [DecommissionPreCheckResponse.NodeReadiness](#cockroach.server.serverpb.DecommissionPreCheckResponse-cockroach.server.serverpb.DecommissionPreCheckResponse.NodeReadiness) |  | The node's decommission readiness status. | [reserved](#support-status) |
+| liveness_status | [cockroach.kv.kvserver.liveness.livenesspb.NodeLivenessStatus](#cockroach.server.serverpb.DecommissionPreCheckResponse-cockroach.kv.kvserver.liveness.livenesspb.NodeLivenessStatus) |  | The liveness status of the given node. | [reserved](#support-status) |
+| replica_count | [int64](#cockroach.server.serverpb.DecommissionPreCheckResponse-int64) |  | The number of total replicas on the node, computed by scanning range descriptors. | [reserved](#support-status) |
+| checked_ranges | [DecommissionPreCheckResponse.RangeCheckResult](#cockroach.server.serverpb.DecommissionPreCheckResponse-cockroach.server.serverpb.DecommissionPreCheckResponse.RangeCheckResult) | repeated | The details and recorded traces from preprocessing each range with a replica on the checked nodes that resulted in error, up to the maximum specified in the request. | [reserved](#support-status) |
+
+
+
+
+
+<a name="cockroach.server.serverpb.DecommissionPreCheckResponse-cockroach.server.serverpb.DecommissionPreCheckResponse.RangeCheckResult"></a>
+#### DecommissionPreCheckResponse.RangeCheckResult
+
+The result of checking a range's readiness for the decommission.
+
+| Field | Type | Label | Description | Support status |
+| ----- | ---- | ----- | ----------- | -------------- |
+| range_id | [int32](#cockroach.server.serverpb.DecommissionPreCheckResponse-int32) |  |  | [reserved](#support-status) |
+| action | [string](#cockroach.server.serverpb.DecommissionPreCheckResponse-string) |  | The action determined by the allocator that is needed for the range. | [reserved](#support-status) |
+| events | [TraceEvent](#cockroach.server.serverpb.DecommissionPreCheckResponse-cockroach.server.serverpb.TraceEvent) | repeated | All trace events collected while checking the range. | [reserved](#support-status) |
+| error | [string](#cockroach.server.serverpb.DecommissionPreCheckResponse-string) |  | The error message from the allocator's processing, if any. | [reserved](#support-status) |
+
+
+
+
+
+<a name="cockroach.server.serverpb.DecommissionPreCheckResponse-cockroach.server.serverpb.TraceEvent"></a>
+#### TraceEvent
+
+
+
+| Field | Type | Label | Description | Support status |
+| ----- | ---- | ----- | ----------- | -------------- |
+| time | [google.protobuf.Timestamp](#cockroach.server.serverpb.DecommissionPreCheckResponse-google.protobuf.Timestamp) |  |  | [reserved](#support-status) |
+| message | [string](#cockroach.server.serverpb.DecommissionPreCheckResponse-string) |  |  | [reserved](#support-status) |
+
+
+
+
+
+
 ## Decommission
 
 

diff --git a/pkg/ccl/backupccl/schedule_pts_chaining_test.go b/pkg/ccl/backupccl/schedule_pts_chaining_test.go
@@ -24,6 +24,7 @@ import (
 	"github.com/cockroachdb/cockroach/pkg/sql/isql"
 	"github.com/cockroachdb/cockroach/pkg/sql/sem/tree"
 	"github.com/cockroachdb/cockroach/pkg/testutils"
+	"github.com/cockroachdb/cockroach/pkg/testutils/skip"
 	"github.com/cockroachdb/cockroach/pkg/util/hlc"
 	"github.com/cockroachdb/cockroach/pkg/util/leaktest"
 	"github.com/cockroachdb/cockroach/pkg/util/log"
@@ -43,7 +44,7 @@ func (th *testHelper) createSchedules(
 	}
 	backupQuery := fmt.Sprintf("%s%s", backupStmt, backupOpts)
 	schedules, err := th.createBackupSchedule(t,
-		fmt.Sprintf("CREATE SCHEDULE FOR %s RECURRING '*/5 * * * *'", backupQuery))
+		fmt.Sprintf("CREATE SCHEDULE FOR %s RECURRING '*/2 * * * *'", backupQuery))
 	require.NoError(t, err)
 
 	// We expect full & incremental schedule to be created.
@@ -94,6 +95,19 @@ func TestScheduleChainingLifecycle(t *testing.T) {
 	th, cleanup := newTestHelper(t)
 	defer cleanup()
 
+	roundedCurrentTime := th.cfg.DB.KV().Clock().PhysicalTime().Round(time.Minute * 5)
+	if roundedCurrentTime.Hour() == 0 && roundedCurrentTime.Minute() == 0 {
+		// The backup schedule in this test uses the following crontab recurrence:
+		// '*/2 * * * *'. In english, this means "run a full backup now, and then
+		// run a full backup every day at midnight, and an incremental every 2
+		// minutes. This test relies on an incremental backup running after the
+		// first full backup. But, what happens if the first full backup gets
+		// scheduled to run within 5 minutes of midnight? A second full backup may
+		// get scheduled before the expected incremental backup, breaking the
+		// invariant the test expects. For this reason, skip the test if it's
+		// running close to midnight #spooky.
+		skip.WithIssue(t, 91640, "test flakes when the machine clock is too close to midnight")
+	}
 	th.sqlDB.Exec(t, `
 CREATE DATABASE db;
 USE db;

diff --git a/pkg/ccl/backupccl/testdata/backup-restore/metadata b/pkg/ccl/backupccl/testdata/backup-restore/metadata
@@ -53,9 +53,6 @@ exec-sql
 BACKUP DATABASE db1 INTO 'nodelocal://0/test/'
 ----
 
-# TODO(ssd): The expectation here is 6 stats rather than 7 because the
-# 'partial' stat is not backed up even though it is persisted. I think
-# we may want a different API for fetching statistics.
 query-sql
 SELECT
     json_array_length(
@@ -72,4 +69,4 @@ SELECT
         -> 'statistics'
     )
 ----
-6
+7
diff --git a/pkg/ccl/multiregionccl/BUILD.bazel b/pkg/ccl/multiregionccl/BUILD.bazel
@@ -26,6 +26,7 @@ go_test(
     name = "multiregionccl_test",
     size = "enormous",
     srcs = [
+        "cold_start_latency_test.go",
         "datadriven_test.go",
         "main_test.go",
         "multiregion_system_table_test.go",
@@ -56,6 +57,7 @@ go_test(
         "//pkg/kv/kvclient/kvcoord",
         "//pkg/kv/kvserver",
         "//pkg/roachpb",
+        "//pkg/rpc",
         "//pkg/security/securityassets",
         "//pkg/security/securitytest",
         "//pkg/server",
@@ -81,22 +83,29 @@ go_test(
         "//pkg/testutils",
         "//pkg/testutils/datapathutils",
         "//pkg/testutils/serverutils",
+        "//pkg/testutils/serverutils/regionlatency",
         "//pkg/testutils/skip",
         "//pkg/testutils/sqlutils",
         "//pkg/testutils/testcluster",
         "//pkg/util",
         "//pkg/util/envutil",
         "//pkg/util/leaktest",
         "//pkg/util/log",
+        "//pkg/util/quotapool",
         "//pkg/util/randutil",
         "//pkg/util/syncutil",
+        "//pkg/util/timeutil",
         "//pkg/util/tracing",
         "//pkg/util/tracing/tracingpb",
         "//pkg/util/uuid",
         "@com_github_cockroachdb_apd_v3//:apd",
         "@com_github_cockroachdb_datadriven//:datadriven",
         "@com_github_cockroachdb_errors//:errors",
+        "@com_github_jackc_pgx_v4//:pgx",
+        "@com_github_stretchr_testify//assert",
         "@com_github_stretchr_testify//require",
+        "@org_golang_google_grpc//:go_default_library",
+        "@org_golang_x_sync//errgroup",
     ],
 )