-
Notifications
You must be signed in to change notification settings - Fork 468
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update 'Stop a Node' with more draining info #2671
Merged
+67
−32
Merged
Changes from 1 commit
Commits
Show all changes
13 commits
Select commit
Hold shift + click to select a range
ff04f3f
Update 'Stop a Node' with more draining info
rmloveland 4275ac1
Remove note re: which settings are "safe"
rmloveland 7bfe68d
Remove commented-out text
rmloveland 3df0393
Update cluster setting language based on feedback
rmloveland 948ef35
One space after a period; it's "a" best effort
rmloveland eb14466
Update cluster setting duration language
rmloveland 64e718f
First crack at 1.1.{5,6} shutdown updates
rmloveland 14139e9
Clarify that node cancels all current sessions
rmloveland 76a5622
Remove version # from range leases bullet
rmloveland 6c72d46
Remove version # from quorum note
rmloveland d9d22a1
Remove duped info from para following list
rmloveland 094d164
Update gossiped draining state note via feedback
rmloveland bb14e1b
Make italic text bold
rmloveland File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Next
Next commit
Update 'Stop a Node' with more draining info
Addresses #2436
- Loading branch information
commit ff04f3f690fc8abcf8e180d24ad7ed45301a9728
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,52 @@ | ||
| SETTING | TYPE | DEFAULT | DESCRIPTION | | ||
|-----------------------------------------------------|-------------------|------------|-------------------------------------------------------------------------------------------------------------------------------------------------| | ||
| `cloudstorage.gs.default.key` | string | `` | if set, JSON key to use during Google Cloud Storage operations | | ||
| `cloudstorage.http.custom_ca` | string | `` | custom root CA (appended to system's default CAs) for verifying certificates when interacting with HTTPS storage | | ||
| `cluster.organization` | string | `` | organization name | | ||
| `debug.panic_on_failed_assertions` | boolean | `false` | panic when an assertion fails rather than reporting | | ||
| `diagnostics.reporting.enabled` | boolean | `true` | enable reporting diagnostic metrics to cockroach labs | | ||
| `diagnostics.reporting.interval` | duration | `1h0m0s` | interval at which diagnostics data should be reported | | ||
| `diagnostics.reporting.send_crash_reports` | boolean | `true` | send crash and panic reports | | ||
| `kv.allocator.lease_rebalancing_aggressiveness` | float | `1E+00` | set greater than 1.0 to rebalance leases toward load more aggressively, or between 0 and 1.0 to be more conservative about rebalancing leases | | ||
| `kv.allocator.load_based_lease_rebalancing.enabled` | boolean | `true` | set to enable rebalancing of range leases based on load and latency | | ||
| `kv.allocator.range_rebalance_threshold` | float | `5E-02` | minimum fraction away from the mean a store's range count can be before it is considered overfull or underfull | | ||
| `kv.allocator.stat_based_rebalancing.enabled` | boolean | `false` | set to enable rebalancing of range replicas based on write load and disk usage | | ||
| `kv.allocator.stat_rebalance_threshold` | float | `2E-01` | minimum fraction away from the mean a store's stats (like disk usage or writes per second) can be before it is considered overfull or underfull | | ||
| `kv.bulk_io_write.max_rate` | byte size | `8.0 EiB` | the rate limit (bytes/sec) to use for writes to disk on behalf of bulk io ops | | ||
| `kv.bulk_sst.sync_size` | byte size | `2.0 MiB` | threshold after which non-Rocks SST writes must fsync (0 disables) | | ||
| `kv.raft.command.max_size` | byte size | `64 MiB` | maximum size of a raft command | | ||
| `kv.raft_log.synchronize` | boolean | `true` | set to true to synchronize on Raft log writes to persistent storage | | ||
| `kv.range.backpressure_range_size_multiplier` | float | `2E+00` | multiple of range_max_bytes that a range is allowed to grow to without splitting before writes to that range are blocked, or 0 to disable | | ||
| `kv.range_descriptor_cache.size` | integer | `1000000` | maximum number of entries in the range descriptor and leaseholder caches | | ||
| `kv.snapshot_rebalance.max_rate` | byte size | `2.0 MiB` | the rate limit (bytes/sec) to use for rebalance snapshots | | ||
| `kv.snapshot_recovery.max_rate` | byte size | `8.0 MiB` | the rate limit (bytes/sec) to use for recovery snapshots | | ||
| `kv.transaction.max_intents_bytes` | integer | `256000` | maximum number of bytes used to track write intents in transactions | | ||
| `kv.transaction.max_refresh_spans_bytes` | integer | `256000` | maximum number of bytes used to track refresh spans in serializable transactions | | ||
| `rocksdb.min_wal_sync_interval` | duration | `0s` | minimum duration between syncs of the RocksDB WAL | | ||
| `server.consistency_check.interval` | duration | `24h0m0s` | the time between range consistency checks; set to 0 to disable consistency checking | | ||
| `server.declined_reservation_timeout` | duration | `1s` | the amount of time to consider the store throttled for up-replication after a reservation was declined | | ||
| `server.failed_reservation_timeout` | duration | `5s` | the amount of time to consider the store throttled for up-replication after a failed reservation call | | ||
| `server.remote_debugging.mode` | string | `local` | set to enable remote debugging, localhost-only or disable (any, local, off) | | ||
| `server.shutdown.drain_wait` | duration | `0s` | the amount of time a server waits in an unready state before proceeding with the rest of the shutdown process | | ||
| `server.shutdown.query_wait` | duration | `10s` | the server will wait for at least this amount of time for active queries to finish | | ||
| `server.time_until_store_dead` | duration | `5m0s` | the time after which if there is no new gossiped information about a store, it is considered dead | | ||
| `server.web_session_timeout` | duration | `168h0m0s` | the duration that a newly created web session will be valid | | ||
| `sql.defaults.distsql` | enumeration | `1` | Default distributed SQL execution mode [off = 0, auto = 1, on = 2] | | ||
| `sql.distsql.distribute_index_joins` | boolean | `true` | if set, for index joins we instantiate a join reader on every node that has a stream; if not set, we use a single join reader | | ||
| `sql.distsql.interleaved_joins.enabled` | boolean | `true` | if set we plan interleaved table joins instead of merge joins when possible | | ||
| `sql.distsql.merge_joins.enabled` | boolean | `true` | if set, we plan merge joins when possible | | ||
| `sql.distsql.temp_storage.joins` | boolean | `true` | set to true to enable use of disk for distributed sql joins | | ||
| `sql.distsql.temp_storage.sorts` | boolean | `true` | set to true to enable use of disk for distributed sql sorts | | ||
| `sql.distsql.temp_storage.workmem` | byte size | `64 MiB` | maximum amount of memory in bytes a processor can use before falling back to temp storage | | ||
| `sql.metrics.statement_details.dump_to_logs` | boolean | `false` | dump collected statement statistics to node logs when periodically cleared | | ||
| `sql.metrics.statement_details.enabled` | boolean | `true` | collect per-statement query statistics | | ||
| `sql.metrics.statement_details.threshold` | duration | `0s` | minimum execution time to cause statistics to be collected | | ||
| `sql.trace.log_statement_execute` | boolean | `false` | set to true to enable logging of executed statements | | ||
| `sql.trace.session_eventlog.enabled` | boolean | `false` | set to true to enable session tracing | | ||
| `sql.trace.txn.enable_threshold` | duration | `0s` | duration beyond which all transactions are traced (set to 0 to disable) | | ||
| `timeseries.resolution_10s.storage_duration` | duration | `720h0m0s` | the amount of time to store timeseries data | | ||
| `timeseries.storage.enabled` | boolean | `true` | if set, periodic timeseries data is stored within the cluster; disabling is not recommended unless you are storing the data elsewhere | | ||
| `trace.debug.enable` | boolean | `false` | if set, traces for recent requests can be seen in the /debug page | | ||
| `trace.lightstep.token` | string | `` | if set, traces go to Lightstep using this token | | ||
| `trace.zipkin.collector` | string | `` | if set, traces go to the given Zipkin instance (example: '127.0.0.1:9411'); ignored if trace.lightstep.token is set. | | ||
| `version` | custom validation | `2.0` | set the active cluster version in the format '<major>.<minor>'. | |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -14,7 +14,14 @@ For information about permanently removing nodes to downsize a cluster or react | |
|
||
### How It Works | ||
|
||
When you stop a node, CockroachDB lets the node finish in-flight requests and transfers all **range leases** off the node before shutting it down. If the node then stays offline for a certain amount of time (5 minutes by default), the cluster considers the node dead and starts to transfer its **range replicas** to other nodes as well. | ||
When you stop a node, it performs the following steps: | ||
|
||
- Finishes in-flight requests. Note that this is a best effort that times out at the `server.shutdown.query_wait` [cluster setting](cluster-settings.html). | ||
- Transfers all *range leases* and Raft leadership to other nodes. | ||
- Gossips its draining state to the cluster, so that other nodes do not try to distribute query planning to the draining node, and no leases are transferred to the draining node. Note that this is best effort that times out at the `server.shutdown.drain_wait` [cluster setting](cluster-settings.html), so other nodes may not receive the gossip info in time. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. nit: Use only one space after the first period. I know you prefer 2, @rmloveland, but convention in our docs is 1. |
||
- No new ranges are transferred to the draining node, to avoid a possible loss of quorum after the node shuts down. | ||
|
||
If the node then stays offline for a certain amount of time (5 minutes by default), the cluster considers the node dead and starts to transfer its **range replicas** to other nodes as well. | ||
|
||
After that, if the node comes back online, its range replicas will determine whether or not they are still valid members of replica groups. If a range replica is still valid and any data in its range has changed, it will receive updates from another replica in the group. If a range replica is no longer valid, it will be removed from the node. | ||
|
||
|
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please remove this old commented-out text.