-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
kvserver: export remaining snapshot bytes #85528
Comments
Previously, we had metrics to track the number of snapshots waiting in the snapshot queue; however, snapshots may be of different sizes, so it is also helpful to track the size of all snapshots in the queue. This change adds the following metrics to track the total size of all snapshots waiting in the queue: range.snapshots.send-queue-bytes range.snapshots.recv-queue-bytes Informs: cockroachdb#85528 Release note (ops change): Added two new metrics, range.snapshots.(send|recv)-queue-bytes, to track the total size of all snapshots waiting in the snapshot queue.
Previously, we had metrics to track the number of snapshots waiting in the snapshot queue; however, snapshots may be of different sizes, so it is also helpful to track the size of all snapshots in the queue. This change adds the following metrics to track the total size of all snapshots waiting in the queue: range.snapshots.send-queue-bytes range.snapshots.recv-queue-bytes Informs: cockroachdb#85528 Release note (ops change): Added two new metrics, range.snapshots.(send|recv)-queue-bytes, to track the total size of all snapshots waiting in the snapshot queue.
99275: sql: enabling forward indexes and ORDERBY on JSONB columns r=celiala a=Shivs11 Currently, #97928 outlines the scheme for JSONB encoding and decoding for forward indexes. However, the PR doesn't enable this feature to our users. This current PR aims to allow forward indexes on JSONB columns. The presence of a lexicographical ordering, as described in #97928, shall now allow primary and secondary indexes on JSONB columns along with the ability to use `ORDER BY` filter in their queries. Additionally, JSON values consist of decimal numbers and containers, such as Arrays and Objects, which can contain these decimal numbers. In order to preserve the values after the decimal, JSONB columns are now required to be composite in nature. This shall enable such values to be stored in both the key and the value side of a K/V pair in hopes of receiving the exact value. Fixes: #35706 Release note (sql change): This PR adds support for enabling forward indexes and ordering on JSON values. Epic: [CRDB-24501](https://cockroachlabs.atlassian.net/browse/CRDB-24501) 100942: kvserver: add metrics to track snapshot queue size r=kvoli a=miraradeva Previously, we had metrics to track the number of snapshots waiting in the snapshot queue; however, snapshots may be of different sizes, so it is also helpful to track the size of all snapshots in the queue. This change adds the following metrics to track the total size of all snapshots waiting in the queue: range.snapshots.send-queue-bytes range.snapshots.recv-queue-bytes Informs: #85528 Release note (ops change): Added two new metrics, range.snapshots.(send|recv)-queue-bytes, to track the total size of all snapshots waiting in the snapshot queue. 101220: roachtest: prevent shared mutable state across c2c roachtest runs r=benbardin a=msbutler Previously, all `c2c/*` roachtests run with `--count` would provide incomprehensible results because multiple roachtest runs of the same test would override each other's state. Specifically, the latest call of `test_spec.Run()`, would override the `test.Test` harness, and `syncedCluster.Cluster` used by all other tests with the same registration. This patch fixes this problem by moving all fields in `replicationSpec` that are set during test execution (i.e. a `test_spec.Run` call), to a new `replicationDriver` struct. Now, `replicationSpec` gets defined during test registration and is shared across test runs, while `replicationDriver` gets set within a test run. Epic: None Release note: None Co-authored-by: Shivam Saraf <[email protected]> Co-authored-by: Mira Radeva <[email protected]> Co-authored-by: Michael Butler <[email protected]>
Added metrics (1) and (2) above as part of #100942. Austen helped me test the change using this roachprod script:
For metrics (3) and (4), I added another metric to keep track of the total reserved bytes sent/received of snapshots with reservations (range.snapshots.recv-reserved-bytes). We also have an existing metrics for total bytes sent/received (range.snapshots.rcvd-bytes). And the metrics we want for (3) and (4) are essentially the difference between these two. I ran the same workload and watched node 5 ramp up in terms of the two metrics above (I didn't do the actual difference because I haven't figured out grafana yet). The lines seem very close, so the difference will likely be too small to tell us much. |
Summary
Each snapshot may be a different size, it would be beneficial to track the total remaining snapshot bytes that are queued and in progress on a store's receiver snapshot semaphore. Additionally the remaining bytes that are queued on a store's sender snapshot semaphore.
Note we currently track the current reservations in bytes, which is the current size of the snapshot(s) being processed on a store
capacity.reserved
.Solution
The solution is to add four additional exported metrics, with the last two optional and pending how useful they are:
range.snapshots.queued-rcvd
: a gauge tracking the sum of all snapshot bytes that are currently queued on a store's receive queue, however have not gotten a reservation (begun processing).range.snapshots.queued-send
: a gauge tracking the sum of all snapshot bytes that are currently queued on a store's send queue, however have not begun gotten a reservation (begun processing).range.snapshots.pending-rcvd
: a gauge tracking the sum of all snapshot bytes that remain on a store's receiving side, for snapshots that have acquired a reservation. This could be updated more frequently, to track the "remaining bytes" i.e. reservation - processed.range.snapshots.pending-send
: a gauge tracking the sum of all snapshot bytes that remain on a store's sending side, for snapshots that have acquired a reservation. Similar to above, this is tracking the remaining bytes to be sent.Context
(3) and (4) may not present much material benefit, as snapshots should in most cases be processed in under 16 (
512mb/32mb/s
) seconds. Whilst the default metric update interval is 10 seconds, In cases where the snapshot rate is set lower, it may provide utility - however the existingcapacity.reserved
metric, tracking the total (unprocessed + processed) in progress snapshot bytes may be more appropriate. This issue leaves them as optional.related PR, for count rather than bytes: #84947
cc @AlexTalks
Jira issue: CRDB-18293
The text was updated successfully, but these errors were encountered: