Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

c2c: add fingerprinting for internal testing #89336

Closed
adityamaru opened this issue Oct 4, 2022 · 4 comments · Fixed by #91124
Closed

c2c: add fingerprinting for internal testing #89336

adityamaru opened this issue Oct 4, 2022 · 4 comments · Fixed by #91124
Assignees
Labels
A-disaster-recovery C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) T-disaster-recovery

Comments

@adityamaru
Copy link
Contributor

adityamaru commented Oct 4, 2022

The following technical approach was replaced with:

...but the goal is the same.

Previous plan

crdb_internal.scan(crdb_internal.tenant_span($1)) returns the raw keys and values from the specified span. We would like to extend this generator to return timestamped ordered revisions of each key. While this is useful as a standalone tool, the motivation behind this change is to drive the on-demand fingerprinting that we are developing for c2c replication. At a high level this primitive will be executed by each processor on a pre-defined chunk of spans, the output of which will be fed to a checksum algorithm and sent downstream in the DistSQL flow.

As part of this issue, we should investigate whether we can add a mode to ExportRequest (KV request that is already capable of reading and returning timestamp-ordered revisions of keys) that does not write these keys to an SST but returns them in a kvBuf slice that we can read from. We should benchmark, and trace what parts of ExportRequest that are being used during backup are "slow" and "unnecessary" for simply reading and returning revisions of all rows in a given timebound.

Epic: CRDB-21075

Jira issue: CRDB-20208

@adityamaru adityamaru added C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) A-disaster-recovery labels Oct 4, 2022
@blathers-crl
Copy link

blathers-crl bot commented Oct 4, 2022

cc @cockroachdb/disaster-recovery

@adityamaru adityamaru self-assigned this Oct 6, 2022
@adityamaru
Copy link
Contributor Author

From an offline conversation with @dt and @stevendanna:

  1. batcheval: ExportRequest should use a kvBuf to transport bytes instead of an SST #90450 is likely not a hard requirement for using ExportRequest as the driver to read key revisions.

  2. What if we pushed down hashing into evalExport?

Until now we were planning to use a primitive such as crdb_internal.scan() to first fetch all the KVs with revisions in a given window of time, and then hash batches of those KVs. This would likely be done in a DistSQL flow that uses PartitionSpans to divvy work based on the leaseholders and then spins up processors on each node of the source cluster to hash their respective batches. The batched approach means that we would have to preserve the batching when generating the fingerprint on the destination cluster too since the hash would depend on the order of the keys. While this gives us more observability into what batch our fingerprinting failed on there is a simpler and potentially cheaper alternative we would like to consider.

Each ExportRequest constructs an SST and ships it as part of the ExportResponse. For the purposes of fingerprinting, we could teach ExportRequest to funnel KVs into an fnv64 hasher instead of the sstWriter and return a XOR'ed hash of the span as part of the ExportResponse. This would eliminate the cost of transferring SSTs over the wire. Furthermore, ExportRequests are splittable and so we can lean entirely on the DistSender to split, send and combine all the responses (XOR the hashes) without setting up a DistSQL flow. Since we are hashing every KV (along with its timestamp) the outputted hash becomes order agnostic and can be run on the destination cluster without having to maintain any complicated batching logic.

adityamaru added a commit to adityamaru/cockroach that referenced this issue Oct 28, 2022
This is a refactor only change that pulls out the
logic in `MVCCExportToSST` into `mvccExportToWriter` that
accepts a `storage.Writer` interface. This will allow us
to pass in a `FingerprintWriter` in a future commit.

Informs: cockroachdb#89336

Release note: None
adityamaru added a commit to adityamaru/cockroach that referenced this issue Oct 28, 2022
This change introduces a fingerprintWriter that
hashes every key/timestamp and value for point keys, and
combines their hashes via a XOR into a running aggregate.
Range keys are not fingerprinted but instead written to a pebble SST that is
returned to the caller. This is because range keys do not have a stable,
discrete identity and so it is up to the caller to define a deterministic
fingerprinting scheme across all returned range keys.

The fingerprintWriter is used by `MVCCExportFingerprint` that
exports a fingerprint for point keys in the keyrange
[StartKey, EndKey) over the interval (StartTS, EndTS].
The export logic used by `MVCCExportFingerprint` is the same
that drives `MVCCExportToSST`. The former writes to a fingerprintWriter
while the latter writes to an sstWriter. Currently, this
method only support using an `fnv64` hasher to fingerprint each KV.

This change does not wire `MVCCExportFingerprint` to ExportRequest
command evaluation. This will be done as a followup.

Informs: cockroachdb#89336

Release note: None
adityamaru added a commit to adityamaru/cockroach that referenced this issue Nov 2, 2022
This is a refactor only change that pulls out the
logic in `MVCCExportToSST` into `mvccExportToWriter` that
accepts a `storage.Writer` interface. This will allow us
to pass in a `FingerprintWriter` in a future commit.

Informs: cockroachdb#89336

Release note: None
adityamaru added a commit to adityamaru/cockroach that referenced this issue Nov 2, 2022
This change introduces a fingerprintWriter that
hashes every key/timestamp and value for point keys, and
combines their hashes via a XOR into a running aggregate.
Range keys are not fingerprinted but instead written to a pebble SST that is
returned to the caller. This is because range keys do not have a stable,
discrete identity and so it is up to the caller to define a deterministic
fingerprinting scheme across all returned range keys.

The fingerprintWriter is used by `MVCCExportFingerprint` that
exports a fingerprint for point keys in the keyrange
[StartKey, EndKey) over the interval (StartTS, EndTS].
The export logic used by `MVCCExportFingerprint` is the same
that drives `MVCCExportToSST`. The former writes to a fingerprintWriter
while the latter writes to an sstWriter. Currently, this
method only support using an `fnv64` hasher to fingerprint each KV.

This change does not wire `MVCCExportFingerprint` to ExportRequest
command evaluation. This will be done as a followup.

Informs: cockroachdb#89336

Release note: None
adityamaru added a commit to adityamaru/cockroach that referenced this issue Nov 2, 2022
This change adds a `crdb_internal.fingerprint` builtin
that accepts a `startTime`, `endTime`, `startKey` and `endKey`
to define the interval the user wants to fingerprint. The builtin
is powered by sending an ExportRequest with the defined intervals
but with the `ExportFingerprint` option set to true.

Setting this option on the ExportRequest means that instead of
writing all point and rangekeys to an SST and sending them back to
the client, command evaluation will use the newly introduced
`fingerprintWriter` (cockroachdb#90848) when exporting keys. This writer
computes an `fnv64` hash of the key/timestamp, value for each point key
and maintains a running XOR aggregate of all the point keys processed
as part of the ExportRequest. Rangekeys are not fingerprinted during
command evaluation, but instead returned to the client in a
pebble SST. This is because range keys do not have a stable,
discrete identity and so it is up to the caller to define a deterministic
ingerprinting scheme across all returned range keys.

The ExportRequest sent as part of this builtin does not set any DistSender
limit, thereby allowing concurrent execution across ranges. We are not
concerned about the ExportResponses growing too large since the SSTs
will only contain rangekeys that should be few in number. If this assumption
is proved incorrect in the future, we can revisit setting a `TargetBytes`
to the header of the BatchRequest.

Fixes: cockroachdb#89336

Release note (sql change): introduces a `crdb_internal.fingerprint` builtin
that can be used to generate a `fnv64` fingerprint of keys (and optionally their revisions)
in a given key/time interval.
adityamaru added a commit to adityamaru/cockroach that referenced this issue Nov 2, 2022
This is a refactor only change that pulls out the
logic in `MVCCExportToSST` into `mvccExportToWriter` that
accepts a `storage.Writer` interface. This will allow us
to pass in a `FingerprintWriter` in a future commit.

Informs: cockroachdb#89336

Release note: None
adityamaru added a commit to adityamaru/cockroach that referenced this issue Nov 2, 2022
This change introduces a fingerprintWriter that
hashes every key/timestamp and value for point keys, and
combines their hashes via a XOR into a running aggregate.
Range keys are not fingerprinted but instead written to a pebble SST that is
returned to the caller. This is because range keys do not have a stable,
discrete identity and so it is up to the caller to define a deterministic
fingerprinting scheme across all returned range keys.

The fingerprintWriter is used by `MVCCExportFingerprint` that
exports a fingerprint for point keys in the keyrange
[StartKey, EndKey) over the interval (StartTS, EndTS].
The export logic used by `MVCCExportFingerprint` is the same
that drives `MVCCExportToSST`. The former writes to a fingerprintWriter
while the latter writes to an sstWriter. Currently, this
method only support using an `fnv64` hasher to fingerprint each KV.

This change does not wire `MVCCExportFingerprint` to ExportRequest
command evaluation. This will be done as a followup.

Informs: cockroachdb#89336

Release note: None
adityamaru added a commit to adityamaru/cockroach that referenced this issue Nov 4, 2022
This is a refactor only change that pulls out the
logic in `MVCCExportToSST` into `mvccExportToWriter` that
accepts a `storage.Writer` interface. This will allow us
to pass in a `FingerprintWriter` in a future commit.

Informs: cockroachdb#89336

Release note: None
adityamaru added a commit to adityamaru/cockroach that referenced this issue Nov 4, 2022
This change introduces a fingerprintWriter that
hashes every key/timestamp and value for point keys, and
combines their hashes via a XOR into a running aggregate.
Range keys are not fingerprinted but instead written to a pebble SST that is
returned to the caller. This is because range keys do not have a stable,
discrete identity and so it is up to the caller to define a deterministic
fingerprinting scheme across all returned range keys.

The fingerprintWriter is used by `MVCCExportFingerprint` that
exports a fingerprint for point keys in the keyrange
[StartKey, EndKey) over the interval (StartTS, EndTS].
The export logic used by `MVCCExportFingerprint` is the same
that drives `MVCCExportToSST`. The former writes to a fingerprintWriter
while the latter writes to an sstWriter. Currently, this
method only support using an `fnv64` hasher to fingerprint each KV.

This change does not wire `MVCCExportFingerprint` to ExportRequest
command evaluation. This will be done as a followup.

Informs: cockroachdb#89336

Release note: None
adityamaru added a commit to adityamaru/cockroach that referenced this issue Nov 7, 2022
This change introduces a fingerprintWriter that
hashes every key/timestamp and value for point keys, and
combines their hashes via a XOR into a running aggregate.
Range keys are not fingerprinted but instead written to a pebble SST that is
returned to the caller. This is because range keys do not have a stable,
discrete identity and so it is up to the caller to define a deterministic
fingerprinting scheme across all returned range keys.

The fingerprintWriter is used by `MVCCExportFingerprint` that
exports a fingerprint for point keys in the keyrange
[StartKey, EndKey) over the interval (StartTS, EndTS].
The export logic used by `MVCCExportFingerprint` is the same
that drives `MVCCExportToSST`. The former writes to a fingerprintWriter
while the latter writes to an sstWriter. Currently, this
method only support using an `fnv64` hasher to fingerprint each KV.

This change does not wire `MVCCExportFingerprint` to ExportRequest
command evaluation. This will be done as a followup.

Informs: cockroachdb#89336

Release note: None
adityamaru added a commit to adityamaru/cockroach that referenced this issue Nov 9, 2022
This change introduces a fingerprintWriter that
hashes every key/timestamp and value for point keys, and
combines their hashes via a XOR into a running aggregate.
Range keys are not fingerprinted but instead written to a pebble SST that is
returned to the caller. This is because range keys do not have a stable,
discrete identity and so it is up to the caller to define a deterministic
fingerprinting scheme across all returned range keys.

The fingerprintWriter is used by `MVCCExportFingerprint` that
exports a fingerprint for point keys in the keyrange
[StartKey, EndKey) over the interval (StartTS, EndTS].
The export logic used by `MVCCExportFingerprint` is the same
that drives `MVCCExportToSST`. The former writes to a fingerprintWriter
while the latter writes to an sstWriter. Currently, this
method only support using an `fnv64` hasher to fingerprint each KV.

This change does not wire `MVCCExportFingerprint` to ExportRequest
command evaluation. This will be done as a followup.

Informs: cockroachdb#89336

Release note: None
adityamaru added a commit to adityamaru/cockroach that referenced this issue Nov 10, 2022
This is a refactor only change that pulls out the
logic in `MVCCExportToSST` into `mvccExportToWriter` that
accepts a `storage.Writer` interface. This will allow us
to pass in a `FingerprintWriter` in a future commit.

Informs: cockroachdb#89336

Release note: None
adityamaru added a commit to adityamaru/cockroach that referenced this issue Nov 10, 2022
This change introduces a fingerprintWriter that
hashes every key/timestamp and value for point keys, and
combines their hashes via a XOR into a running aggregate.
Range keys are not fingerprinted but instead written to a pebble SST that is
returned to the caller. This is because range keys do not have a stable,
discrete identity and so it is up to the caller to define a deterministic
fingerprinting scheme across all returned range keys.

The fingerprintWriter is used by `MVCCExportFingerprint` that
exports a fingerprint for point keys in the keyrange
[StartKey, EndKey) over the interval (StartTS, EndTS].
The export logic used by `MVCCExportFingerprint` is the same
that drives `MVCCExportToSST`. The former writes to a fingerprintWriter
while the latter writes to an sstWriter. Currently, this
method only support using an `fnv64` hasher to fingerprint each KV.

This change does not wire `MVCCExportFingerprint` to ExportRequest
command evaluation. This will be done as a followup.

Informs: cockroachdb#89336

Release note: None
@shermanCRL shermanCRL changed the title c2c: extend crdb_internal.scan(crdb_internal.tenant_span($1)) to support revisions c2c: add fingerprinting for internal testing Nov 23, 2022
@craig craig bot closed this as completed in 1acfcc8 Nov 24, 2022
@adityamaru adityamaru reopened this Nov 24, 2022
@adityamaru adityamaru reopened this Nov 24, 2022
@adityamaru
Copy link
Contributor Author

This can be marked as done once some of our C2C tests uses the crdb_internal.fingerprint builtin.

@adityamaru
Copy link
Contributor Author

Based on an oflfine discussion with @livlobo we are closing the epic linked in this issue since crdb_internal.fingerprint has been implemented and tested. Fallout from tests (existing and future) that will use this tool should be tracked in the more appropriate epic linked here - #93078

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-disaster-recovery C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) T-disaster-recovery
Projects
No open projects
Archived in project
Development

Successfully merging a pull request may close this issue.

1 participant