Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

metamorphic: add testing for synthetic suffix #3305

Merged
merged 2 commits into from
Feb 17, 2024

Conversation

RaduBerinde
Copy link
Member

@RaduBerinde RaduBerinde commented Feb 15, 2024

ingest: require boundary keys without suffix for external ingest with suffix

Require that the smallest and largest keys for the external file have
no suffix when we are doing suffix replacement. This is sufficient for
our use case and makes things easier to reason about.

We also improve and correct a few comments around suffix replacement.

metamorphic: add testing for synthetic suffix

Add a random synthetic suffix for external ingestion.

One difficulty is that the synthetic suffix has to be larger (as an
integer value) than all suffixes in the object. We can't easily
determine this at generation time, so we generate a random suffix and
at runtime determine if it's ok to use or not.

@RaduBerinde RaduBerinde requested review from msbutler, a team and sumeerbhola February 15, 2024 18:00
@cockroach-teamcity
Copy link
Member

This change is Reviewable

@RaduBerinde
Copy link
Member Author

I ran it for a bit but I plan to stress it for a longer period.

@RaduBerinde RaduBerinde force-pushed the meta-suffix-replacement branch 3 times, most recently from a78988f to c73c8ab Compare February 15, 2024 20:02
@RaduBerinde
Copy link
Member Author

It is finding some things: #3306

@RaduBerinde RaduBerinde force-pushed the meta-suffix-replacement branch 4 times, most recently from 0eb3697 to 42f8e63 Compare February 16, 2024 15:56
Copy link
Contributor

@msbutler msbutler left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is awesome!

@@ -185,6 +185,15 @@ func ingestLoad1External(
if e.HasRangeKey {
return nil, errors.New("pebble: range keys not supported in external files")
}
if len(e.SyntheticSuffix) > 0 {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could you add a quick test for this over here? https://github.com/cockroachdb/pebble/blob/master/testdata/ingest_external#L227

I think all you have to do is add a case where the bounds have suffixes

@@ -102,6 +102,11 @@ var Comparer = &base.Comparer{
Name: "pebble.internal.testkeys",
}

// The comparator is similar to the one in Cockroach; when the prefixes are
// equal:
// - a key without a suffix is smaller than one without a suffix;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

replace the second "without" with a "with"

t,
pointIter, rangeDelIter, rangeKeyIter,
false, /* uniquePrefixes */
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shouldn't this be true?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hrm, maybe i guess it doesn't matter because we create the actual external sst without a prefix here

end := g.keyGenerator.prefix(o.bounds.End)
if g.cmp(start, end) < 0 &&
(i == 0 || g.cmp(sorted[i-1].bounds.End, start) <= 0) &&
(i == len(sorted)-1 || g.cmp(end, sorted[i+1].bounds.Start) <= 0) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: add a comment along the lines of: "if the trimmed bounds overlap with adjacent file bounds, don't bother creating an external object with a syntheticSuffix, as its an unrealistic use case"

return false, errors.AssertionFailedf("block with synthetic suffix is obselete")
}
return true, nil
// Point keys should never be obsolete when suffix replacement is used; but it
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hrm, why would a block contain only range dels and a synthetic suffix? I don't think that should ever happen.

func (sr testKeysBlockIntervalSyntheticReplacer) AdjustIntervalWithSyntheticSuffix(
lower uint64, upper uint64, suffix []byte,
) (adjustedLower uint64, adjustedUpper uint64, err error) {
decoded, err := testkeys.ParseSuffix(suffix)
if err != nil {
return 0, 0, err
}
if uint64(decoded) < upper {
// The testKeysSuffixIntervalCollector below maps keys with no suffix to MaxUint64; ignore it.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: i think this diff belongs in the previous commit.

(i == len(sorted)-1 || g.cmp(end, sorted[i+1].bounds.Start) <= 0) {
o.bounds.Start = start
o.bounds.End = end
o.syntheticSuffix = g.keyGenerator.UniformSuffix()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could we use a different distribution to increase the probability that the synthetic suffix (integer cmp) is larger than all suffixes in the sst?

@RaduBerinde RaduBerinde force-pushed the meta-suffix-replacement branch from 42f8e63 to a3fb74a Compare February 16, 2024 21:09
… suffix

Require that the smallest and largest keys for the external file have
no suffix when we are doing suffix replacement. This is sufficient for
our use case and makes things easier to reason about.

We also improve and correct a few comments around suffix replacement.
@RaduBerinde RaduBerinde force-pushed the meta-suffix-replacement branch from a3fb74a to c516ca4 Compare February 16, 2024 21:12
Copy link
Member Author

@RaduBerinde RaduBerinde left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TFTR!

Reviewable status: 0 of 14 files reviewed, 6 unresolved discussions (waiting on @msbutler and @sumeerbhola)


ingest.go line 188 at r1 (raw file):

Previously, msbutler (Michael Butler) wrote…

could you add a quick test for this over here? https://github.com/cockroachdb/pebble/blob/master/testdata/ingest_external#L227

I think all you have to do is add a case where the bounds have suffixes

Done.


internal/testkeys/testkeys.go line 107 at r2 (raw file):

Previously, msbutler (Michael Butler) wrote…

replace the second "without" with a "with"

Done.


metamorphic/build.go line 214 at r2 (raw file):

Previously, msbutler (Michael Butler) wrote…

hrm, maybe i guess it doesn't matter because we create the actual external sst without a prefix here

right, added a comment.


metamorphic/generator.go line 1348 at r2 (raw file):

Previously, msbutler (Michael Butler) wrote…

nit: add a comment along the lines of: "if the trimmed bounds overlap with adjacent file bounds, don't bother creating an external object with a syntheticSuffix, as its an unrealistic use case"

I added a comment. I don't think we care about it being realistic or not.


metamorphic/generator.go line 1351 at r2 (raw file):

Previously, msbutler (Michael Butler) wrote…

could we use a different distribution to increase the probability that the synthetic suffix (integer cmp) is larger than all suffixes in the sst?

Good point, no idea why I used the uniform one. Switched to skewed which should return larger int values much more frequently.


sstable/block_property_test_utils.go line 54 at r2 (raw file):

Previously, msbutler (Michael Butler) wrote…

nit: i think this diff belongs in the previous commit.

Done.


sstable/writer.go line 2419 at r2 (raw file):

Previously, msbutler (Michael Butler) wrote…

hrm, why would a block contain only range dels and a synthetic suffix? I don't think that should ever happen.

I ran into this before I realized I have to disable range dels altogether. Reverted.

Copy link
Contributor

@msbutler msbutler left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 0 of 14 files reviewed, 2 unresolved discussions (waiting on @sumeerbhola)


sstable/block_property_test_utils.go line 54 at r2 (raw file):

Previously, RaduBerinde wrote…

Done.

Looks like this is still in the second commit.

Copy link
Member Author

@RaduBerinde RaduBerinde left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 0 of 14 files reviewed, 2 unresolved discussions (waiting on @msbutler and @sumeerbhola)


sstable/block_property_test_utils.go line 54 at r2 (raw file):

Previously, msbutler (Michael Butler) wrote…

Looks like this is still in the second commit.

It used to be in the first ("require boundary keys without suffix for external ingest with suffix") commit and now it is in the second ("add testing for synthetic suffix") commit (I think it belongs here).

@RaduBerinde
Copy link
Member Author

sstable/block_property_test_utils.go line 54 at r2 (raw file):

Previously, RaduBerinde wrote…

It used to be in the first ("require boundary keys without suffix for external ingest with suffix") commit and now it is in the second ("add testing for synthetic suffix") commit (I think it belongs here).

Oh I see now. I had most of it in the first commit and just the comment in the second one. I think it all belongs in the second one since we're adding this just for the new functionality in the meta test.

Copy link
Contributor

@msbutler msbutler left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 0 of 14 files reviewed, 2 unresolved discussions (waiting on @sumeerbhola)


sstable/block_property_test_utils.go line 54 at r2 (raw file):

Previously, RaduBerinde wrote…

Oh I see now. I had most of it in the first commit and just the comment in the second one. I think it all belongs in the second one since we're adding this just for the new functionality in the meta test.

heh. It really doesn't matter. This is ready to merge!

@RaduBerinde
Copy link
Member Author

sstable/block_property_test_utils.go line 54 at r2 (raw file):

Previously, msbutler (Michael Butler) wrote…

heh. It really doesn't matter. This is ready to merge!

Thanks! I will run the meta test under stress for a while before merging.

Add a random synthetic suffix for external ingestion.

One difficulty is that the synthetic suffix has to be larger (as an
integer value) than all suffixes in the object. We can't easily
determine this at generation time, so we generate a random suffix and
at runtime determine if it's ok to use or not.
@RaduBerinde RaduBerinde force-pushed the meta-suffix-replacement branch from c516ca4 to 8c34d1d Compare February 17, 2024 00:27
@RaduBerinde
Copy link
Member Author

414 runs so far, 0 failures, over 57m15s

@RaduBerinde RaduBerinde merged commit d10a640 into cockroachdb:master Feb 17, 2024
11 checks passed
@RaduBerinde RaduBerinde deleted the meta-suffix-replacement branch February 17, 2024 00:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants