Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

db: handle truncation of shared files with no points #3776

Closed

Conversation

itsbilal
Copy link
Contributor

@itsbilal itsbilal commented Jul 18, 2024

Previously, if we shared a truncated file with no points,
we would try to seek on a nil point iterator, which has undefined
behaviour. This change addresses it in truncateSharedFiles, to not
use a nil point iterator.

Fixes #3761.

@itsbilal itsbilal requested a review from a team as a code owner July 18, 2024 20:34
@itsbilal itsbilal requested a review from sumeerbhola July 18, 2024 20:34
@cockroach-teamcity
Copy link
Member

This change is Reviewable

Copy link
Member

@RaduBerinde RaduBerinde left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewable status: 0 of 3 files reviewed, all discussions resolved (waiting on @sumeerbhola)

Copy link
Collaborator

@sumeerbhola sumeerbhola left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 3 of 3 files at r1, all commit messages.
Reviewable status: all files reviewed, 1 unresolved discussion (waiting on @itsbilal)


table_cache.go line 490 at r1 (raw file):

	if kinds.Point() && file.HasPointKeys && err == nil {
		iters.point, err = c.newPointIter(ctx, v, file, cr, opts, internalOpts, dbOpts)
	}

I am a bit confused about why this worked before. IIUC, this PR didn't need to change more code to tolerate iters.point being nil, because many callers are using iterSet.Point which returns an emptyIter if point is nil. btw, should the code in scan_internal.go also use iterSet.Point?

If the earlier code was returning an "invalid iterator" (presumably that means it returns an error), why were those callers that user iterSet.Point not ending up with an error?

Previously, if we shared a truncated file with no points,
we would try to seek on a nil point iterator, which has undefined
behaviour. This change addresses it in truncateSharedFiles, to not
use a nil point iterator.

Fixes cockroachdb#3761.
@itsbilal itsbilal force-pushed the shared-truncate-no-points branch from 3a0a53d to df0081f Compare July 19, 2024 20:55
Copy link
Contributor Author

@itsbilal itsbilal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TFTRs! I've simplified this change a little, and removed the table_cache.go change.

Reviewable status: 2 of 3 files reviewed, 1 unresolved discussion (waiting on @sumeerbhola)


table_cache.go line 490 at r1 (raw file):

Previously, sumeerbhola wrote…

I am a bit confused about why this worked before. IIUC, this PR didn't need to change more code to tolerate iters.point being nil, because many callers are using iterSet.Point which returns an emptyIter if point is nil. btw, should the code in scan_internal.go also use iterSet.Point?

If the earlier code was returning an "invalid iterator" (presumably that means it returns an error), why were those callers that user iterSet.Point not ending up with an error?

Good point, and this question had me go digging deeper. I found that we did return a nil iterator either way, so this change was unnecessary and I reverted this half of it. However we still need to get truncateSharedFile to not use a nil iterator; previously it was doing just that, and that was leading to undefined behaviour.

@itsbilal
Copy link
Contributor Author

TFTR!

Copy link
Collaborator

@sumeerbhola sumeerbhola left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewed 1 of 1 files at r2, all commit messages.
Reviewable status: all files reviewed, 1 unresolved discussion (waiting on @itsbilal)

@itsbilal
Copy link
Contributor Author

On a closer inspection, this is actually a bug in Excise, and might have implications outside of disaggregated storage too. Basically, an excise of a range key makes us lose all point keys in one instance, even point keys outside the excise bounds.

The fix suggested here actually doesn't fix the metamorphic test failure. I'll close this PR and circle back with the proper fix once I've finished investigating.

@itsbilal itsbilal closed this Jul 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

github.com/cockroachdb/pebble/internal/metamorphic: TestMetaTwoInstance failed
4 participants