Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

storage: use size-carrying point tombstones #104539

Merged
merged 1 commit into from
Jun 27, 2023
Merged

Conversation

jbowens
Copy link
Collaborator

@jbowens jbowens commented Jun 7, 2023

Adapt calls to ClearMVCC, ClearUnversioned, ClearEngineKey and ClearIntent to
write a new kind of point tombstone exposed by Pebble that carries the size of
the deleted kv pair. These sizes are used by Pebble to produce more accurate
estimations of space amplification within the database and pick more productive
compactions.

In point-tombstone/hetereogeneous-value-sizes roachtest runs, the kv database's
approximate disk size is reduced to 2.3 GiB by the end of the run.

Epic: CRDB-25405
Release note (performance improvement): Improve the time to disk space
reclamation when deleting rows. Previously, in scenarios where rows had large
variations in row size, it was possible for disk space to not be reclaimed
after MVCC garbage collection deleted the rows.

@jbowens jbowens requested review from a team as code owners June 7, 2023 17:59
@jbowens jbowens requested review from srosenberg, renatolabs and RaduBerinde and removed request for a team June 7, 2023 17:59
@cockroach-teamcity
Copy link
Member

This change is Reviewable

@jbowens jbowens requested a review from sumeerbhola June 7, 2023 17:59
@blathers-crl
Copy link

blathers-crl bot commented Jun 7, 2023

Thank you for updating your pull request.

My owl senses detect your PR is good for review. Please keep an eye out for any test failures in CI.

🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf.

@blathers-crl blathers-crl bot added the O-community Originated from the community label Jun 7, 2023
@jbowens jbowens removed the O-community Originated from the community label Jun 7, 2023
@jbowens
Copy link
Collaborator Author

jbowens commented Jun 7, 2023

glad to be a part of the community

@jbowens jbowens force-pushed the delsized branch 3 times, most recently from d27cb4c to 607b6b1 Compare June 9, 2023 13:55
Copy link
Collaborator

@sumeerbhola sumeerbhola left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewed 13 of 15 files at r1, 1 of 1 files at r2, 1 of 1 files at r3, all commit messages.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @jbowens, @RaduBerinde, @renatolabs, and @srosenberg)


pkg/storage/engine.go line 667 at r3 (raw file):

	//
	// It is safe to modify the contents of the arguments after it returns.
	ClearEngineKey(key EngineKey) error

what about this case which can be used for intents?
I suppose we are only doing MVCC keys (and not SINGLEDELSIZED too, which is the common case for intents) since this estimation problem mainly manifests when doing large (bursty) GCs and not the relatively evenly spaced intent resolution?


pkg/storage/mvcc.go line 2034 at r3 (raw file):

				if err := writer.ClearMVCC(oldVersionKey, ClearOptions{
					ValueSizeKnown: curProvValRaw != nil,
					ValueSize:      uint32(len(curProvValRaw)),

How will we know if we got the size wrong: the code in mvcc.go is quite complex -- I did go through these changes carefully, but staring at the complex conditional paths in mvcc.go always leaves me with a feeling that things are correct because they gets exercised by all the tests.

Should we be adding a counter inside Pebble of wrong sizes it encountered, and after each MVCC test do a full manual compaction and check that the counter is 0?


pkg/storage/batch_test.go line 864 at r3 (raw file):

	defer log.Scope(t).Close(t)

	e, err := Open(context.Background(), InMemory(),

Is it easy to have a simple unit test to check that the plumbing is working? It's great that we have the roachtest, but it would be good to detect any future plumbing bugs with unit tests.

jbowens added a commit to jbowens/cockroach that referenced this pull request Jun 12, 2023
This commit performs a small refactor of the batch reading interface exposed by
the storage package. It removes 'Pebble' from a few names, removes the
duplicated key kinds (Pebble exports these now), and adds assertions to ensure
new key kinds trigger compilation failures. This motivated by cockroachdb#104539, in which
I never even considered the batch reader or how it might need to change.

Epic: none
Release note: None
craig bot pushed a commit that referenced this pull request Jun 13, 2023
104753: storage: remove duplicated key kind constants r=jbowens a=jbowens

This commit performs a small refactor of the batch reading interface exposed by the storage package. It removes 'Pebble' from a few names, removes the duplicated key kinds (Pebble exports these now), and adds assertions to ensure new key kinds trigger compilation failures. This motivated by #104539, in which I never even considered the batch reader or how it might need to change.

Epic: none
Release note: None

Co-authored-by: Jackson Owens <[email protected]>
@jbowens jbowens force-pushed the delsized branch 4 times, most recently from 1a29767 to 50528d8 Compare June 16, 2023 22:05
@jbowens jbowens requested a review from a team June 16, 2023 22:05
@jbowens jbowens force-pushed the delsized branch 2 times, most recently from cf7c148 to 5012c4c Compare June 20, 2023 15:30
Copy link
Collaborator Author

@jbowens jbowens left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @RaduBerinde, @renatolabs, @srosenberg, and @sumeerbhola)


pkg/storage/engine.go line 667 at r3 (raw file):

Previously, sumeerbhola wrote…

what about this case which can be used for intents?
I suppose we are only doing MVCC keys (and not SINGLEDELSIZED too, which is the common case for intents) since this estimation problem mainly manifests when doing large (bursty) GCs and not the relatively evenly spaced intent resolution?

hrm, I'm not sure. to be honest, this wasn't a principled exclusion; I was just focused on MVCC because I knew we always should know the key/val size due to the requirement of updating MVCC stats. I think it would be reasonable to extend it further, even to SINGLEDELSIZED, but I don't have a great sense of how frequent the estimation problem manifests there.

I've expanded the interface here so that it can be used at least in all the non-SINGLEDEL instances.


pkg/storage/mvcc.go line 2034 at r3 (raw file):

Previously, sumeerbhola wrote…

How will we know if we got the size wrong: the code in mvcc.go is quite complex -- I did go through these changes carefully, but staring at the complex conditional paths in mvcc.go always leaves me with a feeling that things are correct because they gets exercised by all the tests.

Should we be adding a counter inside Pebble of wrong sizes it encountered, and after each MVCC test do a full manual compaction and check that the counter is 0?

I did this, manually sprinkling the manual compact throughout MVCC tests. I put the assert in the ForTesting ConfigOption.


pkg/storage/batch_test.go line 864 at r3 (raw file):

Previously, sumeerbhola wrote…

Is it easy to have a simple unit test to check that the plumbing is working? It's great that we have the roachtest, but it would be good to detect any future plumbing bugs with unit tests.

Done.

@jbowens jbowens force-pushed the delsized branch 3 times, most recently from 2c8f3ac to 08dea42 Compare June 26, 2023 20:25
Adapt calls to ClearMVCC, ClearUnversioned, ClearEngineKey and ClearIntent to
write a new kind of point tombstone exposed by Pebble that carries the size of
the deleted kv pair. These sizes are used by Pebble to produce more accurate
estimations of space amplification within the database and pick more productive
compactions.

In point-tombstone/hetereogeneous-value-sizes roachtest runs, the kv database's
approximate disk size is reduced to 2.3 GiB by the end of the run.

Epic: CRDB-25405
Release note (performance improvement): Improve the time to disk space
reclamation when deleting rows. Previously, in scenarios where rows had large
variations in row size, it was possible for disk space to not be reclaimed
after MVCC garbage collection deleted the rows.
Copy link
Contributor

@itsbilal itsbilal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewed 3 of 11 files at r4, 15 of 18 files at r5, 5 of 5 files at r6, all commit messages.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (and 1 stale) (waiting on @RaduBerinde, @renatolabs, @srosenberg, and @sumeerbhola)

@jbowens
Copy link
Collaborator Author

jbowens commented Jun 27, 2023

TFTRs!

bors r+

@craig
Copy link
Contributor

craig bot commented Jun 27, 2023

Build succeeded:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants