Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pebble: Collect Key Statistics #2713

Merged
merged 1 commit into from
Aug 3, 2023

Conversation

raggar
Copy link
Contributor

@raggar raggar commented Jul 6, 2023

Created a new function ScanStatistics that returns counts of the
different key kinds in Pebble (by level) as well as the number of
snapshot keys. Also modified ScanInternal to surface the level of each
key within each visitor function.

Informs: #1996

@cockroach-teamcity
Copy link
Member

This change is Reviewable

@raggar raggar force-pushed the rahul_advanced_sstable branch 9 times, most recently from 1e3532f to 43027bd Compare July 11, 2023 19:17
@raggar raggar changed the title pebble: Collect Advanced Pebble Metrics pebble: Collect Key Statistics Jul 11, 2023
@raggar raggar force-pushed the rahul_advanced_sstable branch 5 times, most recently from 170501c to c4511f5 Compare July 12, 2023 15:32
@raggar raggar requested review from a team and itsbilal July 12, 2023 15:35
@raggar raggar marked this pull request as ready for review July 13, 2023 16:50
@raggar raggar force-pushed the rahul_advanced_sstable branch 2 times, most recently from fd0941c to e32aef6 Compare July 13, 2023 20:28
Copy link
Member

@RaduBerinde RaduBerinde left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 0 of 7 files reviewed, 3 unresolved discussions (waiting on @itsbilal and @RahulAggarwal1016)


db.go line 1179 at r2 (raw file):

	visitRangeKey func(start, end []byte, keys []rangekey.Key) error,
	visitSharedFile func(sst *SharedSSTMeta) error,
	visitKey func(key *InternalKey, value LazyValue) error,

I think it's time to put all these options into a ScanInternalOptions struct (and rename scanInternalOptions to scanInternalIterOptions). There we can document each field properly.

We should use a RestrictToLevel bool instead of using a pointer for the level (it makes the code self documenting in a way that comparing to nil doesn't).


db.go line 2509 at r2 (raw file):

// ScanStatistics returns the count of different key kinds within the lsm for a
// key span [lower, upper) as well as the number of snapshot keys.
func (d *DB) ScanStatistics(lower, upper []byte) (*LsmKeyStatistics, error) {

Since we're adding the API, we might as well add a Context argument from the getgo.


db.go line 2522 at r2 (raw file):

			func(key *InternalKey, _ LazyValue) error {
				if prevKey != nil && d.cmp(prevKey.UserKey, key.UserKey) == 0 {
					stats.levels[lvl].compactionPinnedCount++

This could use a comment explaining the logic

Copy link
Contributor Author

@raggar raggar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 0 of 7 files reviewed, 3 unresolved discussions (waiting on @itsbilal and @RaduBerinde)


db.go line 1179 at r2 (raw file):

Previously, RaduBerinde wrote…

I think it's time to put all these options into a ScanInternalOptions struct (and rename scanInternalOptions to scanInternalIterOptions). There we can document each field properly.

We should use a RestrictToLevel bool instead of using a pointer for the level (it makes the code self documenting in a way that comparing to nil doesn't).

if I use RestrictToLevel as a bool I would need to pass in another argument for the actual level

@RaduBerinde
Copy link
Member

db.go line 1179 at r2 (raw file):

Previously, RahulAggarwal1016 (Rahul Aggarwal) wrote…

if I use RestrictToLevel as a bool I would need to pass in another argument for the actual level

Yes, but they wouldn't be arguments, they would all be fields in this struct

@raggar raggar force-pushed the rahul_advanced_sstable branch 3 times, most recently from 6fc8b1a to 660dab2 Compare July 18, 2023 14:07
Copy link
Contributor Author

@raggar raggar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 0 of 7 files reviewed, all discussions resolved (waiting on @itsbilal)


db.go line 1179 at r2 (raw file):

Previously, RaduBerinde wrote…

Yes, but they wouldn't be arguments, they would all be fields in this struct

done


db.go line 2509 at r2 (raw file):

Previously, RaduBerinde wrote…

Since we're adding the API, we might as well add a Context argument from the getgo.

done


db.go line 2522 at r2 (raw file):

Previously, RaduBerinde wrote…

This could use a comment explaining the logic

done

@raggar raggar requested a review from RaduBerinde July 18, 2023 14:07
Copy link
Contributor

@itsbilal itsbilal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 1 of 7 files at r1, 1 of 2 files at r2, 5 of 6 files at r3, all commit messages.
Reviewable status: 6 of 7 files reviewed, 12 unresolved discussions (waiting on @RaduBerinde and @RahulAggarwal1016)


db.go line 1179 at r2 (raw file):

Previously, RahulAggarwal1016 (Rahul Aggarwal) wrote…

done

I don't see why this change is necessary? visitPointKey is there for just this purpose. Fragmentation/defragmentation of range keys and range dels means you probably don't want to use the simplification of just checking for user key changes for rangedels/keys anyway, so it makes sense to just use visitPointKey here and then pass noop methods for the other two.


db.go line 1180 at r2 (raw file):

	visitSharedFile func(sst *SharedSSTMeta) error,
	visitKey func(key *InternalKey, value LazyValue) error,
	level *int,

any particular reason why this is a pointer to an int?


db.go line 2495 at r2 (raw file):

// KeyStatistics keeps track of the number of keys that have been pinned by a
// compaction as well as counts of the different key kinds in the lsm.

s/compaction/snapshot here and below


db.go line 2498 at r2 (raw file):

type KeyStatistics struct {
	compactionPinnedCount int
	kindsCount            map[string]int

Having a map of maps is pretty inefficient, you could go for just one separate counter per kind you care about or you could just create an array of size InternalKeyKindMax (i.e. [InternalKeyKindMax]int) and just mutate the appropriate entry for each Kind.


db.go line 2502 at r2 (raw file):

// LsmKeyStatistics is used by DB.ScanStatistics.
type LsmKeyStatistics struct {

Nit: s/Lsm/LSM/


db.go line 2514 at r2 (raw file):

	// statistics per level
	for lvl := 0; lvl < numLevels; lvl++ {

You'd need to exercise a little more caution here for L0 sublevels, as this simplification does not work for L0. You could probably grab a ReadState at the start of this method and then call ScanInternal on each L0 file you see in it, one file at a time. For the otehr levels, this logic works.


options.go line 261 at r3 (raw file):

	skipSharedLevels bool

	// includeObsoleteKeys uses a keyspan.InterleavingIter instead of a

An interleaving iter is used in both cases. The comment should specify that this just controls whether a pointCollapsingIter is used or not.


options.go line 271 at r3 (raw file):

	restrictToLevel bool

	// level indicates the level which should be iterated on during Scan Internal.

Nit: ScanInternal is one word


scan_internal.go line 1076 at r3 (raw file):

	if i.opts.includeObsoleteKeys {
		iiter := &keyspan.InterleavingIter{}
		iiter.Init(i.comparer, &buf.merging, &rangeDelMiter, nil, i.opts.LowerBound, i.opts.UpperBound)

nil should have a /* mask */ comment on it.


scan_internal.go line 1185 at r3 (raw file):

func (i *scanInternalIterator) unsafeRangeDel() *keyspan.Span {
	if !i.opts.includeObsoleteKeys {
		return i.pointKeyIter.(*pointCollapsingIterator).iter.Span()

This should probably be reverted?


snapshot.go line 65 at r3 (raw file):

// point keys deleted by range dels and keys masked by range keys.
func (s *Snapshot) ScanInternal(
	ctx context.Context, lower, upper []byte, scanInternalOps scanInternalIterOptions,

s/Ops/Opts/


scan_internal_test.go line 178 at r3 (raw file):

				fmt.Fprintf(&b, "Level %d:\n", lvl)
				if showCompactionPinned {
					fmt.Fprintf(&b, "  compaction pinned count: %d\n", stats.levels[lvl].compactionPinnedCount)

snapshot pinned


scan_internal_test.go line 187 at r3 (raw file):

			fmt.Fprintf(&b, "Aggregate:\n")
			if showCompactionPinned {
				fmt.Fprintf(&b, "  compaction pinned count: %d\n", stats.accumulated.compactionPinnedCount)

snapshot pinned

@raggar raggar force-pushed the rahul_advanced_sstable branch from 65a972a to e7a2ba8 Compare July 27, 2023 14:24
@raggar raggar requested a review from itsbilal July 27, 2023 14:24
Copy link
Contributor Author

@raggar raggar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 1 of 7 files reviewed, 10 unresolved discussions (waiting on @itsbilal, @jbowens, and @RaduBerinde)


scan_internal.go line 945 at r7 (raw file):

Previously, itsbilal (Bilal Akhtar) wrote…

It should be the current point key, assuming pointCollapsingIterator is not being used (and you might need a comment here to the effect that iterInfo is likely inaccurate if includeObsoleteFiles=false) because the pointCollapsingIter will move the mergingIter to the next user key before returning the current one. There are some other complications as there are two interleaving iterators in between (the rangekey one and the rangedel one), but they shouldn't mess with things too much; the point key iterator (which is the merging iter) should continue to point to the current key.

It is pretty confusing how this code is making assumptions about the iterator stack though, and it "feels" unclean and buggy. I'm also not sure if I understand the issue you're observing. I wonder if there's a better solution here, such as around threading a Level() method to (some) internal iterators to get the level of the last-returned key, but that would be a bigger change. Maybe @jbowens has some thoughts here

I think some iterator in some test tries to get the level for an InternalKeyKindMerge from the heap (but its empty) which is causing the error. For now I added a change where if the top of the mergingIter heap is empty, I pass in -1 as the level (indicating it is invalid).

@raggar raggar force-pushed the rahul_advanced_sstable branch from e7a2ba8 to 127a180 Compare July 27, 2023 14:33
Copy link
Contributor Author

@raggar raggar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 1 of 7 files reviewed, 10 unresolved discussions (waiting on @itsbilal, @jbowens, and @RaduBerinde)


scan_internal.go line 945 at r7 (raw file):

Previously, RahulAggarwal1016 (Rahul Aggarwal) wrote…

I think some iterator in some test tries to get the level for an InternalKeyKindMerge from the heap (but its empty) which is causing the error. For now I added a change where if the top of the mergingIter heap is empty, I pass in -1 as the level (indicating it is invalid).

this makes the sql part a little bit awkward since the level data that I have only applies to point keys, but then since we still want to show data for range keys, if I have a column for rangeKeys, there would technically be some incorrect data for the level rows so im not too sure

Copy link
Collaborator

@jbowens jbowens left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed all commit messages.
Reviewable status: 1 of 7 files reviewed, 12 unresolved discussions (waiting on @itsbilal, @RaduBerinde, and @RahulAggarwal1016)


db.go line 2658 at r13 (raw file):

type KeyStatistics struct {
	// snapshotPinnedKeys represents the number of duplicate keys per sstable.
	// This occurs when a compaction tries to compact ta key but can't due to it being pinned by an open snapshot.

nit: maybe "when a compaction determines a key is obsolete, but cannot elide the key because it's required by an open snapshot"


db.go line 2664 at r13 (raw file):

	// Note: these fields are currently only populated for point keys.
	kindsCount   [InternalKeyKindMax + 1]int
	bytesPerKind [InternalKeyKindMax + 1]uint64

It looks like this is only the key bytes (eg, excluding the value size). Instead of summing key sizes by kind, I think we might get more value out of constructing a histogram (see prometheus.NewHistogram) of all key sizes, irrespective of key-kind, and another histogram of value sizes for all keys that carry values (eg, SET, SETWITHDEL, MERGE).

Also a cumulative total of key.Size()+value.Len() irrespective of key kind would be useful (and could be used for the pacing you're working on in the other PR).


db.go line 2689 at r13 (raw file):

			size := uint64(key.Size())
			kind := key.Kind()
			if d.cmp(prevKey.UserKey, key.UserKey) == 0 {

nit: use d.equal which is marginally more performant when you only need to check for equality


scan_internal.go line 945 at r7 (raw file):

It is pretty confusing how this code is making assumptions about the iterator stack though, and it "feels" unclean and buggy. I'm also not sure if I understand the issue you're observing. I wonder if there's a better solution here, such as around threading a Level() method to (some) internal iterators to get the level of the last-returned key, but that would be a bigger change.

I think adding a new internal interface something like this would be fine:

type IteratorLevelKind int8

const (
    IteratorLevelUnknown IteratorLevelKind = iota
    IteratorLevelLSM
    IteratorLevelFlushable
)

type IteratorLevel struct {
    kind IteratorLevelKind
    // FlushableIndex indicates the position within the flushable queue of this level.
    // Only valid if kind == IteratorLevelFlushable.
    FlushableIndex int
    // The level within the LSM. Only valid if kind == IteratorLevelLSM.
    Level int
    // Sublevel is only valid if kind == IteratorLevelLSM and Level == 0.
    Sublevel int
} 

type levelledInternalIterator interface {
    internalIterator
    IteratorLevel() base.IteratorLevel
}

scan_internal.go line 360 at r13 (raw file):

// Note: this is struct is only provided for point keys.
type iterInfo struct {
	// level indicates the level of point key called with visitPointKey (level == -1 indicates an invalid level).

What does an "invalid level" mean?

@raggar raggar force-pushed the rahul_advanced_sstable branch from 287c2b4 to 655ad56 Compare August 1, 2023 18:44
Copy link
Contributor Author

@raggar raggar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 1 of 7 files reviewed, 12 unresolved discussions (waiting on @itsbilal, @jbowens, and @RaduBerinde)


db.go line 2658 at r13 (raw file):

Previously, jbowens (Jackson Owens) wrote…

nit: maybe "when a compaction determines a key is obsolete, but cannot elide the key because it's required by an open snapshot"

fixed!


db.go line 2664 at r13 (raw file):

Previously, jbowens (Jackson Owens) wrote…

It looks like this is only the key bytes (eg, excluding the value size). Instead of summing key sizes by kind, I think we might get more value out of constructing a histogram (see prometheus.NewHistogram) of all key sizes, irrespective of key-kind, and another histogram of value sizes for all keys that carry values (eg, SET, SETWITHDEL, MERGE).

Also a cumulative total of key.Size()+value.Len() irrespective of key kind would be useful (and could be used for the pacing you're working on in the other PR).

I added the cumulative total but I think ill add the promtheus changes in a seperate pr (this one is a bit long)


db.go line 2689 at r13 (raw file):

Previously, jbowens (Jackson Owens) wrote…

nit: use d.equal which is marginally more performant when you only need to check for equality

fixed!


scan_internal.go line 360 at r13 (raw file):

Previously, jbowens (Jackson Owens) wrote…

What does an "invalid level" mean?

I meant for invalid level to be equivalent to be a non lsm level such as for a memtable or when it is unknown, but using the interface you provided this code is removed.

@raggar raggar force-pushed the rahul_advanced_sstable branch 4 times, most recently from f3b488f to 9c993d0 Compare August 2, 2023 13:40
@raggar raggar requested a review from jbowens August 2, 2023 13:42
@raggar raggar force-pushed the rahul_advanced_sstable branch from 9c993d0 to ca95928 Compare August 2, 2023 13:45
@raggar raggar mentioned this pull request Aug 2, 2023
@raggar raggar requested a review from a team August 2, 2023 20:29
Copy link
Collaborator

@jbowens jbowens left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewable status: 0 of 7 files reviewed, 15 unresolved discussions (waiting on @itsbilal, @RaduBerinde, and @RahulAggarwal1016)


db.go line 2669 at r15 (raw file):

type LSMKeyStatistics struct {
	accumulated KeyStatistics
	levels      [numLevels]KeyStatistics

Let's add a comment clarifying that Levels only contains statistics about point keys. Range deletions and range keys will appear in accumulated, but not Levels.


db.go line 2670 at r15 (raw file):

	accumulated KeyStatistics
	levels      [numLevels]KeyStatistics
	bytesRead   uint64

Do these fields need to be exported (capitalized)? I'm assuming yes, since this is the return value of ScanStatistics.


db.go line 2713 at r15 (raw file):

			return nil
		},
		nil,

Code snippet:

nil, /* visitSharedFile */

db.go line 2714 at r15 (raw file):

		},
		nil,
		true,

Code snippet:

true, /* includeObsoleteKeys */

scan_internal.go line 368 at r15 (raw file):

// Note: this is struct is only provided for point keys.
type IteratorLevel struct {
	kind iteratorLevelKind

nit: let's export IteratorLevelKind and the kind field since the caller may need to access it as a part of its visitPointKey implementation.

@raggar raggar force-pushed the rahul_advanced_sstable branch from ca95928 to 30b5984 Compare August 3, 2023 18:28
Copy link
Contributor Author

@raggar raggar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 0 of 7 files reviewed, 15 unresolved discussions (waiting on @itsbilal, @jbowens, and @RaduBerinde)


db.go line 2669 at r15 (raw file):

Previously, jbowens (Jackson Owens) wrote…

Let's add a comment clarifying that Levels only contains statistics about point keys. Range deletions and range keys will appear in accumulated, but not Levels.

added!


db.go line 2670 at r15 (raw file):

Previously, jbowens (Jackson Owens) wrote…

Do these fields need to be exported (capitalized)? I'm assuming yes, since this is the return value of ScanStatistics.

oh yeah they do, fixed!


db.go line 2713 at r15 (raw file):

			return nil
		},
		nil,

added!


db.go line 2714 at r15 (raw file):

		},
		nil,
		true,

added!

@raggar raggar force-pushed the rahul_advanced_sstable branch from 30b5984 to 0ae7d14 Compare August 3, 2023 18:37
Created a new function `ScanStatistics` that returns counts of the
different key kinds in Pebble (by level) as well as the number of
snapshot keys. Also modified `ScanInternal` to surface the level of each
key within each visitor function.

Informs: cockroachdb#1996
@raggar raggar force-pushed the rahul_advanced_sstable branch from 0ae7d14 to 38f167b Compare August 3, 2023 18:38
@raggar raggar requested a review from jbowens August 3, 2023 18:49
@itsbilal itsbilal dismissed their stale review August 3, 2023 18:54

Jackson LGTM

@raggar raggar merged commit 83c9361 into cockroachdb:master Aug 3, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants