Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP on KV projection pushdown #92294

Closed
wants to merge 26 commits into from

Conversation

yuzefovich
Copy link
Member

@yuzefovich yuzefovich commented Nov 21, 2022

Epic: CRDB-14837

Informs: #82323
Informs: #87610

@cockroach-teamcity
Copy link
Member

This change is Reviewable

@yuzefovich yuzefovich force-pushed the kv-pushdown branch 5 times, most recently from 0615019 to f100b72 Compare November 28, 2022 19:48
Comment on lines 2546 to 2626
// This "any type" field is a serialized descpb.IndexFetchSpec that can be
// used to initialize a columnar scan for MVCC. We use this generic type to
// resolve some painful circular dependency issues.
// TODO: better comment.
google.protobuf.Any index_fetch_spec = 29;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have you considered making this bytes or rather string? I think that a lot of the time the index_fetch_spec is going to be the same. I imagine nobody is every going to use the protobuf directly but rather will build some other sort of decoder struct from it. I imagine we'll want to cache those things. The string itself can be a hash key into the cache. I think it'd be nice to avoid decoding and parsing the specs every time. Ideally we'd define some type like type EncodedFetchSpec string and use a cast type.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, I haven't really considered this, but thanks for taking a look.

I'm currently trying to make the prototype work without too much thought to this kind of questions - at the moment, I'm struggling with all the intricacies of advancing the iterator in getAndAdvance, especially in the presence of multiple versions of the same keys. That code is hard to comprehend :/

@yuzefovich yuzefovich force-pushed the kv-pushdown branch 4 times, most recently from 50a81a3 to 5a48615 Compare December 1, 2022 00:44
@yuzefovich
Copy link
Member Author

yuzefovich commented Dec 1, 2022

The good news is that this pushdown has the impact on the TPCH queries that we expect in the multi-tenant setup:
UPDATE: these results are invalid. See #94438 (comment) for details.

Q1:	before: 8.73s	after: 6.72s	 -23.09%
Q2:	before: 0.28s	after: 0.22s	 -24.56%
Q3:	before: 9.88s	after: 3.97s	 -59.85%
Q4:	before: 4.54s	after: 3.89s	 -14.30%
Q5:	before: 10.39s	after: 10.25s	 -1.31%
Q6:	before: 33.57s	after: 32.96s	 -1.82%
Q7:	before: 23.82s	after: 23.75s	 -0.29%
Q8:	before: 3.78s	after: 3.31s	 -12.41%
Q9:	before: 28.15s	after: 25.54s	 -9.25%
Q10:	before: 5.01s	after: 4.54s	 -9.43%
Q11:	before: 2.44s	after: 2.43s	 -0.41%
Q12:	before: 34.75s	after: 34.16s	 -1.69%
Q13:	before: 3.15s	after: 2.12s	 -32.63%
Q14:	before: 3.27s	after: 0.72s	 -78.04%
Q15:	before: 16.80s	after: 16.48s	 -1.89%
Q16:	before: 1.79s	after: 1.00s	 -44.34%
Q17:	before: 1.02s	after: 0.34s	 -66.20%
Q18:	before: 30.14s	after: 7.05s	 -76.60%
Q19:	before: 13.76s	after: 6.01s	 -56.31%
Q20:	before: 55.33s	after: 52.68s	 -4.80%
Q21:	before: 24.33s	after: 24.31s	 -0.07%
Q22:	before: 1.49s	after: 0.58s	 -61.18%

Multi-tenant is still slower than single tenant which is because other fetchers (e.g. index and lookup joins) don't yet use this:

Q1:	single-tenant: 5.72s	multi-tenant: 6.72s	 17.55%
Q2:	single-tenant: 0.19s	multi-tenant: 0.22s	 10.63%
Q3:	single-tenant: 4.17s	multi-tenant: 3.97s	 -4.87%
Q4:	single-tenant: 1.40s	multi-tenant: 3.89s	 177.42%
Q5:	single-tenant: 2.39s	multi-tenant: 10.25s	 328.20%
Q6:	single-tenant: 7.24s	multi-tenant: 32.96s	 355.43%
Q7:	single-tenant: 7.00s	multi-tenant: 23.75s	 239.45%
Q8:	single-tenant: 1.00s	multi-tenant: 3.31s	 230.27%
Q9:	single-tenant: 8.32s	multi-tenant: 25.54s	 206.84%
Q10:	single-tenant: 1.91s	multi-tenant: 4.54s	 137.15%
Q11:	single-tenant: 0.66s	multi-tenant: 2.43s	 269.82%
Q12:	single-tenant: 7.70s	multi-tenant: 34.16s	 343.72%
Q13:	single-tenant: 1.67s	multi-tenant: 2.12s	 27.27%
Q14:	single-tenant: 0.74s	multi-tenant: 0.72s	 -3.01%
Q15:	single-tenant: 3.92s	multi-tenant: 16.48s	 320.67%
Q16:	single-tenant: 0.86s	multi-tenant: 1.00s	 16.57%
Q17:	single-tenant: 0.26s	multi-tenant: 0.34s	 32.23%
Q18:	single-tenant: 6.77s	multi-tenant: 7.05s	 4.12%
Q19:	single-tenant: 4.60s	multi-tenant: 6.01s	 30.81%
Q20:	single-tenant: 16.69s	multi-tenant: 52.68s	 215.61%
Q21:	single-tenant: 7.41s	multi-tenant: 24.31s	 228.32%
Q22:	single-tenant: 0.62s	multi-tenant: 0.58s	 -7.01%

@yuzefovich yuzefovich force-pushed the kv-pushdown branch 7 times, most recently from 969c3d1 to 814f2b1 Compare December 8, 2022 01:57
@yuzefovich yuzefovich force-pushed the kv-pushdown branch 4 times, most recently from 77431e5 to 3977214 Compare December 14, 2022 17:31
@yuzefovich yuzefovich force-pushed the kv-pushdown branch 6 times, most recently from 5337deb to 6bab717 Compare December 21, 2022 23:39
This commit introduces `NextKVer` interface which only consists of
a single method for returning KVs from somewhere. The interface is then
used by the `cFetcher`. Currently, there is only one implementation
(`row.KVFetcher`), but soon we'll introduce another one to push the
projections into the KV layer. This required moving
`MVCCDecodingStrategy` type into `storage` package as well.

Note that I chose to not introduce other methods that `cFetcher` is
calling on `row.KVFetcher` since they won't be necessary for the
pushdown work. Also since (at least currently) the `cFetcher` doesn't
use the `spanID` value, we omit it from the interface method signature
(it's only used by the lookup joins which aren't yet vectorized).

Additionally, this commit moves a single comment to the appropriate
place.

Release note: None
This reverts commit 134f90b7ce578f63a915e08e757d64f7a255e128.
This reverts commit a2d9008fea22b80f126a54da3b06614369d5fba6.
This reverts commit cf810a18c3f1e6852818125b521fd299e580db5a.
This will allow us to import the index fetch spec into `roachpb` which
might be needed (TBD) for the KV projection pushdown work.

This required changing the custom cast types used in the spec from type
aliases defined in `descpb` to one level lower ones from `catid`.

Epic: None

Release note: None
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants