-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
rowexec: optimize the inverted filterer #63196
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 2 of 2 files at r1.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @yuzefovich)
pkg/sql/rowexec/inverted_expr_evaluator.go, line 527 at r1 (raw file):
// No spans contain the value, so we can short-circuit processing it // altogether. return false, nil
I think this is an error case. Unless the spans generated by the optimizer are not matching what the invertedFilterer
expects, which could be a correctness issue.
And for the invertedJoiner
case the spans it fetches are the ones the batchedInvertedExprEvaluator
tells it to.
Can you reproduce the failure?
pkg/sql/rowexec/inverted_filterer.go, line 239 at r1 (raw file):
enc = []byte(*row[ifr.invertedColIdx].Datum.(*tree.DBytes)) } if shouldAdd, err := ifr.invertedEval.prepareAddIndexRow(enc, nil /* encFull */); err != nil {
not using shouldAdd
was an unfortunate oversight by me in #54283
Looks like that PR has no test for invertedFilterer
that would show that pre-filtering is doing something. This is in contrast with PR #53963 that added a test in inverted_joiner_test.go
when making a similar change for invertedJoiner
. It would be good to rectify this testing oversight in this PR.
pkg/sql/rowexec/inverted_filterer.go, line 243 at r1 (raw file):
return ifrStateUnknown, ifr.DrainHelper() } else if shouldAdd { // Transform to keyRow.
Looks like the code comments here and below were not changed when multi-column inverted indexes were added.
The keyRow
is everything other than the inverted column (not just the primary key) , which is fine, but the comments should be updated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
once you address @sumeerbhola's comments below. I think #63574 is the right approach to fix #63180, but this PR still seems valuable for performance reasons and to avoid panics due to other bugs.
Reviewed 2 of 2 files at r1.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @sumeerbhola and @yuzefovich)
pkg/sql/rowexec/inverted_expr_evaluator.go, line 527 at r1 (raw file):
Previously, sumeerbhola wrote…
I think this is an error case. Unless the spans generated by the optimizer are not matching what the
invertedFilterer
expects, which could be a correctness issue.
And for theinvertedJoiner
case the spans it fetches are the ones thebatchedInvertedExprEvaluator
tells it to.Can you reproduce the failure?
I opened up #63574 with a reproduction of the failure and a fix. So I agree with @sumeerbhola that this should result in an assertion failed error (but I think it's good to add this check so that we avoid a panic in case of future bugs).
571ee64
to
e92d3e8
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @rytaft and @sumeerbhola)
pkg/sql/rowexec/inverted_expr_evaluator.go, line 527 at r1 (raw file):
Previously, rytaft (Rebecca Taft) wrote…
I opened up #63574 with a reproduction of the failure and a fix. So I agree with @sumeerbhola that this should result in an assertion failed error (but I think it's good to add this check so that we avoid a panic in case of future bugs).
Added an assertion error.
pkg/sql/rowexec/inverted_filterer.go, line 239 at r1 (raw file):
Previously, sumeerbhola wrote…
not using
shouldAdd
was an unfortunate oversight by me in #54283
Looks like that PR has no test forinvertedFilterer
that would show that pre-filtering is doing something. This is in contrast with PR #53963 that added a test ininverted_joiner_test.go
when making a similar change forinvertedJoiner
. It would be good to rectify this testing oversight in this PR.
I've stared at the tests and the code for like half an hour and still don't understand how to write a test case for this (in case of the inverted joiner we have access to "disassembled" things like datumsToInvertedExpr
, but for the inverted filterer we only have the processor spec). Another thing is that, IIUC, this PR only fixes an optimization (but has no influence on the correctness), so the unit test would need to look into some of the internals of the inverted filterer (i.e. something like whether pre-filtered row is added to the container or not) which will require introducing some knobs into the processor.
Am I missing something? Could you provide more details on how to test this?
pkg/sql/rowexec/inverted_filterer.go, line 243 at r1 (raw file):
Previously, sumeerbhola wrote…
Looks like the code comments here and below were not changed when multi-column inverted indexes were added.
ThekeyRow
is everything other than the inverted column (not just the primary key) , which is fine, but the comments should be updated.
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 2 of 2 files at r2.
Reviewable status: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @sumeerbhola)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 1 of 0 LGTMs obtained (and 1 stale) (waiting on @sumeerbhola)
pkg/sql/rowexec/inverted_filterer.go, line 243 at r1 (raw file):
Previously, yuzefovich (Yahor Yuzefovich) wrote…
Done.
[nit] there's a similar comment that could use updating at line 88.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 2 of 2 files at r2.
Reviewable status: complete! 2 of 0 LGTMs obtained (and 1 stale) (waiting on @rytaft, @sumeerbhola, and @yuzefovich)
pkg/sql/rowexec/inverted_expr_evaluator.go, line 527 at r1 (raw file):
Previously, yuzefovich (Yahor Yuzefovich) wrote…
Added an assertion error.
How about also adding an assertion that bytes.Compare(b.fragmentedSpans[i].End, routingEnc) > 0
. I think we can afford to do an additional comparison for the sake of correctness.
pkg/sql/rowexec/inverted_filterer.go, line 239 at r1 (raw file):
Previously, yuzefovich (Yahor Yuzefovich) wrote…
I've stared at the tests and the code for like half an hour and still don't understand how to write a test case for this (in case of the inverted joiner we have access to "disassembled" things like
datumsToInvertedExpr
, but for the inverted filterer we only have the processor spec). Another thing is that, IIUC, this PR only fixes an optimization (but has no influence on the correctness), so the unit test would need to look into some of the internals of the inverted filterer (i.e. something like whether pre-filtered row is added to the container or not) which will require introducing some knobs into the processor.Am I missing something? Could you provide more details on how to test this?
I agree that this is messy to test. I was imagining constructing an invertedFilterer
in the normal way without a prefilter, and then injecting batchedInvertedExprEvaluator.{filterer,preFilterState}
-- the former is an interface and the latter is []interface{}
. We'd need a hook in ProcessorTest
to give us access to invertedFilterer
after it is constructed, and before it is used, so we can fiddle with it.
I'm fine with a TODO in inverted_filterer_test.go
.
Previously, we were ignoring the first return parameter of `prepareAddIndexRow` which indicates whether we should be adding a row for inverted expr evaluation which is suboptimal (essentially we were ignoring the result of prefiltering). This is now fixed. Additionally, this commit adds some comments and a sanity check around the usage of `batchedInvertedExprEvaluator`. Release note: None
e92d3e8
to
6236309
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (and 3 stale) (waiting on @rytaft and @sumeerbhola)
pkg/sql/rowexec/inverted_expr_evaluator.go, line 527 at r1 (raw file):
Previously, sumeerbhola wrote…
How about also adding an assertion that
bytes.Compare(b.fragmentedSpans[i].End, routingEnc) > 0
. I think we can afford to do an additional comparison for the sake of correctness.
Done. The End
is non-inclusive so that we return an error in case == 0
, right?
pkg/sql/rowexec/inverted_filterer.go, line 239 at r1 (raw file):
Previously, sumeerbhola wrote…
I agree that this is messy to test. I was imagining constructing an
invertedFilterer
in the normal way without a prefilter, and then injectingbatchedInvertedExprEvaluator.{filterer,preFilterState}
-- the former is an interface and the latter is[]interface{}
. We'd need a hook inProcessorTest
to give us access toinvertedFilterer
after it is constructed, and before it is used, so we can fiddle with it.I'm fine with a TODO in
inverted_filterer_test.go
.
Sounds good, left a TODO.
pkg/sql/rowexec/inverted_filterer.go, line 243 at r1 (raw file):
Previously, mgartner (Marcus Gartner) wrote…
[nit] there's a similar comment that could use updating at line 88.
Nice catch, done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 3 of 3 files at r3.
Reviewable status: complete! 1 of 0 LGTMs obtained (and 2 stale) (waiting on @rytaft and @sumeerbhola)
pkg/sql/rowexec/inverted_expr_evaluator.go, line 527 at r1 (raw file):
Previously, yuzefovich (Yahor Yuzefovich) wrote…
Done. The
End
is non-inclusive so that we return an error in case== 0
, right?
correct
TFTRs! bors r+ |
Build succeeded: |
Previously, we were ignoring the first return parameter of
prepareAddIndexRow
which indicates whether we should be adding a rowfor inverted expr evaluation which is suboptimal (essentially we were
ignoring the result of prefiltering). This is now fixed.
Additionally, this commit adds some comments and a sanity check around
the usage of
batchedInvertedExprEvaluator
.Release note: None