Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rowexec: optimize the inverted filterer #63196

Merged
merged 1 commit into from
Apr 16, 2021

Conversation

yuzefovich
Copy link
Member

@yuzefovich yuzefovich commented Apr 7, 2021

Previously, we were ignoring the first return parameter of
prepareAddIndexRow which indicates whether we should be adding a row
for inverted expr evaluation which is suboptimal (essentially we were
ignoring the result of prefiltering). This is now fixed.

Additionally, this commit adds some comments and a sanity check around
the usage of batchedInvertedExprEvaluator.

Release note: None

@cockroach-teamcity
Copy link
Member

This change is Reviewable

Copy link
Collaborator

@sumeerbhola sumeerbhola left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 2 of 2 files at r1.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @yuzefovich)


pkg/sql/rowexec/inverted_expr_evaluator.go, line 527 at r1 (raw file):

		// No spans contain the value, so we can short-circuit processing it
		// altogether.
		return false, nil

I think this is an error case. Unless the spans generated by the optimizer are not matching what the invertedFilterer expects, which could be a correctness issue.
And for the invertedJoiner case the spans it fetches are the ones the batchedInvertedExprEvaluator tells it to.

Can you reproduce the failure?


pkg/sql/rowexec/inverted_filterer.go, line 239 at r1 (raw file):

		enc = []byte(*row[ifr.invertedColIdx].Datum.(*tree.DBytes))
	}
	if shouldAdd, err := ifr.invertedEval.prepareAddIndexRow(enc, nil /* encFull */); err != nil {

not using shouldAdd was an unfortunate oversight by me in #54283
Looks like that PR has no test for invertedFilterer that would show that pre-filtering is doing something. This is in contrast with PR #53963 that added a test in inverted_joiner_test.go when making a similar change for invertedJoiner. It would be good to rectify this testing oversight in this PR.


pkg/sql/rowexec/inverted_filterer.go, line 243 at r1 (raw file):

		return ifrStateUnknown, ifr.DrainHelper()
	} else if shouldAdd {
		// Transform to keyRow.

Looks like the code comments here and below were not changed when multi-column inverted indexes were added.
The keyRow is everything other than the inverted column (not just the primary key) , which is fine, but the comments should be updated.

Copy link
Collaborator

@rytaft rytaft left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm: once you address @sumeerbhola's comments below. I think #63574 is the right approach to fix #63180, but this PR still seems valuable for performance reasons and to avoid panics due to other bugs.

Reviewed 2 of 2 files at r1.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @sumeerbhola and @yuzefovich)


pkg/sql/rowexec/inverted_expr_evaluator.go, line 527 at r1 (raw file):

Previously, sumeerbhola wrote…

I think this is an error case. Unless the spans generated by the optimizer are not matching what the invertedFilterer expects, which could be a correctness issue.
And for the invertedJoiner case the spans it fetches are the ones the batchedInvertedExprEvaluator tells it to.

Can you reproduce the failure?

I opened up #63574 with a reproduction of the failure and a fix. So I agree with @sumeerbhola that this should result in an assertion failed error (but I think it's good to add this check so that we avoid a panic in case of future bugs).

@yuzefovich yuzefovich changed the title rowexec: fix and optimize the inverted filterer rowexec: optimize the inverted filterer Apr 14, 2021
Copy link
Member Author

@yuzefovich yuzefovich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @rytaft and @sumeerbhola)


pkg/sql/rowexec/inverted_expr_evaluator.go, line 527 at r1 (raw file):

Previously, rytaft (Rebecca Taft) wrote…

I opened up #63574 with a reproduction of the failure and a fix. So I agree with @sumeerbhola that this should result in an assertion failed error (but I think it's good to add this check so that we avoid a panic in case of future bugs).

Added an assertion error.


pkg/sql/rowexec/inverted_filterer.go, line 239 at r1 (raw file):

Previously, sumeerbhola wrote…

not using shouldAdd was an unfortunate oversight by me in #54283
Looks like that PR has no test for invertedFilterer that would show that pre-filtering is doing something. This is in contrast with PR #53963 that added a test in inverted_joiner_test.go when making a similar change for invertedJoiner. It would be good to rectify this testing oversight in this PR.

I've stared at the tests and the code for like half an hour and still don't understand how to write a test case for this (in case of the inverted joiner we have access to "disassembled" things like datumsToInvertedExpr, but for the inverted filterer we only have the processor spec). Another thing is that, IIUC, this PR only fixes an optimization (but has no influence on the correctness), so the unit test would need to look into some of the internals of the inverted filterer (i.e. something like whether pre-filtered row is added to the container or not) which will require introducing some knobs into the processor.

Am I missing something? Could you provide more details on how to test this?


pkg/sql/rowexec/inverted_filterer.go, line 243 at r1 (raw file):

Previously, sumeerbhola wrote…

Looks like the code comments here and below were not changed when multi-column inverted indexes were added.
The keyRow is everything other than the inverted column (not just the primary key) , which is fine, but the comments should be updated.

Done.

Copy link
Collaborator

@rytaft rytaft left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 2 of 2 files at r2.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @sumeerbhola)

Copy link
Collaborator

@mgartner mgartner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (and 1 stale) (waiting on @sumeerbhola)


pkg/sql/rowexec/inverted_filterer.go, line 243 at r1 (raw file):

Previously, yuzefovich (Yahor Yuzefovich) wrote…

Done.

[nit] there's a similar comment that could use updating at line 88.

Copy link
Collaborator

@sumeerbhola sumeerbhola left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewed 2 of 2 files at r2.
Reviewable status: :shipit: complete! 2 of 0 LGTMs obtained (and 1 stale) (waiting on @rytaft, @sumeerbhola, and @yuzefovich)


pkg/sql/rowexec/inverted_expr_evaluator.go, line 527 at r1 (raw file):

Previously, yuzefovich (Yahor Yuzefovich) wrote…

Added an assertion error.

How about also adding an assertion that bytes.Compare(b.fragmentedSpans[i].End, routingEnc) > 0. I think we can afford to do an additional comparison for the sake of correctness.


pkg/sql/rowexec/inverted_filterer.go, line 239 at r1 (raw file):

Previously, yuzefovich (Yahor Yuzefovich) wrote…

I've stared at the tests and the code for like half an hour and still don't understand how to write a test case for this (in case of the inverted joiner we have access to "disassembled" things like datumsToInvertedExpr, but for the inverted filterer we only have the processor spec). Another thing is that, IIUC, this PR only fixes an optimization (but has no influence on the correctness), so the unit test would need to look into some of the internals of the inverted filterer (i.e. something like whether pre-filtered row is added to the container or not) which will require introducing some knobs into the processor.

Am I missing something? Could you provide more details on how to test this?

I agree that this is messy to test. I was imagining constructing an invertedFilterer in the normal way without a prefilter, and then injecting batchedInvertedExprEvaluator.{filterer,preFilterState} -- the former is an interface and the latter is []interface{}. We'd need a hook in ProcessorTest to give us access to invertedFilterer after it is constructed, and before it is used, so we can fiddle with it.

I'm fine with a TODO in inverted_filterer_test.go.

Previously, we were ignoring the first return parameter of
`prepareAddIndexRow` which indicates whether we should be adding a row
for inverted expr evaluation which is suboptimal (essentially we were
ignoring the result of prefiltering). This is now fixed.

Additionally, this commit adds some comments and a sanity check around
the usage of `batchedInvertedExprEvaluator`.

Release note: None
Copy link
Member Author

@yuzefovich yuzefovich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 3 stale) (waiting on @rytaft and @sumeerbhola)


pkg/sql/rowexec/inverted_expr_evaluator.go, line 527 at r1 (raw file):

Previously, sumeerbhola wrote…

How about also adding an assertion that bytes.Compare(b.fragmentedSpans[i].End, routingEnc) > 0. I think we can afford to do an additional comparison for the sake of correctness.

Done. The End is non-inclusive so that we return an error in case == 0, right?


pkg/sql/rowexec/inverted_filterer.go, line 239 at r1 (raw file):

Previously, sumeerbhola wrote…

I agree that this is messy to test. I was imagining constructing an invertedFilterer in the normal way without a prefilter, and then injecting batchedInvertedExprEvaluator.{filterer,preFilterState} -- the former is an interface and the latter is []interface{}. We'd need a hook in ProcessorTest to give us access to invertedFilterer after it is constructed, and before it is used, so we can fiddle with it.

I'm fine with a TODO in inverted_filterer_test.go.

Sounds good, left a TODO.


pkg/sql/rowexec/inverted_filterer.go, line 243 at r1 (raw file):

Previously, mgartner (Marcus Gartner) wrote…

[nit] there's a similar comment that could use updating at line 88.

Nice catch, done.

Copy link
Collaborator

@sumeerbhola sumeerbhola left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewed 3 of 3 files at r3.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (and 2 stale) (waiting on @rytaft and @sumeerbhola)


pkg/sql/rowexec/inverted_expr_evaluator.go, line 527 at r1 (raw file):

Previously, yuzefovich (Yahor Yuzefovich) wrote…

Done. The End is non-inclusive so that we return an error in case == 0, right?

correct

@yuzefovich
Copy link
Member Author

TFTRs!

bors r+

@craig
Copy link
Contributor

craig bot commented Apr 16, 2021

Build succeeded:

@craig craig bot merged commit 2c49249 into cockroachdb:master Apr 16, 2021
@yuzefovich yuzefovich deleted the inverted-filterer branch April 16, 2021 00:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants