Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

release-20.2: colfetcher: fix NULL checks during unique index decoding #68238

Merged
merged 1 commit into from
Aug 12, 2021

Conversation

mgartner
Copy link
Collaborator

@mgartner mgartner commented Jul 29, 2021

Backport 1/1 commits from #68071.

/cc @cockroachdb/release

Release justification: This backport fixes a correctness bug in the
vectorized execution engine. See the release note for details.


colfetcher must detect NULL values in unique secondary index keys on
tables with multiple column families in order to determine whether
consecutive KVs belongs to the same row or different rows. Previously,
only the columns in the key that were needed by the query were decoded
and checked for NULL. This caused incorrect query results when NULL
column values were not detected because those columns were not needed.

This commit fixes the issue by checking all columns for NULL when
decoding unique secondary index keys on tables with multiple column
families.

Fixes #66706

Release note (bug fix): A bug has been fix that caused incorrect query
results when querying tables with multiple column families and unique
secondary indexes. The bug only occurred if 1) vectorized execution was
enabled for the query, 2) the query scanned a unique secondary index
that contained columns from more than one column family, and 3) rows
fetched by the query contained NULL values for some of the indexed
columns. This bug was present since version 20.1.

@mgartner mgartner requested review from yuzefovich and a team July 29, 2021 15:26
@blathers-crl
Copy link

blathers-crl bot commented Jul 29, 2021

Thanks for opening a backport.

Please check the backport criteria before merging:

  • Patches should only be created for serious issues.
  • Patches should not break backwards-compatibility.
  • Patches should change as little code as possible.
  • Patches should not change on-disk formats or node communication protocols.
  • Patches should not add new functionality.
If some of the basic criteria cannot be satisfied, ensure that the exceptional criteria are satisfied within.
  • There is a high priority need for the functionality that cannot wait until the next release and is difficult to address in another way.
  • The new functionality is additive-only and only runs for clusters which have specifically “opted in” to it (e.g. by a cluster setting).
  • New code is protected by a conditional check that is trivial to verify and ensures that it only runs for opt-in clusters.
  • The PM and TL on the team that owns the changed code have signed off that the change obeys the above rules.

Add a brief release justification to the body of your PR to justify this backport.

Some other things to consider:

  • What did we do to ensure that a user that doesn’t know & care about this backport, has no idea that it happened?
  • Will this work in a cluster of mixed patch versions? Did we test that?
  • If a user upgrades a patch version, uses this feature, and then downgrades, what happens?

@cockroach-teamcity
Copy link
Member

This change is Reviewable

# Regression test for #66706. Ensure that cfetcher can correctly determine when
# consecutive KVs in a unique secondary index belong to the same row.
statement ok
SET vectorize = experimental_always
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added this because I believe it is necessary in 20.2 to ensure that the queries below use the vectorized engine.

Copy link
Member

@yuzefovich yuzefovich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 3 of 3 files at r1.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @mgartner)


pkg/sql/logictest/testdata/logic_test/vectorize, line 1298 at r1 (raw file):

Previously, mgartner (Marcus Gartner) wrote…

I added this because I believe it is necessary in 20.2 to ensure that the queries below use the vectorized engine.

I think on 20.2 CREATE TABLE and INSERT might not go through vectorized (weren't supported yet), so we want to have this right before the SELECT query, and right after we want RESET vectorize;.

@mgartner mgartner force-pushed the backport20.2-68071 branch from a0e2248 to 5189ea2 Compare July 29, 2021 19:22
Copy link
Collaborator Author

@mgartner mgartner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @yuzefovich)


pkg/sql/logictest/testdata/logic_test/vectorize, line 1298 at r1 (raw file):

Previously, yuzefovich (Yahor Yuzefovich) wrote…

I think on 20.2 CREATE TABLE and INSERT might not go through vectorized (weren't supported yet), so we want to have this right before the SELECT query, and right after we want RESET vectorize;.

Thanks! Done.

`colfetcher` must detect `NULL` values in unique secondary index keys on
tables with multiple column families in order to determine whether
consecutive KVs belongs to the same row or different rows. Previously,
only the columns in the key that were needed by the query were decoded
and checked for `NULL`. This caused incorrect query results when `NULL`
column values were not detected because those columns were not needed.

This commit fixes the issue by checking all columns for `NULL` when
decoding unique secondary index keys on tables with multiple column
families.

Fixes cockroachdb#66706

Release note (bug fix): A bug has been fix that caused incorrect query
results when querying tables with multiple column families and unique
secondary indexes. The bug only occurred if 1) vectorized execution was
enabled for the query, 2) the query scanned a unique secondary index
that contained columns from more than one column family, and 3) rows
fetched by the query contained NULL values for some of the indexed
columns. This bug was present since version 20.1.
@mgartner mgartner force-pushed the backport20.2-68071 branch from 5189ea2 to 0f3f45a Compare August 11, 2021 22:39
@mgartner mgartner merged commit a05c293 into cockroachdb:release-20.2 Aug 12, 2021
@mgartner mgartner deleted the backport20.2-68071 branch August 12, 2021 00:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants