-
Notifications
You must be signed in to change notification settings - Fork 3.9k
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
65437: opt: normalize CollateExpr Locale r=mgartner a=mgartner #### opt: format CollateExpr.Locale in opt trees Formatted opt trees now include the `Locale` field of `CollateExpr`s. Previously, the `Locale` was never shown. Release note: None #### opt: normalize CollateExpr Locale This commit normalizes the `Locale` string of a `CollateExpr` when the expression is built in optbuilder. Normalization of this string ensures that collated string expressions with different but equivalent locales are considered equal. For example, the expressions `s COLLATE "en_us"` and `s COLLATE "en-US"` are equivalent, but prior to this commit they would be considered non-equivalent. This change allows crucial optimizer rules to apply in more cases, like `GenerateConstrainedScans`. Consider the table: CREATE TABLE t ( s STRING, c STRING COLLATE en_US AS (s COLLATE en_US) VIRTUAL, INDEX (c) ) None of the following queries would perform a constrained scan on the secondary index because the collated expressions on the left side of the `=` were not considered equal to the virtual column expression. SELECT * FROM t WHERE s COLLATED "en_US" = 'foo' COLLATE en_US SELECT * FROM t WHERE s COLLATED "en-US" = 'foo' COLLATE en_US SELECT * FROM t WHERE s COLLATED "en-us" = 'foo' COLLATE en_US The locale is normalized in optbuild rather than in a normalization rule for the sake of efficiency. A normalization rule would have to check that a locale is not already normalized to prevent an infinite normalization loop. This would require normalizing the locale multiple times: at least once to normalize and at least once more in the recursive call to `ConstructCollate` to detect if the locale was already normalized. Normalizing the locale in optbuilder requires only normalizing the locale once. Fixes #65343 Release note (performance improvement): The optimizer now generates query plans that scan indexes on virtual collated string columns, regardless of the casing or formatting of the collated locale in the query. Co-authored-by: Marcus Gartner <[email protected]>
- Loading branch information
Showing
6 changed files
with
140 additions
and
4 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters