-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reduce stats allocations in optimizer #80186
Labels
C-performance
Perf of queries or internals. Solution not expected to change functional behavior.
T-sql-queries
SQL Queries Team
Comments
rharding6373
added
the
C-performance
Perf of queries or internals. Solution not expected to change functional behavior.
label
Apr 19, 2022
mgartner
added a commit
to mgartner/cockroach
that referenced
this issue
Aug 19, 2022
Prior to the commit, a column's average size in bytes was included in column statistics. To fetch this average size, the coster requested an individual column statistic each scanned column. For scans and joins involving many columns, this caused many allocations of column statistics and column sets. Because we only use a column's average size when costing scans and lookup joins, there was no need to include it in column statistics. Average size doesn't propagate up an expression tree like other statistics do. This commit removes average size from column statistics and instead builds a map in `props.Statistics` that maps column IDs to average size. This significantly reduces allocations in some cases. The only downside to this change is that we no longer set a columns average size to zero if it has all NULL values, according to statistics. I believe this is a pretty rare edge case that is unlikely to significantly affect query plans, so I think the trade-off is worth it. Fixes cockroachdb#80186 Release justification: This is a minor change that improves optimizer performance. Release note: None
craig bot
pushed a commit
that referenced
this issue
Aug 22, 2022
86460: opt: reduce statistics allocations for avg size r=mgartner a=mgartner Prior to the commit, a column's average size in bytes was included in column statistics. To fetch this average size, the coster requested an individual column statistic each scanned column. For scans and joins involving many columns, this caused many allocations of column statistics and column sets. Because we only use a column's average size when costing scans and lookup joins, there was no need to include it in column statistics. Average size doesn't propagate up an expression tree like other statistics do. This commit removes average size from column statistics and instead builds a map in `props.Statistics` that maps column IDs to average size. This significantly reduces allocations in some cases. The only downside to this change is that we no longer set a columns average size to zero if it has all NULL values, according to statistics. I believe this is a pretty rare edge case that is unlikely to significantly affect query plans, so I think the trade-off is worth it. Fixes #80186 Release justification: This is a minor change that improves optimizer performance. Release note: None 86528: storage: add default-off setting for MVCC range tombstones r=msbutler,nicktrav a=erikgrinaker This patch adds the default-off cluster setting `storage.mvcc.range_tombstones.enabled` to control whether or not to write MVCC range tombstones. The setting is internal and system-only. The read path is always active, this only determines whether KV clients should write them. A helper function `CanUseMVCCRangeTombstones()` has also been added. Callers have not yet been updated to respect this. Note that any in-flight jobs may not pick up this change, so these need to be waited out before being certain that the setting has taken effect. If disabled after being enabled, this will prevent new range tombstones from being written, but already written tombstones will remain until GCed. The above note on jobs above also applies in this case. Release justification: bug fixes and low-risk updates to new functionality Release note: None 86572: ui: update styles on sessions details page r=maryliag a=maryliag The Session Details page was updated to use the same style of summary cards as the other details pages (e.g. statement, transaction, job). Fixes #85257 Before <img width="1236" alt="Screen Shot 2022-08-22 at 12 26 43 PM" src="https://user-images.githubusercontent.com/1017486/185971455-3dc7b57f-07bc-45df-94e3-f0bd7b3e541a.png"> After <img width="1250" alt="Screen Shot 2022-08-22 at 12 26 19 PM" src="https://user-images.githubusercontent.com/1017486/185971475-1bd563f6-a596-4321-9f90-d6c68470dbb9.png"> Release justification: low risk change Release note (ui change): New styles of summary cards on Session Details page to align with other details pages. 86579: sql/builtins: update `Info` for `pg_get_viewdef` r=ZhouXing19 a=ZhouXing19 The pg_builtin func `pg_get_viewdef` was updated from a no-op to an actual function long time ago, but the info field is still `notUsableInfo`, which made the doc to miss it by mistake. This PR is to update the info, and let it be recorded in `functions.md`. Release justification: bug fix, update a builtin's visibility in docs. Release note: none 86586: README: make sure roachprod/roachtest docs use dev, not make r=rail a=rickystewart Release justification: Non-production code changes Release note: None Co-authored-by: Marcus Gartner <[email protected]> Co-authored-by: Erik Grinaker <[email protected]> Co-authored-by: Marylia Gutierrez <[email protected]> Co-authored-by: Jane Xing <[email protected]> Co-authored-by: Ricky Stewart <[email protected]>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
C-performance
Perf of queries or internals. Solution not expected to change functional behavior.
T-sql-queries
SQL Queries Team
Since the optimizer uses the table stat AvgSize to cost scans and index joins, it stores all the table stats in cases where it previously did not. We could reduce the allocations by either discarding and not storing these stats, or storing fewer stats when AvgSize is needed.
See #78592 (comment)
Jira issue: CRDB-15817
The text was updated successfully, but these errors were encountered: