-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
costfuzz: expected required columns to be a subset of output columns in projectBuildProvided #88037
Comments
There's a panic near the top of the
I actually added this panic check in #86193. |
WRT the timeout, the most likely candidate seems to be that we need (a) cancel-checking and (b) better output batch sizing in the vectorized window function operators. The query that was executing during the timeout was probably this one:
|
This comment was marked as duplicate.
This comment was marked as duplicate.
Most recent failure is a dup of #86790 |
Nevermind, missed the other error above. |
@DrewKimball one that that stands out to me in that stack trace is this line:
Maybe the panic is only reproducible when making index recommendations due to a bug or incorrect assumption in the implementation of hypothetical indexes and tables? |
Maybe... I was trying to figure out how this could possibly happen to get a repro, and it doesn't really make sense to me - how could we require an ordering that references non-output columns? So something like that might explain how this could happen. |
We need to make sure that panics are correctly caught in the index recommendation engine. It appears that they are not because this panic crashed the node. Once we fix that, the panic should cause a "regular" test failure and this should be easier to diagnose. I've created a separate issue to track that, #88246. I think we need to urgently fix it, but I'm hesistant to label it a release blocker because it's present on previous versions. That being said, v22.2 introduces index recommendations in the statements page of the UI, making this crash more likely. I'm going to remove the release-blocker labels on this issue until we get a reproduction that proves its severity. |
@mgartner are you sure the node crashed? I can't find any reason why a panic wouldn't be caught here, since it did originate from the |
I think probably we're not getting any test output for these panics because the costfuzzer skips queries that fail on the first run: cockroach/pkg/cmd/roachtest/tests/costfuzz.go Lines 57 to 63 in bbcb10f
Oh, or maybe running the query doesn't fail, but running EXPLAIN on it does. Costfuzz ignores failures when getting the query plan:cockroach/pkg/cmd/roachtest/tests/query_comparison_util.go Lines 295 to 303 in bbcb10f
We could catch errors like this that only show when running EXPLAIN by having sqlsmith run every test query with EXPLAIN .
|
This commit modifies the SqlSmith roachtest to run each generated query with explain. This should allow the test to catch planning errors that do not happen during normal execution. Informs cockroachdb#88037 Release note: None
This commit modifies the SqlSmith roachtest to run each generated query with explain. This should allow the test to catch planning errors that do not happen during normal execution. Informs cockroachdb#88037 Release note: None
88136: backupccl: add regions column to SHOW BACKUP r=msbutler a=baoalvin1 Fixes #79681 Add regions column to display multi-region information for backup databases. Release note (sql change): Previously, the user did not have an easy way to see if a backed up database is multi-region. This change adds the `regions` column to the `SHOW BACKUP` command which will output a string of `ALTER DATABASE` commands if the database is a multi-region database and `NULL` for everything else. 88333: opt: run sqlsmith roachtest queries with EXPLAIN r=DrewKimball a=DrewKimball This commit modifies the SqlSmith roachtest to run each generated query with explain. This should allow the test to catch planning errors that do not happen during normal execution. Informs #88037 Release note: None 88343: sql: turn off plan sampling by default r=maryliag a=maryliag Previously, we were sampling plans for fingerprints and saving to statement_statistics tables. Now that we are saving plan hash and plan gist (that allow us to decode back to the plan) we are no longer using the sampled plan anywhere. Since this is a heavy opperation, we are turning it off by default, but changing the default value of `sql.metrics.statement_details.plan_collection.enabled` to `false`. If we don't receive feedback about turning it back on, we can remove this sampling entirely. Partially addresses #77944 Release note (sql change): Change the default value of `sql.metrics.statement_details.plan_collection.enabled` to false, since we no longer use this information anywhere. 88391: cfetcher: correctly update the limit hint r=yuzefovich a=yuzefovich In 41fa8b6 (which was supposed to be a "noop" refactor) we introduced a bug which made it so that we no longer update the remaining limit hint correctly. As a result, the cFetcher might no longer respect the limit hint. What makes things worse is the fact that the KV layer still does everything correctly, so when the cFetcher asks for more rows that exceed the limit, the KV layer does a BatchRequest with 10x of the original limit. This is now fixed by correctly updating the limit hint right before emitting the batch. Addresses: #88382. Release note (bug fix): CockroachDB no longer fetches unnecessary rows for queries with LIMITs. The bug was introduced in 22.1.7. 88399: opt: mark SimplifyRange as essential r=DrewKimball a=DrewKimball This commit marks the `SimplifyRange` normalization rule as essential during random rule-disabling tests. This is necessary because `RangeExpr` is expected to maintain the invariant that its input is always an `AndExpr`. Other rules can replace the `AndExpr` with a different expression over the course of normalization, at which point the `RangeExpr` needs to be removed. Otherwise, various code-paths that depend on the invariant may panic (for example, predicate implication). Fixes #88352 Release note: None Co-authored-by: Alvin Bao <[email protected]> Co-authored-by: DrewKimball <[email protected]> Co-authored-by: Marylia Gutierrez <[email protected]> Co-authored-by: Yahor Yuzefovich <[email protected]>
Previously, `EXPLAIN` operators could require an ordering on non-output columns from their enclosed expression, which causes a panic. This could happen after column-pruning rules fired. This commit modifies the order-building logic to project the ordering required of the child of an `EXPLAIN` operator to only reference output columns. Fixes cockroachdb#88037 Release note (bug fix): Fixed a longstanding bug that could cause a panic when running a query with `EXPLAIN` that attempts to order on a non-output column.
88441: opt: project required ordering for EXPLAIN to input columns r=mgartner,yuzefovich a=DrewKimball Previously, `EXPLAIN` operators could require an ordering on non-output columns from their enclosed expression, which causes a panic. This could happen after column-pruning rules fired. This commit modifies the order-building logic to project the ordering required of the child of an `EXPLAIN` operator to only reference output columns. Fixes #88037 Release note (bug fix): Fixed a longstanding bug that could cause a panic when running a query with `EXPLAIN` that attempts to order on a non-output column. Co-authored-by: DrewKimball <[email protected]>
Previously, `EXPLAIN` operators could require an ordering on non-output columns from their enclosed expression, which causes a panic. This could happen after column-pruning rules fired. This commit modifies the order-building logic to project the ordering required of the child of an `EXPLAIN` operator to only reference output columns. Fixes #88037 Release note (bug fix): Fixed a longstanding bug that could cause a panic when running a query with `EXPLAIN` that attempts to order on a non-output column.
Previously, `EXPLAIN` operators could require an ordering on non-output columns from their enclosed expression, which causes a panic. This could happen after column-pruning rules fired. This commit modifies the order-building logic to project the ordering required of the child of an `EXPLAIN` operator to only reference output columns. Fixes #88037 Release note (bug fix): Fixed a longstanding bug that could cause a panic when running a query with `EXPLAIN` that attempts to order on a non-output column.
roachtest.costfuzz failed with artifacts on master @ 9a05046ce19e7678340e82c70d61e928be95bc72:
Parameters:
ROACHTEST_cloud=gce
,ROACHTEST_cpu=4
,ROACHTEST_ssd=0
Help
See: roachtest README
See: How To Investigate (internal)
Same failure on other branches
This test on roachdash | Improve this report!
Jira issue: CRDB-19654
The text was updated successfully, but these errors were encountered: