Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sql: flake with distsql_automatic_stats #99751

Closed
yuzefovich opened this issue Mar 28, 2023 · 1 comment · Fixed by #100246
Closed

sql: flake with distsql_automatic_stats #99751

yuzefovich opened this issue Mar 28, 2023 · 1 comment · Fixed by #100246
Assignees
Labels
branch-master Failures and bugs on the master branch. C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. GA-blocker skipped-test T-sql-queries SQL Queries Team

Comments

@yuzefovich
Copy link
Member

yuzefovich commented Mar 28, 2023

Seen on this CI run on staging.

Failed
=== RUN   TestTenantLogic_distsql_automatic_stats
    test_log_scope.go:161: test logs captured to: /artifacts/tmp/_tmp/8808ff176222b1d4cd31f20d03de82b0/logTestTenantLogic_distsql_automatic_stats1986543448
    test_log_scope.go:79: use -show-logs to present logs inline
[01:21:48] setting distsql_workmem='27352B';
[01:21:48] setting distsql_workmem='27352B';
    logic.go:2829: 
         
        /home/roach/.cache/bazel/_bazel_roach/c5a4e7d36696d9cd970af2045211a7df/sandbox/processwrapper-sandbox/4242/execroot/com_github_cockroachdb_cockroach/bazel-out/k8-fastbuild/bin/pkg/ccl/logictestccl/tests/3node-tenant/3node-tenant_test_/3node-tenant_test.runfiles/com_github_cockroachdb_cockroach/pkg/sql/logictest/testdata/logic_test/distsql_automatic_stats:25: SELECT DISTINCT ON (column_names) statistics_name, column_names, row_count, distinct_count, null_count
        FROM [SHOW STATISTICS FOR TABLE data] ORDER BY column_names, created DESC
        expected:
            statistics_name  column_names  row_count  distinct_count  null_count
            __auto__         {a}           1000       10              0
            __auto__         {a,b}         1000       100             0
            __auto__         {a,b,c}       1000       1000            0
            __auto__         {b}           1000       10              0
            __auto__         {c}           1000       10              0
            __auto__         {d}           1000       1               1000
            
        but found (query options: "colnames,retry") :
            statistics_name  column_names  row_count  distinct_count  null_count

Jira issue: CRDB-26052

@yuzefovich yuzefovich added the C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. label Mar 28, 2023
@blathers-crl blathers-crl bot added the T-sql-queries SQL Queries Team label Mar 28, 2023
@michae2 michae2 self-assigned this Mar 28, 2023
@yuzefovich
Copy link
Member Author

yuzefovich commented Mar 28, 2023

I just hit it quickly when trying to repro #99753. I used ./dev testlogic base --files=distsql_automatic_stats --config=fakedist-disk -v --stream-output --stress --cpus=48 -- --test_arg -test.timeout=10m &> stress.out on the gceworker with 24 cores.

In this case the machine is overloaded, so retry directive doesn't succeed within 45s (the auto stats aren't kicked off, or the stats cache isn't populated). Hm, we do override stats.DefaultRefreshInterval to 1ms in logic tests, so I'm not sure how else we can nudge the refresher. Perhaps the answer is that we should use longer duration for "succeeds soon" retry in the logic tests.

craig bot pushed a commit that referenced this issue Mar 29, 2023
99993: logictest: skip flaky distsql_automatic_stats for now r=knz a=yuzefovich

**execbuilder: remove redundant auto stats table overrides**

This commit removes table settings overrides of the auto stats
collection in several places. This is unnecessary given that we
explicitly disable the auto stats collection for the whole cluster in
the logic tests.

Release note: None

**logictest: skip flaky distsql_automatic_stats for now**

We've seen this file become more flaky recently, so let's skip it for
now.

Informs: #99751.

Release note: None

Co-authored-by: Yahor Yuzefovich <[email protected]>
@rafiss rafiss added skipped-test branch-master Failures and bugs on the master branch. GA-blocker labels Mar 30, 2023
michae2 added a commit to michae2/cockroach that referenced this issue Mar 31, 2023
Now that automatic stats are disabled for system tables in logic tests,
`distsql_automatic_stats` was opening the floodgates with its first `SET CLUSTER
SETTING sql.stats.automatic_collection.enabled = true`. With this enabled, the
automatic stats refresher was executing `CREATE STATISTICS` for every system
table, one at a time, often taking longer than the 45s logic test retry limit to
get to the user-defined table we really care about.

Instead of using the cluster setting, change `distsql_automatic_stats` to use
the per-table setting.

Fixes: cockroachdb#99751

Epic: None

Release note: None
craig bot pushed a commit that referenced this issue Mar 31, 2023
100246: sql: deflake distsql_automatic_stats logic test r=rytaft,yuzefovich a=michae2

**sql: fix formatting of CREATE TABLE AS with storage parameters**

We were placing the storage parameters in the wrong place when formatting `CREATE TABLE AS` statements.

Fixes: #100243

Epic: None

Release note: None

**sql: deflake distsql_automatic_stats logic test**

Now that automatic stats are disabled for system tables in logic tests, `distsql_automatic_stats` was opening the floodgates with its first `SET CLUSTER SETTING sql.stats.automatic_collection.enabled = true`. With this enabled, the automatic stats refresher was executing `CREATE STATISTICS` for every system table, one at a time, often taking longer than the 45s logic test retry limit to get to the user-defined table we really care about.

Instead of using the cluster setting, change `distsql_automatic_stats` to use the per-table setting.

Fixes: #99751

Epic: None

Release note: None

100311: opt: prohibit hash-sharded index syntactic sugar in test catalog r=mgartner a=mgartner

The test catalog now panics when trying to build a hash-sharded index
instead of silently ignoring the `USING HASH` clause. This prevents
writing tests that incorrectly assume that `USING HASH` works as
expected in the test catalog. One test with `USING HASH` has been
updated.

Fixes #99129

Release note: None


Co-authored-by: Michael Erickson <[email protected]>
Co-authored-by: Marcus Gartner <[email protected]>
@craig craig bot closed this as completed in 5bfb0d9 Mar 31, 2023
michae2 added a commit to michae2/cockroach that referenced this issue Apr 3, 2023
Now that automatic stats are disabled for system tables in logic tests,
`distsql_automatic_stats` was opening the floodgates with its first `SET CLUSTER
SETTING sql.stats.automatic_collection.enabled = true`. With this enabled, the
automatic stats refresher was executing `CREATE STATISTICS` for every system
table, one at a time, often taking longer than the 45s logic test retry limit to
get to the user-defined table we really care about.

Instead of using the cluster setting, change `distsql_automatic_stats` to use
the per-table setting.

Fixes: cockroachdb#99751

Epic: None

Release note: None
@mgartner mgartner moved this to Done in SQL Queries Jul 24, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
branch-master Failures and bugs on the master branch. C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. GA-blocker skipped-test T-sql-queries SQL Queries Team
Projects
Archived in project
Development

Successfully merging a pull request may close this issue.

3 participants