Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

stats: index out of range crash when forecasting #109060

Closed
yuzefovich opened this issue Aug 19, 2023 · 0 comments · Fixed by #109461
Closed

stats: index out of range crash when forecasting #109060

yuzefovich opened this issue Aug 19, 2023 · 0 comments · Fixed by #109461
Assignees
Labels
A-sql-table-stats Table statistics (and their automatic refresh). C-test-failure Broken test (automatically or manually discovered). O-rsg Random Syntax Generator T-sql-queries SQL Queries Team

Comments

@yuzefovich
Copy link
Member

yuzefovich commented Aug 19, 2023

(I was mistakenly running using the binary from late June, about on fa47de0.)

I hit this crash on one node when running unoptimized-query-oracle/disable-rules=half/seed-multi-region with #108972

panic: runtime error: index out of range [0] with length 0

goroutine 91801 [running]:
panic({0x5dcdee0, 0xc00ec84d98})
        GOROOT/src/runtime/panic.go:987 +0x3ba fp=0xc00ab1f280 sp=0xc00ab1f1c0 pc=0x49beda
runtime.goPanicIndex(0x0, 0x0)
        GOROOT/src/runtime/panic.go:113 +0x7f fp=0xc00ab1f2c0 sp=0xc00ab1f280 pc=0x499cff
github.com/cockroachdb/cockroach/pkg/sql/stats.quantile.fixMalformed({0xc007cfb000?, 0xb0, 0x160})
        github.com/cockroachdb/cockroach/pkg/sql/stats/pkg/sql/stats/quantile.go:806 +0xd3b fp=0xc00ab1f468 sp=0xc00ab1f2c0 pc=0x22d589b
github.com/cockroachdb/cockroach/pkg/sql/stats.predictHistogram({0x7812f40, 0xc0052b48a0}, {0xc00315fa60, 0x3, 0x3?}, 0xc00ec84d20?, 0x3fee666666666666, 0x4053c00000000000)
        github.com/cockroachdb/cockroach/pkg/sql/stats/pkg/sql/stats/forecast.go:413 +0x67c fp=0xc00ab1f608 sp=0xc00ab1f468 pc=0x22cbf7c
github.com/cockroachdb/cockroach/pkg/sql/stats.forecastColumnStatistics({0x7812f40, 0xc0052b48a0}, 0xc008780970?, {0xc00315fa60, 0x3, 0x4}, {0xc00bc29c00?, 0xc00bc29cc8?, 0x0?}, 0x3fee666666666666)
        github.com/cockroachdb/cockroach/pkg/sql/stats/pkg/sql/stats/forecast.go:275 +0xca5 fp=0xc00ab1f968 sp=0xc00ab1f608 pc=0x22ca8a5
github.com/cockroachdb/cockroach/pkg/sql/stats.ForecastTableStatistics({0x7812f40, 0xc0052b48a0}, 0xc00c123280?, {0xc00a750780?, 0x9, 0xa})
        github.com/cockroachdb/cockroach/pkg/sql/stats/pkg/sql/stats/forecast.go:115 +0x52d fp=0xc00ab1fc60 sp=0xc00ab1f968 pc=0x22c99cd
github.com/cockroachdb/cockroach/pkg/sql/stats.(*TableStatisticsCache).getTableStatsFromDB(0xc002127080, {0x7812f40, 0xc0052b48a0}, 0x7812f40?, 0x1)
        github.com/cockroachdb/cockroach/pkg/sql/stats/pkg/sql/stats/stats_cache.go:826 +0x60a fp=0xc00ab1fe20 sp=0xc00ab1fc60 pc=0x22dcc0a
github.com/cockroachdb/cockroach/pkg/sql/stats.(*TableStatisticsCache).refreshCacheEntry.func1(0xc002127080, {0x7812f40, 0xc0052b48a0}, 0x4b6dd00?, 0xc0?, 0xc00ab1ff50, 0xc00ab1ff30)
        github.com/cockroachdb/cockroach/pkg/sql/stats/pkg/sql/stats/stats_cache.go:451 +0x14e fp=0xc00ab1fec0 sp=0xc00ab1fe20 pc=0x22da56e
github.com/cockroachdb/cockroach/pkg/sql/stats.(*TableStatisticsCache).refreshCacheEntry(0xc002127080, {0x7812f40, 0xc0052b48a0}, 0x1b44500?, {0xc009583000?, 0xfc7c50?, 0xc0?})
        github.com/cockroachdb/cockroach/pkg/sql/stats/pkg/sql/stats/stats_cache.go:453 +0x219 fp=0xc00ab1ff80 sp=0xc00ab1fec0 pc=0x22da239
github.com/cockroachdb/cockroach/pkg/sql/stats.(*TableStatisticsCache).refreshTableStats.func1()
        github.com/cockroachdb/cockroach/pkg/sql/stats/pkg/sql/stats/stats_cache.go:480 +0x74 fp=0xc00ab1ffe0 sp=0xc00ab1ff80 pc=0x22daa14
runtime.goexit()
        GOROOT/src/runtime/asm_amd64.s:1594 +0x1 fp=0xc00ab1ffe8 sp=0xc00ab1ffe0 pc=0x4d1061
created by github.com/cockroachdb/cockroach/pkg/sql/stats.(*TableStatisticsCache).refreshTableStats
        github.com/cockroachdb/cockroach/pkg/sql/stats/pkg/sql/stats/stats_cache.go:478 +0x1a5

Here I'm attaching somewhat reduced version of the artifacts (so that it would fit under 25MiB limit of GitHub). I didn't try to reproduce this.

artifacts.zip

Jira issue: CRDB-30757

@yuzefovich yuzefovich added the C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. label Aug 19, 2023
@github-project-automation github-project-automation bot moved this to Triage in SQL Queries Aug 19, 2023
@yuzefovich yuzefovich added C-test-failure Broken test (automatically or manually discovered). and removed C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. labels Aug 19, 2023
@rharding6373 rharding6373 self-assigned this Aug 22, 2023
@rharding6373 rharding6373 moved this from Triage to Active in SQL Queries Aug 22, 2023
@michae2 michae2 added the A-sql-table-stats Table statistics (and their automatic refresh). label Aug 23, 2023
@lunevalex lunevalex added the T-sql-queries SQL Queries Team label Aug 23, 2023
@rharding6373 rharding6373 removed their assignment Aug 23, 2023
@rharding6373 rharding6373 self-assigned this Aug 24, 2023
rharding6373 added a commit to rharding6373/cockroach that referenced this issue Aug 24, 2023
This PR captures quantile information for future reproduction when we
encounter a malformed quantile that we're unable to fix, instead of
inducing an inactionable panic.

Epic: None
Fixes: cockroachdb#109060

Release note: None
craig bot pushed a commit that referenced this issue Aug 25, 2023
108099: asim: enforce len(weighted_rand) == number of stores r=kvoli a=wenyihu6

**asim: enable option to change static option settings**

This patch allows users to modify the settings for the static mode within the
randomized testing framework. The following command is now supported:
```
4. “change_static_option”[nodes=<int>][stores_per_node=<int>]
[rw_ratio=<float64>] [rate=<float64>] [min_block=<int>] [max_block=<int>]
[min_key=<int64>] [max_key=<int64>] [skewed_access=<bool>] [ranges=<int>]
[placement_type=<gen.PlacementType>] [key_space=<int>]
[replication_factor=<int>] [bytes=<int64>] [stat=<string>] [height=<int>]
[width=<int>]
e.g. change_static_option nodes=2 stores_per_node=3 placement_type=skewed
	- Change_static_option: modifies the settings for the static mode where no
	  randomization is involved. Note that this does not change the default
	  settings for any randomized generation.
	- nodes (default value is 3): number of nodes in the generated cluster
	- storesPerNode (default value is 1): number of store per nodes in the
	  generated cluster
	- rwRatio (default value is 0.0): read-write ratio of the generated load
	- rate (default value is 0.0): rate at which the load is generated
	- minBlock (default value is 1): min size of each load event
	- maxBlock (default value is 1): max size of each load event
	- minKey (default value is int64(1)): min key of the generated load
	- maxKey (default value is int64(200000)): max key of the generated load
	- skewedAccess (default value is false): is true, workload key generator is
	  skewed (zipf)
	- ranges (default value is 1): number of generated ranges
	- keySpace (default value is 200000): keyspace for the generated range
	- placementType (default value is gen.Even): type of distribution for how
	  ranges are distributed across stores
	- replicationFactor (default value is 3): number of replica for each range
	- bytes (default value is int64(0)): size of each range in bytes
	- stat (default value is “replicas”): specifies the output to be plotted
	  for the verbose option
	- height (default value is 15): height of the plot
	- width (default value is 80): width of the plot

In addition, verbose=(static_settings) can now be used to display settings used
for static options where no randomization is involved.
```

Part of: #106311
Release note: None

----

**asim: enforce len(weighted_rand) == number of stores**

Previously, we enforce that the length of a given `weighted_rand` cannot exceed
the number of stores. This was challenging for users as they might not know the
cluster configuration that would be generated and thus do not know the number of
stores. In addition, if the length of `weighted_rand` was less than total number
of stores, any stores outside of the `weighted_rand` range would simply have
zero replicas. This could lead to confusion.

To improve user control, this patch disables the use of weighted_rand with
randomized cluster generation. Requirements to use weighted_rand:
1. use static option for cluster generation
2. specify nodes(default:3) and stores_per_node(default:1) through
change_static_option
3. ensure len(weighted_rand) == number of stores == nodes * stores_per_node

In addition to these new rules, the following existing requirements remain in
place:
1. weighted_rand should only be used with placement_type=weighted_rand and vice
versa.
2. must specify a weight between [0.0, 1.0] for each element in the array, with
each element corresponding to a store
3. sum of weights in the array should be equal to 1

Resolves: #106311
Release note: None

109461: sql: add error and reporting when unable to fix malformed quantile r=rharding6373 a=rharding6373

This PR captures quantile information for future reproduction when we encounter a malformed quantile that we're unable to fix, instead of inducing an inactionable panic.

Epic: None
Fixes: #109060

Release note: None

Co-authored-by: wenyihu6 <[email protected]>
Co-authored-by: rharding6373 <[email protected]>
@craig craig bot closed this as completed in 120132e Aug 25, 2023
@github-project-automation github-project-automation bot moved this from Active to Done in SQL Queries Aug 25, 2023
blathers-crl bot pushed a commit that referenced this issue Aug 25, 2023
This PR captures quantile information for future reproduction when we
encounter a malformed quantile that we're unable to fix, instead of
inducing an inactionable panic.

Epic: None
Fixes: #109060

Release note: None
@michae2 michae2 added the O-rsg Random Syntax Generator label Sep 6, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-sql-table-stats Table statistics (and their automatic refresh). C-test-failure Broken test (automatically or manually discovered). O-rsg Random Syntax Generator T-sql-queries SQL Queries Team
Projects
Archived in project
Development

Successfully merging a pull request may close this issue.

4 participants