-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
sentry: stopper.go:182: runtime.errorString: runtime error: invalid memory address or nil pointer dereference #35557
Comments
cockroach-teamcity
added
C-bug
Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior.
O-sentry
Originated from an in-the-wild panic report.
labels
Mar 8, 2019
nvanbenschoten
added a commit
to nvanbenschoten/arenaskl
that referenced
this issue
Mar 10, 2019
This change protects against large allocations causing Arena's internal size accounting to overflow and produce incorrect results. This overflow could allow for invalid offsets being returned from `Arena.Alloc`, leading to slice bounds and index out of range panics when passed to `Arena.GetBytes` or `Arena.GetPointer`. Specifically, if `Arena.n` overflowed in `Arena.Alloc`, which was possible because it was a uint32 and we allow up to MaxUint32 size allocations, then `Arena.Alloc` could bypass the length check against `Arena.buf`. The "padded" size would then be subtracted from the offset, allowing the offset to underflow back around to an offset that was out of `Arena.buf`'s bounds. I believe this is the underlying cause of the following two crashes: - cockroachdb/cockroach#31624 - cockroachdb/cockroach#35557 The reason for this is subtle and has to do with failed allocations slowly building up and eventually overflowing `Arena.n`. I'll explain on the PR that vendors this fix. We now protect against this overflow in two ways. We perform an initial size check before performing the atomic addition in `Arena.Alloc` to prevent failed allocations from "building up" to an overflow. We also use 64-bit arithmetic so that it would take 2^32 concurrent allocations with the maximum allocation size (math.MaxUint32) to overflow the accounting. An alternative would be to use a CAS loop to strictly serialize all additions to `Arena.n`, but that would limit concurrency.
I have a suspected fix for this: andy-kimball/arenaskl#4. |
nvanbenschoten
added a commit
to nvanbenschoten/cockroach
that referenced
this issue
Mar 12, 2019
Fixes cockroachdb#31624. Fixes cockroachdb#35557. This commit picks up andy-kimball/arenaskl#4. I strongly suspect that the uint32 overflow fixed in that PR was the cause of the two index out of bounds panics. See that commit for more details. Release note: None
craig bot
pushed a commit
that referenced
this issue
Mar 12, 2019
35321: opt: propagate set operation output types to input columns r=rytaft a=rytaft This commit updates the `optbuilder` logic for set operations in which the types of the input columns do not match the types of the output columns. This can happen if a column on one side has type Unknown, but the corresponding column on the other side has a known type such as Int. The known type must be propagated to the side with the unknown type to prevent errors in the execution engine related to decoding types. If there are any column types on either side that don't match the output, then the `optbuilder` propagates the output types of the set operation down to the input columns by wrapping the side with mismatched types in a Project operation. The Project operation passes through columns that already have the correct type, and creates cast expressions for those that don't. Fixes #34524 Release note (bug fix): Fixed an error that happened when executing some set operations containing only nulls in one of the input columns. 35587: opt: add more cost for lookup joins with more ON conditions r=RaduBerinde a=RaduBerinde This is a very limited fix for #34810. The core problem is that we don't take into account that if we have an ON condition, not only there's a cost to evaluate it on each row, but we are generating more internal rows to get a given number of output rows. I attempted to do a more general fix (for all join types), where I tried to estimate the "internal" number of rows using `unknownFilterSelectivity` for each ON condition. There were two problems: - in some cases (especially with lookup joins) we have an extra ON condition that doesn't actually do anything: `ab JOIN xy ON a=x AND a=10` becomes `ab JOIN xy ON a=x AND a=10 AND x=10` becomes and `a=10` could remain as an ON condition. This results in bad query plans in important cases (e.g. TPCC) where it prefers to do an extra lookup join (due to a non-covering index) just because of that condition. - we don't have the equality columns readily available for hash join (and didn't want to extract them each time we cost). In the future we may split the planning into a logical and physical stage, and we should then separate the logical joins from hash join. For 19.1, we simply simply add a cost for lookup joins that is proportional to the number of remaining ON conditions. This is the least disruptive method that still fixes the case observed in #34810. I will leave the issue open to address this properly in the next release. Note that although hash joins and merge joins have the same issue in principle, in practice we always generate these expressions with equality on all possible columns. Release note: None 35630: storage/tscache: Pick up andy-kimball/arenaskl fix r=nvanbenschoten a=nvanbenschoten Fixes #31624. Fixes #35557. This commit picks up andy-kimball/arenaskl#4. I strongly suspect that the uint32 overflow fixed in that PR was the cause of the two index out of bounds panics. See that commit for more details. The PR also fixes a bug in memory recylcling within the tscache. I confirmed on adriatic that over 900 64MB arenas had been allocated since it was last wiped. 35644: opt: use correct ordering for insert input in execbuilder r=RaduBerinde a=RaduBerinde We were setting up a projection on the Insert's input but we were accidentally using the parent Insert's ordering instead of that of the input. Fixes #35564. Release note (bug fix): Fixed a "column not in input" crash when `INSERT ... RETURNING` is used inside a clause that requires an ordering. 35651: jobs, sql, ui: Create `AutoCreateStats` job type r=celiala a=celiala With #34279, enabling the cluster setting `sql.stats.experimental_automatic_collection.enabled` has the potential to create many CreateStats jobs, which can cause the Jobs view on the AdminUI to become cluttered. This commit creates a new `AutoCreateStats` job type for these auto-created CreateStats jobs, so that users are able to still see their own manual runs of CREATE STATISTICS, via the pre-existing `CreateStats` type. cc @danhhz, @piyush-singh, @rolandcrosby ![jobs-auto-create-stats](https://user-images.githubusercontent.com/3051672/54212467-5cea2c80-44b9-11e9-9c11-db749814f019.gif) Release note (admin ui change): AutoCreateStats type added to Jobs page to filter automatic statistics jobs. Fixes #34377. Co-authored-by: Rebecca Taft <[email protected]> Co-authored-by: Radu Berinde <[email protected]> Co-authored-by: Nathan VanBenschoten <[email protected]> Co-authored-by: Celia La <[email protected]>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
This issue was autofiled by Sentry. It represents a crash or reported error on a live cluster with telemetry enabled.
Sentry link: https://sentry.io/organizations/cockroach-labs/issues/923688019/?project=164528&referrer=webhooks_plugin
Panic message:
Stacktrace (expand for inline code snippets):
/usr/local/go/src/runtime/asm_amd64.s#L572-L574 in runtime.call32
/usr/local/go/src/runtime/panic.go#L501-L503 in runtime.gopanic
cockroach/pkg/storage/store.go
Lines 3046 to 3048 in 0c87b11
/usr/local/go/src/runtime/asm_amd64.s#L573-L575 in runtime.call64
/usr/local/go/src/runtime/panic.go#L501-L503 in runtime.gopanic
/usr/local/go/src/runtime/panic.go#L62-L64 in runtime.panicmem
/usr/local/go/src/runtime/signal_unix.go#L387-L389 in runtime.sigpanic
cockroach/pkg/storage/store.go
Lines 3116 to 3118 in 0c87b11
/usr/local/go/src/runtime/asm_amd64.s#L572-L574 in runtime.call32
/usr/local/go/src/runtime/panic.go#L501-L503 in runtime.gopanic
/usr/local/go/src/runtime/panic.go#L34-L36 in runtime.panicslice
https://github.com/cockroachdb/cockroach/blob/0c87b11cb99ba5c677c95ded55dcba385928474e/vendor/github.com/andy-kimball/arenaskl/arena.go#L83-L85 in vendor/github.com/andy-kimball/arenaskl.(*Skiplist).allocVal
https://github.com/cockroachdb/cockroach/blob/0c87b11cb99ba5c677c95ded55dcba385928474e/vendor/github.com/andy-kimball/arenaskl/skl.go#L198-L200 in vendor/github.com/andy-kimball/arenaskl.(*Skiplist).allocVal
https://github.com/cockroachdb/cockroach/blob/0c87b11cb99ba5c677c95ded55dcba385928474e/vendor/github.com/andy-kimball/arenaskl/iterator.go#L237-L239 in vendor/github.com/andy-kimball/arenaskl.(*Iterator).Set
cockroach/pkg/storage/tscache/interval_skl.go
Lines 869 to 871 in 0c87b11
cockroach/pkg/storage/tscache/interval_skl.go
Lines 1021 to 1023 in 0c87b11
cockroach/pkg/storage/tscache/interval_skl.go
Lines 981 to 983 in 0c87b11
cockroach/pkg/storage/tscache/interval_skl.go
Lines 931 to 933 in 0c87b11
cockroach/pkg/storage/tscache/interval_skl.go
Lines 547 to 549 in 0c87b11
cockroach/pkg/storage/tscache/interval_skl.go
Lines 487 to 489 in 0c87b11
cockroach/pkg/storage/tscache/interval_skl.go
Lines 445 to 447 in 0c87b11
cockroach/pkg/storage/tscache/skl_impl.go
Lines 115 to 117 in 0c87b11
cockroach/pkg/storage/tscache/skl_impl.go
Lines 102 to 104 in 0c87b11
cockroach/pkg/storage/replica.go
Lines 2595 to 2597 in 0c87b11
cockroach/pkg/storage/replica.go
Lines 3229 to 3231 in 0c87b11
cockroach/pkg/storage/replica.go
Lines 3096 to 3098 in 0c87b11
cockroach/pkg/storage/replica.go
Lines 2025 to 2027 in 0c87b11
cockroach/pkg/storage/replica.go
Lines 1973 to 1975 in 0c87b11
cockroach/pkg/storage/store.go
Lines 3167 to 3169 in 0c87b11
cockroach/pkg/storage/stores.go
Lines 184 to 186 in 0c87b11
cockroach/pkg/server/node.go
Lines 986 to 988 in 0c87b11
cockroach/pkg/util/stop/stopper.go
Lines 302 to 304 in 0c87b11
cockroach/pkg/server/node.go
Lines 973 to 975 in 0c87b11
cockroach/pkg/server/node.go
Lines 1014 to 1016 in 0c87b11
cockroach/pkg/roachpb/api.pb.go
Lines 6632 to 6634 in 0c87b11
https://github.com/cockroachdb/cockroach/blob/0c87b11cb99ba5c677c95ded55dcba385928474e/vendor/github.com/grpc-ecosystem/grpc-opentracing/go/otgrpc/server.go#L47-L49 in vendor/github.com/grpc-ecosystem/grpc-opentracing/go/otgrpc.OpenTracingServerInterceptor.func1
cockroach/pkg/rpc/context.go
Lines 196 to 198 in 0c87b11
cockroach/pkg/rpc/context.go
Lines 226 to 228 in 0c87b11
cockroach/pkg/roachpb/api.pb.go
Lines 6634 to 6636 in 0c87b11
https://github.com/cockroachdb/cockroach/blob/0c87b11cb99ba5c677c95ded55dcba385928474e/vendor/google.golang.org/grpc/server.go#L1010-L1012 in vendor/google.golang.org/grpc.(*Server).processUnaryRPC
https://github.com/cockroachdb/cockroach/blob/0c87b11cb99ba5c677c95ded55dcba385928474e/vendor/google.golang.org/grpc/server.go#L1248-L1250 in vendor/google.golang.org/grpc.(*Server).handleStream
https://github.com/cockroachdb/cockroach/blob/0c87b11cb99ba5c677c95ded55dcba385928474e/vendor/google.golang.org/grpc/server.go#L679-L681 in vendor/google.golang.org/grpc.(*Server).serveStreams.func1.1
v2.1.3
go1.10.3
The text was updated successfully, but these errors were encountered: