Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

panic crash while on the database pages #83935

Closed
maryliag opened this issue Jul 6, 2022 · 7 comments · Fixed by #84049
Closed

panic crash while on the database pages #83935

maryliag opened this issue Jul 6, 2022 · 7 comments · Fixed by #84049
Assignees
Labels
C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. T-sql-queries SQL Queries Team

Comments

@maryliag
Copy link
Contributor

maryliag commented Jul 6, 2022

Seeing a panic crash, (other people mentioned they saw on other occasions, but these are the steps I was able to reproduce):

Created a build with make build
Started CRDB with ./cockroach demo --insecure --multitenant=false
Start the console with make ui-watch TARGET=http://localhost:8080/
Open the db console on one of the database list of tables, e.g. http://localhost:3000/#/database/system

Wait for awhile (sometime it took me a few minutes, sometimes 30min, sometimes doesn't happen at all) and it will crash and the trace shows on the terminal

Trace:

# Server version: CockroachDB CCL v22.2.0-alpha.00000000-1090-g4dc922688e (x86_64-apple-darwin21.5.0, built 2022/07/05 21:13:24, go1.17.2) (same version as client)
# Cluster ID: 02f64d33-30bc-468c-967a-8884de0ff2ba
# Organization: Cockroach Demo
#
# Enter \? for a brief introduction.
#
[email protected]:26257/movr> panic: kvfetcher-0-unlimited-1: no bytes in account to release, current 0, free 82 [recovered]
	panic: kvfetcher-0-unlimited-1: no bytes in account to release, current 0, free 82 [recovered]
	panic: kvfetcher-0-unlimited-1: no bytes in account to release, current 0, free 82

goroutine 101639 [running]:
github.com/cockroachdb/cockroach/pkg/sql/colexecerror.CatchVectorizedRuntimeError.func1()
	/Users/maryliag/go/src/github.com/cockroachdb/cockroach/pkg/sql/colexecerror/error.go:58 +0x3bb
panic({0x86dc4e0, 0xc00cb04f00})
	/usr/local/opt/go/libexec/src/runtime/panic.go:1038 +0x215
github.com/cockroachdb/cockroach/pkg/sql/colexecerror.CatchVectorizedRuntimeError.func1()
	/Users/maryliag/go/src/github.com/cockroachdb/cockroach/pkg/sql/colexecerror/error.go:58 +0x3bb
panic({0x86dc4e0, 0xc00cb04f00})
	/usr/local/opt/go/libexec/src/runtime/panic.go:1038 +0x215
github.com/cockroachdb/cockroach/pkg/util/log/logcrash.ReportOrPanic({0xb5560e8, 0xc00e45c3c0}, 0xc001108a80, {0x8bbc4a0, 0xc00cb07f00}, {0xc00cd1d, 0x5, 0x5})
	/Users/maryliag/go/src/github.com/cockroachdb/cockroach/pkg/util/log/logcrash/crash_reporting.go:378 +0x1c5
github.com/cockroachdb/cockroach/pkg/util/mon.(*BoundAccount).Shrink(0xc00e45c270, {0xb5560e8, 0xc00e45c3c0}, 0x52)
	/Users/maryliag/go/src/github.com/cockroachdb/cockroach/pkg/util/mon/bytes_usage.go:715 +0x1d5
github.com/cockroachdb/cockroach/pkg/sql/row.(*txnKVFetcher).reset(0xc00cb7c300, {0xb5560e8, 0xc00e45c3c0})
	/Users/maryliag/go/src/github.com/cockroachdb/cockroach/pkg/sql/row/kv_batch_fetcher.go:648 +0x125
github.com/cockroachdb/cockroach/pkg/sql/row.(*txnKVFetcher).close(0x400e6bd, {0xb5560e8, 0xc00e45c3c0})
	/Users/maryliag/go/src/github.com/cockroachdb/cockroach/pkg/sql/row/kv_batch_fetcher.go:654 +0x25
github.com/cockroachdb/cockroach/pkg/sql/row.(*KVFetcher).Close(...)
	/Users/maryliag/go/src/github.com/cockroachdb/cockroach/pkg/sql/row/kv_fetcher.go:300
github.com/cockroachdb/cockroach/pkg/sql/colfetcher.(*cFetcher).Close(0xc00ba2c000, {0xb5560e8, 0xc00e45c3c0})
	/Users/maryliag/go/src/github.com/cockroachdb/cockroach/pkg/sql/colfetcher/cfetcher.go:1330 +0x6d
github.com/cockroachdb/cockroach/pkg/sql/colfetcher.(*ColBatchScan).Close(0xc00b330d20, {0x8a82b1a, 0x8a8183c})
	/Users/maryliag/go/src/github.com/cockroachdb/cockroach/pkg/sql/colfetcher/colbatch_scan.go:298 +0x4b
github.com/cockroachdb/cockroach/pkg/sql/colexecop.Closers.CloseAndLogOnErr.func1()
	/Users/maryliag/go/src/github.com/cockroachdb/cockroach/pkg/sql/colexecop/operator.go:177 +0xae
github.com/cockroachdb/cockroach/pkg/sql/colexecerror.CatchVectorizedRuntimeError(0xe62b7f8)
	/Users/maryliag/go/src/github.com/cockroachdb/cockroach/pkg/sql/colexecerror/error.go:91 +0x62
github.com/cockroachdb/cockroach/pkg/sql/colexecop.Closers.CloseAndLogOnErr({0xc00cb07e90, 0xb4b50e0, 0xc00f5e42e0}, {0xb556040, 0xc00cb49800}, {0x8a9fa10, 0xc})
	/Users/maryliag/go/src/github.com/cockroachdb/cockroach/pkg/sql/colexecop/operator.go:175 +0xcd
github.com/cockroachdb/cockroach/pkg/sql/colexec.(*Materializer).close(0xc00cba4960)
	/Users/maryliag/go/src/github.com/cockroachdb/cockroach/pkg/sql/colexec/materializer.go:324 +0x98
github.com/cockroachdb/cockroach/pkg/sql/colexec.newMaterializerInternal.func1()
	/Users/maryliag/go/src/github.com/cockroachdb/cockroach/pkg/sql/colexec/materializer.go:210 +0x1d
github.com/cockroachdb/cockroach/pkg/sql/execinfra.(*ProcessorBaseNoHelper).moveToTrailingMeta(0xc00cba4960)
	/Users/maryliag/go/src/github.com/cockroachdb/cockroach/pkg/sql/execinfra/processorsbase.go:682 +0x364
github.com/cockroachdb/cockroach/pkg/sql/execinfra.(*ProcessorBaseNoHelper).DrainHelper(0xc00cba4960)
	/Users/maryliag/go/src/github.com/cockroachdb/cockroach/pkg/sql/execinfra/processorsbase.go:561 +0x189
github.com/cockroachdb/cockroach/pkg/sql/colexec.(*Materializer).Next(0xc00cba4960)
	/Users/maryliag/go/src/github.com/cockroachdb/coc:311 +0xa5
github.com/cockroachdb/cockroach/pkg/sql/execinfra.(*ProcessorBaseNoHelper).DrainHelper(0xc003de2240)
	/Users/maryliag/go/src/github.com/cockroachdb/cockroach/pkg/sql/execinfra/processorsbase.go:557 +0x11c
github.com/cockroachdb/cockroach/pkg/sql/colflow.(*FlowCoordinator).next(0x0)
	/Users/maryliag/go/src/github.com/cockroachdb/cockroach/pkg/sql/colflow/flow_coordinator.go:143 +0x92
github.com/cockroachdb/cockroach/pkg/sql/colflow.(*FlowCoordinator).nextAdapter(...)
	/Users/maryliag/go/src/github.com/cockroachdb/cockroach/pkg/sql/colflow/flow_coordinator.go:147
github.com/cockroachdb/cockroach/pkg/sql/colexecerror.CatchVectorizedRuntimeError(0x0)
	/Users/maryliag/go/src/github.com/cockroachdb/cockroach/pkg/sql/colexecerror/error.go:91 +0x62
github.com/cockroachdb/cockroach/pkg/sql/colflow.(*FlowCoordinator).Next(0xc003de2240)
	/Users/maryliag/go/src/github.com/cockroachdb/cockroach/pkg/sql/colflow/flow_coordinator.go:152 +0x3e
github.com/cockroachdb/cockroach/pkg/sql/execinfra.DrainAndForwardMetadata({0xb556040, 0xc00cb49800}, {0xb57f4d8, 0xc003de2240}, {0xb4cfc50, 0xc004348700})
	/Users/maryliag/go/src/github.com/cockroachdb/cockroach/pkg/sql/execinfra/base.go:220 +0x75
github.com/cockroachdb/cockroach/pkg/sql/execinfra.Run({0xb556040, 0xc00cb49800}, {0xb57f4d8, 0xc003de2240}, {0xb4cfc50, 0xc004348700})
	/Users/maryliag/go/src/github.com/cockroachdb/cockroach/pkg/sql/execinfra/base.go:193 +0xe5
github.com/cockroachdb/cockroach/pkg/sql/execinfra.(*ProcessorBaseNoHelper).Run(0xc003de2240, {0xb556040, 0xc00cb49800})
	/Users/maryliag/go/src/github.com/cockroachdb/cockroach/pkg/sql/execinfra/processorsbase.go:722 +0x5b
github.com/cockroachdb/cockroach/pkg/sql/flowinfra.(*FlowBase).Run(0xc00cba4780, {0xb556040, 0xc00cb49800}, 0xc003de2240)
	/Users/maryliag/go/src/github.com/cockroachdb/cockroach/pkg/sql/flowinfra/flow.go:475 +0x258
github.com/cockroachdb/cockroach/pkg/sql/colflow.(*vectorizedFlow).Run(0xc00c7182d0, {0xb556040, 0xc00cb49800}, 0xc00cc888c0)
	/Users/maryliag/go/src/github.com/cockroachdb/cockroach/pkg/sql/colflow/vectorized_flow.go:249 +0x205
github.com/cockroachdb/cockroach/pkg/sql.(*DistSQLPlanner).Run(0xc001ea3a40, {0xb5560e8, 0xc00e45c0c0}, 0xc00c76b420, 0xc00cc888c0, 0xc00cd02300, 0xc004348700, 0xc00c7182d0, 0x0)
	/Users/maryliag/go/src/github.com/cockroachdb/cockroach/pkg/sql/distsql_running.go:607 +0xb04
github.com/cockroachdb/cockroach/pkg/sql.(*DistSQLPlanner).PlanAndRun(0xb5560e8, {0xb5560e8, 0xc00e45c0c0}, 0xc00c718010, 0xc00c76b420, 0xc00e81bf20, {{0xb557c08, 0xc00cd02280}, 0x0}, 0xc004348700)
	/Users/maryliag/go/src/github.com/cockroachdb/cockroach/pkg/sql/distsql_running.go:1461 +0x25c
github.com/cockroachdb/cockroach/pkg/sql.(*connExecutor).execWithDistSQLEngine(0xc00c717900, {0xb5560e8, 0xc00e45c0c0}, 0xc00c718010, 0xc00e45c0c0, {0xb5fdf18, 0xc00e81bf20}, 0x50, 0xc0091a3618)
	/Users/maryliag/go/src/github.com/cockroachdb/cockroach/pkg/sql/conn_executor_exec.go:1485 +0x614
github.com/cockroachdb/cockroach/pkg/sql.(*connExecutor).dispatchToExecutionEngine(0xc00c717900, {0xb5560e8, 0xc00e81bfb0}, 0xc00c718010, {0xb5fdf18, 0xc00e81bf20})
	/Users/maryliag/go/src/github.com/cockroachdb/cockroach/pkg/sql/conn_executor_exec.go:1159 +0xb87
github.com/cockroachdb/cockroach/pkg/sql.(*connExecutor).execStmtInOpenState(0xc00c717900, {0xb556040, 0xc00cb495c0}, {{0xb586798, 0xc00c72be50}, {0xc002e1f380, 0x51}, 0x0, 0x1}, 0x0, ...)
	/Users/maryliag/go/src/github.com/cockroachdb/cockroach/pkg/sql/conn_executor_exec.go:690 +0x2091
github.com/cockroachdb/cockroach/pkg/sql.(*connExecutor).execStmt(0xc00c717900, {0xb556040, 0xc00cb495c0}, {{0xb586798, 0xc00c72be50}, {0xc002e1f380, 0x51}, 0x0, 0x1}, 0x0, ...)
	/Users/maryliag/go/src/github.com/cockroachdb/cockroach/pkg/sql/conn_executor_exec.go:145 +0x59e
github.com/cockroachdb/cockroach/pkg/sql.(*connExecutor).execCmd.func1({{{0xb586798, 0xc00c72be50}, {0xc002e1f380, 0x51}, 0x0, 0x1}, {0xc0a94c0b66814cc0, 0x366afd67ace, 0x0}, {0xc0a94c0b66814cc0, ...}, ...}, ...)
	/Users/maryliag/go/src/github.com/cockroachdb/cockroach/pkg/sql/conn_executor.go:1892 +0x2f6
github.com/cockroachdb/cockroach/pkg/sql.(*connExecutor).execCmd(0xc00c717900)
	/Users/maryliag/go/src/github.com/cockroachdb/cockroach/pkg/sql/conn_executor.go:1896 +0xb48
github.com/cockroachdb/cockroach/pkg/sql.(*connExecutor).run(0xc00c717900, {0xb5560e8, 0xc00e81bbc0}, 0xc00e81b9e0, {0x0, 0x0, 0x0, 0x0, 0x0}, 0x0)
	/Users/maryliag/go/src/github.com/cockroachdb/cockroach/pkg/sql/conn_executor.go:1818 +0x26c
github.com/cockroachdb/cockroach/pkg/sql.(*InternalExecutor).initConnEx.func1()
	/Users/maryliag/go/src/github.com/cockroachdb/cockroach/pkg/sql/internal.go:206 +0xa5
created by github.com/cockroachdb/cockroach/pkg/sql.(*InternalExecutor).initConnEx
	/Users/maryliag/go/src/github.com/cockroachdb/cockroach/pkg/sql/internal.go:205 +0x5f1

Jira issue: CRDB-17362

@maryliag maryliag added the C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. label Jul 6, 2022
@maryliag maryliag added the T-sql-queries SQL Queries Team label Jul 6, 2022
@maryliag
Copy link
Contributor Author

maryliag commented Jul 7, 2022

I was testing again, this time without using -insecure, and it switches between the panic above and another one

[email protected]:26257/movr> panic: session: unexpected 10240 leftover bytes

goroutine 19233 [running]:
github.com/cockroachdb/cockroach/pkg/util/log/logcrash.ReportOrPanic({0xb58f848, 0xc0025fc2a0}, 0xc00110ca80, {0x8b338fa, 0x0}, {0xc0023f1d10, 0x6f4d750, 0x0})
	/Users/maryliag/go/src/github.com/cockroachdb/cockroach/pkg/util/log/logcrash/crash_reporting.go:378 +0x1c5
github.com/cockroachdb/cockroach/pkg/util/mon.(*BytesMonitor).doStop(0xc003ff8aa0, {0xb58f848, 0xc0025fc2a0}, 0x1)
	/Users/maryliag/go/src/github.com/cockroachdb/cockroach/pkg/util/mon/bytes_usage.go:435 +0x233
github.com/cockroachdb/cockroach/pkg/util/mon.(*BytesMonitor).Stop(...)
	/Users/maryliag/go/src/github.com/cockroachdb/cockroach/pkg/util/mon/bytes_usage.go:415
github.com/cockroachdb/cockroach/pkg/sql.(*connExecutor).close(0xc0042a5900, {0xb58f848, 0xc0025fc2a0}, 0x2)
	/Users/maryliag/go/src/github.com/cockroachdb/cockroach/pkg/sql/conn_executor.go:1170 +0x82b
github.com/cockroachdb/cockroach/pkg/sql.(*InternalExecutor).initConnEx.func1()
	/Users/maryliag/go/src/github.com/cockroachdb/cockroach/pkg/sql/internal.go:215 +0x137
created by github.com/cockroachdb/cockroach/pkg/sql.(*InternalExecutor).initConnEx
	/Users/maryliag/go/src/github.com/cockroachdb/cockroach/pkg/sql/internal.go:205 +0x5f1

Video of reproducing the panic: https://www.loom.com/share/9a6f143ae950448b851ffd860ddb05d7

@yuzefovich
Copy link
Member

Hm, I just tried doing the same as in the video with 41228d1 SHA, and it doesn't seem to crash.

Where does the SHA 4dc922688e that the binary was built on come from? I can't seem to find it. Based on the stack trace, I can see it was after #83010 was merged.

@maryliag
Copy link
Contributor Author

maryliag commented Jul 7, 2022

The branch I'm using is from this PR #83677 (if it helps). I did a rebase and rebuilt, and then got those panics

@yuzefovich
Copy link
Member

Thanks, I can repro on your branch.

@yuzefovich
Copy link
Member

I'm somewhat confident that #83615 is to blame - it somehow exposed some issue with releasing of the prepared statements in some cases, still looking.

@yuzefovich
Copy link
Member

Alright, I have figured out the cause of the first stack trace and made some progress on the second. I believe it occurs when --max-sql-memory budget is exceeded which is 128MiB by default for demo, and on the database page we run into the memory accounting leak that is fixed by #83678.

I still don't fully understand how we get those "leftover bytes" errors - I'm pretty sure it has to do with the memory account used in PreparedStatement structs, and I found some minor issues, but couldn't get to the bottom of them. Since in the release builds we won't crash and will just release these "leftover bytes", it seems ok to leave it at that.

In short, a couple of small PRs (which I'm about to open) plus #83678 should address this.

@yuzefovich
Copy link
Member

Alright, I finally figured it out - #84049 will solve this stack trace, even in face of a memory leak.

craig bot pushed a commit that referenced this issue Jul 8, 2022
83597: Colocate auth logging with auth metric for consistency r=rafiss a=ecwall

refs #83224

Release note (bug fix): Move connection OK log and metric to same location
after auth completes for consistency. This resolves an inconsistency 
(see linked isssue) in the DB console where the log and metric did not match.


83731: kvserver: acquire replica lease on queue check r=nvanbenschoten a=kvoli

This patch adds a check within the replication for when a replica is the
raft leader and does not have a valid lease. The necessary conditions
are that it is currently the raft leader and that the lease status is
expired.

This ensures that following a node restart, a replicas with a valid
lease will be installed within the replica scanner interval.

**single nodes 10k ranges with change**

![image](https://user-images.githubusercontent.com/39606633/176971656-317c38d3-7103-47a0-a18a-d9f29c49baa5.png)

**5 node, 3k ranges**

*without change*
![image](https://user-images.githubusercontent.com/39606633/177620933-56cfe528-c45c-429f-a4d9-9d3ba90fe8e1.png)

*with change*
![image](https://user-images.githubusercontent.com/39606633/177621186-ee467043-47d5-4279-bb69-5478e7ad445a.png)



resolves #83444

Release note: None

84044: ui: option to search exact statement on SQL Activity r=maryliag a=maryliag

Previously, when doing a search on SQL Activity page,
it was returning all statements that contained all terms
from the search, but not necessarily on the same order.
This commit adds an option when you wrap the search in quotes
it will only return results with the exact match in order.

https://www.loom.com/share/442c6eaee84b4c71a1acdef0b63b74bf

Release note (ui change): Ability to search for the exact terms
in order when wrapping the search in quotes.

84047: sql: remove unused error return value in a method of connExecutor r=yuzefovich a=yuzefovich

Found while looking into #83935.

Release note: None

84082: roachtest: skip multitenant/fairness r=cucaroach a=cucaroach

Informs: #83994

Release note: None


84085: roachtest: fix zipping of artifacts to include other zips r=srosenberg a=renatolabs

When artifacts are zipped in preparation for being published to
TeamCity, other zip files are skipped. The idea is that we won't try
to recursively zip artifacts.zip itself, or debug.zip, which is
published separately. However, some tests (notably, `tpchvec`)
download their own zip files in the `logs` directory so that they'll
be available for analysis when a test fails.

While there was an intention to skip only top-level zip files (as
indicated by existing comments), the code itself would skip any zip
files found in the artifacts directory. This commit updates the zipping
logic to skip only toplevel zip files, allowing tests to write their
own zip files to the `logs` directory and have them available for
inspection later.

Release note: None.

Co-authored-by: Evan Wall <[email protected]>
Co-authored-by: Austen McClernon <[email protected]>
Co-authored-by: Marylia Gutierrez <[email protected]>
Co-authored-by: Yahor Yuzefovich <[email protected]>
Co-authored-by: Tommy Reilly <[email protected]>
Co-authored-by: Renato Costa <[email protected]>
craig bot pushed a commit that referenced this issue Jul 9, 2022
84048: row: only store the accounted for memory if the reservation is approved r=yuzefovich a=yuzefovich

Previously, we would update the counter about the reserved memory before
doing the reservation. If that reservation is denied, then later on, in
`txnKVFetcher.close` we could try to release more memory than we
registered. This is now fixed.

Addresses: #83935.

Release note: None

Co-authored-by: Yahor Yuzefovich <[email protected]>
@craig craig bot closed this as completed in 10853d2 Jul 11, 2022
@craig craig bot closed this as completed in #84049 Jul 11, 2022
@mgartner mgartner moved this to Done in SQL Queries Jul 24, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. T-sql-queries SQL Queries Team
Projects
Archived in project
Development

Successfully merging a pull request may close this issue.

2 participants