-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
backupccl: panic in backup resumer #52279
Comments
Hi @rohany, please add a C-ategory label to your issue. Check out the label system docs. While you're here, please consider adding an A- label to help keep our repository tidy. 🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is otan. |
51503: rpc: use tenant client/server certs r=nvanbenschoten a=tbg With this PR, tenant SQL servers use tenant client certificates to connect to the tenant server endpoint at the KV layer. They were previously using the KV-internal node certs. No validation is performed as of this PR, but this is the obvious next step. Follow-up work will assertions that make sure that we don't see tenants "accidentally" use the node certs for some operations when they are available (as is typically the case during testing). Finally, there will be some work on the heartbeats exchanged by the RPC context. We don't want a SQL tenant's time signal to ever trigger KV nodes to crash, for example. Touches #47898. Release note: None - cli/flags,config: new flag for tenant KV listen addr - sql: route tenant KV traffic to tenant KV address - roachtest: configure --tenant-addr flag in acceptance/multitenant - rpc: add TenantID to rpc.ContextOptions - security: slight test improvement - rpc: pass TenantID to SecurityContext to CertManager - security: use a single test_certs dir - security: embed certs for a few hard-coded tenants - rpc: *almost* use tenant client certs (on tenants) - rpc: use tenant client/server certs where appropriate 52281: bulkio: Correctly handle exhausting retries when reading from HTTP. r=rohany a=miretskiy Fixes #52279 Return an error if we exhaust the retry budget when reading from HTTP. Release Notes: None Co-authored-by: Tobias Schottdorf <[email protected]> Co-authored-by: Yevgeniy Miretskiy <[email protected]>
Also seeing the same crash in sqlsmith roachtests: https://teamcity.cockroachdb.com/viewLog.html?buildId=2143445 |
Yup, these have the same root cause. I was running into the panics because of pending SQLSmith backup jobs in my data directory |
52281: bulkio: Correctly handle exhausting retries when reading from HTTP. r=rohany a=miretskiy Fixes #52279 Return an error if we exhaust the retry budget when reading from HTTP. Release Notes: None 52696: cli/demo: activate Sentry crash reporting for `cockroach demo` r=irfansharif a=knz Prior to this patch, the TestServer instance was using its own `cluster.Settings` instance separate from the global one defined in `cli.serverCfg`, which is the one used by the `log` package to decide whether or not to send sentry reports. Since the `Settings` objects did not match, sentry reporting never got enabled for `cockroach demo`. This patch fixes it. This can be checked by running `cockroach demo --logtostderr=INFO`, with `COCKROACH_CRASH_REPORTS` set to a suitable URL. The reporting will be visible as the following log line (example): ``` E200812 16:27:10.520228 1764 util/log/crash_reporting.go:319 [n1,client=127.0.0.1:33741,hostssl,user=root] Queued as error 694cfa684ea14a528be7de9e79aa5947 ``` Release note (cli change): Crashes in `cockroach demo` sessions are now reported to telemetry, if telemetry is enabled. 52749: rowexec,distsql: fix a couple of things around aggregators r=yuzefovich a=yuzefovich **rowexec: fix closing of aggregator bucket in case of an error** Previously, the bucket of aggregate functions wouldn't get properly closed if an aggregate function returned an error on `Result` call. Now this is fixed. Release note: None **distsql: fix TestAggregatorAgainstProcessor** In `TestAggregatorAgainstProcessor` we generate input types randomly relying on `execinfrapb.GetAggregateInfo` function to perform the "type-checking". That function, however, can have "false negative" for several aggregate functions when a tuple is chosen as an input type, so we have special handling for those cases. Fixes: #52742. Release note: None Co-authored-by: Yevgeniy Miretskiy <[email protected]> Co-authored-by: Raphael 'kena' Poss <[email protected]> Co-authored-by: Yahor Yuzefovich <[email protected]>
I'm on 66a3c19, and I've just started a local node and running the
kv
workload against it. Within a few minutes I see a panic:The text was updated successfully, but these errors were encountered: