Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

distsql: Nil pointer panic in PopulateEndpoints #26140

Closed
bdarnell opened this issue May 28, 2018 · 4 comments · Fixed by #26950
Closed

distsql: Nil pointer panic in PopulateEndpoints #26140

bdarnell opened this issue May 28, 2018 · 4 comments · Fixed by #26950
Assignees
Labels
A-sql-optimizer SQL logical planning and optimizations. C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior.

Comments

@bdarnell
Copy link
Contributor

https://sentry.io/cockroach-labs/cockroachdb/issues/565592900/

*log.safeError: conn_executor.go:521: panic while executing 1 statements: SELECT _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _ FROM ((SELECT _, _, _, _, _, _, _, _, _, _ FROM _._._ WHERE (_ = (-_)) AND (_ = _) ORDER BY _, _, _ LIMIT _) JOIN (SELECT _, _, _, _, _, _, _, _, _, _, _, _, _, _ FROM _._._ WHERE (_ = (-_)) AND (_ = _) ORDER BY _, _ LIMIT _) USING (_, _)) ORDER BY _, _, _: caused by <redacted>
  File "github.com/cockroachdb/cockroach/pkg/sql/conn_executor.go", line 489, in func2
  File "github.com/cockroachdb/cockroach/pkg/sql/distsqlplan/physical_plan.go", line 676, in PopulateEndpoints
  File "github.com/cockroachdb/cockroach/pkg/sql/distsql_physical_planner.go", line 2463, in FinalizePlan
  File "github.com/cockroachdb/cockroach/pkg/sql/distsql_running.go", line 531, in PlanAndRun
  File "github.com/cockroachdb/cockroach/pkg/sql/conn_executor_exec.go", line 672, in execWithDistSQLEngine
...
(8 additional frame(s) were not displayed)

conn_executor.go:521: panic while executing 1 statements: SELECT _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _ FROM ((SELECT _, _, _, _, _, _, _, _, _, _ FROM _._._ WHERE (_ = (-_)) AND (_ = _) ORDER BY _, _, _ LIMIT _) JOIN (SELECT _, _, _, _, _, _, _, _, _, _, _, _, _, _ FROM _._._ WHERE (_ = (-_)) AND (_ = _) ORDER BY _, _ LIMIT _) USING (_, _)) ORDER BY _, _, _: caused by <redacted>
@jordanlewis jordanlewis added C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. A-sql-optimizer SQL logical planning and optimizations. labels May 29, 2018
@jordanlewis
Copy link
Member

                p2.Spec.Input[s.DestInput].Streams = append(p2.Spec.Input[s.DestInput].Streams, endpoint)
                if endpoint.Type == distsqlrun.StreamEndpointSpec_REMOTE {
                        var ok bool
                        endpoint.TargetAddr, ok = nodeAddresses[p2.Node]
                        if !ok {
                                panic(fmt.Sprintf("node %d node in nodeAddresses map", p2.Node))
                        }
                }

The panic comes from the explicit panic listed there.

@solongordon solongordon self-assigned this Jun 25, 2018
@solongordon
Copy link
Contributor

I think I see where this got broken. getNodeIDForScan used to return the gateway node ID if a node was unhealthy or incompatible. Now it returns the unhealthy node ID.

0cd1da0#diff-8840c1839af40078c429f6fed21d7916L899

I'll dig in a bit more but if that's all it is, it'll be an easy fix.

@solongordon
Copy link
Contributor

This bug is present as of 2.0.2.

@solongordon
Copy link
Contributor

Just realized this code path is thankfully only used by LIMIT queries, which somewhat reduces the scope of this bug.

solongordon added a commit to solongordon/cockroach that referenced this issue Jun 25, 2018
A bug was introduced in 0cd1da0 which allows table readers to be planned
on unhealthy or incompatible nodes for LIMIT queries. They should use
the gateway node instead. This was causing a panic in execution because
the node was not in the nodeAddresses map.

Fixes cockroachdb#26140

Release note (bug fix): Fixed 'node not in nodeAddresses map' panic,
which could occur when distributed LIMIT queries were run on a cluster
with at least one unhealthy node.
solongordon added a commit to solongordon/cockroach that referenced this issue Jun 25, 2018
A bug was introduced in 0cd1da0 which allows table readers to be planned
on unhealthy or incompatible nodes for LIMIT queries. They should use
the gateway node instead. This was causing a panic in execution because
the node was not in the nodeAddresses map.

Fixes cockroachdb#26140

Release note (bug fix): Fixed 'node not in nodeAddresses map' panic,
which could occur when distributed LIMIT queries were run on a cluster
with at least one unhealthy node.
craig bot pushed a commit that referenced this issue Jun 25, 2018
26950: distsql: do not plan against unhealthy nodes r=solongordon a=solongordon

A bug was introduced in 0cd1da0 which allows table readers to be planned
on unhealthy or incompatible nodes for LIMIT queries. They should use
the gateway node instead. This was causing a panic in execution because
the node was not in the nodeAddresses map.

Fixes #26140

Release note (bug fix): Fixed 'node not in nodeAddresses map' panic,
which could occur when distributed queries were run on a cluster with at
least one unhealthy node.

Co-authored-by: Solon Gordon <[email protected]>
@craig craig bot closed this as completed in #26950 Jun 25, 2018
craig bot pushed a commit that referenced this issue Jun 25, 2018
26953: release-2.0: distsql: do not plan against unhealthy nodes r=solongordon a=solongordon

Backport 1/1 commits from #26950.

/cc @cockroachdb/release

---

A bug was introduced in 0cd1da0 which allows table readers to be planned
on unhealthy or incompatible nodes for LIMIT queries. They should use
the gateway node instead. This was causing a panic in execution because
the node was not in the nodeAddresses map.

Fixes #26140

Release note (bug fix): Fixed 'node not in nodeAddresses map' panic,
which could occur when distributed queries were run on a cluster with at
least one unhealthy node.


Co-authored-by: Solon Gordon <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-sql-optimizer SQL logical planning and optimizations. C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants