Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

distsqlrun: schedule failed streams on the gateway #17497

Closed

Conversation

andreimatei
Copy link
Contributor

Before this patch, failure to schedule any remote flows was causing the
query to fail. This is not great, as scheduling can fail because the
remote node went down (but it must have gone down recently, otherwise we
wouldn't even have scheduled a flow on it), but also because the remote
node is running an older version.
This patch makes distSQLPlanner.Run() schedule locally the flows that
failed to schedule remotely. It does this by merging the processors in
those flows with the local flow.
Moving flows to the gateway means that any other remote processor that
was planned to connect to a moved processor will have a bad time
connecting; we fix it by adding a fallback to outboxes - all outboxes
are programmed to fallback to connecting to the gateway if the
connection to their primary target doesn't succeed in within a timeout.

... to be returned by SetupFlow when the server doesn't support the
requested version, instead of the "internal error" we used to return.
Even though we won't be able to use the typed error for now because 1.0
didn't have it, I still think it's a good idea to have it from now on.
Before this patch, failure to schedule any remote flows was causing the
query to fail. This is not great, as scheduling can fail because the
remote node went down (but it must have gone down recently, otherwise we
wouldn't even have scheduled a flow on it), but also because the remote
node is running an older version.
This patch makes distSQLPlanner.Run() schedule locally the flows that
failed to schedule remotely. It does this by merging the processors in
those flows with the local flow.
Moving flows to the gateway means that any other remote processor that
was planned to connect to a moved processor will have a bad time
connecting; we fix it by adding a fallback to outboxes - all outboxes
are programmed to fallback to connecting to the gateway if the
connection to their primary target doesn't succeed in within a timeout.
…al flow on the gateway

Before this patch, when a query was compiled to flows A,B,G, and A was
successfully scheduled but B's scheduling errors, we didn't schedule the
gateway flow G. If A was trying to connect to G, the process of stream
establishment would have waited for the connection timeout before
giving up. This patch improves this by registering a "poisoned entry" on
G that informs anybody trying to connect that the flow is toast.
@andreimatei andreimatei requested review from a team August 7, 2017 22:10
@cockroach-teamcity
Copy link
Member

This change is Reviewable

@andreimatei
Copy link
Contributor Author

I still have to figure out the testing story, but sending out for opinions.

@RaduBerinde
Copy link
Member

:lgtm:

This is cool stuff! Thanks for getting this done so quickly!

For testing, we could add a testing knob to inject version mismatch errors and verify a query while turning it on and off for various nodes.

I worked on a change at some point that introduced an "finished flow cache" which would fail-fast requests for those flows, but never made it into a PR. The idea was that if a flow fails and gets unregistered before other flows connect to it, we don't want those to wait for the timeout. The same could be used instead of the poisoning stuff. If you want to look at it, this is the commit: RaduBerinde@3eb5dd0 We can of course add this later, just thought I should point it out.


Review status: 0 of 13 files reviewed at latest revision, 7 unresolved discussions, some commit checks failed.


pkg/sql/distsql_running.go, line 41 at r3 (raw file):

// poisonedFlowDefaultTimeout is the amount of time that a poisoned flow (a
// flow that will not actually be scheduled) lives in the FlowRegistry.
const poisonedFlowDefaultTimeout time.Duration = time.Second

this can be longer? the connection timeouts are a few seconds I think


pkg/sql/distsqlrun/flow_registry.go, line 205 at r3 (raw file):

// late streams attempting to connect will wait for the regular connection
// timeout before timing out.
func (fr *flowRegistry) PoisonFlow(id FlowID, timeout time.Duration) {

[nit] I would call this "duration" (as in duration of the poisoning)


pkg/sql/distsqlrun/flow_registry.go, line 229 at r3 (raw file):

		fr.UnregisterFlow(id)
	})

[nit] extra blank line


pkg/sql/distsqlrun/outbox.go, line 320 at r2 (raw file):

	}
	var fallbackTimer *time.Timer
	if m.fallbackAddr != "" {

[nit] would be a bit more readable to use a hasFallback := (m.fallbackAddr != "") instead of repeating the condition


pkg/sql/distsqlrun/outbox.go, line 323 at r2 (raw file):

		connectAttemptsRemaning++
		fallbackStreamCtx, fallbackStreamCancel = context.WithCancel(ctx)
		fallbackTimer = time.AfterFunc(outboxAttemptFallbackTimeout, func() {

Pretty smart stuff going on here with the timer!


pkg/sql/distsqlrun/outbox.go, line 359 at r2 (raw file):

	if m.fallbackAddr != "" {
		return errors.Errorf(
			"failed to connect outbound stream. Primary: %s. Fallback: %s",

I think it's ok to have the same message (with empty fallback if there is none); it's a bit more explicit than the other one.


pkg/sql/distsqlrun/outbox.go, line 365 at r2 (raw file):

}

func connectOutboundStream(

Can this not have the same name as the method above?


Comments from Reviewable

@knz
Copy link
Contributor

knz commented Apr 27, 2018

Maybe rebase this? Also it will want a release note.

@andreimatei
Copy link
Contributor Author

andreimatei commented Apr 27, 2018 via email

@jordanlewis
Copy link
Member

cc @asubiotto

@rjnn
Copy link
Contributor

rjnn commented Sep 18, 2018

Closing this as obsolete.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants