distsqlrun: schedule failed streams on the gateway #17497

andreimatei · 2017-08-07T22:10:28Z

Before this patch, failure to schedule any remote flows was causing the
query to fail. This is not great, as scheduling can fail because the
remote node went down (but it must have gone down recently, otherwise we
wouldn't even have scheduled a flow on it), but also because the remote
node is running an older version.
This patch makes distSQLPlanner.Run() schedule locally the flows that
failed to schedule remotely. It does this by merging the processors in
those flows with the local flow.
Moving flows to the gateway means that any other remote processor that
was planned to connect to a moved processor will have a bad time
connecting; we fix it by adding a fallback to outboxes - all outboxes
are programmed to fallback to connecting to the gateway if the
connection to their primary target doesn't succeed in within a timeout.

... to be returned by SetupFlow when the server doesn't support the requested version, instead of the "internal error" we used to return. Even though we won't be able to use the typed error for now because 1.0 didn't have it, I still think it's a good idea to have it from now on.

Before this patch, failure to schedule any remote flows was causing the query to fail. This is not great, as scheduling can fail because the remote node went down (but it must have gone down recently, otherwise we wouldn't even have scheduled a flow on it), but also because the remote node is running an older version. This patch makes distSQLPlanner.Run() schedule locally the flows that failed to schedule remotely. It does this by merging the processors in those flows with the local flow. Moving flows to the gateway means that any other remote processor that was planned to connect to a moved processor will have a bad time connecting; we fix it by adding a fallback to outboxes - all outboxes are programmed to fallback to connecting to the gateway if the connection to their primary target doesn't succeed in within a timeout.

…al flow on the gateway Before this patch, when a query was compiled to flows A,B,G, and A was successfully scheduled but B's scheduling errors, we didn't schedule the gateway flow G. If A was trying to connect to G, the process of stream establishment would have waited for the connection timeout before giving up. This patch improves this by registering a "poisoned entry" on G that informs anybody trying to connect that the flow is toast.

cockroach-teamcity · 2017-08-07T22:10:37Z

This change is

andreimatei · 2017-08-07T22:10:46Z

I still have to figure out the testing story, but sending out for opinions.

RaduBerinde · 2017-08-08T02:14:06Z

This is cool stuff! Thanks for getting this done so quickly!

For testing, we could add a testing knob to inject version mismatch errors and verify a query while turning it on and off for various nodes.

I worked on a change at some point that introduced an "finished flow cache" which would fail-fast requests for those flows, but never made it into a PR. The idea was that if a flow fails and gets unregistered before other flows connect to it, we don't want those to wait for the timeout. The same could be used instead of the poisoning stuff. If you want to look at it, this is the commit: RaduBerinde@3eb5dd0 We can of course add this later, just thought I should point it out.

Review status: 0 of 13 files reviewed at latest revision, 7 unresolved discussions, some commit checks failed.

pkg/sql/distsql_running.go, line 41 at r3 (raw file):

// poisonedFlowDefaultTimeout is the amount of time that a poisoned flow (a
// flow that will not actually be scheduled) lives in the FlowRegistry.
const poisonedFlowDefaultTimeout time.Duration = time.Second

this can be longer? the connection timeouts are a few seconds I think

pkg/sql/distsqlrun/flow_registry.go, line 205 at r3 (raw file):

// late streams attempting to connect will wait for the regular connection
// timeout before timing out.
func (fr *flowRegistry) PoisonFlow(id FlowID, timeout time.Duration) {

[nit] I would call this "duration" (as in duration of the poisoning)

pkg/sql/distsqlrun/flow_registry.go, line 229 at r3 (raw file):

		fr.UnregisterFlow(id)
	})

[nit] extra blank line

pkg/sql/distsqlrun/outbox.go, line 320 at r2 (raw file):

	}
	var fallbackTimer *time.Timer
	if m.fallbackAddr != "" {

[nit] would be a bit more readable to use a hasFallback := (m.fallbackAddr != "") instead of repeating the condition

pkg/sql/distsqlrun/outbox.go, line 323 at r2 (raw file):

		connectAttemptsRemaning++
		fallbackStreamCtx, fallbackStreamCancel = context.WithCancel(ctx)
		fallbackTimer = time.AfterFunc(outboxAttemptFallbackTimeout, func() {

Pretty smart stuff going on here with the timer!

pkg/sql/distsqlrun/outbox.go, line 359 at r2 (raw file):

	if m.fallbackAddr != "" {
		return errors.Errorf(
			"failed to connect outbound stream. Primary: %s. Fallback: %s",

I think it's ok to have the same message (with empty fallback if there is none); it's a bit more explicit than the other one.

pkg/sql/distsqlrun/outbox.go, line 365 at r2 (raw file):

}

func connectOutboundStream(

Can this not have the same name as the method above?

Comments from Reviewable

knz · 2018-04-27T14:52:53Z

Maybe rebase this? Also it will want a release note.

andreimatei · 2018-04-27T17:34:36Z

This PR is dead, at least for now. The fundamental issue is that it's hard to decide when to fallback because it's hard to conclude that you've connected to the wrong node.

…

On Fri, Apr 27, 2018 at 10:53 AM, kena ***@***.***> wrote: Maybe rebase this? Also it will want a release note. — You are receiving this because your review was requested. Reply to this email directly, view it on GitHub <#17497 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AAXBcQgxFL82hW_yujocKKiiG-WrBCleks5tszDUgaJpZM4OwApb> .

jordanlewis · 2018-08-06T02:30:55Z

cc @asubiotto

rjnn · 2018-09-18T20:33:32Z

Closing this as obsolete.

andreimatei added 3 commits August 7, 2017 16:47

andreimatei assigned RaduBerinde Aug 7, 2017

andreimatei requested review from a team August 7, 2017 22:10

rjnn mentioned this pull request Aug 15, 2017

distsql: better handling of failures and dead nodes #15637

Closed

andreimatei mentioned this pull request Aug 17, 2017

sql,distsqlrun,gossip: don't plan on incompatible nodes #17747

Merged

RaduBerinde mentioned this pull request Aug 24, 2017

distsql: ensure backward compatibility during upgrade #17277

Closed

tamird requested review from a team and removed request for a team October 22, 2017 18:07

rjnn closed this Sep 18, 2018

andreimatei deleted the distsql-version-err branch September 25, 2018 00:06

yuzefovich mentioned this pull request Aug 31, 2022

roachtest: stop cockroach gracefully when upgrading nodes #87154

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

distsqlrun: schedule failed streams on the gateway #17497

distsqlrun: schedule failed streams on the gateway #17497

andreimatei commented Aug 7, 2017

cockroach-teamcity commented Aug 7, 2017

andreimatei commented Aug 7, 2017

RaduBerinde commented Aug 8, 2017

knz commented Apr 27, 2018

andreimatei commented Apr 27, 2018 via email

jordanlewis commented Aug 6, 2018

rjnn commented Sep 18, 2018

distsqlrun: schedule failed streams on the gateway #17497

distsqlrun: schedule failed streams on the gateway #17497

Conversation

andreimatei commented Aug 7, 2017

cockroach-teamcity commented Aug 7, 2017

andreimatei commented Aug 7, 2017

RaduBerinde commented Aug 8, 2017

knz commented Apr 27, 2018

andreimatei commented Apr 27, 2018 via email

jordanlewis commented Aug 6, 2018

rjnn commented Sep 18, 2018