importccl: refactor out all business logic from the processor #41909

spaskob · 2019-10-24T18:00:42Z

This helps in being able to run an import standalone and makes it clear
that the distSql processor is only used for propagating error and status
messages to the controller.

Release note: none.

cockroach-teamcity · 2019-10-24T18:00:49Z

This change is

miretskiy

Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @dt, @miretskiy, and @spaskob)

pkg/ccl/importccl/import_processor.go, line 77 at r1 (raw file):

	// We don't have to worry about this go routine leaking because next we loop over progCh
	// which is closed only after the go routine returns.

I kinda think that "leaking" is a strong word... Did you mean that runImport() terminates early if the passed in context is cancelled, and so we don't need to use group.GoCtx for the cancellation propagation?

pkg/ccl/importccl/import_processor.go, line 102 at r1 (raw file):

Quoted 8 lines of code…

	if err != nil {
		cp.output.Push(nil, &execinfrapb.ProducerMetadata{Err: err})
		return
	}
	cp.output.Push(sqlbase.EncDatumRow{
		sqlbase.DatumToEncDatum(types.Bytes, tree.NewDBytes(tree.DBytes(countsBytes))),
		sqlbase.DatumToEncDatum(types.Bytes, tree.NewDBytes(tree.DBytes([]byte{}))),
	}, nil)

Not sure which one is more canonical go:
if countBytes, err := marhsal(); err == nil {
// send result
} else {
// send error.
}

This helps in being able to run an import standalone and makes it clear that the distSql processor is only used for propagating error and status messages to the controller. Release note: none.

dt

Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @dt and @miretskiy)

pkg/ccl/importccl/import_processor.go, line 77 at r1 (raw file):

Previously, miretskiy (Yevgeniy Miretskiy) wrote…

	// We don't have to worry about this go routine leaking because next we loop over progCh
	// which is closed only after the go routine returns.
I kinda think that "leaking" is a strong word... Did you mean that runImport() terminates early if the passed in context is cancelled, and so we don't need to use group.GoCtx for the cancellation propagation?

"leaking a go routine" is a common problem where the function that started a goroutine returns while the goroutine it started is still running, and nothing is responsible for making sure that goroutine actually exists.

Using group.Go is one way we avoid this (since group.Wait() will block until they all exit) and using our bespoke stopper is another, and a naked go is often treated as suspect given how easy it is to get wrong. However, in-person, Spas and I looked at this particular callsite and decided there was no need for the one-goroutine-group, since it was clear the range would block the creating function until the defer fired, and the ctx closed over would cancel at exactly the same time either way, so in this case, it was clearer this way (but with a comment for those who have acquired the fear of the naked keyword).

pkg/ccl/importccl/import_processor.go, line 102 at r1 (raw file):

Previously, miretskiy (Yevgeniy Miretskiy) wrote…

	if err != nil {
		cp.output.Push(nil, &execinfrapb.ProducerMetadata{Err: err})
		return
	}
	cp.output.Push(sqlbase.EncDatumRow{
		sqlbase.DatumToEncDatum(types.Bytes, tree.NewDBytes(tree.DBytes(countsBytes))),
		sqlbase.DatumToEncDatum(types.Bytes, tree.NewDBytes(tree.DBytes([]byte{}))),
	}, nil)

Not sure which one is more canonical go:
if countBytes, err := marhsal(); err == nil {
// send result
} else {
// send error.
}

no, err != nil immediately after the error-returning call is the norm.

If you try to put the nil case in the else so you can use the if-scoped values (countBytes here), our linter will in fact tell you not to (since the first block has an early return) and instead do it the way it is in this diff now: binding the value and err outside the if, doing the err conditional early return, then using the bound result back at the original scope.

You can do it the way you wrote, putting the non-nil err in the else instead, if you really want to use the if binding to limit the scope of the countBytes, but that is not the norm, and runs a serious risk of people accustomed to the err != nil convention missing the subtle difference. (FWIW, I wish that linter didn't exist, and have been known to try your way, but have come around to ahearing to the convention instead)

spaskob requested review from dt and miretskiy October 24, 2019 18:00

spaskob force-pushed the import-refactor branch from 0eab64b to b99ec7f Compare October 24, 2019 18:03

miretskiy reviewed Oct 24, 2019

View reviewed changes

importccl: refactor out all business logic from the processor

d993f9e

This helps in being able to run an import standalone and makes it clear that the distSql processor is only used for propagating error and status messages to the controller. Release note: none.

spaskob force-pushed the import-refactor branch from b99ec7f to d993f9e Compare October 24, 2019 18:30

dt approved these changes Oct 24, 2019

View reviewed changes

spaskob merged commit 7ac14eb into cockroachdb:master Oct 24, 2019

spaskob mentioned this pull request Nov 12, 2019

release-19.2: IMPORT CSV experimental_save_rejected support #42391

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

importccl: refactor out all business logic from the processor #41909

importccl: refactor out all business logic from the processor #41909

spaskob commented Oct 24, 2019

cockroach-teamcity commented Oct 24, 2019

miretskiy left a comment

dt left a comment

importccl: refactor out all business logic from the processor #41909

importccl: refactor out all business logic from the processor #41909

Conversation

spaskob commented Oct 24, 2019

cockroach-teamcity commented Oct 24, 2019

miretskiy left a comment

Choose a reason for hiding this comment

dt left a comment

Choose a reason for hiding this comment