Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

exec: add cancellation check to operators #36967

Merged
merged 1 commit into from
Apr 23, 2019

Conversation

yuzefovich
Copy link
Member

We introduce a CancelChecker operator that wraps all other operators
and performs a cancellation check on every batch. Also, sorter and
hash joiner do additional checks themselves since they can spend
a lot of time in certain phases of execution.

Release note: None

@cockroach-teamcity
Copy link
Member

This change is Reviewable

Copy link
Member

@jordanlewis jordanlewis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks nice.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @asubiotto, @georgeutsin, @jordanlewis, @solongordon, and @yuzefovich)


pkg/sql/distsqlrun/column_exec_setup.go, line 450 at r1 (raw file):

		op = exec.NewLimitOp(op, post.Limit)
	}
	return exec.NewCancelChecker(op), nil

I wonder if there's any significant advantage to doing this once per batch per operator as opposed to just once per input batch. Like, you could install a single cancel checker at the top of the pipeline, below the table reader.

In this case, I think the effects would be nearly identical as long as there aren't any long-running operators. But we're already adding fixes to all of the long running operators individually...


pkg/sql/exec/sort.go, line 364 at r1 (raw file):

		// Since we can spend a lot of time doing sorting here, it's worth to check
		// for cancellation explicitly.
		// TODO(yuzefovich): is this granular enough?

I kind of doubt it - it misses the common case of a large global sort on a single column. That might take a long time and should be interruptable.

Copy link
Member Author

@yuzefovich yuzefovich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @asubiotto, @georgeutsin, @jordanlewis, and @solongordon)


pkg/sql/distsqlrun/column_exec_setup.go, line 450 at r1 (raw file):

Previously, jordanlewis (Jordan Lewis) wrote…

I wonder if there's any significant advantage to doing this once per batch per operator as opposed to just once per input batch. Like, you could install a single cancel checker at the top of the pipeline, below the table reader.

In this case, I think the effects would be nearly identical as long as there aren't any long-running operators. But we're already adding fixes to all of the long running operators individually...

Good point, done. I also think this should be sufficient with individual operators doing the checks as well.


pkg/sql/exec/sort.go, line 364 at r1 (raw file):

Previously, jordanlewis (Jordan Lewis) wrote…

I kind of doubt it - it misses the common case of a large global sort on a single column. That might take a long time and should be interruptable.

Agreed. I basically copied the approach from existing sorter now.

Copy link
Member

@jordanlewis jordanlewis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @asubiotto, @georgeutsin, @solongordon, and @yuzefovich)


pkg/sql/exec/hashjoiner_tmpl.go, line 272 at r2 (raw file):

	nKeys uint64,
	sel []uint16,
	cancelChecker CancelChecker,

Would it make more sense to put the cancelChecker in the hashTable struct itself? Not sure, I guess the downside of that is that it has to know the context ahead of time, which is something that we try to avoid.


pkg/sql/exec/quicksort_tmpl.go, line 197 at r2 (raw file):

		}
		maxDepth--
		cancelChecker.check()

Do we also need one below, in the other if? This looks like only happens for small slices. I don't know how this implementation works very well.

Also could you show results of sort benchmarks?

Copy link
Contributor

@asubiotto asubiotto left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 1 of 8 files at r1.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @asubiotto, @georgeutsin, @jordanlewis, @solongordon, and @yuzefovich)


pkg/sql/exec/cancel_checker.go, line 25 at r2 (raw file):

)

// CancelChecker is an Operator that checks whether a query cancellation has

s/a query/query


pkg/sql/exec/cancel_checker.go, line 46 at r2 (raw file):

// Init is part of Operator interface.
func (c *CancelChecker) Init() {

I believe you can elide this definition (the embedded Init will be called by default)


pkg/sql/exec/cancel_checker.go, line 61 at r2 (raw file):

const cancelCheckInterval = 1024

// check panics with query canceled error if the associated query has been

s/with query/with a query


pkg/sql/exec/cancel_checker.go, line 70 at r2 (raw file):

	// Increment. This may rollover when the 32-bit capacity is reached, but
	// that's all right.
	atomic.AddUint32(&c.callsSinceLastCheck, 1)

TODO: come back to this


pkg/sql/exec/cancel_checker_test.go, line 33 at r2 (raw file):

	op := NewCancelChecker(NewNoop(newRepeatableBatchSource(batch)))
	cancel()
	if err := CatchVectorizedRuntimeError(func() {

You might be able to use the require package here again (using require.Panics)


pkg/sql/exec/hashjoiner_tmpl.go, line 272 at r2 (raw file):

Previously, jordanlewis (Jordan Lewis) wrote…

Would it make more sense to put the cancelChecker in the hashTable struct itself? Not sure, I guess the downside of that is that it has to know the context ahead of time, which is something that we try to avoid.

One solution would be to put the cancelChecker in the hashTable struct but remove the context field from the cancelChecker struct and make that an argument to check instead. This way the hashTable could take a context.Context that it then passes to its cancelChecker.check which makes for more idiomatic go and doesn't require a context field.

Copy link
Member Author

@yuzefovich yuzefovich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Benchmark of the general sorter:

name                        old time/op    new time/op    delta
Sort/rows=2048/cols=1-12      20.1µs ±10%    17.7µs ±12%    -11.94%  (p=0.002 n=10+10)
Sort/rows=2048/cols=2-12       128µs ± 6%     145µs ± 8%    +13.12%  (p=0.000 n=10+10)
Sort/rows=2048/cols=4-12       222µs ± 3%     300µs ±11%    +34.79%  (p=0.000 n=10+10)
Sort/rows=16384/cols=1-12      164µs ±11%     160µs ± 9%       ~     (p=0.393 n=10+10)
Sort/rows=16384/cols=2-12     1.09ms ± 6%    1.11ms ± 3%       ~     (p=0.218 n=10+10)
Sort/rows=16384/cols=4-12     1.84ms ± 8%    2.00ms ± 7%     +8.65%  (p=0.000 n=10+9)
Sort/rows=262144/cols=1-12    2.68ms ± 9%    2.37ms ± 3%    -11.41%  (p=0.000 n=10+9)
Sort/rows=262144/cols=2-12    15.3ms ± 7%    15.3ms ± 2%       ~     (p=0.829 n=10+8)
Sort/rows=262144/cols=4-12    33.1ms ± 6%    32.3ms ± 1%       ~     (p=0.515 n=10+8)

name                        old speed      new speed      delta
Sort/rows=2048/cols=1-12     820MB/s ±10%   929MB/s ±11%    +13.31%  (p=0.002 n=10+10)
Sort/rows=2048/cols=2-12     256MB/s ± 6%   226MB/s ± 8%    -11.57%  (p=0.000 n=10+10)
Sort/rows=2048/cols=4-12     295MB/s ± 3%   219MB/s ±10%    -25.61%  (p=0.000 n=10+10)
Sort/rows=16384/cols=1-12    799MB/s ±10%   819MB/s ± 8%       ~     (p=0.393 n=10+10)
Sort/rows=16384/cols=2-12    240MB/s ± 5%   236MB/s ± 3%       ~     (p=0.218 n=10+10)
Sort/rows=16384/cols=4-12    285MB/s ± 8%   262MB/s ± 6%     -8.00%  (p=0.000 n=10+9)
Sort/rows=262144/cols=1-12   785MB/s ± 9%   884MB/s ± 3%    +12.62%  (p=0.000 n=10+9)
Sort/rows=262144/cols=2-12   274MB/s ± 7%   275MB/s ± 2%       ~     (p=0.829 n=10+8)
Sort/rows=262144/cols=4-12   254MB/s ± 6%   259MB/s ± 1%       ~     (p=0.515 n=10+8)

name                        old alloc/op   new alloc/op   delta
Sort/rows=2048/cols=1-12      73.2kB ± 0%    73.3kB ± 0%     +0.09%  (p=0.000 n=10+10)
Sort/rows=2048/cols=2-12       113kB ± 0%     121kB ± 0%     +7.01%  (p=0.000 n=10+9)
Sort/rows=2048/cols=4-12       206kB ± 0%     267kB ± 0%    +29.87%  (p=0.000 n=10+10)
Sort/rows=16384/cols=1-12      885kB ± 0%     885kB ± 0%     +0.01%  (p=0.000 n=10+10)
Sort/rows=16384/cols=2-12     1.52MB ± 0%    1.56MB ± 0%     +2.36%  (p=0.000 n=9+10)
Sort/rows=16384/cols=4-12     2.78MB ± 0%    2.94MB ± 0%     +5.84%  (p=0.000 n=10+10)
Sort/rows=262144/cols=1-12    16.0MB ± 0%    16.0MB ± 0%     +0.00%  (p=0.000 n=8+10)
Sort/rows=262144/cols=2-12    28.0MB ± 0%    28.0MB ± 0%     +0.13%  (p=0.000 n=8+10)
Sort/rows=262144/cols=4-12    51.5MB ± 0%    51.7MB ± 0%     +0.31%  (p=0.000 n=10+10)

name                        old allocs/op  new allocs/op  delta
Sort/rows=2048/cols=1-12        27.0 ± 0%      29.0 ± 0%     +7.41%  (p=0.000 n=10+10)
Sort/rows=2048/cols=2-12        45.0 ± 0%     293.3 ± 3%   +551.85%  (p=0.000 n=10+9)
Sort/rows=2048/cols=4-12        81.0 ± 0%    1999.4 ± 1%  +2368.40%  (p=0.000 n=10+10)
Sort/rows=16384/cols=1-12       65.0 ± 0%      67.0 ± 0%     +3.08%  (p=0.000 n=10+10)
Sort/rows=16384/cols=2-12        121 ± 0%      1243 ± 3%   +927.44%  (p=0.000 n=10+10)
Sort/rows=16384/cols=4-12        233 ± 0%      5304 ± 1%  +2176.22%  (p=0.000 n=10+10)
Sort/rows=262144/cols=1-12       564 ± 0%       566 ± 0%     +0.35%  (p=0.000 n=10+10)
Sort/rows=262144/cols=2-12     1.12k ± 0%     2.22k ± 1%    +98.67%  (p=0.000 n=10+10)
Sort/rows=262144/cols=4-12     2.23k ± 0%     7.28k ± 1%   +226.50%  (p=0.000 n=10+10)

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @asubiotto, @georgeutsin, @jordanlewis, and @solongordon)


pkg/sql/exec/cancel_checker.go, line 25 at r2 (raw file):

Previously, asubiotto (Alfonso Subiotto Marqués) wrote…

s/a query/query

Done.


pkg/sql/exec/cancel_checker.go, line 46 at r2 (raw file):

Previously, asubiotto (Alfonso Subiotto Marqués) wrote…

I believe you can elide this definition (the embedded Init will be called by default)

Thanks, done.


pkg/sql/exec/cancel_checker.go, line 61 at r2 (raw file):

Previously, asubiotto (Alfonso Subiotto Marqués) wrote…

s/with query/with a query

Done.


pkg/sql/exec/cancel_checker.go, line 70 at r2 (raw file):

Previously, asubiotto (Alfonso Subiotto Marqués) wrote…

TODO: come back to this

Did you mean for me to leave a TODO like this? Or was it for your own attention?


pkg/sql/exec/cancel_checker_test.go, line 33 at r2 (raw file):

Previously, asubiotto (Alfonso Subiotto Marqués) wrote…

You might be able to use the require package here again (using require.Panics)

Done.


pkg/sql/exec/hashjoiner_tmpl.go, line 272 at r2 (raw file):

Previously, asubiotto (Alfonso Subiotto Marqués) wrote…

One solution would be to put the cancelChecker in the hashTable struct but remove the context field from the cancelChecker struct and make that an argument to check instead. This way the hashTable could take a context.Context that it then passes to its cancelChecker.check which makes for more idiomatic go and doesn't require a context field.

Done.


pkg/sql/exec/quicksort_tmpl.go, line 197 at r2 (raw file):

Previously, jordanlewis (Jordan Lewis) wrote…

Do we also need one below, in the other if? This looks like only happens for small slices. I don't know how this implementation works very well.

Also could you show results of sort benchmarks?

I also wondered about that. sqlbase/sort.go doesn't check it in case 12 > b-a > 1, I'm assuming, because that case should be extremely quick.

Copy link
Member Author

@yuzefovich yuzefovich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I decided to try having to pass CancelChecker by pointer, and the results are surprising to me (note that this is the comparison of current master against cancellation branch and not the comparison between cancellation with values and with pointers):

name                        old time/op    new time/op    delta
Sort/rows=2048/cols=1-12      20.8µs ±13%    17.7µs ± 8%  -14.70%  (p=0.000 n=10+10)
Sort/rows=2048/cols=2-12       139µs ±14%     128µs ± 5%   -7.81%  (p=0.000 n=10+10)
Sort/rows=2048/cols=4-12       227µs ± 2%     214µs ± 2%   -6.00%  (p=0.000 n=10+10)
Sort/rows=16384/cols=1-12      161µs ± 6%     149µs ± 2%   -7.43%  (p=0.000 n=10+9)
Sort/rows=16384/cols=2-12     1.08ms ± 7%    1.02ms ± 3%   -5.39%  (p=0.000 n=10+9)
Sort/rows=16384/cols=4-12     1.75ms ± 1%    1.70ms ± 2%   -2.71%  (p=0.000 n=8+10)
Sort/rows=262144/cols=1-12    2.41ms ± 9%    2.32ms ± 5%     ~     (p=0.065 n=10+9)
Sort/rows=262144/cols=2-12    16.1ms ± 7%    14.5ms ± 2%  -10.31%  (p=0.000 n=10+8)
Sort/rows=262144/cols=4-12    36.6ms ±14%    29.7ms ± 1%  -18.75%  (p=0.000 n=10+8)

name                        old speed      new speed      delta
Sort/rows=2048/cols=1-12     793MB/s ±12%   927MB/s ± 7%  +16.88%  (p=0.000 n=10+10)
Sort/rows=2048/cols=2-12     236MB/s ±12%   255MB/s ± 4%   +8.19%  (p=0.000 n=10+10)
Sort/rows=2048/cols=4-12     288MB/s ± 2%   307MB/s ± 2%   +6.37%  (p=0.000 n=10+10)
Sort/rows=16384/cols=1-12    816MB/s ± 6%   881MB/s ± 2%   +7.93%  (p=0.000 n=10+9)
Sort/rows=16384/cols=2-12    243MB/s ± 6%   256MB/s ± 3%   +5.65%  (p=0.000 n=10+9)
Sort/rows=16384/cols=4-12    300MB/s ± 1%   308MB/s ± 2%   +2.79%  (p=0.000 n=8+10)
Sort/rows=262144/cols=1-12   872MB/s ± 8%   901MB/s ± 3%     ~     (p=0.122 n=10+8)
Sort/rows=262144/cols=2-12   258MB/s ± 7%   290MB/s ± 2%  +12.25%  (p=0.000 n=9+8)
Sort/rows=262144/cols=4-12   231MB/s ±15%   282MB/s ± 1%  +22.25%  (p=0.000 n=10+8)

name                        old alloc/op   new alloc/op   delta
Sort/rows=2048/cols=1-12      73.2kB ± 0%    73.3kB ± 0%   +0.04%  (p=0.000 n=10+10)
Sort/rows=2048/cols=2-12       113kB ± 0%     113kB ± 0%   +0.03%  (p=0.000 n=10+10)
Sort/rows=2048/cols=4-12       206kB ± 0%     206kB ± 0%   +0.02%  (p=0.000 n=10+10)
Sort/rows=16384/cols=1-12      885kB ± 0%     885kB ± 0%   +0.00%  (p=0.000 n=10+10)
Sort/rows=16384/cols=2-12     1.52MB ± 0%    1.52MB ± 0%   +0.00%  (p=0.000 n=10+10)
Sort/rows=16384/cols=4-12     2.78MB ± 0%    2.78MB ± 0%   +0.00%  (p=0.000 n=10+10)
Sort/rows=262144/cols=1-12    16.0MB ± 0%    16.0MB ± 0%   +0.00%  (p=0.000 n=10+10)
Sort/rows=262144/cols=2-12    28.0MB ± 0%    28.0MB ± 0%   +0.00%  (p=0.000 n=9+8)
Sort/rows=262144/cols=4-12    51.5MB ± 0%    51.5MB ± 0%   +0.00%  (p=0.000 n=8+10)

name                        old allocs/op  new allocs/op  delta
Sort/rows=2048/cols=1-12        27.0 ± 0%      28.0 ± 0%   +3.70%  (p=0.000 n=10+10)
Sort/rows=2048/cols=2-12        45.0 ± 0%      46.0 ± 0%   +2.22%  (p=0.000 n=10+10)
Sort/rows=2048/cols=4-12        81.0 ± 0%      82.0 ± 0%   +1.23%  (p=0.000 n=10+10)
Sort/rows=16384/cols=1-12       65.0 ± 0%      66.0 ± 0%   +1.54%  (p=0.000 n=10+10)
Sort/rows=16384/cols=2-12        121 ± 0%       122 ± 0%   +0.83%  (p=0.000 n=10+10)
Sort/rows=16384/cols=4-12        233 ± 0%       234 ± 0%   +0.43%  (p=0.000 n=10+10)
Sort/rows=262144/cols=1-12       564 ± 0%       565 ± 0%   +0.18%  (p=0.000 n=10+10)
Sort/rows=262144/cols=2-12     1.12k ± 0%     1.12k ± 0%   +0.09%  (p=0.000 n=10+10)
Sort/rows=262144/cols=4-12     2.23k ± 0%     2.23k ± 0%   +0.04%  (p=0.000 n=10+10)

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @asubiotto, @georgeutsin, @jordanlewis, and @solongordon)

Copy link
Member Author

@yuzefovich yuzefovich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reduced hash joiner benchmark:

HashJoiner/nulls=false/fullOuter=false/distinct=true/rows=2048-12        299µs ± 5%     350µs ± 7%  +16.85%  (p=0.000 n=10+10)
HashJoiner/nulls=false/fullOuter=false/distinct=true/rows=262144-12     25.7ms ± 2%    32.4ms ± 6%  +26.28%  (p=0.000 n=10+10)
HashJoiner/nulls=false/fullOuter=false/distinct=true/rows=4194304-12     397ms ± 6%     488ms ± 3%  +22.97%  (p=0.000 n=10+9)
HashJoiner/nulls=true/fullOuter=false/distinct=true/rows=2048-12         375µs ± 7%     408µs ± 2%   +8.66%  (p=0.000 n=10+10)
HashJoiner/nulls=true/fullOuter=false/distinct=true/rows=262144-12      34.2ms ± 2%    39.7ms ± 2%  +16.15%  (p=0.000 n=10+10)
HashJoiner/nulls=true/fullOuter=false/distinct=true/rows=4194304-12      540ms ± 6%     625ms ± 4%  +15.56%  (p=0.000 n=10+9)

name                                                                  old speed      new speed      delta
HashJoiner/nulls=false/fullOuter=false/distinct=true/rows=2048-12      438MB/s ± 5%   375MB/s ± 6%  -14.40%  (p=0.000 n=10+10)
HashJoiner/nulls=false/fullOuter=false/distinct=true/rows=262144-12    654MB/s ± 2%   519MB/s ± 6%  -20.71%  (p=0.000 n=10+10)
HashJoiner/nulls=false/fullOuter=false/distinct=true/rows=4194304-12   677MB/s ± 6%   550MB/s ± 3%  -18.72%  (p=0.000 n=10+9)
HashJoiner/nulls=true/fullOuter=false/distinct=true/rows=2048-12       350MB/s ± 7%   321MB/s ± 2%   -8.06%  (p=0.000 n=10+10)
HashJoiner/nulls=true/fullOuter=false/distinct=true/rows=262144-12     491MB/s ± 2%   422MB/s ± 2%  -13.92%  (p=0.000 n=10+10)
HashJoiner/nulls=true/fullOuter=false/distinct=true/rows=4194304-12    497MB/s ± 6%   430MB/s ± 4%  -13.50%  (p=0.000 n=10+9)

name                                                                  old alloc/op   new alloc/op   delta
HashJoiner/nulls=false/fullOuter=false/distinct=true/rows=2048-12        805kB ± 0%     805kB ± 0%     ~     (p=0.126 n=8+10)
HashJoiner/nulls=false/fullOuter=false/distinct=true/rows=262144-12     46.4MB ± 0%    46.4MB ± 0%     ~     (p=0.262 n=10+10)
HashJoiner/nulls=false/fullOuter=false/distinct=true/rows=4194304-12     699MB ± 0%     699MB ± 0%     ~     (p=1.000 n=8+10)
HashJoiner/nulls=true/fullOuter=false/distinct=true/rows=2048-12         805kB ± 0%     805kB ± 0%     ~     (p=0.211 n=10+10)
HashJoiner/nulls=true/fullOuter=false/distinct=true/rows=262144-12      46.4MB ± 0%    46.4MB ± 0%     ~     (p=0.229 n=10+9)
HashJoiner/nulls=true/fullOuter=false/distinct=true/rows=4194304-12      699MB ± 0%     699MB ± 0%     ~     (p=0.952 n=10+9)

name                                                                  old allocs/op  new allocs/op  delta
HashJoiner/nulls=false/fullOuter=false/distinct=true/rows=2048-12        8.29k ± 0%     8.29k ± 0%     ~     (all equal)
HashJoiner/nulls=false/fullOuter=false/distinct=true/rows=262144-12      1.05M ± 0%     1.05M ± 0%     ~     (all equal)
HashJoiner/nulls=false/fullOuter=false/distinct=true/rows=4194304-12     16.8M ± 0%     16.8M ± 0%     ~     (all equal)
HashJoiner/nulls=true/fullOuter=false/distinct=true/rows=2048-12         8.29k ± 0%     8.29k ± 0%     ~     (all equal)
HashJoiner/nulls=true/fullOuter=false/distinct=true/rows=262144-12       1.05M ± 0%     1.05M ± 0%     ~     (all equal)
HashJoiner/nulls=true/fullOuter=false/distinct=true/rows=4194304-12      16.8M ± 0%     16.8M ± 0%     ~     (all equal)

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @asubiotto, @georgeutsin, @jordanlewis, and @solongordon)

Copy link
Member

@jordanlewis jordanlewis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So you are suggesting that somehow passing the cancel checker by pointer makes the general sorter just faster? That seems suspicious... why do you think that is? Must be something more going on here.

:lgtm: though

Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @asubiotto, @georgeutsin, and @solongordon)

Copy link
Contributor

@asubiotto asubiotto left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 10 of 10 files at r3.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @asubiotto, @georgeutsin, @solongordon, and @yuzefovich)


pkg/sql/distsqlrun/column_exec_setup.go, line 119 at r3 (raw file):

		}
		op, err = newColBatchScan(flowCtx, core.TableReader, post)
		op = exec.NewCancelChecker(op)

Maybe add a comment as to why it's only necessary to wrap the colBatchScan.


pkg/sql/exec/cancel_checker.go, line 70 at r2 (raw file):

Previously, yuzefovich wrote…

Did you mean for me to leave a TODO like this? Or was it for your own attention?

Oops. This was for my own attention. What I was going to ask is why do we need an atomic here?


pkg/sql/exec/cancel_checker.go, line 54 at r3 (raw file):

// check panics with a query canceled error if the associated query has been
// canceled. The check is performed on every 1024th call. This should be used

s/1024/cancelCheckInterval


pkg/sql/exec/quicksort_tmpl.go, line 61 at r3 (raw file):

}

func (p *sort_TYPE_DIROp) heapSort(a, b int, ctx context.Context, cancelChecker *CancelChecker) {

Should we embed this cancelChecker in the sort struct as we do with the hash table? I suppose this is what the benchmark discussion was about. It seems suspicious to me as well.

@yuzefovich
Copy link
Member Author

@jordanlewis you were right about probably my Mac being noisy. The results of running the benchmark on gce worker:

name                        old time/op    new time/op    delta
Sort/rows=2048/cols=1-24      45.6µs ± 2%    47.2µs ± 3%  +3.33%  (p=0.000 n=10+9)
Sort/rows=2048/cols=2-24       236µs ± 2%     239µs ± 2%  +1.31%  (p=0.029 n=10+10)
Sort/rows=2048/cols=4-24       412µs ± 1%     419µs ± 2%  +1.57%  (p=0.000 n=9+8)
Sort/rows=16384/cols=1-24      438µs ± 1%     469µs ± 8%  +7.16%  (p=0.013 n=9+10)
Sort/rows=16384/cols=2-24     2.00ms ± 2%    1.99ms ± 2%    ~     (p=0.515 n=8+10)
Sort/rows=16384/cols=4-24     3.38ms ± 2%    3.47ms ± 1%  +2.70%  (p=0.000 n=10+9)
Sort/rows=262144/cols=1-24    6.53ms ± 1%    6.51ms ± 1%    ~     (p=0.684 n=10+10)
Sort/rows=262144/cols=2-24    28.1ms ± 2%    28.3ms ± 3%    ~     (p=0.356 n=10+9)
Sort/rows=262144/cols=4-24    63.8ms ± 3%    62.7ms ± 3%    ~     (p=0.089 n=10+10)

name                        old speed      new speed      delta
Sort/rows=2048/cols=1-24     359MB/s ± 2%   348MB/s ± 3%  -3.23%  (p=0.000 n=10+9)
Sort/rows=2048/cols=2-24     139MB/s ± 2%   137MB/s ± 2%  -1.29%  (p=0.025 n=10+10)
Sort/rows=2048/cols=4-24     159MB/s ± 1%   155MB/s ± 6%  -2.31%  (p=0.000 n=9+9)
Sort/rows=16384/cols=1-24    299MB/s ± 1%   281MB/s ± 8%  -6.25%  (p=0.013 n=9+10)
Sort/rows=16384/cols=2-24    131MB/s ± 2%   132MB/s ± 2%    ~     (p=0.515 n=8+10)
Sort/rows=16384/cols=4-24    155MB/s ± 2%   151MB/s ± 1%  -2.64%  (p=0.000 n=10+9)
Sort/rows=262144/cols=1-24   321MB/s ± 1%   322MB/s ± 1%    ~     (p=0.684 n=10+10)
Sort/rows=262144/cols=2-24   149MB/s ± 2%   148MB/s ± 4%    ~     (p=0.368 n=10+9)
Sort/rows=262144/cols=4-24   131MB/s ± 3%   134MB/s ± 3%    ~     (p=0.089 n=10+10)

name                        old alloc/op   new alloc/op   delta
Sort/rows=2048/cols=1-24      73.2kB ± 0%    73.3kB ± 0%  +0.04%  (p=0.000 n=10+10)
Sort/rows=2048/cols=2-24       113kB ± 0%     113kB ± 0%  +0.03%  (p=0.000 n=10+10)
Sort/rows=2048/cols=4-24       206kB ± 0%     206kB ± 0%  +0.02%  (p=0.000 n=10+8)
Sort/rows=16384/cols=1-24      885kB ± 0%     885kB ± 0%  +0.00%  (p=0.000 n=9+10)
Sort/rows=16384/cols=2-24     1.52MB ± 0%    1.52MB ± 0%  +0.00%  (p=0.000 n=10+10)
Sort/rows=16384/cols=4-24     2.78MB ± 0%    2.78MB ± 0%  +0.00%  (p=0.000 n=10+10)
Sort/rows=262144/cols=1-24    16.0MB ± 0%    16.0MB ± 0%  +0.00%  (p=0.000 n=9+10)
Sort/rows=262144/cols=2-24    28.0MB ± 0%    28.0MB ± 0%  +0.00%  (p=0.000 n=10+10)
Sort/rows=262144/cols=4-24    51.5MB ± 0%    51.5MB ± 0%  +0.00%  (p=0.000 n=10+10)

name                        old allocs/op  new allocs/op  delta
Sort/rows=2048/cols=1-24        27.0 ± 0%      28.0 ± 0%  +3.70%  (p=0.000 n=10+10)
Sort/rows=2048/cols=2-24        45.0 ± 0%      46.0 ± 0%  +2.22%  (p=0.000 n=10+10)
Sort/rows=2048/cols=4-24        81.0 ± 0%      82.0 ± 0%  +1.23%  (p=0.000 n=10+10)
Sort/rows=16384/cols=1-24       65.0 ± 0%      66.0 ± 0%  +1.54%  (p=0.000 n=10+10)
Sort/rows=16384/cols=2-24        121 ± 0%       122 ± 0%  +0.83%  (p=0.000 n=10+10)
Sort/rows=16384/cols=4-24        233 ± 0%       234 ± 0%  +0.43%  (p=0.000 n=10+10)
Sort/rows=262144/cols=1-24       564 ± 0%       565 ± 0%  +0.18%  (p=0.000 n=10+10)
Sort/rows=262144/cols=2-24     1.12k ± 0%     1.12k ± 0%  +0.09%  (p=0.000 n=10+10)
Sort/rows=262144/cols=4-24     2.23k ± 0%     2.23k ± 0%  +0.04%  (p=0.000 n=10+10)

@yuzefovich yuzefovich force-pushed the cancellation branch 2 times, most recently from 8115b78 to 7d9aee1 Compare April 23, 2019 02:28
Copy link
Member Author

@yuzefovich yuzefovich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @asubiotto, @georgeutsin, and @solongordon)


pkg/sql/distsqlrun/column_exec_setup.go, line 119 at r3 (raw file):

Previously, asubiotto (Alfonso Subiotto Marqués) wrote…

Maybe add a comment as to why it's only necessary to wrap the colBatchScan.

Done.


pkg/sql/exec/cancel_checker.go, line 70 at r2 (raw file):

Previously, asubiotto (Alfonso Subiotto Marqués) wrote…

Oops. This was for my own attention. What I was going to ask is why do we need an atomic here?

I guess I just copy/pasted it. I don't see a reason for it at the moment, so I removed it. Maybe in the future we will want to parallelize sort or something like that? We can always add it back later if needed.


pkg/sql/exec/cancel_checker.go, line 54 at r3 (raw file):

Previously, asubiotto (Alfonso Subiotto Marqués) wrote…

s/1024/cancelCheckInterval

Done.


pkg/sql/exec/quicksort_tmpl.go, line 61 at r3 (raw file):

Previously, asubiotto (Alfonso Subiotto Marqués) wrote…

Should we embed this cancelChecker in the sort struct as we do with the hash table? I suppose this is what the benchmark discussion was about. It seems suspicious to me as well.

Yeah, the benchmarks on my Mac were off.

I embedded cancelChecker into sorter structs as well. I also run the benchmark to check whether there is any significant difference, and there doesn't appear to be any (ran on gce worker):

name                        old time/op   new time/op   delta
Sort/rows=2048/cols=1-24     38.1µs ±14%   36.7µs ± 2%    ~     (p=0.905 n=10+9)
Sort/rows=2048/cols=2-24      203µs ± 1%    203µs ± 1%    ~     (p=0.353 n=10+10)
Sort/rows=2048/cols=4-24      367µs ± 1%    369µs ± 1%    ~     (p=0.156 n=10+9)
Sort/rows=16384/cols=1-24     308µs ± 2%    312µs ± 1%  +1.49%  (p=0.015 n=10+10)
Sort/rows=16384/cols=2-24    1.63ms ± 2%   1.64ms ± 1%  +1.06%  (p=0.027 n=8+9)
Sort/rows=16384/cols=4-24    3.13ms ± 4%   3.10ms ± 2%    ~     (p=0.481 n=10+10)
Sort/rows=262144/cols=1-24   4.32ms ± 1%   4.54ms ±12%  +5.13%  (p=0.003 n=10+10)
Sort/rows=262144/cols=2-24   22.7ms ± 2%   22.7ms ± 1%    ~     (p=0.853 n=10+10)
Sort/rows=262144/cols=4-24   59.1ms ± 1%   59.3ms ± 2%    ~     (p=0.579 n=10+10)

name                        old speed     new speed     delta
Sort/rows=2048/cols=1-24    433MB/s ±13%  446MB/s ± 2%    ~     (p=0.905 n=10+9)
Sort/rows=2048/cols=2-24    161MB/s ± 1%  162MB/s ± 1%    ~     (p=0.353 n=10+10)
Sort/rows=2048/cols=4-24    178MB/s ± 1%  178MB/s ± 1%    ~     (p=0.163 n=10+9)
Sort/rows=16384/cols=1-24   426MB/s ± 2%  420MB/s ± 1%  -1.47%  (p=0.015 n=10+10)
Sort/rows=16384/cols=2-24   161MB/s ± 2%  160MB/s ± 1%  -1.05%  (p=0.026 n=8+9)
Sort/rows=16384/cols=4-24   168MB/s ± 4%  169MB/s ± 2%    ~     (p=0.481 n=10+10)
Sort/rows=262144/cols=1-24  485MB/s ± 1%  463MB/s ±11%  -4.55%  (p=0.003 n=10+10)
Sort/rows=262144/cols=2-24  185MB/s ± 2%  185MB/s ± 1%    ~     (p=0.838 n=10+10)
Sort/rows=262144/cols=4-24  142MB/s ± 1%  142MB/s ± 2%    ~     (p=0.565 n=10+10)

Copy link
Member

@jordanlewis jordanlewis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm: once alfonso's satisfied too

Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @asubiotto, @georgeutsin, and @solongordon)

Copy link
Contributor

@asubiotto asubiotto left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm: minus the ordering of the ctx arg.

Reviewed 5 of 5 files at r4.
Reviewable status: :shipit: complete! 2 of 0 LGTMs obtained (waiting on @asubiotto, @georgeutsin, @solongordon, and @yuzefovich)


pkg/sql/exec/quicksort_tmpl.go, line 192 at r4 (raw file):

}

func (p *sort_TYPE_DIROp) quickSort(a, b, maxDepth int, ctx context.Context) {

nit: ctx is usually the first argument to a method.

We introduce a CancelChecker operator that wraps colBatchScan and
performs a cancellation check on every input batch. Also, sorter and
hash joiner do additional checks themselves since they can spend
a lot of time in certain phases of execution.

Release note: None
Copy link
Member Author

@yuzefovich yuzefovich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 2 stale) (waiting on @asubiotto, @georgeutsin, and @solongordon)


pkg/sql/exec/quicksort_tmpl.go, line 192 at r4 (raw file):

Previously, asubiotto (Alfonso Subiotto Marqués) wrote…

nit: ctx is usually the first argument to a method.

Done. I was debating whether it should be the first or the last in this case.

Copy link
Member Author

@yuzefovich yuzefovich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the reviews!

bors r+

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 2 stale) (waiting on @asubiotto, @georgeutsin, and @solongordon)

craig bot pushed a commit that referenced this pull request Apr 23, 2019
35585: roachtest: rationalize cluster destruction r=andreimatei a=andreimatei

Cluster destruction can be initiated from two places: tests finishing
and a signal being received. Some synchronization is required for
handling races on this destruction. This patch makes this
synchronization nicer. Before this patch, the case where a signal was
received and then a test finishes was handled OK, but the reverse I
think was not.

This patch also kills a global containing a list of all the clusters and
encapsulates it in a nicer clusterRegistry.

Release note: None

36967: exec: add cancellation check to operators r=yuzefovich a=yuzefovich

We introduce a CancelChecker operator that wraps all other operators
and performs a cancellation check on every batch. Also, sorter and
hash joiner do additional checks themselves since they can spend
a lot of time in certain phases of execution.

Release note: None

Co-authored-by: Andrei Matei <[email protected]>
Co-authored-by: Yahor Yuzefovich <[email protected]>
@craig
Copy link
Contributor

craig bot commented Apr 23, 2019

Build succeeded

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants