roachtest: tpchvec/perf failed #47401

cockroach-teamcity · 2020-04-12T10:03:41Z

(roachtest).tpchvec/perf failed on master@7b0f60fe2034ba8677242dbcdad86d3e5587c0d4:

The test failed on branch=master, cloud=gce:
test artifacts and logs in: /home/agent/work/.go/src/github.com/cockroachdb/cockroach/artifacts/20200412-1869957/tpchvec/perf/run_1
	tpchvec.go:644,tpchvec.go:659,test_runner.go:753: [q12] vec ON is slower by 57.50% than vec OFF
		vec ON times: [9.81 10.19 10.41]
		vec OFF times: [6.28 6.47 6.52]

More

Artifacts: /tpchvec/perf
Related:

roachtest: tpchvec/perf failed #47118 roachtest: tpchvec/perf failed C-test-failure O-roachtest O-robot branch-release-20.1 release-blocker

See this test on roachdash
_{powered by pkg/cmd/internal/issues}

The text was updated successfully, but these errors were encountered:

yuzefovich · 2020-04-12T15:26:13Z

Hm, I cannot reproduce it locally. The run times for the last five days of query 12:

vec ON times: [9.12 9.19 9.35]	 vec OFF times: [8.81 8.82 8.82]
vec ON times: [6.74 6.84 10.81]	 vec OFF times: [6.6 6.7 6.82]
vec ON times: [6.12 6.22 9.93]	 vec OFF times: [5.89 5.89 6.24]
vec ON times: [6.4 6.4 6.55]	 vec OFF times: [6.32 6.37 6.45]
vec ON times: [6.37 6.76 7.02]	 vec OFF times: [6.22 6.35 6.45]

So we had one slow run in two out of five cases, and it's always the first one (the runtimes are sorted). I'm not sure what to make of it though.

cockroach-teamcity · 2020-04-13T09:11:49Z

(roachtest).tpchvec/perf failed on master@b644071166e3a9db1fd595154a82e472d052b258:

The test failed on branch=master, cloud=gce:
test artifacts and logs in: /home/agent/work/.go/src/github.com/cockroachdb/cockroach/artifacts/20200413-1870981/tpchvec/perf/run_1
	tpchvec.go:644,tpchvec.go:659,test_runner.go:753: [q15] vec ON is slower by 15.44% than vec OFF
		vec ON times: [4.4 4.56 4.57]
		vec OFF times: [3.92 3.95 4.05]

More

Artifacts: /tpchvec/perf
Related:

roachtest: tpchvec/perf failed #47118 roachtest: tpchvec/perf failed C-test-failure O-roachtest O-robot branch-release-20.1 release-blocker

See this test on roachdash
_{powered by pkg/cmd/internal/issues}

asubiotto · 2020-04-15T17:11:26Z

I also ran into this when running ./bin/roachtest on my refactor branch.

cockroach-teamcity · 2020-04-18T09:44:11Z

(roachtest).tpchvec/perf failed on master@b5b4a9a55f1122ef9c82b968aa5c8cc137c7e281:

The test failed on branch=master, cloud=gce:
test artifacts and logs in: /home/agent/work/.go/src/github.com/cockroachdb/cockroach/artifacts/tpchvec/perf/run_1
	tpchvec.go:610,tpchvec.go:715,tpchvec.go:725,test_runner.go:753: [q6] vec ON is slower by 23.55% than vec OFF
		vec ON times: [9.23 9.39 9.91]
		vec OFF times: [7.56 7.6 8.21]

More

Artifacts: /tpchvec/perf

See this test on roachdash
_{powered by pkg/cmd/internal/issues}

cockroach-teamcity · 2020-04-20T13:28:49Z

(roachtest).tpchvec/perf failed on master@666a0ac62832b6884bc6b039b4c944dbb42924aa:

The test failed on branch=master, cloud=gce:
test artifacts and logs in: /home/agent/work/.go/src/github.com/cockroachdb/cockroach/artifacts/tpchvec/perf/run_1
	tpchvec.go:610,tpchvec.go:715,tpchvec.go:725,test_runner.go:753: [q7] vec ON is slower by 97.57% than vec OFF
		vec ON times: [18.16 18.67 19.15]
		vec OFF times: [9.26 9.45 19.19]

More

Artifacts: /tpchvec/perf
Related:

roachtest: tpchvec/perf failed #47677 roachtest: tpchvec/perf failed C-test-failure O-roachtest O-robot branch-release-20.1 release-blocker
roachtest: tpchvec/perf failed #47675 roachtest: tpchvec/perf failed C-test-failure O-roachtest O-robot branch-release-19.2 release-blocker

See this test on roachdash
_{powered by pkg/cmd/internal/issues}

cockroach-teamcity · 2020-04-25T12:30:55Z

(roachtest).tpchvec/perf failed on master@d620f6242ad43481e61a6af19416733cf05233a4:

The test failed on branch=master, cloud=gce:
test artifacts and logs in: /home/agent/work/.go/src/github.com/cockroachdb/cockroach/artifacts/tpchvec/perf/run_1
	tpchvec.go:226,tpchvec.go:321,tpchvec.go:331,test_runner.go:753: [q7] vec ON is slower by 96.64% than vec OFF
		vec ON times: [9.85 18.13 18.5]
		vec OFF times: [9.18 9.22 14.1]

More

Artifacts: /tpchvec/perf
Related:

roachtest: tpchvec/perf failed #47677 roachtest: tpchvec/perf failed C-test-failure O-roachtest O-robot branch-release-20.1 release-blocker

See this test on roachdash
_{powered by pkg/cmd/internal/issues}

cockroach-teamcity · 2020-04-28T08:51:54Z

(roachtest).tpchvec/perf failed on master@c3bb2bce3d69d0ceef20797d856a6ac107a3ef47:

The test failed on branch=master, cloud=gce:
test artifacts and logs in: /home/agent/work/.go/src/github.com/cockroachdb/cockroach/artifacts/tpchvec/perf/run_1
	tpchvec.go:239,tpchvec.go:400,tpchvec.go:410,test_runner.go:753: [q1] unexpectedly wrong number of run times recorded with vec ON config: []

More

Artifacts: /tpchvec/perf
Related:

roachtest: tpchvec/perf failed #47677 roachtest: tpchvec/perf failed C-test-failure O-roachtest O-robot branch-release-20.1 release-blocker

See this test on roachdash
_{powered by pkg/cmd/internal/issues}

cockroach-teamcity · 2020-05-06T09:38:06Z

(roachtest).tpchvec/perf failed on master@683f0d561bf9b73902edb27d681bca5333bdad60:

The test failed on branch=master, cloud=gce:
test artifacts and logs in: /home/agent/work/.go/src/github.com/cockroachdb/cockroach/artifacts/tpchvec/perf/run_1
	tpchvec.go:230,tpchvec.go:400,tpchvec.go:410,test_runner.go:753: [q7] vec ON is slower by 99.89% than vec OFF
		vec ON times: [17.83 18.29 18.37]
		vec OFF times: [9.1 9.15 10.43]

More

Artifacts: /tpchvec/perf
Related:

roachtest: tpchvec/perf failed #47677 roachtest: tpchvec/perf failed C-test-failure O-roachtest O-robot branch-release-20.1 release-blocker

See this test on roachdash
_{powered by pkg/cmd/internal/issues}

48522: geo/geo*fn: error when comparing two geometries of different SRIDs r=sumeerbhola a=otan When operating on more than one geo type, we must ensure that if we compare SRIDs that mismatch, we report an error. We've done this for all binary predicates implemented so far in Geometry and Geography. Furthermore, this PR adds an interface for Geospatial types. It makes things a little bit neater, and we can extend this as we go along. Release note: None 48540: tpchvec: enable some logging to debug perf test failure r=yuzefovich a=yuzefovich My current theory is that we're seeing random performance failures due to spilling to disk. To check that we now will enable some verbose logging for the components that can spill to disk. Addresses: #47401. Release note: None Co-authored-by: Oliver Tan <[email protected]> Co-authored-by: Yahor Yuzefovich <[email protected]>

cockroach-teamcity · 2020-05-10T09:57:36Z

(roachtest).tpchvec/perf failed on master@c1abb272c94a437f1df9175fc30dc6a6729d3338:

The test failed on branch=master, cloud=gce:
test artifacts and logs in: /home/agent/work/.go/src/github.com/cockroachdb/cockroach/artifacts/tpchvec/perf/run_1
	tpchvec.go:246,tpchvec.go:416,tpchvec.go:426,test_runner.go:753: [q7] vec ON is slower by 96.12% than vec OFF
		vec ON times: [18.89 19.69 20.15]
		vec OFF times: [9.89 10.04 18.99]

More

Artifacts: /tpchvec/perf
Related:

roachtest: tpchvec/perf failed #47677 roachtest: tpchvec/perf failed C-test-failure O-roachtest O-robot branch-release-20.1 release-blocker

See this test on roachdash
_{powered by pkg/cmd/internal/issues}

yuzefovich · 2020-05-10T15:55:00Z

Hm, logs indicate that the vectorized flow was not setup for query 7 for some reason - there should be messages like vectorized flow setup succeeded around these times:

I200510 09:17:30.923854 10 workload/tpch/tpch.go:465  [q6] returned 1 rows after 10.84 seconds
I200510 09:17:51.446629 25 workload/tpch/tpch.go:465  [q7] returned 4 rows after 18.89 seconds
I200510 09:18:11.594030 25 workload/tpch/tpch.go:465  [q7] returned 4 rows after 20.15 seconds

but all we have is the runtime stats at that period of time.

asubiotto · 2020-05-12T08:17:24Z

Something must be wrong with the logging, right? Because we also print failed to vectorize in vectorized_flow with verbosity 1 if there's an issue.

yuzefovich · 2020-05-12T16:44:49Z

Something must be wrong with the logging, right? Because we also print failed to vectorize in vectorized_flow with verbosity 1 if there's an issue.

I don't think so - we will print this message only if we're trying to set up a vectorized flow and we fail, but if SupportsVectorized check fails, then we will be planning row-based flow right away. I have no idea why SupportsVectorized check would fail though.

49397: roachtest: run EXPLAIN ANALYZE when tpchvec/perf fails r=yuzefovich a=yuzefovich **workload: switch TPCH queries mapping from using string to int** It also exports `NumQueries` and `QueriesByNumber` fields of `tpch` package to be used by `tpchvec` roachtest. Release note: None **roachtest: minor cleanup of tpchbench** This commit refactors `tpchbench` test slightly in order to hide `tpch` bench type from the whole `roachtest` package. Release note: None **roachtest: run EXPLAIN ANALYZE when tpchvec/perf fails** We have a mysterious rare failure on query 7 in which `vectorize=on` performs significantly worse than `vectorize=off`. I'm out of possible explanations for this, so this commit adds an ability to run `EXPLAIN ANALYZE` on the query with both `vectorize` options when the slowness threshold is exceeded. Hopefully it'll give us some insight into the perf failures. Addresses: #47401. Release note: None 49429: *: replace test usages of context.TODO() with context.Background() r=otan a=otan *: replace test usages of context.TODO() with context.Background() This commit replaces all usages of context.TODO() with context.Background() in tests, and adds a lint rule ensuring context.TODO() does not appear. Tests should always be initiating a new context if they are calling a function for the first time - it is now "unknown" and hence not a "TODO" application. Also added a new test in `lint_test.go` for this. Release note: None Co-authored-by: Yahor Yuzefovich <[email protected]> Co-authored-by: Oliver Tan <[email protected]>

yuzefovich · 2020-06-02T13:42:03Z

I'm inclined to close this issue as well because it appears to be occurring as if there is some infrastructure flake going on rarely. Additionally if it does occur again, we should have some useful info to dig into. @asubiotto thoughts?

asubiotto · 2020-06-02T14:13:45Z

Maybe it's useful to keep this "history" because it's still an issue. If it is some infrastructure flake, it'd be useful to rope dev-inf in.

yuzefovich · 2020-06-02T14:21:17Z

Well, we will remember that this particular github issue exists (even if it's closed), so we'll be able to inspect it if necessary. To me this issue seems unactionable at the moment, so I'm in favor of closing it.

asubiotto · 2020-06-02T14:30:14Z

unactionable at the moment

I think this is the point. It might become actionable later on. I think it's only useful to close issues if the problem is resolved, we've decided we won't fix it or it's unactionable and always will be so. I think it's desirable to keep issues open where we're currently waiting for something to happen so that we don't lose the context (and don't have to keep everything in our mind).

yuzefovich · 2020-06-02T14:35:12Z

Ok, I don't feel too strongly about it.

asubiotto · 2020-07-01T14:47:56Z

I think I've changed my mind. Closing this and others. If it happens again, we'll see the trace and can discuss on that issue.

cockroach-teamcity added this to the 20.1 milestone Apr 12, 2020

cockroach-teamcity assigned andreimatei Apr 12, 2020

yuzefovich assigned yuzefovich and unassigned andreimatei Apr 12, 2020

This was referenced Apr 19, 2020

roachtest: tpchvec/perf failed #47675

Closed

roachtest: tpchvec/perf failed #47677

Closed

yuzefovich mentioned this issue Apr 28, 2020

roachtest: fix tpchvec perf #48107

Merged

yuzefovich mentioned this issue May 7, 2020

tpchvec: enable some logging to debug perf test failure #48540

Merged

yuzefovich mentioned this issue May 21, 2020

roachtest: run EXPLAIN ANALYZE when tpchvec/perf fails #49397

Merged

This was referenced Jun 8, 2020

roachtest: tpchvec/perf failed #49955

Closed

roachtest: tpchvec/perf failed #50029

Closed

asubiotto closed this as completed Jul 1, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

roachtest: tpchvec/perf failed #47401

roachtest: tpchvec/perf failed #47401

cockroach-teamcity commented Apr 12, 2020

yuzefovich commented Apr 12, 2020

cockroach-teamcity commented Apr 13, 2020

asubiotto commented Apr 15, 2020

cockroach-teamcity commented Apr 18, 2020

cockroach-teamcity commented Apr 20, 2020

cockroach-teamcity commented Apr 25, 2020

cockroach-teamcity commented Apr 28, 2020

cockroach-teamcity commented May 6, 2020

cockroach-teamcity commented May 10, 2020

yuzefovich commented May 10, 2020

asubiotto commented May 12, 2020

yuzefovich commented May 12, 2020

yuzefovich commented Jun 2, 2020

asubiotto commented Jun 2, 2020

yuzefovich commented Jun 2, 2020

asubiotto commented Jun 2, 2020

yuzefovich commented Jun 2, 2020

asubiotto commented Jul 1, 2020

roachtest: tpchvec/perf failed #47401

roachtest: tpchvec/perf failed #47401

Comments

cockroach-teamcity commented Apr 12, 2020

yuzefovich commented Apr 12, 2020

cockroach-teamcity commented Apr 13, 2020

asubiotto commented Apr 15, 2020

cockroach-teamcity commented Apr 18, 2020

cockroach-teamcity commented Apr 20, 2020

cockroach-teamcity commented Apr 25, 2020

cockroach-teamcity commented Apr 28, 2020

cockroach-teamcity commented May 6, 2020

cockroach-teamcity commented May 10, 2020

yuzefovich commented May 10, 2020

asubiotto commented May 12, 2020

yuzefovich commented May 12, 2020

yuzefovich commented Jun 2, 2020

asubiotto commented Jun 2, 2020

yuzefovich commented Jun 2, 2020

asubiotto commented Jun 2, 2020

yuzefovich commented Jun 2, 2020

asubiotto commented Jul 1, 2020