-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
roachtest: tpchvec/perf failed #47401
Comments
Hm, I cannot reproduce it locally. The run times for the last five days of query 12:
So we had one slow run in two out of five cases, and it's always the first one (the runtimes are sorted). I'm not sure what to make of it though. |
(roachtest).tpchvec/perf failed on master@b644071166e3a9db1fd595154a82e472d052b258:
More
Artifacts: /tpchvec/perf
See this test on roachdash |
I also ran into this when running |
(roachtest).tpchvec/perf failed on master@b5b4a9a55f1122ef9c82b968aa5c8cc137c7e281:
|
(roachtest).tpchvec/perf failed on master@666a0ac62832b6884bc6b039b4c944dbb42924aa:
More
Artifacts: /tpchvec/perf
See this test on roachdash |
(roachtest).tpchvec/perf failed on master@d620f6242ad43481e61a6af19416733cf05233a4:
More
Artifacts: /tpchvec/perf
See this test on roachdash |
(roachtest).tpchvec/perf failed on master@c3bb2bce3d69d0ceef20797d856a6ac107a3ef47:
More
Artifacts: /tpchvec/perf
See this test on roachdash |
(roachtest).tpchvec/perf failed on master@683f0d561bf9b73902edb27d681bca5333bdad60:
More
Artifacts: /tpchvec/perf
See this test on roachdash |
48522: geo/geo*fn: error when comparing two geometries of different SRIDs r=sumeerbhola a=otan When operating on more than one geo type, we must ensure that if we compare SRIDs that mismatch, we report an error. We've done this for all binary predicates implemented so far in Geometry and Geography. Furthermore, this PR adds an interface for Geospatial types. It makes things a little bit neater, and we can extend this as we go along. Release note: None 48540: tpchvec: enable some logging to debug perf test failure r=yuzefovich a=yuzefovich My current theory is that we're seeing random performance failures due to spilling to disk. To check that we now will enable some verbose logging for the components that can spill to disk. Addresses: #47401. Release note: None Co-authored-by: Oliver Tan <[email protected]> Co-authored-by: Yahor Yuzefovich <[email protected]>
(roachtest).tpchvec/perf failed on master@c1abb272c94a437f1df9175fc30dc6a6729d3338:
More
Artifacts: /tpchvec/perf
See this test on roachdash |
Hm, logs indicate that the vectorized flow was not setup for query 7 for some reason - there should be messages like
but all we have is the runtime stats at that period of time. |
Something must be wrong with the logging, right? Because we also print |
I don't think so - we will print this message only if we're trying to set up a vectorized flow and we fail, but if |
49397: roachtest: run EXPLAIN ANALYZE when tpchvec/perf fails r=yuzefovich a=yuzefovich **workload: switch TPCH queries mapping from using string to int** It also exports `NumQueries` and `QueriesByNumber` fields of `tpch` package to be used by `tpchvec` roachtest. Release note: None **roachtest: minor cleanup of tpchbench** This commit refactors `tpchbench` test slightly in order to hide `tpch` bench type from the whole `roachtest` package. Release note: None **roachtest: run EXPLAIN ANALYZE when tpchvec/perf fails** We have a mysterious rare failure on query 7 in which `vectorize=on` performs significantly worse than `vectorize=off`. I'm out of possible explanations for this, so this commit adds an ability to run `EXPLAIN ANALYZE` on the query with both `vectorize` options when the slowness threshold is exceeded. Hopefully it'll give us some insight into the perf failures. Addresses: #47401. Release note: None 49429: *: replace test usages of context.TODO() with context.Background() r=otan a=otan *: replace test usages of context.TODO() with context.Background() This commit replaces all usages of context.TODO() with context.Background() in tests, and adds a lint rule ensuring context.TODO() does not appear. Tests should always be initiating a new context if they are calling a function for the first time - it is now "unknown" and hence not a "TODO" application. Also added a new test in `lint_test.go` for this. Release note: None Co-authored-by: Yahor Yuzefovich <[email protected]> Co-authored-by: Oliver Tan <[email protected]>
I'm inclined to close this issue as well because it appears to be occurring as if there is some infrastructure flake going on rarely. Additionally if it does occur again, we should have some useful info to dig into. @asubiotto thoughts? |
Maybe it's useful to keep this "history" because it's still an issue. If it is some infrastructure flake, it'd be useful to rope dev-inf in. |
Well, we will remember that this particular github issue exists (even if it's closed), so we'll be able to inspect it if necessary. To me this issue seems unactionable at the moment, so I'm in favor of closing it. |
I think this is the point. It might become actionable later on. I think it's only useful to close issues if the problem is resolved, we've decided we won't fix it or it's unactionable and always will be so. I think it's desirable to keep issues open where we're currently waiting for something to happen so that we don't lose the context (and don't have to keep everything in our mind). |
Ok, I don't feel too strongly about it. |
I think I've changed my mind. Closing this and others. If it happens again, we'll see the trace and can discuss on that issue. |
(roachtest).tpchvec/perf failed on master@7b0f60fe2034ba8677242dbcdad86d3e5587c0d4:
More
Artifacts: /tpchvec/perf
Related:
See this test on roachdash
powered by pkg/cmd/internal/issues
The text was updated successfully, but these errors were encountered: