Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

importccl: roachtest import/tpch/nodes=8 is unexpectedly slow #68117

Open
adityamaru opened this issue Jul 27, 2021 · 0 comments
Open

importccl: roachtest import/tpch/nodes=8 is unexpectedly slow #68117

adityamaru opened this issue Jul 27, 2021 · 0 comments
Labels
C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. T-sql-queries SQL Queries Team

Comments

@adityamaru
Copy link
Contributor

adityamaru commented Jul 27, 2021

There is a longstanding TODO above the test definition in registerImportTPCH:

                // TODO(dt): this test seems to have become slower as of 19.2. It previously
		// had 4, 8 and 32 node configurations with comments claiming they ran in in
		// 4-5h for 4 node and 3h for 8 node. As of 19.2, it seems to be timing out
		// -- potentially because 8 secondary indexes is worst-case for direct
		// ingestion and seems to cause a lot of compaction, but further profiling
		// is required to confirm this. Until then, the 4 and 32 node configurations
		// are removed (4 is too slow and 32 is pretty expensive) while 8-node is
		// given a 50% longer timeout (which running by hand suggests should be OK).
		// (07/27/21) The timeout was increased again to 10 hours. The test runs in
		// ~7 hours which causes it to occasionally exceed the previous timeout of 8
		// hours.

We have bumped the timeout to 8 hours in 2019, and subsequently to 10 hours in 2021. The test usually runs in ~7 hours. This issue serves as a placeholder for any investigation into the cause of this slowness. It is likely that the TODO is outdated considering IMPORT has changed considerably since 19.2.

Jira issue: CRDB-8877

@adityamaru adityamaru added C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. A-disaster-recovery labels Jul 27, 2021
adityamaru added a commit to adityamaru/cockroach that referenced this issue Jul 27, 2021
Previously, the roachtest had a timeout of 8h. The test
usually runs in ~7hrs but occasionally tips over the
configured time out. While we investigate the slowness
of this import as tracked in cockroachdb#68117,
we are bumping the timeout to 10h.

Release note: None
craig bot pushed a commit that referenced this issue Jul 27, 2021
67989: cluster-ui: bump package.json version r=celiala a=celiala

In #67722, we updated peerDependencies, but forgot to bump
the version in package.json.

This fast-follow PR bumps the version for #67722.

Note: this has been preemptively backported to 21.1 and 20.2, in #67981 and #67982 respectively.


Release note: None

68088: roachtest: create perf dir for bulk roachtests r=pbardea a=pbardea

When the bulk op roachtests were updated to avoid racing when writing
their stats files, the creation of the perf directory itself was
removed. This adds it back.

There was some consideration to update PutString to create the
filepath.Dir of its destination but that refactor was left for a
potential follow up since it applies to other tests as well.

Fixes #67870.

Release note: None

68118: roachtest: bump import/tpch/nodes=8 timeout to 10h r=tbg,pbardea a=adityamaru

Previously, the roachtest had a timeout of 8h. The test
usually runs in ~7hrs but occasionally tips over the
configured time out. While we investigate the slowness
of this import as tracked in #68117,
we are bumping the timeout to 10h.

Release note: None

68119: rowexec: ask for at least 8MiB in the join reader memory limit r=yuzefovich a=yuzefovich

The join reader doesn't know how to spill to disk, so previously in some
cases (namely, when `distsql_workmem` session variable is low) the
queries would error out. Now this is temporarily fixed by requiring the
memory limit to be at least 8MiB (to accommodate 4MiB scratch input
rows). This shouldn't really matter in the production setting but makes
`tpchvec/disk` roachtest happy.

Fixes: #68036.

Release note: None

Co-authored-by: Celia La <[email protected]>
Co-authored-by: Paul Bardea <[email protected]>
Co-authored-by: Aditya Maru <[email protected]>
Co-authored-by: Yahor Yuzefovich <[email protected]>
adityamaru added a commit to adityamaru/cockroach that referenced this issue Sep 21, 2021
Previously, the roachtest had a timeout of 8h. The test
usually runs in ~7hrs but occasionally tips over the
configured time out. While we investigate the slowness
of this import as tracked in cockroachdb#68117,
we are bumping the timeout to 10h.

Release note: None
@blathers-crl blathers-crl bot added the T-sql-queries SQL Queries Team label Jul 5, 2023
@mgartner mgartner moved this to New Backlog in SQL Queries Jul 24, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. T-sql-queries SQL Queries Team
Projects
Status: Backlog
Development

No branches or pull requests

2 participants