Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

remove bench job from extended ci #136470

Closed
msbutler opened this issue Dec 2, 2024 · 1 comment · Fixed by #137049
Closed

remove bench job from extended ci #136470

msbutler opened this issue Dec 2, 2024 · 1 comment · Fixed by #137049
Assignees
Labels
A-ci Continuous Integration branch-release-23.1 Used to mark GA and release blockers, technical advisories, and bugs for 23.1 branch-release-23.2 Used to mark GA and release blockers, technical advisories, and bugs for 23.2 branch-release-24.1 Used to mark GA and release blockers, technical advisories, and bugs for 24.1 branch-release-24.2 Used to mark GA and release blockers, technical advisories, and bugs for 24.2 branch-release-24.3 Used to mark GA and release blockers, technical advisories, and bugs for 24.3 C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) T-dev-inf

Comments

@msbutler
Copy link
Collaborator

msbutler commented Dec 2, 2024

While I haven't polled the org, I don't think db engineers investigate and take action if the bench job fails because you cant see the benchmark results and because it's incredibly flakey. Furthermore, this job implicitly increases a PRs's time to merge because 1) many engineers wait for extended CI to complete before merging and 2) this job is the longest running extended ci job. I don't know the details, but I imagine we're also spending significant money running this job.

Also, @herkolategan, crdb's micro bench infra expert in residence, signed off on this idea here.

Jira issue: CRDB-45088

@msbutler msbutler added C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) T-dev-inf labels Dec 2, 2024
@rickystewart rickystewart self-assigned this Dec 9, 2024
@rickystewart rickystewart added the A-ci Continuous Integration label Dec 9, 2024
craig bot pushed a commit that referenced this issue Dec 9, 2024
136780: stats: use no-op DatumAlloc when decoding EncDatums r=yuzefovich a=yuzefovich

**tree: make nil DatumAlloc act as a no-op**

This commit makes so that `nil` `DatumAlloc` value makes the alloc struct a no-op, i.e. each datum type passed by value results in a separate allocation. This can be useful when the callers will keep only a subset of all allocated datums.

Release note: None

**stats: use no-op DatumAlloc when decoding EncDatums**

This commit fixes a bounded memory leak that could previously occur due to usage of the DatumAlloc when decoding EncDatums into `tree.Datum`s during table stats collection. We have two use cases for decoded `tree.Datum`s:
- we need it when adding _all_ rows into the sketch (we compute the datum's fingerprint ("hash") to use in the distinct count estimation). This usage is very brief and we don't need the decoded datum afterwards.
- we also need it when sampling _some_ rows when we decide to keep a particular row. In this case, the datum is needed throughout the whole stats collection job (or until it's replaced by another sample) for the histogram bucket computation.

The main observation is that the DatumAlloc allocates datums in batches of 16 objects, and even if at least one of the objects is kept by the sample, then all others from the same batch are as well. We only perform memory accounting for the ones we explicitly keep, yet others would be considered live by the Go runtime, resulting in a bounded memory leak. This behavior has been present since forever, but it became more problematic in 24.2 release with the introduction of dynamically computing the sample size. To go around this problematic behavior this commit uses `nil` DatumAlloc which makes it so that every decoded datum incurs a separate allocation. This will have a minor performance hit and increase in total number of allocations, but at least most of them should be short-lived. Alternatively, we could use an "operating" DatumAlloc during fingerprinting and a no-op during the sampling, but we'd need to explicitly nil out the decoded datum after fingerprinting which would result in decoding some datums twice.

Fixes: #136394.

Release note: None

136901: kvserver: deflake TestStoreRangeMergeConcurrentRequests r=iskettaneh a=iskettaneh

Fixes: #136774

Release note: None

136994: kvserver: deflake TestLeaseRequestFromExpirationToEpochOrLeaderDoesNotRegressExpiration r=iskettaneh a=iskettaneh

This commit deflakes
TestLeaseRequestFromExpirationToEpochOrLeaderDoesNotRegressExpiration by establishing the node liveness record before pausing node liveness heartbeats.

Fixes: #136549

Release Note: None

137039: sql/tests: add `ANALYZE` to sysbench microbenchmark r=mgartner a=mgartner

Collecting table statistics manually during the benchmark setup makes
the performance more representative of a real-world workload without the
variance in results that automatic stats collection would introduce.

Epic: None

Release note: None


137049: teamcity: remove `bench` scripts r=jlinder a=rickystewart

These jobs are very flaky and extremely time-consuming at about an hour per run. Simply delete the scripts.

Closes: #136470
Epic: none
Release note: None

Co-authored-by: Yahor Yuzefovich <[email protected]>
Co-authored-by: Ibrahim Kettaneh <[email protected]>
Co-authored-by: Marcus Gartner <[email protected]>
Co-authored-by: Ricky Stewart <[email protected]>
@craig craig bot closed this as completed in 42e1e79 Dec 9, 2024
Copy link

blathers-crl bot commented Dec 9, 2024

Based on the specified backports for linked PR #137049, I applied the following new label(s) to this issue: branch-release-23.1, branch-release-23.2, branch-release-24.1, branch-release-24.2, branch-release-24.3. Please adjust the labels as needed to match the branches actually affected by this issue, including adding any known older branches.

🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf.

@blathers-crl blathers-crl bot added branch-release-23.1 Used to mark GA and release blockers, technical advisories, and bugs for 23.1 branch-release-24.1 Used to mark GA and release blockers, technical advisories, and bugs for 24.1 branch-release-23.2 Used to mark GA and release blockers, technical advisories, and bugs for 23.2 branch-release-24.2 Used to mark GA and release blockers, technical advisories, and bugs for 24.2 branch-release-24.3 Used to mark GA and release blockers, technical advisories, and bugs for 24.3 labels Dec 9, 2024
blathers-crl bot pushed a commit that referenced this issue Dec 9, 2024
These jobs are very flaky and extremely time-consuming at about an hour
per run. Simply delete the scripts.

Closes: #136470
Epic: none
Release note: None
blathers-crl bot pushed a commit that referenced this issue Dec 9, 2024
These jobs are very flaky and extremely time-consuming at about an hour
per run. Simply delete the scripts.

Closes: #136470
Epic: none
Release note: None
blathers-crl bot pushed a commit that referenced this issue Dec 9, 2024
These jobs are very flaky and extremely time-consuming at about an hour
per run. Simply delete the scripts.

Closes: #136470
Epic: none
Release note: None
blathers-crl bot pushed a commit that referenced this issue Dec 9, 2024
These jobs are very flaky and extremely time-consuming at about an hour
per run. Simply delete the scripts.

Closes: #136470
Epic: none
Release note: None
blathers-crl bot pushed a commit that referenced this issue Dec 9, 2024
These jobs are very flaky and extremely time-consuming at about an hour
per run. Simply delete the scripts.

Closes: #136470
Epic: none
Release note: None
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-ci Continuous Integration branch-release-23.1 Used to mark GA and release blockers, technical advisories, and bugs for 23.1 branch-release-23.2 Used to mark GA and release blockers, technical advisories, and bugs for 23.2 branch-release-24.1 Used to mark GA and release blockers, technical advisories, and bugs for 24.1 branch-release-24.2 Used to mark GA and release blockers, technical advisories, and bugs for 24.2 branch-release-24.3 Used to mark GA and release blockers, technical advisories, and bugs for 24.3 C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) T-dev-inf
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants