Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chore(NA): rebalance x-pack cigroups #85797

Merged
merged 20 commits into from
Dec 16, 2020

Conversation

mistic
Copy link
Member

@mistic mistic commented Dec 14, 2020

After #85191 our CI was again over 2h20m and it seems very unstable reaching the overall timeout a couple of times which is generating a lot of failures. That change intends to rebalance the ciGroups by introducing a new ciGroup12 and getting the CI back again into around 1h40m. I'm also expecting to get the CI more stable again once that change is merged in.

@mistic
Copy link
Member Author

mistic commented Dec 14, 2020

@elasticmachine merge upstream

@mistic mistic changed the title chore(NA): rebalance some ciGroup11 into ciGroup4 chore(NA): rebalance x-pack cigroups by introducing a new one Dec 15, 2020
@mistic mistic self-assigned this Dec 15, 2020
@mistic mistic added chore Feature:CI Continuous integration release_note:skip Skip the PR/issue when compiling release notes Team:Operations Team label for Operations Team v7.11.0 v8.0.0 labels Dec 15, 2020
@mistic mistic marked this pull request as ready for review December 15, 2020 05:03
@mistic mistic requested review from a team as code owners December 15, 2020 05:03
@elasticmachine
Copy link
Contributor

Pinging @elastic/kibana-operations (Team:Operations)

@tylersmalley
Copy link
Contributor

Current times on master:

name time
ciGroup1 6:24
ciGroup2 0:20
ciGroup3 26
ciGroup4 0:23
ciGroup5 0:15
ciGroup6 26
ciGroup7 0:14
ciGroup8 0:16
ciGroup9 0:23
ciGroup10 0:16
ciGroup11 0:25
ciGroup12 26
xpack-ciGroup1 0:57
xpack-ciGroup2 1:07
xpack-ciGroup3 1:00
xpack-ciGroup4 0:58
xpack-ciGroup5 0:57
xpack-ciGroup6 0:55
xpack-ciGroup7 0:53
xpack-ciGroup8 0:51
xpack-ciGroup9 0:58
xpack-ciGroup10 1:04
xpack-ciGroup11 1:36

@brianseeders raised a concern last time around limited resources with adding another CI group. If this is the case, it seems like we could drastically decrease the number of OSS ciGroups to compensate.

Copy link
Member

@spong spong left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM -- thank you for all your efforts here in keeping the build times down @mistic! 🙂

Copy link
Member

@dmlemeshko dmlemeshko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

@FrankHassanabad FrankHassanabad left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@mistic mistic requested a review from a team as a code owner December 15, 2020 14:52
@mistic
Copy link
Member Author

mistic commented Dec 15, 2020

@elasticmachine merge upstream

Copy link
Member

@legrego legrego left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Group change in x-pack/test/encrypted_saved_objects_api_integration/tests/index.ts LGTM! Thanks for rebalancing!

@brianseeders
Copy link
Contributor

@brianseeders raised a concern last time around limited resources with adding another CI group. If this is the case, it seems like we could drastically decrease the number of OSS ciGroups to compensate.

This is primarily a problem on jobs that don't use the tasks framework (es snapshots, code coverage for example). In those jobs, oss and xpack cigroups run on different machines, so reducing the number of oss groups won't do anything for xpack. We could make the machines a little bigger for those jobs, if we need to

@mistic mistic requested a review from a team as a code owner December 16, 2020 02:14
@tylersmalley
Copy link
Contributor

@mistic why did you add two new groups? I see ciGroup12 taking 38 minutes and ciGroup13 taking 25. Together that would be under the 1 hour average for the other ciGroups.

@mistic
Copy link
Member Author

mistic commented Dec 16, 2020

@tylersmalley the best I could do with only one added ciGroup was getting the CI into 1h57 on https://kibana-ci.elastic.co/job/elastic+kibana+pipeline-pull-request/94337/ and we will stay with only a little room for new tests to be added. I think the best to do here is to add those two new groups until we found a different fix for those time consuming tests.

Copy link
Contributor

@ymao1 ymao1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alerting changes LGTM

@tylersmalley tylersmalley merged commit 1e3a483 into elastic:master Dec 16, 2020
tylersmalley pushed a commit to tylersmalley/kibana that referenced this pull request Dec 16, 2020
# Conflicts:
#	vars/kibanaCoverage.groovy
@tylersmalley tylersmalley changed the title chore(NA): rebalance x-pack cigroups by introducing a new one by introducing a new one Dec 16, 2020
@tylersmalley tylersmalley changed the title by introducing a new one chore(NA): rebalance x-pack cigroups Dec 16, 2020
@kibanamachine
Copy link
Contributor

💚 Build Succeeded

Metrics [docs]

Distributable file count

id before after diff
default 47268 48028 +760

History

To update your PR or re-run it, just comment with:
@elasticmachine merge upstream

tylersmalley pushed a commit that referenced this pull request Dec 16, 2020
# Conflicts:
#	vars/kibanaCoverage.groovy

Co-authored-by: Tiago Costa <[email protected]>
tylersmalley pushed a commit that referenced this pull request Dec 16, 2020
tylersmalley pushed a commit that referenced this pull request Dec 16, 2020
@tylersmalley
Copy link
Contributor

Reverted. This led to us hitting memory ceiling on the ES verification job, increased those instances in #86192 which resulted in getting "message": "index [.kibana_1] blocked by: [TOO_MANY_REQUESTS/12/disk usage exceeded flood-stage watermark, index has read-only-allow-delete block];: cluster_block_exception" even though the I am only seeing ~3% disk utilization. Will take another swing at this tomorrow.

master: 6671cf3
7.x: af5b7af

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
chore Feature:CI Continuous integration release_note:skip Skip the PR/issue when compiling release notes reverted Team:Operations Team label for Operations Team v7.11.0 v8.0.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants