Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Fleet] added cardinality agg when counting acks in action_status #141651

Merged
merged 5 commits into from
Sep 26, 2022

Conversation

juliaElastic
Copy link
Contributor

@juliaElastic juliaElastic commented Sep 23, 2022

Summary

Related to #140267

During scalability testing with 75k agents, we encountered an issue that the number of acks for an action were greater than the agents actioned, this resulted the action showing up as in progress.
It is possible that one agent acks multiple times, if the update of action result takes long, and the agent checks in multiple times.
To fix this, added a cardinality agg on agent id when fetching action results, to make sure we are only counting one ack per agent.

The fix is hard to reproduce locally with small agent count, so would have to be tested again with larger clusters on cloud.

cc @joshdover

@juliaElastic juliaElastic added release_note:skip Skip the PR/issue when compiling release notes v8.5.0 labels Sep 23, 2022
@juliaElastic juliaElastic self-assigned this Sep 23, 2022
@juliaElastic juliaElastic requested a review from a team as a code owner September 23, 2022 14:32
@botelastic botelastic bot added the Team:Fleet Team label for Observability Data Collection Fleet team label Sep 23, 2022
@elasticmachine
Copy link
Contributor

Pinging @elastic/fleet (Team:Fleet)

@@ -39,6 +39,11 @@ export async function getActionStatuses(
terms: { field: 'action_id', size: actions.length || 10 },
aggs: {
max_timestamp: { max: { field: '@timestamp' } },
agent_count: {
cardinality: {
field: 'agent_id',
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@juliaElastic
Copy link
Contributor Author

@elasticmachine merge upstream

@kibana-ci
Copy link
Collaborator

💚 Build Succeeded

Metrics [docs]

Async chunks

Total size of all lazy-loaded chunks that will be downloaded as the user navigates the app

id before after diff
fleet 914.9KB 915.1KB +174.0B
Unknown metric groups

ESLint disabled line counts

id before after diff
fleet 61 62 +1

Total ESLint disabled count

id before after diff
fleet 69 70 +1

History

To update your PR or re-run it, just comment with:
@elasticmachine merge upstream

cc @juliaElastic

@juliaElastic juliaElastic requested review from nchaulet and a team September 26, 2022 10:39
@juliaElastic juliaElastic merged commit adffaa4 into elastic:main Sep 26, 2022
kibanamachine pushed a commit to kibanamachine/kibana that referenced this pull request Sep 26, 2022
…astic#141651)

* added cardinality agg when counting acks in action_status

* added precision_threshold, added tests for activity flyout

* fixed tests

* fixed tests

Co-authored-by: Kibana Machine <[email protected]>
(cherry picked from commit adffaa4)
@kibanamachine
Copy link
Contributor

💚 All backports created successfully

Status Branch Result
8.5

Note: Successful backport PRs will be merged automatically after passing CI.

Questions ?

Please refer to the Backport tool documentation

kibanamachine added a commit that referenced this pull request Sep 26, 2022
…41651) (#141764)

* added cardinality agg when counting acks in action_status

* added precision_threshold, added tests for activity flyout

* fixed tests

* fixed tests

Co-authored-by: Kibana Machine <[email protected]>
(cherry picked from commit adffaa4)

Co-authored-by: Julia Bardi <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
release_note:skip Skip the PR/issue when compiling release notes Team:Fleet Team label for Observability Data Collection Fleet team v8.5.0 v8.6.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants