Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Fleet] using point in time for agent status query to avoid discrepancy #135816

Merged
merged 5 commits into from
Jul 8, 2022

Conversation

juliaElastic
Copy link
Contributor

@juliaElastic juliaElastic commented Jul 6, 2022

Summary

Fix #134798

  • Querying agent statuses by opening a point in time query to avoid discrepancy.
  • Merged together getAgentsByKuery and getAgentsByKueryPit functions
  • fixed a small bug with inactive query, inactive count was not returning correctly when using default showInactive: false parameter because there was an unnecessary active:true filter added.

Tested by:

  • enroll 3k agents with horde
  • halt horde
  • wait 5 minutes so that agents start moving to offline
  • keep running the status query at around the 5 min mark in Console GET kbn:/api/fleet/agent_status
  • shouldn't see any discrepancy where online + offline > total - did this by logging out when this happens

Checklist

@juliaElastic juliaElastic requested a review from a team as a code owner July 6, 2022 13:36
@juliaElastic juliaElastic self-assigned this Jul 6, 2022
@botelastic botelastic bot added the Team:Fleet Team label for Observability Data Collection Fleet team label Jul 6, 2022
@elasticmachine
Copy link
Contributor

Pinging @elastic/fleet (Team:Fleet)

kuery: joinKuerys(
...[
kuery,
filterKuery,
`${AGENTS_PREFIX}.attributes.active:true`,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

adding active filter here is incorrect, as getAgentsByKuery adds it based on showInactive:

if (showInactive === false) {
filters.push(ACTIVE_AGENT_CONDITION);
}

@jlind23 jlind23 requested a review from joshdover July 7, 2022 08:32
await closePointInTime(esClient, pitId);
}

const result = {
total: allActive.total,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I noticed that total doesn't include inactive, is it intentional? It can be a little confusing, example:

{
  "results": {
    "total": 981,
    "inactive": 20,
    "online": 981,
    "error": 0,
    "offline": 0,
    "updating": 0,
    "other": 20,
    "events": 0
  }
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Inactive agents are the ones that have been unenrolled, so they're generally excluded from the status as they're not relevant most of the time. We should probably have two fields, one for the real total and one for the total active, but I think we should only do this in a non-breaking way. Maybe total (all active, deprecated) and all (all active and inactive) and active (all active)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, this would be cleaner.
Do you happen to know if there is any usage of other status? It might be good to deprecate that as well, as it seems confusing. It is not mentioned in the UI/docs either.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree it is confusing. Total seems to be total active, which you don't assume from the name total, my vote would be to fix this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Raised an issue to deprecate total status: #135980

@@ -54,6 +54,7 @@ export const TableRowActions: React.FunctionComponent<{
onClick={(event) => {
onAddRemoveTagsClick((event.target as Element).closest('button')!);
}}
disabled={!agent.active}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

small change to disable Add / remove tags action on inactive agents, like the other actions are disabled
image

@juliaElastic
Copy link
Contributor Author

@elasticmachine merge upstream

@kibana-ci
Copy link
Collaborator

💚 Build Succeeded

Metrics [docs]

Async chunks

Total size of all lazy-loaded chunks that will be downloaded as the user navigates the app

id before after diff
fleet 842.7KB 842.7KB +19.0B

History

To update your PR or re-run it, just comment with:
@elasticmachine merge upstream

cc @juliaElastic

@juliaElastic juliaElastic merged commit 943c366 into elastic:main Jul 8, 2022
@kibanamachine kibanamachine added the backport:skip This commit does not require backporting label Jul 8, 2022
@joshdover
Copy link
Contributor

@juliaElastic should we backport this to 8.3?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport:skip This commit does not require backporting release_note:fix Team:Fleet Team label for Observability Data Collection Fleet team v8.4.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

/api/fleet/agent_status provides incorrect counts
7 participants