Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add /raftz monitoring endpoint #5530

Merged
merged 1 commit into from
Jun 17, 2024
Merged

Add /raftz monitoring endpoint #5530

merged 1 commit into from
Jun 17, 2024

Conversation

neilalexander
Copy link
Member

@neilalexander neilalexander commented Jun 13, 2024

We often find ourselves without good visibility into what's really going on in the Raft layer. This endpoint dumps quite a substantial amount of internal Raft node state.

Filters include:

  • ?acc=ACCNAME to filter by account (defaults to system account if not specified)
  • ?group=GROUP to show only specific groups

Signed-off-by: Neil Twigg [email protected]

@neilalexander neilalexander requested a review from a team as a code owner June 13, 2024 14:18
@derekcollison
Copy link
Member

This 2.11 correct?

@neilalexander
Copy link
Member Author

I didn't have a specific version in mind, whichever you think it's best.

@bruth
Copy link
Member

bruth commented Jun 13, 2024

@neilalexander how expensive is this operation for all accounts?

@@ -3006,6 +3007,8 @@ func (s *Server) startMonitoring(secure bool) error {
mux.HandleFunc(s.basePath(HealthzPath), s.HandleHealthz)
// IPQueuesz
mux.HandleFunc(s.basePath(IPQueuesPath), s.HandleIPQueuesz)
// Raftz
mux.HandleFunc(s.basePath(RaftzPath), s.HandleRaftz)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not making this available in the system account?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No reason it can't be, just haven't done it yet. Don't know if it's better to keep it HTTP only until we're definitely happy with the shape of the request parameters & response.

@neilalexander
Copy link
Member Author

@bruth The cost scales linearly with the number of HA assets but the PR is careful to only hold read locks for one Raft group at a time, so it shouldn't result in any global pauses.

I could just make it default to $SYS if no account is specified (which will probably only show the _meta_ group), which would prevent looking up all accounts and instead force you to request accounts specifically?

@derekcollison
Copy link
Member

Make $SYS only..

We often find ourselves without good visibility into what's really going
on in the Raft layer. This endpoint dumps quite a substantial amount of
internal Raft node state.

Filters include:

* `?acc=ACCNAME` to filter by account (defaults to system account if not specified)
* `?group=GROUP` to show only specific groups

Signed-off-by: Neil Twigg <[email protected]>
@neilalexander
Copy link
Member Author

Done, defaults to the system account now if no ?acc=.

@derekcollison
Copy link
Member

Sorry I might have mis-spoke, I assume this is only accessible via the system account, which is probably the case. And by default you gather all HAAsset data, but you can scope to only a certain accounts assets yes?

@neilalexander
Copy link
Member Author

At the moment it's only on the monitoring HTTP port and not accessible via $SYS or anywhere else. Would you prefer it to be via $SYS for now and not via HTTP?

For the filtering, previously if you didn't specify an ?acc= then it would list all HA assets across all accounts. Now it doesn't, it defaults to showing HA assets in the system account only if no account is specified, in case enumerating the entire system is expensive.

Copy link
Member

@derekcollison derekcollison left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@derekcollison derekcollison merged commit 9d0b227 into main Jun 17, 2024
4 checks passed
@derekcollison derekcollison deleted the neil/raftz branch June 17, 2024 02:56
neilalexander added a commit that referenced this pull request Jun 17, 2024
We often find ourselves without good visibility into what's really going
on in the Raft layer. This endpoint dumps quite a substantial amount of
internal Raft node state.

Filters include:

* `?acc=ACCNAME` to filter by account (defaults to system account if not
specified)
* `?group=GROUP` to show only specific groups

Signed-off-by: Neil Twigg <[email protected]>

Signed-off-by: Neil Twigg <[email protected]>
neilalexander added a commit that referenced this pull request Jun 17, 2024
We often find ourselves without good visibility into what's really going
on in the Raft layer. This endpoint dumps quite a substantial amount of
internal Raft node state.

Filters include:

* `?acc=ACCNAME` to filter by account (defaults to system account if not
specified)
* `?group=GROUP` to show only specific groups

Signed-off-by: Neil Twigg <[email protected]>

Signed-off-by: Neil Twigg <[email protected]>
wallyqs added a commit that referenced this pull request Jun 17, 2024
Includes the following:

* #5524
* #5528
* #5533
* #5535
* #5538
* #5543
* #5546
* #5545
* #5547
* #5548
* #5530 (**BETA**)
* #5549

The following PRs were **NOT** included as they were later reverted:

* #5532

Signed-off-by: Neil Twigg <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants