Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

disconnected clients: Observability plumbing #12141

Conversation

DerekStrickland
Copy link
Contributor

@DerekStrickland DerekStrickland commented Feb 28, 2022

NOTE TO REVIEWER: Converting this back to draft, as it makes sense to include the TaskGroupSummary changes in this batch of commits.

This PR contains three changes.

  • It adds disconnect & reconnect to the GoString implementation for reconcileResults so that the count of these
    maps will be included in log output.
  • It adds a call to emitRescheduleInfo in createTimeoutLaterEvals so that metrics will now include reschedule metrics for disconnects.
  • Adds an Unknown field to the TaskGroupSummary struct and includes that value in all related processing and metrics.

@DerekStrickland DerekStrickland added this to the 1.3.0 milestone Feb 28, 2022
@DerekStrickland DerekStrickland self-assigned this Feb 28, 2022
@DerekStrickland DerekStrickland marked this pull request as ready for review February 28, 2022 11:32
@DerekStrickland DerekStrickland marked this pull request as draft February 28, 2022 11:55
@DerekStrickland DerekStrickland marked this pull request as ready for review February 28, 2022 17:07
@DerekStrickland DerekStrickland changed the title disconnected clients: Observability improvements disconnected clients: Observability plumbing Feb 28, 2022
Copy link
Member

@tgross tgross left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

Comment on lines +148 to +149
base := fmt.Sprintf("Total changes: (place %d) (destructive %d) (inplace %d) (stop %d) (disconnect %d) (reconnect %d)",
len(r.place), len(r.destructiveUpdate), len(r.inplaceUpdate), len(r.stop), len(r.disconnectUpdates), len(r.reconnectUpdates))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Possibly nitpicky: because this report is for "Total changes", is calling the change disconnect and reconnect the right name? That reads to me like the scheduler is asking for the allocs to be disconnected or reconnected. (Might be a good question to throw to the team for bikeshedding 😀 )

@DerekStrickland DerekStrickland merged commit a497f12 into f-disconnected-client-allocation-handling Mar 4, 2022
@DerekStrickland DerekStrickland deleted the f-observability-disconnected-clients branch March 4, 2022 16:04
DerekStrickland added a commit that referenced this pull request Apr 5, 2022
* Add disconnects/reconnect to log output and emit reschedule metrics

* TaskGroupSummary: Add Unknown, update StateStore logic, add to metrics
@github-actions
Copy link

I'm going to lock this pull request because it has been closed for 120 days ⏳. This helps our maintainers find and focus on the active contributions.
If you have found a problem that seems related to this change, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Oct 27, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants