Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

disconnected clients: ensure servers meet minimum required version #12202

Conversation

DerekStrickland
Copy link
Contributor

@DerekStrickland DerekStrickland commented Mar 7, 2022

Note for Reviewer: This is a pragmatic compromise that might need discussion.

  • The CoreScheduler has a direct reference to the nomad.Server and thus its members and is responsible for
    performing server version checks for core jobs.
  • Initially, I tried exposing this functionality via the Scheduler interface, however, the GenericScheduler cannot have a
    direct reference to the nomad.Server because it is in the scheduler package and that creates an import cycle.
  • The Worker fulfills the Planner interface, is made available to the GenericScheduler by that interface, and also
    has a reference to the nomad.Server. So the Planner interface was extended instead.

Overview

This PR contains the logic necessary to ensure clusters running mixed version servers do not enable the max_client_disconnect feature until all servers meet the minimum required version.

  • bf71504
    • Exposes the ServersMeetMinimumVersion utility function via the Planner interface.
  • 6e60a2c
    • Modifies filterByTained to include a boolean flag indicating whether the caller supports disconnected clients.
    • If not, the disconnected and reconnecting allocSets are never populated.
    • Effectively short-circuits disconnect/reconnect logic from ever triggering in the reconciler.
  • b41401b
    • Adds the version check logic in the GenericScheduler
    • Adds a field to the allocReconciler.
    • ModifiesNewAllocReconciler to accept a parameter to set the field.
    • Plumbs it through to all calls tofilterByTainted.
  • cfc5f49
    • Updates affected/dependent tests

@vercel vercel bot temporarily deployed to Preview – nomad March 7, 2022 12:16 Inactive
@DerekStrickland DerekStrickland changed the title [WIP] server guard disconnected clients [WIP] disconnected clients: ensure servers meet minimum required version Mar 7, 2022
@DerekStrickland DerekStrickland changed the title [WIP] disconnected clients: ensure servers meet minimum required version disconnected clients: ensure servers meet minimum required version Mar 7, 2022
@DerekStrickland DerekStrickland marked this pull request as ready for review March 7, 2022 12:58
@DerekStrickland DerekStrickland requested a review from tgross March 7, 2022 13:03
Copy link
Member

@tgross tgross left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM once my comment about the tests is addressed.

scheduler/generic_sched.go Show resolved Hide resolved
scheduler/reconcile_test.go Outdated Show resolved Hide resolved
scheduler/reconcile_util.go Outdated Show resolved Hide resolved
@vercel vercel bot temporarily deployed to Preview – nomad March 7, 2022 18:03 Inactive
@DerekStrickland DerekStrickland merged this pull request into f-disconnected-client-allocation-handling Mar 7, 2022
@DerekStrickland DerekStrickland deleted the f-server-guard-disconnected-clients branch March 7, 2022 18:40
DerekStrickland added a commit that referenced this pull request Mar 7, 2022
…12202)

* planner: expose ServerMeetsMinimumVersion via Planner interface
* filterByTainted: add flag indicating disconnect support
* allocReconciler: accept and pass disconnect support flag
* tests: update dependent tests
DerekStrickland added a commit that referenced this pull request Apr 5, 2022
…12202)

* planner: expose ServerMeetsMinimumVersion via Planner interface
* filterByTainted: add flag indicating disconnect support
* allocReconciler: accept and pass disconnect support flag
* tests: update dependent tests
@github-actions
Copy link

I'm going to lock this pull request because it has been closed for 120 days ⏳. This helps our maintainers find and focus on the active contributions.
If you have found a problem that seems related to this change, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Oct 26, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants