-
Notifications
You must be signed in to change notification settings - Fork 101
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add telemetry job #1448
Merged
Merged
Add telemetry job #1448
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
sjberman
reviewed
Jan 5, 2024
pleshakov
commented
Jan 8, 2024
kate-osborn
reviewed
Jan 8, 2024
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a couple of questions, but it looks good to me!
sjberman
approved these changes
Jan 9, 2024
bjee19
reviewed
Jan 9, 2024
Problem: We want to have a telemetry job that periodically reports product telemetry every 24h. For now, telemetry data is empty and report is sent to the debug log. Solution: - Refactor leader election to use controller-runtime manager capabilities. This simplifies the existing code and make it easier to add a telemetry Job. - Add a telemetry Job that periodically reports empty telemetry to the debug log. - Make the period configurable at build time via TELEMETRY_REPORT_PERIOD Makefile variable. Note: leader elector refactoring changes behavior of NGF process when leadership gets lost: Before: the Manager would shutdown waiting for the runnables to exit. After: the Manager doesn't wait. It similar to NGF process panicing. This should be OK, as NGF container will restart and recover any potentially broken state (update not fully populated statuses, restore correct NGINX configuration). Testing: - Unit tests - Manual testing: - Ensure leader election works as expected - both leader and non-pods run successfully. - Ensure NGF container exits when stop being leader. - Ensure an upgrade from Release 1.1.0 is successful for leader election - the leader gets elected among the new pods. - Ensure the telemetry Job reports telemetry multiple times, using a small value of ELEMETRY_REPORT_PERIOD CLOSES nginx#1382
Co-authored-by: Saylor Berman <[email protected]>
pleshakov
force-pushed
the
feature/telemetry-job
branch
from
January 10, 2024 15:37
5c5fcf1
to
06832aa
Compare
bjee19
approved these changes
Jan 10, 2024
kate-osborn
approved these changes
Jan 10, 2024
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Proposed changes
Problem:
We want to have a telemetry job that periodically reports product telemetry every 24h. For now, telemetry data is empty and report is sent to the debug log.
Solution:
Note: leader elector refactoring changes behavior of NGF process when leadership gets lost:
Before: the Manager would shutdown waiting for the runnables to exit. After: the Manager doesn't wait. It similar to NGF process panicing. This should be OK, as NGF container will restart and recover any potentially broken state (update not fully populated statuses, restore correct NGINX configuration).
Testing:
CLOSES #1382
More notes:
Checklist
Before creating a PR, run through this checklist and mark each as complete.