Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Engineer] On call Primary and Secondary for Sprint 1 #3439

Closed
BerniXiongA6 opened this issue Sep 10, 2024 · 1 comment
Closed

[Engineer] On call Primary and Secondary for Sprint 1 #3439

BerniXiongA6 opened this issue Sep 10, 2024 · 1 comment

Comments

@BerniXiongA6
Copy link

BerniXiongA6 commented Sep 10, 2024

Responsibilities for the primary on-call engineer:

  • Monitor the#benefits-vro-alerts channel for alerts triggered through datadog. For any incidents which impact partner team applications, please follow the issue triage procedure below.
  • Monitor #benefits-vro-support for potential incidents
  • Monitor SecRel
  • Monitor Dependabot
  • Give Berni the MTTR to post in the Sprint Review Deck

Issue Triage Procedure

Upon receiving a notification, promptly evaluate the severity of the incident and perform triage accordingly.
Collect pertinent information related to the triggered alert(s), with a focus on communicating the impact and, if possible, identifying the root cause. Notify all relevant parties, including LHDI or partner teams, about the observed behavior, and create a corresponding ticket for the issue.
If the issue is considered straightforward to fix, proceed to address it. Notify the team and bring a user story into the current sprint to represent the work.
For issues deemed complex and requiring more discussion, create a ticket and collaborate with the PM to prioritize it effectively.
Maintain transparent and frequent communication with the team and partners through the support channel, especially if the issues hinder their ability to deploy or use applications appropriately.
Document the findings and issues created in a wiki page under the homepage under the heading "Partner Teams" subheading "Partner Team Incident Reports"

See also: wiki page for Incident Response.

Secondary responsibilities

Remain accessible to the primary for assistance as required, and concentrate on addressing smaller tickets or collaborating on larger ones during the Sprint.

@BerniXiongA6 BerniXiongA6 assigned chengjie8 and Ponnia-M and unassigned chengjie8 and Ponnia-M Sep 10, 2024
@BerniXiongA6 BerniXiongA6 changed the title Copy of On call Primary and Secondary [Engineer] On call Primary and Secondary for Sprint 1 Sep 10, 2024
@BerniXiongA6
Copy link
Author

Closing this ticket since it's end of sprint, @PaulKBaumann please let us know at sprint planning if there were any new tickets that came from pager duty this past sprint that need to be handled in Sprint 2. cc: @lisac @meganhicks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants