Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Agent remains in the UPG_SCHEDULED state past the scheduled action start time #3817

Closed
cmacknz opened this issue Nov 24, 2023 · 2 comments · Fixed by #3902
Closed

Agent remains in the UPG_SCHEDULED state past the scheduled action start time #3817

cmacknz opened this issue Nov 24, 2023 · 2 comments · Fixed by #3902
Assignees
Labels
Team:Elastic-Agent Label for the Agent team

Comments

@cmacknz
Copy link
Member

cmacknz commented Nov 24, 2023

When scheduling an agent upgrade the agent will remain in the UPG_SCHEDULED state past the scheduled start time.

~/Downloads/elastic-agent-8.3.0-SNAPSHOT-darwin-aarch64 ······························ 02:17:09 PM
❯ sudo elastic-agent status
┌─ fleet
│  └─ status: (HEALTHY) Connected
├─ elastic-agent
│  └─ status: (HEALTHY) Running
└─ upgrade_details
├─ target_version: 8.11.1
├─ state: UPG_SCHEDULED
├─ action_id: 4ecb0512-8f59-49a8-b27a-00acba792c6c
└─ metadata
    └─ scheduled_at: 2023-11-24T19:15:00Z

Note the scheduled start time of 2023-11-24T19:15:00Z UTC but the current time showing 02:17:09 PM EST as my local time or 2023-11-24T19:17:09Z UTC which is 2 minutes past when the upgrade should have started.

The current implementation sets the start time of the upgrade to the start time of the dispatcher so this may be an artifact of the dispatcher implementation.

upgradeDetails := details.NewDetails(nextUpgrade.Version, details.StateScheduled, nextUpgrade.ID())
startTime, _ := nextUpgrade.StartTime()
upgradeDetails.Metadata.ScheduledAt = &startTime

@cmacknz cmacknz added the Team:Elastic-Agent Label for the Agent team label Nov 24, 2023
@elasticmachine
Copy link
Contributor

Pinging @elastic/elastic-agent (Team:Elastic-Agent)

@cmacknz
Copy link
Member Author

cmacknz commented Dec 12, 2023

Adding how to handle expired upgrades from a Slack discussion on what to do about expired upgrades:

It would be nice to know if this happened. I think moving to UPG_FAILED if this happens might be best so this shows up in telemetry, all the cases I know of where an upgrade action expired were treated as a bug or a surprise by the user. The other options are to stay in the UPG_SCHEDULED state but with the details indicating it expired, or to introduce a new state just for this but I think that’s probably too much work at this point since you’d have to change Fleet as well.

I currently think we should consider expired upgrades as failed upgrades to make sure they are flagged. This would have allowed us to more easily detect and fix elastic/kibana#170322

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Team:Elastic-Agent Label for the Agent team
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants