Agent remains in the UPG_SCHEDULED state past the scheduled action start time #3817

cmacknz · 2023-11-24T20:31:57Z

When scheduling an agent upgrade the agent will remain in the UPG_SCHEDULED state past the scheduled start time.

~/Downloads/elastic-agent-8.3.0-SNAPSHOT-darwin-aarch64 ······························ 02:17:09 PM
❯ sudo elastic-agent status
┌─ fleet
│  └─ status: (HEALTHY) Connected
├─ elastic-agent
│  └─ status: (HEALTHY) Running
└─ upgrade_details
├─ target_version: 8.11.1
├─ state: UPG_SCHEDULED
├─ action_id: 4ecb0512-8f59-49a8-b27a-00acba792c6c
└─ metadata
    └─ scheduled_at: 2023-11-24T19:15:00Z

Note the scheduled start time of 2023-11-24T19:15:00Z UTC but the current time showing 02:17:09 PM EST as my local time or 2023-11-24T19:17:09Z UTC which is 2 minutes past when the upgrade should have started.

The current implementation sets the start time of the upgrade to the start time of the dispatcher so this may be an artifact of the dispatcher implementation.

elastic-agent/internal/pkg/agent/application/dispatcher/dispatcher.go

Lines 322 to 324 in 97e8217

    
           upgradeDetails := details.NewDetails(nextUpgrade.Version, details.StateScheduled, nextUpgrade.ID()) 
        
           startTime, _ := nextUpgrade.StartTime() 
        
           upgradeDetails.Metadata.ScheduledAt = &startTime

The text was updated successfully, but these errors were encountered:

elasticmachine · 2023-11-24T20:31:59Z

Pinging @elastic/elastic-agent (Team:Elastic-Agent)

cmacknz · 2023-12-12T15:04:10Z

Adding how to handle expired upgrades from a Slack discussion on what to do about expired upgrades:

It would be nice to know if this happened. I think moving to UPG_FAILED if this happens might be best so this shows up in telemetry, all the cases I know of where an upgrade action expired were treated as a bug or a surprise by the user. The other options are to stay in the UPG_SCHEDULED state but with the details indicating it expired, or to introduce a new state just for this but I think that’s probably too much work at this point since you’d have to change Fleet as well.

I currently think we should consider expired upgrades as failed upgrades to make sure they are flagged. This would have allowed us to more easily detect and fix elastic/kibana#170322

cmacknz added the Team:Elastic-Agent Label for the Agent team label Nov 24, 2023

cmacknz assigned ycombinator Nov 24, 2023

blakerouse assigned AndersonQ and unassigned ycombinator Nov 27, 2023

AndersonQ mentioned this issue Dec 13, 2023

Clean up UPG_SCHEDULED state when action expires #3902

Merged

4 tasks

AndersonQ closed this as completed in #3902 Dec 28, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Agent remains in the UPG_SCHEDULED state past the scheduled action start time #3817

Agent remains in the UPG_SCHEDULED state past the scheduled action start time #3817

cmacknz commented Nov 24, 2023

elasticmachine commented Nov 24, 2023

cmacknz commented Dec 12, 2023

Agent remains in the UPG_SCHEDULED state past the scheduled action start time #3817

Agent remains in the UPG_SCHEDULED state past the scheduled action start time #3817

Comments

cmacknz commented Nov 24, 2023

elasticmachine commented Nov 24, 2023

cmacknz commented Dec 12, 2023