Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remote Agent restart capability #22

Closed
tigrannajaryan opened this issue Nov 15, 2021 · 4 comments · Fixed by #64
Closed

Remote Agent restart capability #22

tigrannajaryan opened this issue Nov 15, 2021 · 4 comments · Fixed by #64

Comments

@tigrannajaryan
Copy link
Member

Do we need a capability for the Server to order the Agent to restart?

@djaglowski
Copy link
Member

It seems reasonable to me to have basic control capabilities like this.

I've personally required this in similar implementations and have found it useful for various troubleshooting and recovery scenarios.

Similarly, a "shutdown" capability can be useful for retiring unsupported versions / zombie instances.

I support both cases as long as they have corresponding capability flags.

@tigrannajaryan
Copy link
Member Author

Similarly, a "shutdown" capability can be useful for retiring unsupported versions / zombie instances.

Is it assumed that the OpAMP connection is lost after the "shutdown"? This would mean "shutdown" is irreversible (unless the Agent us run again after e.g. machine restarts)? Alternatively, do we think of "shutdown" to be applicable when the Supervisor model is used, essentially meaning that the Supervisor stays up and running and can be instructed to start the Agent again?

@djaglowski
Copy link
Member

Is it assumed that the OpAMP connection is lost after the "shutdown"? This would mean "shutdown" is irreversible (unless the Agent us run again after e.g. machine restarts)?

That is how I would define it.

I have implemented a shutdown in a supervised model but not one where the supervisor had a connection. I'll describe that first as a possible model:

  • Server sends a shutdown signal to the agent
  • The agent process shuts down w/ a particular exit code
  • The supervisor checks the exit code and interprets it to mean that it should itself shut down
  • If there is an externally driven restart (e.g. machine restart)
    • The agent connects to the server, transmits (among other things) its ID
    • The server recognizes the ID as having been retired, and responds with another "shutdown" signal

In the case where the supervisor has a connection, it may worth differentiating between the cases where a total shutdown is desired, vs waiting for further instructions. Even a standalone agent could differentiate between the two behaviors:

  • "shutdown": end the process
  • "pause": cease all activity except for the connection to server

I'm in the weeds in relation to your original question - happy to open separate issues if more appropriate.

@tigrannajaryan
Copy link
Member Author

I'm in the weeds in relation to your original question - happy to open separate issues if more appropriate.

Yes, I think it would be useful to separate this into a separate issue, since it seems complicated enough to warrant its own discussion.

tigrannajaryan pushed a commit that referenced this issue Mar 22, 2022
Resolves #22

- Add `ServerToAgentCommand` into `ServerToAgentMessage`.
- Documents `ServerToAgentCommand`
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants