-
-
Notifications
You must be signed in to change notification settings - Fork 371
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Store an agents list and add agent heartbeats #1189
Conversation
Where does agent stores it's ID between restarts? |
It does not yet. For user-registered users I would like to get the db entry by searching using the token. For "system"-level agents (the type we currently already have), they get a generated id on the first start. This one would be needed to be saved to a file for example. |
Or do you have a nice idea? |
I like the agent's to be stateles ... we could do generate an UUID for each run but use some .env var to send a user defined var for reidentification ? |
IMO our agents are still somehow stateless if we save that uuid to a file. The id will be only used for system level agents to reidentify and if it does not exists it will simply be a new agent. In the long term we could think about removing support for the shared agent secret completely. |
File probably would be ok, otherwise there is no really way to reidentify same agent and for each agent restart there will be new record in database. Also for future it would be nice if agent could unregister itself but this is probably for future PRs so that for auto-scaling agents would not create a large list of agents that won't be coming back. |
In docker if agents are scaled in docker and multiple agent instances are run in same docker host... they would have same persistent volume attached that could be a problem as they would get same file with same UUID |
well I would not implement autoscaling by spowning new threads ... and instead refactor the loop in the agent implementation to only have one main loop and use go routines for tasks and with a headbeat I would delete agent from db if it timed out based on mentioned UUID |
By auto-scaling I meant firing up/down new VMs etc
This would probably best option as I don't see how uuid persistence could be achieved reliably on docker |
this timeout also can then mark procs marked as running at that specific agent as pending again so a other can pick it up ... this will fix an issue where running steps on an agent if the agent get killet atm stay at running forever until the pipeline timeout do cancle them and then they are failed but well they dont have to PS: note that reset has to hapen to the whole pipeline, as they share the workspace ... |
please resolve ;) |
uh and of the rpc move ... we might move the rpc stuff into it's own root directory, as it's a shared thing between agent and server but nothing else should use/touch it ... |
and we need to update https://woodpecker-ci.org/docs/next/development/architecture acordingly |
I would leave that for another PR. 😉 Should I undo the move of the agent code, to keep changes to a minimum? |
closes #536
closes #252
prepares #267
TODO
RegsiterAgent
and agent heartbeat calls to agent