Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Store an agents list and add agent heartbeats #1189

Merged
merged 42 commits into from
Jan 28, 2023

Conversation

anbraten
Copy link
Member

@anbraten anbraten commented Sep 14, 2022

closes #536
closes #252

prepares #267

TODO

  • update agent model and add datastore functions
  • update grpc proto and add register agent and heartbeat
  • add RegsiterAgent and agent heartbeat calls to agent
  • fix agent not re-trying to register / doing the heartbeat
  • find way to store agent-id of system agens (currently all agents) somehow so the agent can be matched to the db entry
  • add api endpoint to get all agents as admin
  • show agents in admin UI
  • allow to change name of an agent in admin UI
  • fix system agents updating the same (first) agent containing the system token
    • return agent-id on register
    • use agent-id for heartbeat?

Screenshot from 2022-09-30 18-54-02

image

@anbraten anbraten changed the title update agent model and add new agents to table Store an agents list and add agent heatbeats Sep 14, 2022
@anbraten anbraten changed the title Store an agents list and add agent heatbeats Store an agents list and add agent heartbeats Sep 14, 2022
@6543 6543 added feature add new functionality server agent labels Sep 14, 2022
@6543 6543 added this to the 1.0.0 milestone Sep 14, 2022
server/model/agent.go Outdated Show resolved Hide resolved
@lafriks
Copy link
Contributor

lafriks commented Sep 17, 2022

Where does agent stores it's ID between restarts?

@anbraten
Copy link
Member Author

Where does agent stores it's ID between restarts?

It does not yet. For user-registered users I would like to get the db entry by searching using the token. For "system"-level agents (the type we currently already have), they get a generated id on the first start. This one would be needed to be saved to a file for example.

@anbraten
Copy link
Member Author

Or do you have a nice idea?

@6543
Copy link
Member

6543 commented Sep 17, 2022

I like the agent's to be stateles ... we could do generate an UUID for each run but use some .env var to send a user defined var for reidentification ?

@anbraten
Copy link
Member Author

IMO our agents are still somehow stateless if we save that uuid to a file. The id will be only used for system level agents to reidentify and if it does not exists it will simply be a new agent. In the long term we could think about removing support for the shared agent secret completely.

@lafriks
Copy link
Contributor

lafriks commented Sep 19, 2022

File probably would be ok, otherwise there is no really way to reidentify same agent and for each agent restart there will be new record in database. Also for future it would be nice if agent could unregister itself but this is probably for future PRs so that for auto-scaling agents would not create a large list of agents that won't be coming back.

@lafriks
Copy link
Contributor

lafriks commented Sep 19, 2022

In docker if agents are scaled in docker and multiple agent instances are run in same docker host... they would have same persistent volume attached that could be a problem as they would get same file with same UUID

@6543
Copy link
Member

6543 commented Sep 19, 2022

File probably would be ok, otherwise there is no really way to reidentify same agent and for each agent restart there will be new record in database. Also for future it would be nice if agent could unregister itself but this is probably for future PRs so that for auto-scaling agents would not create a large list of agents that won't be coming back.

well I would not implement autoscaling by spowning new threads ... and instead refactor the loop in the agent implementation to only have one main loop and use go routines for tasks

and with a headbeat I would delete agent from db if it timed out based on mentioned UUID

@lafriks
Copy link
Contributor

lafriks commented Sep 19, 2022

well I would not implement autoscaling by spowning new threads ... and instead refactor the loop in the agent implementation to only have one main loop and use go routines for tasks

By auto-scaling I meant firing up/down new VMs etc

and with a headbeat I would delete agent from db if it timed out based on mentioned UUID

This would probably best option as I don't see how uuid persistence could be achieved reliably on docker

@6543
Copy link
Member

6543 commented Sep 19, 2022

this timeout also can then mark procs marked as running at that specific agent as pending again so a other can pick it up ...

this will fix an issue where running steps on an agent if the agent get killet atm stay at running forever until the pipeline timeout do cancle them and then they are failed but well they dont have to

PS: note that reset has to hapen to the whole pipeline, as they share the workspace ...

@6543
Copy link
Member

6543 commented Nov 10, 2022

please resolve ;)

@6543
Copy link
Member

6543 commented Nov 10, 2022

uh and of the rpc move ... we might move the rpc stuff into it's own root directory, as it's a shared thing between agent and server but nothing else should use/touch it ...

@6543
Copy link
Member

6543 commented Nov 10, 2022

and we need to update https://woodpecker-ci.org/docs/next/development/architecture acordingly

@anbraten anbraten mentioned this pull request Nov 22, 2022
1 task
@anbraten
Copy link
Member Author

anbraten commented Nov 26, 2022

uh and of the rpc move ... we might move the rpc stuff into it's own root directory, as it's a shared thing between agent and server but nothing else should use/touch it ...

I would leave that for another PR. 😉 Should I undo the move of the agent code, to keep changes to a minimum?

@anbraten anbraten requested a review from a team January 11, 2023 08:33
cmd/agent/agent.go Outdated Show resolved Hide resolved
server/grpc/auth_server.go Show resolved Hide resolved
server/grpc/jwt_manager.go Outdated Show resolved Hide resolved
server/grpc/jwt_manager.go Outdated Show resolved Hide resolved
server/grpc/rpc.go Outdated Show resolved Hide resolved
server/grpc/rpc.go Outdated Show resolved Hide resolved
server/grpc/rpc.go Outdated Show resolved Hide resolved
server/grpc/rpc.go Outdated Show resolved Hide resolved
agent/rpc/auth_client_grpc.go Outdated Show resolved Hide resolved
agent/rpc/auth_interceptor.go Outdated Show resolved Hide resolved
@anbraten anbraten requested a review from lafriks January 28, 2023 12:34
@anbraten anbraten merged commit d960323 into woodpecker-ci:master Jan 28, 2023
@anbraten anbraten deleted the agent branch February 12, 2024 16:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
agent feature add new functionality server
Projects
None yet
Development

Successfully merging this pull request may close these issues.

add agent heartbeat Show agents and some information about them in webinterface
4 participants