Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tie a specific workflow job to a specific ephemeral runner via labels #2147

Closed
alexellis opened this issue Sep 21, 2022 · 3 comments
Closed
Labels
enhancement New feature or request

Comments

@alexellis
Copy link

Describe the enhancement

I would like to be able to tie a specific workflow job to a specific ephemeral runner via labels

I would imagine this working as a generated or optional label with a unique identifier like the job ID.

  1. A workflow job event comes in with id 12345678
  2. I create a VM, and instruct it to have the label 12345678
  3. After 2-3 seconds, the runner starts on the VM
  4. The runner picks up a job 12345678
  5. The job is completed, the runner exits and the VM shuts down, the ephemeral runner is removed from the org.

Alternative flow using cancellation

  1. Two workflow job events come in with ids 1234 and 5678
  2. Two VMs are created with these IDs, and the corresponding labels
  3. Before the runner starts, the user cancels job 1234
  4. I delete the VM with the name runner-1234
  5. After 2-3 seconds, the runner starts on the 5678VM
  6. The runner picks up a job 5678
  7. The job is completed, the runner exits and the VM shuts down, the ephemeral runner is removed from the org.

Why is this necessary?

When creating ephemeral runners with individual VMs, there is no deterministic way to manage the lifecycle of the VM. If you create 5 VMs because a workflow run has 5 jobs within it, then the workflow gets cancelled, you can't remove any of the VMs you created, because a separate build from another repo could have started running jobs there.

In addition, if you have a workflow run with 10 jobs, 10 VMs are created and one job starts, the other 9 are cancelled by the user. How do you know which VMs to remove?

Code Snippet

Additional information
Add any other context about the feature here.

NOTE: if the feature request has been agreed upon then the assignee will create an ADR. See docs/adrs/README.md

This may be useful to the recommended Philips solution linked from the official GitHub docs philips-labs/terraform-aws-github-runner#1853 - they also seemed to run into similar issues with managing lifecycle.

If a runner can be created and set to only run a job for a deterministic job ID, lifecycle management because much easier.

@alexellis alexellis added the enhancement New feature or request label Sep 21, 2022
@nikola-jokic
Copy link
Contributor

Hey @alexellis, thank you for submitting the issue. The issue is a duplicate of #620 and #2106, so I will close it now. Please add 👍 to the original issue #620 so we can see the interest!

@alexellis
Copy link
Author

Ack @nikola-jokic, it's just unfortunate that the issue you closed doesn't describe the problem quite so well, so I wonder if it will make it harder for your team at GitHub to understand the issue as clearly.

@nikola-jokic
Copy link
Contributor

Hi @alexellis, you are right, you described the problem really well and thank you for doing so! I closed the issue so we don't have multiple of them pointing to the same feature. By mentioning it, there is a link pointing to your issue so we can find it easily. ☺️

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants