-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Unable to get dockerVolumeMounts working #452
Comments
@Puneeth-n Hey! I have no experience in using FSx but is PVC backed by FSx is supposed to mount the "same volume" into multiple pods? Also, sharing /var/lib/docker isn't how it's supposed to work.
|
@mumoshu thanks for reminding me about this article. Then, what is the main purpose of dockerVolumeMounts? to have let's says have 50 persistent volumes for 50 runners? Aren't runners short lived? i.e, don't they die and a new one comes up after the action is done? More importantly, how can i improve the docker pull and docker build performance? i'm trying to build huge docker containers with GHA and not happy with the docker build times |
@Puneeth-n I think that's mainly for sharing more files between (1) the runner container that runs your job steps and (2)docker containers run within the dind container. |
That's being said, I would appreciate it very much if you could share more use-cases with dockerVolumeMounts, if found any 😄 |
It was added here #439 to resolve #435. Unforunately it's is another feature that has been added without any documentation by the author so it's not clear on how it is expected to be used. Those issues may help @Puneeth-n with figuring that out. A PR to add docs would be greatly appreciated by yourself or @asoldino the original author. |
Thanks @toast-gear! I haven't read @asoldino's original motivation very carefully
When you used e.g. host volumes, this would work when you have only one runner pod per host. In a public cloud like AWS, it would imply that you may prefer combining smaller EC2 instances with the one-pod-per-host model. But I think it would be preferable to avoid using PV just for making the docker builds faster. Probably you've better experience with e.g. using the nearest container image registry like ECR when you're in AWS with docker's |
Hi @mumoshu, in my scenario we leverage private runners mainly because we need powerful machines to run container jobs based on very large images (~10s of Gb hosted on ACR/ECR) and complex compilation units. In this scenario, single-tenancy of runners is desirable. Initial benchmarks shown an improvement of ~5 minutes per build (on a 10 Gb image). What do you think? |
@asoldino Hey! Your scenario and the use of the feature seem completely valid. To be clear, you aren't concurrently writing to /var/lib/docker from multiple dockerd processes, right? |
Exactly, that's not supposed to happen |
@asoldino Thanks for confirming! @Puneeth-n Hey! I believe I wasn't clear. I only wanted to say that sharing /var/lib/docker from multiple concurrently-running containers is wrong. If you can ensure only one container is writing to /var/lib/docker, it should work fine. That's being said, if you'd like to share /var/lib/docker using a host volume, you will likely to need to set some pod anti affinity and/or big resource req/limits to avoid two or more runners pods concurrently scheduled onto the same host. |
@Puneeth-n If you still need to use FSx, I think actions-runner-controller needs to be enhanced to enable the user to specify a PVC template rather than a PVC, like a K8s statefulset. |
@mumoshu Thanks for clarifying. |
@asoldino could you describe your setup a bit more for me please? You have RunnerDeployments in k8s cluster tied to a single really big k8s node which isn't shared with any other runners. Do your runners just have huge requests/limits? |
Sure:
#...
kind: RunnerDeployment
spec:
template:
spec:
nodeSelector:
agentpool: runners
#...
#...
resources:
limits:
cpu: "4.0"
memory: "16Gi"
dockerContainerResources:
limits:
cpu: "4.0"
memory: "16Gi"
#...
#...
kind: HorizontalRunnerAutoscaler
spec:
scaleTargetRef:
name: runners
#... To recap: Resource request forces Kubernetes to schedule one pod per runner node, when the runner autoscaler kicks in then the node autoscaler provisions the extra nodes required and Kubernetes can eventually run the additional pod. |
@asoldino are you using |
@Puneeth-n I'm not actively working on the workflows, I'm "just" a platform provider for my company. I can tell there are a few teams using |
@mumoshu when do you plan to create a new release? I can't wait to test this feature out :) |
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. |
We've done another release 🎉 |
@Puneeth-n @callum-tait-pbx @mumoshu Question: I have yesterday I deployed registry pod, configured |
^^ taking this to discussion #813 (comment) |
@Puneeth-n @pratikbalar have you guys got the PVC to mount |
@antodoms Sharing |
@mumoshu i was more of thinking mounting |
@antodoms I'm not an expert but I guess, docker doesn't maintain an exclusive lock under the whole |
@antodoms I do not recommend EFS for anything. I tried it years back to mount the EFS volume across multiple Jenkins agents to have the same source code. I had issues with file consistencies across AZ |
Thank you @Puneeth-n and @mumoshu so i have found this metric better, stead of scaling down the replica to 0, i am using multi stage build and buildkit definetly helped speed up few steps in build but still without image caching each new runner that gets created will have to fetch new postgres and redis service images from docker hub, which eats up some time. having few runners running always makes sure those images are cached inside those runners. |
@antodoms why not mount local volume to the docker container and pin one runner per node? or try volumeClaimTemplates and RunnerSet? |
@Puneeth-n @mumoshu @asoldino Wouldn't using
|
@prein Hey! I believe we had only two options so far- mount the host /var/lib/docker onto the runner pod and ensure there's only one runner per node, or use emptyDir/dynamic local volume. Neither solutions share /var/lib/docker across pods. It remains the best practice NOT to share it. I'd consider the use of subPathExpr in this context a variant of the latter option, because it enables you to have a unique /var/lib/docker volume per pod, not shared. |
Hi, I am trying to mount a AWS fsx volume to
docker:dind
image with the newdockerVolumeMounts
feature and I am not sure if it is working as expected.I puller a docker image from inside one runner and ried to do the same from another runner. The expectation was that it would not pull it again in the 2nd runner but it did.
the nodes are in the same AZ as the FSx volume and all the GHA are running on these nodes.
Chart version: 0.10.5
Controller: v0.18.2
Runner config
k -n ci describe runner comtravo-github-actions-deployment-8f2gx-5bmhm
The text was updated successfully, but these errors were encountered: