-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Workspace support for VolumeClaimTemplates #1986
Comments
That's a interesting idea. The default could be |
Could use the configuration we use currently for pipelineresources (or similar): https://github.com/tektoncd/pipeline/blob/master/docs/install.md#how-are-resources-shared-between-tasks |
Discussed in beta working group and it doesn't feel like we need this for beta at least, so we want to wait until a user indicates they definitely need this functionality. |
My 2-cents... Any feature that assumes the ability to automatically create a PVC (PVs, attaching volumes, or anything to do with underlying storage resources of any kind) is a no-can-do for us. Any resource like that must be pre-allocated and pre-created and its reference injected into a PipelineRun/TaskRun. This is why Tekton pipelines simply wasn't usable for us here at eBay until the Workspaces was implemented. As such, I see no need for such feature as our underlying infrastructure would be unable to support it in the first place. |
It could be a configuration switch (that is on or off by default) |
In #2174 @tstmrk brought up the example of volumeClaimTemplates, which are a feature of Stateful Sets. They provide a similar kind of system to what we're discussing here I think, although they don't clean up the volumes after the Pod is destroyed. In our case we might want to do that when a Task or PIpeline have completed. |
volume claim templates were mentioned again in #2218 as a possible implementation for this feature. Useful to know: they are used in Argo Workflows for a similar use case. Quoting @cameronbraid :
|
@sbwsg +1 for claim templates. It would be useful to be able to configure a default claim template for the cluster as well, in addition to per pipeline. |
For others currently dynamically creating PVCs as the results of triggers firing so each PipelineRun gets a unique one.... can you please share how you are doing this now? |
Currently if a pipelinerun doesn't provide any workspace bindings, the pipelinerun fails immediately. Would it make sense for the controller to bind those to This is in addition to the other paths discussed like claim templates, etc . |
Yeah that sounds useful! Behaviour of the fallback probably configurable somehow? |
Expected BehaviorIf I have pipeline for a project. I expect that Use case for a CI-pipeline: I also expect that the client starting those PipelineRun not to be too smart. Actual BehaviorCurrently, the client starting those PipelineRun must know some things, e.g. creating Unless, this scenario is happening: Example Pipeline
and a number of PipelineRun pr1, pr2 ... pr6
Starting all those PipelineRun concurrently with
Some of the pods is Running, but some are stuck in Init-state.
I can see that the PVC is mounted by all pods
I think this is related to kubernetes/kubernetes#60903 (comment) Proposed SolutionAdd
And then PipelineRun could get unique by appending the generateName, e.g. Types: |
Hey @jlpettersson ! Thanks for the detailed problem statement and proposal! I'm wondering, if the PVCs were created for you by the controller, and/or the controller managed a pool of PVCs such that the PipelineRuns just needed to indicate that they needed a PVC and the controller took care of the rest, would that work for you? I think |
Basically I'm wondering if you'd rather configure this kind of thing:
|
@bobcatfish I see your point. As I see it, PV resources belong to the folks operating the Tekton installation. But PVC is a resource that is created in the users namespace, and by declaring the Persistent Storage design document writes about PV and PVC's responsibilities. https://github.com/kubernetes/community/blob/master/contributors/design-proposals/storage/persistent-storage.md The PVCs created in a namespace also belongs to the quota to the user: https://kubernetes.io/docs/concepts/policy/resource-quotas/#storage-resource-quota so I think it would be great if the user could control size and type, and there is also a default storageClass. In a bigger picture, I think |
There is also this issue tektoncd/triggers#476 but I feel that the PVC belongs to the |
To make sure I'm totally clear @jlpettersson you'd like the user creating the PipelineRun to be able to control this? For example the way that PipelineResources create PVCs, you can configure the storage class across the installation: https://github.com/tektoncd/pipeline/blob/master/docs/install.md#configuring-a-persistent-volume
Interesting point! Would you want the PipelineRun controller to always delete the PVC when the PipelineRun finishes, or would you want to be able to configure that? |
@bobcatfish yes, that is correct. Managing one PVC per PipelineRun can only be done programmatically currently, and it is very hard to get correct from the outside (hard to know when to delete). This is the biggest obstacle for me implementing Tekton for a larger enterprise at the moment. I want to declaratively do this (not using any params here, for clarity), e.g.
Then I can have my declarative Tekton manifest files stored in e.g. git and practice GitOps as with most Kubernetes resources. The strenght with using Consider a cluster with namespaces:
The machine learning team may use huge machine learning models in their apps, so they need a workspace volumes, e.g. 10Gi. While the Windows team need custom storageClass. And the java team can use the default storageClass, but they can use smaller volumes, e.g. 1Gi. Preferably I want that the by PipelineRun created PVC (from user provided template) should be deleted when the PipelineRun is successfully completed. Then you are only paying for it in short time and your namespace Storage Quota is not filled up. Meanwhile we can avoid deleting PVCs for failed PipelineRuns - for investigation. Yes, this can be configurable. E.g. namespace hackathon-playground get it's namespace storage quota filled up fast since their flaky tests makes many PipelineRuns to fail. But since there is a quota they will not allocate Terabytes of storage - they have to clean it up and manage the resources - storage is kept within budget. The alternative to all above, offered with Tekton currently is to reuse the same workspace PVC. This has many disadvantages. First, you need to add an extra Step to cleanup the PVC at the end of the PipelineRun. But for concurrent builds, e.g. PipelineRun The problem here is that all PipelineRuns use the same PVC, and they may interfere. In a Pipeline I love the Tekton project so far, it looks very promising and we try to adopt it. But this is almost a showstopper. But I would love to contribute and implement this, currently investigating it since I may need to use a branch for it anyway. |
An existing PersistentVolumeClaim can currently be used as a Workspace volume source. There is two ways of using an existing PVC as volume: - Reuse an existing PVC - Create a new PVC before each PipelineRun. There is disadvantages by reusing the same PVC for every PipelineRun: - You need to clean the PVC at the end of the Pipeline - All Tasks using the workspace will be scheduled to the node where the PV is bound - Concurrent PipelineRuns may interfere, an artifact or file from one PipelineRun may slip in to or be used in another PipelineRun, with very few audit tracks. There is also disadvantages by creating a new PVC before each PipelineRun: - This can not (easily) be done declaratively - This is hard to do programmatically, because it is hard to know when to delete the PVC. The PipelineRun can not be set as OwnerReference since the PVC must be created first This commit adds 'volumeClaimTemplate' as a volume source for workspaces. This has several advantages: - The syntax is used in k8s StatefulSet and other k8s projects so it is familiar in the kubernetes ecosystem - It is possible to declaratively declare that a PVC should be created for each PipelineRun, e.g. from a TriggerTemplate. - The user can choose storageClass (or omit to get the cluster default) to e.g. get a faster SSD volume, or to get a volume compatible with e.g. Windows. - The user can adapt the size to the job, e.g. use 5Gi for apps that contains machine learning models, or 1Gi for microservice apps. It can be changed on demand in a configuration that lives in the users namespace e.g. in a TriggerTemplate. - The size affects the storage quota that is set on the namespace and it may affect billing and cost depending on the cluster environment. - The PipelineRun or TaskRun with the template is created first, and is used as OwnerReference on the PVC. That means that the PVC will have the same lifecycle as the PipelineRun. Related to tektoncd#1986 See also: - tektoncd#2174 - tektoncd#2218 - tektoncd/triggers#476 - tektoncd/triggers#482 - kubeflow/kfp-tekton#51
@bobcatfish @sbwsg I went on and submitted an implementation similar to what I proposed above. It is possible to try it out. The implementation is inspired by StatefulSet. Nothing is removed, this is purely an additional volume source option when using workspaces. For lifecycle management, I set the |
OK, I'm going to split this issue into two. I'll leave this one for discussion of the volumeClaimTemplate proposal and will cut a new one for the tekton artifact configmap. @jlpettersson thanks for taking this on! Would be open to bringing this PR up in our weekly working group on Wednesday? It would be good to share this with the group, perhaps demo it, and get any feedback on the design as implemented. If you're unable to make it then would you mind if I presented it to the WG? Edit: Here're the details of the Working Group. This week's call will be 12-1 ET. |
@sbwsg yes, I can have a demo on Wednesday :) |
An existing PersistentVolumeClaim can currently be used as a Workspace volume source. There is two ways of using an existing PVC as volume: - Reuse an existing PVC - Create a new PVC before each PipelineRun. There is disadvantages by reusing the same PVC for every PipelineRun: - You need to clean the PVC at the end of the Pipeline - All Tasks using the workspace will be scheduled to the node where the PV is bound - Concurrent PipelineRuns may interfere, an artifact or file from one PipelineRun may slip in to or be used in another PipelineRun, with very few audit tracks. There is also disadvantages by creating a new PVC before each PipelineRun: - This can not (easily) be done declaratively - This is hard to do programmatically, because it is hard to know when to delete the PVC. The PipelineRun can not be set as OwnerReference since the PVC must be created first This commit adds 'volumeClaimTemplate' as a volume source for workspaces. This has several advantages: - The syntax is used in k8s StatefulSet and other k8s projects so it is familiar in the kubernetes ecosystem - It is possible to declaratively declare that a PVC should be created for each PipelineRun, e.g. from a TriggerTemplate. - The user can choose storageClass (or omit to get the cluster default) to e.g. get a faster SSD volume, or to get a volume compatible with e.g. Windows. - The user can adapt the size to the job, e.g. use 5Gi for apps that contains machine learning models, or 1Gi for microservice apps. It can be changed on demand in a configuration that lives in the users namespace e.g. in a TriggerTemplate. - The size affects the storage quota that is set on the namespace and it may affect billing and cost depending on the cluster environment. - The PipelineRun or TaskRun with the template is created first, and is used as OwnerReference on the PVC. That means that the PVC will have the same lifecycle as the PipelineRun. Related to tektoncd#1986 See also: - tektoncd#2174 - tektoncd#2218 - tektoncd/triggers#476 - tektoncd/triggers#482 - kubeflow/kfp-tekton#51
An existing PersistentVolumeClaim can currently be used as a Workspace volume source. There is two ways of using an existing PVC as volume: - Reuse an existing PVC - Create a new PVC before each PipelineRun. There is disadvantages by reusing the same PVC for every PipelineRun: - You need to clean the PVC at the end of the Pipeline - All Tasks using the workspace will be scheduled to the node where the PV is bound - Concurrent PipelineRuns may interfere, an artifact or file from one PipelineRun may slip in to or be used in another PipelineRun, with very few audit tracks. There is also disadvantages by creating a new PVC before each PipelineRun: - This can not (easily) be done declaratively - This is hard to do programmatically, because it is hard to know when to delete the PVC. The PipelineRun can not be set as OwnerReference since the PVC must be created first This commit adds 'volumeClaimTemplate' as a volume source for workspaces. This has several advantages: - The syntax is used in k8s StatefulSet and other k8s projects so it is familiar in the kubernetes ecosystem - It is possible to declaratively declare that a PVC should be created for each PipelineRun, e.g. from a TriggerTemplate. - The user can choose storageClass (or omit to get the cluster default) to e.g. get a faster SSD volume, or to get a volume compatible with e.g. Windows. - The user can adapt the size to the job, e.g. use 5Gi for apps that contains machine learning models, or 1Gi for microservice apps. It can be changed on demand in a configuration that lives in the users namespace e.g. in a TriggerTemplate. - The size affects the storage quota that is set on the namespace and it may affect billing and cost depending on the cluster environment. - The PipelineRun or TaskRun with the template is created first, and is used as OwnerReference on the PVC. That means that the PVC will have the same lifecycle as the PipelineRun. Related to #1986 See also: - #2174 - #2218 - tektoncd/triggers#476 - tektoncd/triggers#482 - kubeflow/kfp-tekton#51
Add a short example and link to a full example of using volumeClaimTemplate as a volume source in a workspace. Requested in comment to PR tektoncd#2326 (comment) that fixes tektoncd#1986
Add a short example and link to a full example of using volumeClaimTemplate as a volume source in a workspace. Requested in comment to PR #2326 (comment) that fixes #1986
Jonas has recently become a regularly contributor. He started with adding a minor [_missing_ `omitempty`](tektoncd/pipeline#2301) and then [proposed some ideas](tektoncd/pipeline#1986 (comment)) around workspaces and PersistentVolumeClaim creation and continued to [elaborate around those ideas](tektoncd/pipeline#1986 (comment)). A sunny day a few days later, he also submitted an [extensive implementation for volumeClaimTemplate](tektoncd/pipeline#2326), corresponding to the idea discussions. A few days later submitted a [small refactoring PR](tektoncd/pipeline#2392), and he also listened to community members that [proposed changes](tektoncd/pipeline#2450) to his implementation about volumeClaimTemplates and did an [implementation for that proposal](tektoncd/pipeline#2453). A rainy day, he also wrote [technical documentation about PVCs](tektoncd/pipeline#2521) including adding an example that caused _flaky_ integration tests for the whole community during multiple days. When he understood his mistake, he submitted a [removal of the example](tektoncd/pipeline#2546) that caused flaky tests. He has also put his toe into Tekton Catalog and [contributed to the buildah task](tektoncd/pipeline#2546). This has followed, mostly with more PRs to the Pipeline project: - tektoncd/pipeline#2460 - tektoncd/pipeline#2491 - tektoncd/pipeline#2502 - tektoncd/pipeline#2506 - tektoncd/pipeline#2632 - tektoncd/pipeline#2633 - tektoncd/pipeline#2634 - tektoncd/pipeline#2636 - tektoncd/pipeline#2601 - tektoncd/pipeline#2630 Jonas is excited about the great community around Tekton and the project! He now would like to join the org.
Jonas has recently become a regularly contributor. He started with adding a minor [_missing_ `omitempty`](tektoncd/pipeline#2301) and then [proposed some ideas](tektoncd/pipeline#1986 (comment)) around workspaces and PersistentVolumeClaim creation and continued to [elaborate around those ideas](tektoncd/pipeline#1986 (comment)). A sunny day a few days later, he also submitted an [extensive implementation for volumeClaimTemplate](tektoncd/pipeline#2326), corresponding to the idea discussions. A few days later submitted a [small refactoring PR](tektoncd/pipeline#2392), and he also listened to community members that [proposed changes](tektoncd/pipeline#2450) to his implementation about volumeClaimTemplates and did an [implementation for that proposal](tektoncd/pipeline#2453). A rainy day, he also wrote [technical documentation about PVCs](tektoncd/pipeline#2521) including adding an example that caused _flaky_ integration tests for the whole community during multiple days. When he understood his mistake, he submitted a [removal of the example](tektoncd/pipeline#2546) that caused flaky tests. He has also put his toe into Tekton Catalog and [contributed to the buildah task](tektoncd/pipeline#2546). This has followed, mostly with more PRs to the Pipeline project: - tektoncd/pipeline#2460 - tektoncd/pipeline#2491 - tektoncd/pipeline#2502 - tektoncd/pipeline#2506 - tektoncd/pipeline#2632 - tektoncd/pipeline#2633 - tektoncd/pipeline#2634 - tektoncd/pipeline#2636 - tektoncd/pipeline#2601 - tektoncd/pipeline#2630 Jonas is excited about the great community around Tekton and the project! He now would like to join the org.
Expected Behavior
This issue started out as a general discussion on a mechanism for dynamically allocating storage for TaskRuns at runtime but has become focused on volume claim templates. So now this issue is for discussion of Volume Claim Templates to define PVCs in workspaces at runtime, similarly to the way StatefulSets and Argo handle storage (see comments in this PR for links to each of these).
See this issue about the artifact configmap for discussion of Workspaces supporting the config-artifact-pvc approach that we use with PipelineResources and to revisit the conversation around "Auto Workspaces".
Actual Behavior
Users have to explicitly define the PersistentVolumeClaim configuration every time they bind a workspace that spans multiple tasks.
The text was updated successfully, but these errors were encountered: