-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Link git outputs between tasks #270
Link git outputs between tasks #270
Conversation
c744923
to
33d6996
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd like to start off by saying this is a great PR and thanks for putting all this thorough work in 😍
I especially appreciate your attention to test coverage and all of the well factored unit tests :D
Of course I've always got some feedback tho, my apologies 😅 ! Initial thoughts below:
- Could this PR also include some info in the docs about how the outputs are passed b/w the tasks? maybe at https://github.com/knative/build-pipeline/blob/master/docs/using.md#creating-resources
- And some docs about the new
targetPath
resource type
- And some docs about the new
I'm confused about the fact that in the e2e test, it looks liek the two fanned out tasks modify the same resource is that right? I thought we wanted to explicitly avoid that :S :S :S
Replaces current e2e test for multiple tasks pipelinerun test with fan-in and fan-out pipeline. (More complex than existing e2e test and covers all use cases previous e2e test as well)
Since we haven't completed #168 yet, what order does the fan-in, fan-out end up executing in?
Quick plug for rewriting some of the commit messages, see https://github.com/knative/build-pipeline/blob/master/CONTRIBUTING.md#commit-messages (commit messages like "POC outputs", "Fix merge conflicts" could have more info - also might lend themselves to better messages if they were rebased together? and it would be great if the issue #s could be in some of the commit messages)
test/helm_task_test.go
Outdated
Name: "workspace", | ||
ProvidedBy: []string{createImageTaskName}, | ||
Name: "workspace", | ||
// ProvidedBy: []string{createImageTaskName}, //Yet to be implemented |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could we add a TODO(some issue # here)
, e.g.:
// TODO(#148 ) add support for xyz
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you mean you are going to implement this as part of this same PR?
test/crd_checks.go
Outdated
@@ -30,7 +30,7 @@ import ( | |||
|
|||
const ( | |||
interval = 1 * time.Second | |||
timeout = 5 * time.Minute | |||
timeout = 10 * time.Minute // TODO: timeout should be configurable via go test timeout |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
go test timeout
is the timeout for all tests being run, while this timeout is for individual calls to PollImmediate
, I don't think you can use go test timeout
for this
which call to PollImmediate
needs 10 minutes? That seems extremely long
if it's unavoidable to have a 10 minute timeout, I suggest that we change the interface to these functions to take a timeout parameter so we can use the timeout that is needed each time we call it (i.e. certainly most calls to Poll
don't need 10 min)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
also, if we DO need to up one of these timeouts to 10 min, we should update the test README (https://github.com/knative/build-pipeline/blob/master/test/README.md) to indicate that the go test
default timeout won't cut it anymore
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
New pipelinerun e2e test includes 4 tasks and this takes upto 6min (on my gke) cluster. I wanted to add timeout to be on safer side so increased it upto 10min but it is very specific to that e2e test. So probably I will just update timeout for that test instead of all crds check timeout
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just bumping this before we merge
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh wait my bad, this is an outdated comment. ignore me!
}, &v1alpha1.Task{ | ||
ObjectMeta: metav1.ObjectMeta{ | ||
Namespace: namespace, | ||
Name: "check-create-files-exists", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hahaha i feel like ive seen this exact example in the concourse docs (as we slowly re-create concourse XD)
test/pipelinerun_test.go
Outdated
Name: "read-something-else", | ||
Image: "ubuntu", | ||
Command: []string{"cat"}, | ||
Args: []string{"/workspace/readingspace/something-else"}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- im a bit confused by what each task and each step in each task is doing - maybe we could use the names of the steps to make it a bit more clear, e.g.:
- In the first task, "write-something" could become "write-task-1"
- In the next two tasks that run in parallel, "read-stuff" could be something like "read-data-from-task-1", and "write-soemthing-else" could be "write-data-from-task-2-1" or something
- what do you think if we had the tasks themselves assert? i.e. have the step fail if the expected text doesnt exist, e.g. using grep
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what do you think if we had the tasks themselves assert? i.e. have the step fail if the expected text doesnt exist, e.g. using grep
I like the idea. I will update the tasks to assert contents.
im a bit confused by what each task and each step in each task is doing - maybe we could use the names of the steps to make it a bit more clear
I tried to name tasks with "stuff" and "something" because the tasks are actually writing the same words. I could update them to pattern you mentioned.
}) | ||
} | ||
|
||
resources.WrapSteps(&tr.Spec, pr.Spec.PipelineTaskResources, pt) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i really like breaking this logic into a separate function!! ❤️
// +optional | ||
PreBuiltSteps []TaskBuildStep `json:"preBuildSteps,omitempty"` | ||
// +optional | ||
PostBuiltSteps []TaskBuildStep `json:"postBuiltSteps,omitempty"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what does Built
mean in this context? I think this is intended to run after all the steps run, is that right? i think since this CRD isn't called Build
anymore, using Built
is a bit confusing, some other ideas:
PreSteps
BeforeSteps
InitializationSteps
SetupSteps
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I leaned toward Built
because thats the end object we are creating now. I like the name BeforeSteps
for pre build steps, for post build steps we could go with AfterSteps
.
} | ||
|
||
// TaskBuildStep contains information to construct build pre and post steps | ||
type TaskBuildStep struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
again im not sure about calling this Build
(maybe Im missing something tho)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about TaskStep? I agree with moving away from associating with Build
keyword
// +optional | ||
PVCName string `json:"pvcName,omitempty"` | ||
// +optional | ||
PreBuiltSteps []TaskBuildStep `json:"preBuildSteps,omitempty"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At the moment it would look to a user of TaskRun
as if these are fields we expect them to provide - do we? What would it look like if a user provided these fields? Can we document it?
If not, can we somehow make it obvious we don't want users to provide them?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we have to have these steps in the definition of the taskRun type as a field, can we not append them in the appropriate order while creating the taskRun
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If not, can we somehow make it obvious we don't want users to provide them?
How about add validation in taskrun to check if owner reference is pipelinerun Kind
then post build and pre build steps should be allowed else throw error.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we don't allow user to provide those field, it shouldn't be in the spec (should it ?), but I guess they are required for pipeline/pipelinerun to populate them… so… yeah most likely a validation error could work there.
Is there a use case for a user to provides those in addition to normal steps ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a use case for a user to provides those in addition to normal steps ?
@vdemeester: @shashwathi and i discussed, and we were thinking that maybe a use case where a user wants to debug tasks that are part of a pipeline in isolation? it might be a bit of a stretch tho!
i think adding paths
to the interface instead of the builtSteps
makes it a bit clearer hopefully!
tests := []struct { | ||
name string | ||
steps TaskBuildStep | ||
wantErr *apis.FieldError |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks like we don't need wantErr
, we have no error cases here (and id prefer those in a separate test anyway!)
@@ -83,3 +83,24 @@ func TestInitializeConditions(t *testing.T) { | |||
t.Fatalf("PipelineRun status getting reset") | |||
} | |||
} | |||
|
|||
func Test_GetPVC(t *testing.T) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
❤️
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR @shashwathi
ITs a little bit hard to follow, so wondering if you can maybe update the commit message and add some more comments
value: admin | ||
|
||
--- |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why do you need a new resource that has the same properties as existing resource?
Type PipelineResourceType `json:"type"` | ||
Name string `json:"name"` | ||
Type PipelineResourceType `json:"type"` | ||
TargetPath string `json:"targetPath"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should be optional since its not applicable to all types of resources
@@ -118,6 +119,8 @@ type PipelineResource struct { | |||
type TaskRunResource struct { | |||
Name string `json:"name"` | |||
ResourceRef PipelineResourceRef `json:"resourceRef"` | |||
// +optional | |||
SourcePaths []string `json:"sourcePaths"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add a comment to what this new field is for
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do not think I am using that. I will go ahead and delete that field
} | ||
newSteps = append(newSteps, []corev1.Container{{ | ||
Name: fmt.Sprintf("source-mkdir-%s", source.Name), | ||
Image: "busybox", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would prefer not to use busybox and instead use some base image controlled by the project
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can I follow up with another PR to fix image?
// +optional | ||
PVCName string `json:"pvcName,omitempty"` | ||
// +optional | ||
PreBuiltSteps []TaskBuildStep `json:"preBuildSteps,omitempty"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we have to have these steps in the definition of the taskRun type as a field, can we not append them in the appropriate order while creating the taskRun
test/helm_task_test.go
Outdated
Name: "workspace", | ||
ProvidedBy: []string{createImageTaskName}, | ||
Name: "workspace", | ||
// ProvidedBy: []string{createImageTaskName}, //Yet to be implemented |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you mean you are going to implement this as part of this same PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, most of my question were already covered by @bobcatfish and @pivotal-nader-ziada 😉
// +optional | ||
PVCName string `json:"pvcName,omitempty"` | ||
// +optional | ||
PreBuiltSteps []TaskBuildStep `json:"preBuildSteps,omitempty"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we don't allow user to provide those field, it shouldn't be in the spec (should it ?), but I guess they are required for pipeline/pipelinerun to populate them… so… yeah most likely a validation error could work there.
Is there a use case for a user to provides those in addition to normal steps ?
6ed542e
to
7381dab
Compare
@pivotal-nader-ziada @bobcatfish @vdemeester I updated the design in recent change. Please take a look at it and leave comments. Thanks to @bobcatfish for slack brainstorming.
|
docs/Concepts.md
Outdated
@@ -37,6 +37,7 @@ A Task is a collection of sequential steps you would want to run as part of your | |||
A task will run inside a container on your cluster. A Task declares: | |||
|
|||
1. Inputs the task needs. | |||
Task input resource can provide `targetPath` to initialize resource in specific directory. Resource will be placed under `/workspace/targetPath`. If `targetPath` is not specified then resource will be initialized under `/workspace`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the PR is looking great and I really like the interface change!
I find this part of the docs a bit hard to follow - maybe it would be more clear if it was in a different section, such as Resources
or TaskRun
, and maybe had an example?
I'm not 100% sure how to improve it tho, feel free to merge this and we can iterate on it.
@@ -73,12 +73,12 @@ func WaitForPodState(c *clients, name string, namespace string, inState func(r * | |||
// interval until inState returns `true` indicating it is done, returns an | |||
// error or timeout. desc will be used to name the metric that is emitted to | |||
// track how long it took for name to get into the state checked by inState. | |||
func WaitForPipelineRunState(c *clients, name string, inState func(r *v1alpha1.PipelineRun) (bool, error), desc string) error { | |||
func WaitForPipelineRunState(c *clients, name string, polltimeout time.Duration, inState func(r *v1alpha1.PipelineRun) (bool, error), desc string) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
Name: "read-from-task-0", | ||
Image: "ubuntu", | ||
Command: []string{"bash"}, | ||
Args: []string{"-c", "[[ stuff == $(cat /workspace/stuff) ]]"}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
niiiiiice :D
/lgtm In case you want to make some tweaks to the docs - altho it's unlikely a user would want to use the /hold And also I'm still wondering about fan-out and fan-in, seems like we can end up with tasks running in parallel changing the same data? |
I will take another shot at docs tomorrow. 👍
Yes tasks have individual copy of the resources so even if they do modify it, I do not see why it could be a problem. I agree that fan-in of resources could lead to undesired side effects. I can see a use case where this feature is useful like using same git resource across whole pipeline but user has to be aware of the changes in each tasks which does not seem too unreasonable I think. |
7381dab
to
71e86e4
Compare
@bobcatfish : I have updated the docs to include examples. Please review that and let me know if you have any feedback. |
- Implementation details Pipelinerun creates pvc for the lifetime for object and uses that pvc as scratch space to transfer git resources between them. This information is passed to taskrun via resource paths. Paths are array of strings and incase of inouts these paths will be considered as new source of pipeline resource. In the case of outputs paths will be considered as new destination directory. - Update docs to include examples of paths Partially fixes tektoncd#148
71e86e4
to
95a17ab
Compare
The following is the coverage report on pkg/.
|
@@ -169,6 +199,66 @@ Creating a `TaskRun` will invoke a [Task](#task), running all of the steps until | |||
completion or failure. Creating a `TaskRun` will require satisfying all of the input | |||
requirements of the `Task`. | |||
|
|||
`TaskRun` definition includes `inputs`, `outputs` for `Task` referred in spec. | |||
|
|||
Input resource includes name and reference to pipeline resource and optionally `paths`. `paths` will be used by `TaskRun` as the resource's new source paths i.e., copy the resource from specified list of paths. `TaskRun` expects the folder and contents to be already present in specified paths. `paths` feature could be used to provide extra files or altered version of existing resource before execution of steps. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah great, I get it now! thanks for the example above and the extra detail in the docs :D
/lgtm |
In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
@bobcatfish : I need |
whoops, just took it for granted that @shashwathi had approver rights already, she's one of the top contributors!! /approve |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: bobcatfish, shashwathi The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/hold cancel |
In this PR I have implemented outputs using PVC within lifetime of pipelinerun. Closed old PR #255 in favor of this. Discussion for design is in previous PR so please refer to that for more details.
Note:
examples
to demonstrate git resource dependencyWhat will not be included?
Support for images
This PR only includes support for git resource dependency. Support for images includes some design of metadata and I would like to follow up with another PR for that.
/assign @imjasonh
/assign @bobcatfish
/assign @pivotal-nader-ziada