-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix(sdk): Relax the requirement that component inputs/outputs must appear on the command line. #6268
Conversation
/hold Still working on this, will ping folks when ready for review |
/test kubeflow-pipelines-samples-v2 |
2 similar comments
/test kubeflow-pipelines-samples-v2 |
/test kubeflow-pipelines-samples-v2 |
/retest |
on the command line. This change enables component builders to define their implementation (i.e. the container arg and command) independent of the actual component inputs/outputs defined in component.yaml. The latter will be used when determining inputs/outputs during compilation. Only works for tasks that turn into ContainerOps during compilation. Naturally, does not work for ContainerOps (which have no inputs/outputs anyway).
35e6cb1
to
49ae9e6
Compare
/hold cancel |
if (name in original_arguments and | ||
type_utils.is_parameter_type(spec_type) and | ||
name not in task._parameter_arguments and | ||
name not in task.artifact_arguments): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For my education, in which scenario would name not in task._parameter_arguments
and name not in task.artifact_arguments
both evaluate to true
?
My guess is this could happen for optional inputs without default value?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch. I added this while debugging, but it turns out the previous lines actually cover the case we care about. I don't think this can ever happen actually. I've removed this bit now.
@@ -848,7 +850,7 @@ def inputs(self): | |||
# called the 1st time (because there are in-place updates to `PipelineParam` | |||
# during compilation - remove in-place updates for easier debugging?) | |||
if not self._inputs: | |||
self._inputs = [] | |||
self._inputs = self._component_spec_inputs or [] | |||
# TODO replace with proper k8s obj? | |||
for key in self.attrs_with_pipelineparams: | |||
self._inputs += _pipeline_param.extract_pipelineparams_from_any( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like _component_spec_inputs
is a list of inputs reference PipelineParam
, in another word, inputs with constant values would not appear in this list. It that right?
If so, 1) maybe we can rename it to better reflect this fact. 2) I wonder if this self._inputs += _pipeline_param.extract_pipelineparams_from_any
is still needed, because it looks to me you would have covered them when you construct _component_spec_inputs
down below?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
Re: 1) I updated the names to be more specific and added a comment.
Re: 2), I can't remove that, as some params are specified on things like the container image. So this won't work I think. Also, the same code is used by things like ResourceOp, which does not have a component spec so it won't work there either.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the explanation. That makes sense.
(This also reminds me the topic that component may have implicit inputs not captured in its inputs spec, and whether we should allow this to happen)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
@@ -848,7 +850,7 @@ def inputs(self): | |||
# called the 1st time (because there are in-place updates to `PipelineParam` | |||
# during compilation - remove in-place updates for easier debugging?) | |||
if not self._inputs: | |||
self._inputs = [] | |||
self._inputs = self._component_spec_inputs or [] | |||
# TODO replace with proper k8s obj? | |||
for key in self.attrs_with_pipelineparams: | |||
self._inputs += _pipeline_param.extract_pipelineparams_from_any( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the explanation. That makes sense.
(This also reminds me the topic that component may have implicit inputs not captured in its inputs spec, and whether we should allow this to happen)
/approve |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: chensun The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Hi @neuromage @chensun, I guess this PR broke postsubmit integration tests, here's my investigations: I've ruled out other possibilities -- the failure is a change in KFP v1 compiler, and this PR looks like the only KFP v1 compiler change since the last time postsubmit test was passing. Can you take a look? |
Relax the requirement that component inputs/outputs must appear on the command line.
This change enables component builders to define their implementation (i.e. the container arg and command) independent of the actual component inputs/outputs defined in component.yaml. The latter will be used when determining inputs/outputs during compilation.
With this change, v2 lightweight components do not need to include the superfluous arguments for inputs and outputs (since
ExecutorInput
andExecutorOutput
is what is actually used to pass data).Only works for tasks that turn into ContainerOps during compilation. Naturally, does not work for ContainerOps (which have no inputs/outputs anyway).
Checklist: