Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(sdk): Relax the requirement that component inputs/outputs must appear on the command line. #6268

Merged
merged 10 commits into from
Aug 10, 2021

Conversation

neuromage
Copy link
Contributor

@neuromage neuromage commented Aug 9, 2021

Relax the requirement that component inputs/outputs must appear on the command line.

This change enables component builders to define their implementation (i.e. the container arg and command) independent of the actual component inputs/outputs defined in component.yaml. The latter will be used when determining inputs/outputs during compilation.

With this change, v2 lightweight components do not need to include the superfluous arguments for inputs and outputs (since ExecutorInput and ExecutorOutput is what is actually used to pass data).

Only works for tasks that turn into ContainerOps during compilation. Naturally, does not work for ContainerOps (which have no inputs/outputs anyway).

Checklist:

@neuromage
Copy link
Contributor Author

/hold

Still working on this, will ping folks when ready for review

@neuromage
Copy link
Contributor Author

/test kubeflow-pipelines-samples-v2

2 similar comments
@neuromage
Copy link
Contributor Author

/test kubeflow-pipelines-samples-v2

@neuromage
Copy link
Contributor Author

/test kubeflow-pipelines-samples-v2

@neuromage
Copy link
Contributor Author

/retest

on the command line.

This change enables component builders to define their implementation
(i.e. the container arg and command) independent of the actual
component inputs/outputs defined in component.yaml. The latter will be
used when determining inputs/outputs during compilation.

Only works for tasks that turn into ContainerOps during compilation.
Naturally, does not work for ContainerOps (which have no inputs/outputs
anyway).
@neuromage
Copy link
Contributor Author

/assign @chensun
/assign @ji-yaqi

This is ready to be looked at now. Thanks!

@neuromage
Copy link
Contributor Author

/hold cancel

if (name in original_arguments and
type_utils.is_parameter_type(spec_type) and
name not in task._parameter_arguments and
name not in task.artifact_arguments):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For my education, in which scenario would name not in task._parameter_arguments and name not in task.artifact_arguments both evaluate to true?
My guess is this could happen for optional inputs without default value?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch. I added this while debugging, but it turns out the previous lines actually cover the case we care about. I don't think this can ever happen actually. I've removed this bit now.

@@ -848,7 +850,7 @@ def inputs(self):
# called the 1st time (because there are in-place updates to `PipelineParam`
# during compilation - remove in-place updates for easier debugging?)
if not self._inputs:
self._inputs = []
self._inputs = self._component_spec_inputs or []
# TODO replace with proper k8s obj?
for key in self.attrs_with_pipelineparams:
self._inputs += _pipeline_param.extract_pipelineparams_from_any(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like _component_spec_inputs is a list of inputs reference PipelineParam, in another word, inputs with constant values would not appear in this list. It that right?
If so, 1) maybe we can rename it to better reflect this fact. 2) I wonder if this self._inputs += _pipeline_param.extract_pipelineparams_from_any is still needed, because it looks to me you would have covered them when you construct _component_spec_inputs down below?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

Re: 1) I updated the names to be more specific and added a comment.

Re: 2), I can't remove that, as some params are specified on things like the container image. So this won't work I think. Also, the same code is used by things like ResourceOp, which does not have a component spec so it won't work there either.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the explanation. That makes sense.
(This also reminds me the topic that component may have implicit inputs not captured in its inputs spec, and whether we should allow this to happen)

Copy link
Member

@chensun chensun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@@ -848,7 +850,7 @@ def inputs(self):
# called the 1st time (because there are in-place updates to `PipelineParam`
# during compilation - remove in-place updates for easier debugging?)
if not self._inputs:
self._inputs = []
self._inputs = self._component_spec_inputs or []
# TODO replace with proper k8s obj?
for key in self.attrs_with_pipelineparams:
self._inputs += _pipeline_param.extract_pipelineparams_from_any(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the explanation. That makes sense.
(This also reminds me the topic that component may have implicit inputs not captured in its inputs spec, and whether we should allow this to happen)

@chensun
Copy link
Member

chensun commented Aug 10, 2021

/approve

@google-oss-robot
Copy link

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: chensun

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@google-oss-robot google-oss-robot merged commit 911562f into kubeflow:master Aug 10, 2021
@Bobgy
Copy link
Contributor

Bobgy commented Aug 12, 2021

Hi @neuromage @chensun, I guess this PR broke postsubmit integration tests, here's my investigations:
#6311 (comment)

I've ruled out other possibilities -- the failure is a change in KFP v1 compiler, and this PR looks like the only KFP v1 compiler change since the last time postsubmit test was passing. Can you take a look?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants