-
Notifications
You must be signed in to change notification settings - Fork 26
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Migrate to KfpV2 #477
Migrate to KfpV2 #477
Conversation
ae4d1c2
to
146e728
Compare
adresses: #417
PR that adds option to set hardware accelerators to KFP V2 (nodepool Notes: * Each component was previously configured as a sub-pipeline because of an error in detecting a certain field in the V2 spec (`exec-<sanitizied_component_name`) which forces to kubeflow to compile component to a pipeline spec instead of a component spec [link](https://github.com/kubeflow/pipelines/blob/ad058b5321f5fe74bbd10845778e6627fb9aa5cb/sdk/python/kfp/dsl/structures.py#L462). This was resolved by adding the same custom method used by KFP to format the component name to Fondant. This makes it possible to assign resources to each component since KFP does not allow assigning resources on a pipeline level * Setting up and running nodepools is still to be tested on native KFP
PR that fixes the v2 default /optional types. KFP v2 has two different optional types: - Optional values that default to None -> defined by setting `isOptional` to True and not passing the default value - Optional values that default to another values -> defined by setting the value in `defaultValue` and having `isOptional` False or empty. This PR adjusts the componentSpec accordingly. Another nice win is that we do not have to define bools/dicts/lists as strings in the component spec Few things no note: * Due to strict type. Mixing different data types is not allowed -> we had to change the type of `input_partition_rows` to int. Which means it does not accept the value `disable` anymore. Need to fix this in a seperate PR * 'inf` as float is not allowed
146e728
to
32a8c04
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @GeorgesLorre.
Mostly looks good to me! Having a bit of a hard time understanding some of the argument parsing / converting, but the argument usage in the tests seems logical.
Left some comments.
src/fondant/component_spec.py
Outdated
) -> "KubeflowComponentSpec": | ||
"""Create a Kubeflow component spec from a Fondant component spec.""" | ||
def from_fondant_component_spec(cls, fondant_component: ComponentSpec): | ||
"""Generate a Kubeflow component spec from a ComponentOp.""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"""Generate a Kubeflow component spec from a ComponentOp.""" | |
"""Generate a Kubeflow component spec from a Fondant component spec.""" |
29e27c8
to
fa14ea0
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks Robbe! Left a few comments. Can we maybe merge this one into another branch and then tackle all other issues on other branches or would you prefer to merge directly to main?
n_rows_to_load: | ||
description: Optional argument that defines the number of rows to load. Useful for testing pipeline runs on a small scale | ||
type: int | ||
default: None | ||
default: -1 | ||
index_column: | ||
description: Column to set index to in the load component, if not specified a default globally unique index will be set | ||
type: str |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does it make sense to later on revert back to None and handle the conversion internally? It could be something like setting an arbitrary value if it's None in the fondant component and parsing it back to None again when the component is run
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure if it's such a bad thing that the default needs to match the argument type. We would also need an arbitrary value for each type to make this work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Before we would assign a value as an optional argument if the default is set to None
. Now the notion of optional arguments is missing and optional arguments are checked implicitly against arbitrary values defined in the default which can be a bit confusing.
Was more thinking from the perspective of the end user. So if they would assign a default to None
, we would handle it as optional and return it as None
to the end user. Implementing this is still not very clear to me but we could just assign arbitrary known values. For example, if the users sets an list argument to None
, we would pass it as []
and the convert it back to None
during parsing
image_column_names: t.Optional[list], | ||
n_rows_to_load: t.Optional[int], | ||
index_column: t.Optional[str], | ||
image_column_names: list, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we keep them as optional since they have a default or does optional
only apply when the default is None
by convention?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Optional means that a None
value can be passed in or set as a default. This is no longer the case now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I still think we should have optional argument (see comment above)
# entry: python scripts/component_readme/generate_readme.py | ||
# files: ^components/.*/fondant_component.yaml | ||
# additional_dependencies: ["fondant"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had to deactivate this since it installs the latest version of fondant on PyPI, which breaks on the changes we made to the component spec schema. We can re-enable this later.
|
||
# add ops to pipeline | ||
pipeline.add_op(load_from_hub_op) | ||
# pipeline.add_op(download_images_op, dependencies=load_from_hub_op) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you uncomment here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks Robbe, left one comment regarding arguments but I think we can discuss it separately
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, let's merge!
No description provided.