-
Notifications
You must be signed in to change notification settings - Fork 26
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enable optional component arguments #201
Conversation
79c42d6
to
64614f7
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @PhilippeMoussalli, I think this approach is indeed quite elegant! Left one comment and you'll need to rebase / merge :)
fondant/component_spec.py
Outdated
"String": lambda value: str(value) if value != "None" else None, | ||
"Integer": lambda value: int(value) if value != "None" else None, | ||
"Float": lambda value: float(value) if value != "None" else None, | ||
"Boolean": lambda value: ast.literal_eval(value) if value != "None" else None, | ||
"JsonObject": lambda value: json.loads(value) if value != "None" else None, | ||
"JsonArray": lambda value: json.loads(value) if value != "None" else None, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe it makes sense converting this into a function.
"String": lambda value: str(value) if value != "None" else None, | |
"Integer": lambda value: int(value) if value != "None" else None, | |
"Float": lambda value: float(value) if value != "None" else None, | |
"Boolean": lambda value: ast.literal_eval(value) if value != "None" else None, | |
"JsonObject": lambda value: json.loads(value) if value != "None" else None, | |
"JsonArray": lambda value: json.loads(value) if value != "None" else None, | |
kubeflow_to_python_type_dict = { | |
"String": str, | |
"Integer": int, | |
"Float": float, | |
"Boolean": ast.literal_eval, | |
"JsonObject": json.loads, | |
"JsonArray": json.loads, | |
} | |
def kubeflow_to_python_type(type_: str) -> t.Any: | |
map_fn = kubeflow_to_python_type_dict[type_] | |
return lambda value: map_fn(value) if value != "None" else None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
or use the match case
syntax: https://peps.python.org/pep-0622/
def kubeflow_to_python_type(type: str):
match type:
case "String": lambda value: str(value) if value != "None" else None
case "Integer": lambda value: int(value) if value != "None" else None
case "Float": lambda value: float(value) if value != "None" else None
case "Boolean": lambda value: ast.literal_eval(value) if value != "None" else None
case "JsonObject": lambda value: json.loads(value) if value != "None" else None
case "JsonArray": lambda value: json.loads(value) if value != "None" else None
case _: raise ValueError(f"Unknown type: {type}")
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ignore the above match case
is python 3.10+ and fondant is 3.8+
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the feedback :) I think it's ready to go
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thx!
PR that enables defining optional arguments to the component spec. This is needed to enable building reusable components. For example, the `load_from_hub` component will need to format images from hf format to bytes and this can be achieved by passing a specific argument (`image_column_name`). This can be achieved either by: - Passing `None` explicitly as an argument value at the `ComponentOp` level - Assigning `None` as the `default` argument field in the component spec. Previous attempt was to achieve this with the [optional field of the kubeflow spec](https://www.kubeflow.org/docs/components/pipelines/v1/reference/component-spec/#:~:text=Kubeflow%20Pipelines%20SDK.-,optional,-%3A%20Specifies%20if%20input) but this returned many errors and also requires adding in an additional `optional` field to the fondant spec. The current approach achieves the same outcome with a simpler approach. The `component_spec` argument is now also defined as an optional argument. This allows use to define a component using the `from_file()` method without having to explicitly define that argument. We still return a valid error if `from_args()` is used and no `component_spec` argument is provided since it defaults to `None` [link](https://github.com/ml6team/fondant/blob/e90ac782a8c253720456fcbd05a9ced086ca5547/fondant/component.py#L61)
PR that enables defining optional arguments to the component spec. This is needed to enable building reusable components.
For example, the
load_from_hub
component will need to format images from hf format to bytes and this can be achieved by passing a specific argument (image_column_name
). This can be achieved either by:None
explicitly as an argument value at theComponentOp
levelNone
as thedefault
argument field in the component spec.Previous attempt was to achieve this with the optional field of the kubeflow spec but this returned many errors and also requires adding in an additional
optional
field to the fondant spec. The current approach achieves the same outcome with a simpler approach.The
component_spec
argument is now also defined as an optional argument. This allows use to define a component using thefrom_file()
method without having to explicitly define that argument. We still return a valid error iffrom_args()
is used and nocomponent_spec
argument is provided since it defaults toNone
link