Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add writer component #196

Merged
merged 2 commits into from
Jun 13, 2023
Merged

Add writer component #196

merged 2 commits into from
Jun 13, 2023

Conversation

PhilippeMoussalli
Copy link
Contributor

PR that adds a writer class as discussed in #138.
This enables us to write the final dataset without having to write the dataset and manifest since there is no modification made on the data.

Next steps:

  • Enable default and optional arguments in components. The optional arguments are needed to make the Reader/Writer components generic (e.g. Write to hub requires special hf metadata to be attached to the image column in case there is any, user needs to pass an optional argument specifying the columns name of the image)
  • Re implement load/Write to hub component to make them more generic.

@PhilippeMoussalli PhilippeMoussalli requested review from RobbeSneyders and GeorgesLorre and removed request for RobbeSneyders June 12, 2023 09:46
@PhilippeMoussalli PhilippeMoussalli self-assigned this Jun 12, 2023
@PhilippeMoussalli PhilippeMoussalli added Core Core framework Components Implementation of components labels Jun 12, 2023
@PhilippeMoussalli PhilippeMoussalli added this to the 0.2.0 milestone Jun 12, 2023
@PhilippeMoussalli PhilippeMoussalli linked an issue Jun 12, 2023 that may be closed by this pull request
Copy link
Member

@RobbeSneyders RobbeSneyders left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"""Base class for a Fondant write component."""

@classmethod
def _add_and_parse_args(cls, spec: ComponentSpec):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is currently the same as the TransformComponent, but I guess the output manifest is not required here?

We still have a lot of duplication in these methods for only small differences, so would like to figure out a better way to do this. Maybe we should still create separate schemas?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tackled this in a the PR that follows this by specifying the optional parameters as a specific attribute per component type

https://github.com/ml6team/fondant/pull/199/files#:~:text=if%20arg.name%20in%20cls.optional_fondant_arguments()%3A

we could also tackle it here by having separate secs per component type. This of course would need more reworking and would also required specifying the component type within the component spec yaml (which is not currently the case). Is this what you mean by separate schemas?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That could be a better place to do it indeed. But let's merge the current implementation, it's already an improvement.

@PhilippeMoussalli PhilippeMoussalli merged commit 8b60c48 into main Jun 13, 2023
@PhilippeMoussalli PhilippeMoussalli deleted the write-component-class branch June 13, 2023 14:03
PhilippeMoussalli added a commit that referenced this pull request Jun 13, 2023
PR that enables adding default arguments as discussed in #179. 

Users can now define default arguments in the component specs. Those
arguments do not have to be explicitly defined in the `ComponentOp` but
could still be overridden.

Note: please review this
[PR](#196) beforehand since I am
branched off from it.

Created a separate ticket to do the necessary changes #198. Best to
handle it in a separate PR.

---------

Co-authored-by: Robbe Sneyders <[email protected]>
Hakimovich99 pushed a commit that referenced this pull request Oct 16, 2023
PR that adds a writer class as discussed in #138. 
This enables us to write the final dataset without having to write the
dataset and manifest since there is no modification made on the data.

Next steps: 
- Enable default and optional arguments in components. The optional
arguments are needed to make the Reader/Writer components generic (e.g.
Write to hub requires special hf metadata to be attached to the image
column in case there is any, user needs to pass an optional argument
specifying the columns name of the image)
- Re implement load/Write to hub component to make them more generic.
Hakimovich99 pushed a commit that referenced this pull request Oct 16, 2023
PR that enables adding default arguments as discussed in #179. 

Users can now define default arguments in the component specs. Those
arguments do not have to be explicitly defined in the `ComponentOp` but
could still be overridden.

Note: please review this
[PR](#196) beforehand since I am
branched off from it.

Created a separate ticket to do the necessary changes #198. Best to
handle it in a separate PR.

---------

Co-authored-by: Robbe Sneyders <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Components Implementation of components Core Core framework
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

Implement a WriteComponent class
2 participants