-
Notifications
You must be signed in to change notification settings - Fork 26
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add writer component #196
Add writer component #196
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @PhilippeMoussalli!
"""Base class for a Fondant write component.""" | ||
|
||
@classmethod | ||
def _add_and_parse_args(cls, spec: ComponentSpec): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is currently the same as the TransformComponent
, but I guess the output manifest is not required here?
We still have a lot of duplication in these methods for only small differences, so would like to figure out a better way to do this. Maybe we should still create separate schemas?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tackled this in a the PR that follows this by specifying the optional parameters as a specific attribute per component type
we could also tackle it here by having separate secs per component type. This of course would need more reworking and would also required specifying the component type within the component spec yaml (which is not currently the case). Is this what you mean by separate schemas?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That could be a better place to do it indeed. But let's merge the current implementation, it's already an improvement.
PR that enables adding default arguments as discussed in #179. Users can now define default arguments in the component specs. Those arguments do not have to be explicitly defined in the `ComponentOp` but could still be overridden. Note: please review this [PR](#196) beforehand since I am branched off from it. Created a separate ticket to do the necessary changes #198. Best to handle it in a separate PR. --------- Co-authored-by: Robbe Sneyders <[email protected]>
PR that adds a writer class as discussed in #138. This enables us to write the final dataset without having to write the dataset and manifest since there is no modification made on the data. Next steps: - Enable default and optional arguments in components. The optional arguments are needed to make the Reader/Writer components generic (e.g. Write to hub requires special hf metadata to be attached to the image column in case there is any, user needs to pass an optional argument specifying the columns name of the image) - Re implement load/Write to hub component to make them more generic.
PR that enables adding default arguments as discussed in #179. Users can now define default arguments in the component specs. Those arguments do not have to be explicitly defined in the `ComponentOp` but could still be overridden. Note: please review this [PR](#196) beforehand since I am branched off from it. Created a separate ticket to do the necessary changes #198. Best to handle it in a separate PR. --------- Co-authored-by: Robbe Sneyders <[email protected]>
PR that adds a writer class as discussed in #138.
This enables us to write the final dataset without having to write the dataset and manifest since there is no modification made on the data.
Next steps: