-
Notifications
You must be signed in to change notification settings - Fork 26
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Don't use from_registry for generic components #251
Labels
Core
Core framework
Comments
Closed
RobbeSneyders
moved this from Ready for development
to In Progress
in Fondant development
Jul 9, 2023
RobbeSneyders
added a commit
that referenced
this issue
Jul 18, 2023
Fixes #251, #252, #253 Before this PR, the operation for a generic component (a reusable component that dynamically takes a component specification), needed to be created as follows: ```python component_op = ComponentOp.from_registry( name="generic_component", component_spec_path="components/custom_generic_component/fondant_component.yaml", ) ``` But the provided name wasn't actually used, since the component specification already contains a reference to the reusable image that it should use. Now we can define both custom and generic components as follows: ```python component_op = ComponentOp( component_dir="components/custom_generic_component", ) ``` There is still a difference in how we want to handle custom and generic components though. Custom components should be built by the local runner, while generic components should not, since they use a reusable image. To make this differentiation, we now simply check if a `Dockerfile` is present in the provided `Component_dir`. This will be the case for a custom component, but not for a generic component. --- This has a nice implicit result for reusable components as well, which can still be defined as: ```python component_op = ComponentOp.from_registry( name="reusable_component", ) ``` Which `fondant` now resolves to: ```python component_op = ComponentOp( component_dir="{fondant_install_dir}/components/custom_generic_component", ) ``` I added a change to this PR which no longer packages the reusable component code with the `fondant` package, but only the component specifications, as those are all that is needed since they contain a reference to the reusable image on the registry. This means that the `component_dir` above doesn't contain a `Dockerfile` when `fondant` is installed from PyPI, but does when you locally install `fondant` using `poetry install`. So the local runner doesn't build reusable components when users install `fondant` from PyPI, but it does when you're working on the `fondant` repo, which is useful for us `fondant` developers. --- The only thing we still need is an option on the runner to provide `build_args`, so we can pass in the `fondant` version to build into the image. But I'll open a separate PR for that.
github-project-automation
bot
moved this from Validation
to Done
in Fondant development
Jul 18, 2023
satishjasthi
pushed a commit
to satishjasthi/fondant
that referenced
this issue
Jul 21, 2023
Fixes ml6team#251, ml6team#252, ml6team#253 Before this PR, the operation for a generic component (a reusable component that dynamically takes a component specification), needed to be created as follows: ```python component_op = ComponentOp.from_registry( name="generic_component", component_spec_path="components/custom_generic_component/fondant_component.yaml", ) ``` But the provided name wasn't actually used, since the component specification already contains a reference to the reusable image that it should use. Now we can define both custom and generic components as follows: ```python component_op = ComponentOp( component_dir="components/custom_generic_component", ) ``` There is still a difference in how we want to handle custom and generic components though. Custom components should be built by the local runner, while generic components should not, since they use a reusable image. To make this differentiation, we now simply check if a `Dockerfile` is present in the provided `Component_dir`. This will be the case for a custom component, but not for a generic component. --- This has a nice implicit result for reusable components as well, which can still be defined as: ```python component_op = ComponentOp.from_registry( name="reusable_component", ) ``` Which `fondant` now resolves to: ```python component_op = ComponentOp( component_dir="{fondant_install_dir}/components/custom_generic_component", ) ``` I added a change to this PR which no longer packages the reusable component code with the `fondant` package, but only the component specifications, as those are all that is needed since they contain a reference to the reusable image on the registry. This means that the `component_dir` above doesn't contain a `Dockerfile` when `fondant` is installed from PyPI, but does when you locally install `fondant` using `poetry install`. So the local runner doesn't build reusable components when users install `fondant` from PyPI, but it does when you're working on the `fondant` repo, which is useful for us `fondant` developers. --- The only thing we still need is an option on the runner to provide `build_args`, so we can pass in the `fondant` version to build into the image. But I'll open a separate PR for that.
Hakimovich99
pushed a commit
that referenced
this issue
Oct 16, 2023
Fixes #251, #252, #253 Before this PR, the operation for a generic component (a reusable component that dynamically takes a component specification), needed to be created as follows: ```python component_op = ComponentOp.from_registry( name="generic_component", component_spec_path="components/custom_generic_component/fondant_component.yaml", ) ``` But the provided name wasn't actually used, since the component specification already contains a reference to the reusable image that it should use. Now we can define both custom and generic components as follows: ```python component_op = ComponentOp( component_dir="components/custom_generic_component", ) ``` There is still a difference in how we want to handle custom and generic components though. Custom components should be built by the local runner, while generic components should not, since they use a reusable image. To make this differentiation, we now simply check if a `Dockerfile` is present in the provided `Component_dir`. This will be the case for a custom component, but not for a generic component. --- This has a nice implicit result for reusable components as well, which can still be defined as: ```python component_op = ComponentOp.from_registry( name="reusable_component", ) ``` Which `fondant` now resolves to: ```python component_op = ComponentOp( component_dir="{fondant_install_dir}/components/custom_generic_component", ) ``` I added a change to this PR which no longer packages the reusable component code with the `fondant` package, but only the component specifications, as those are all that is needed since they contain a reference to the reusable image on the registry. This means that the `component_dir` above doesn't contain a `Dockerfile` when `fondant` is installed from PyPI, but does when you locally install `fondant` using `poetry install`. So the local runner doesn't build reusable components when users install `fondant` from PyPI, but it does when you're working on the `fondant` repo, which is useful for us `fondant` developers. --- The only thing we still need is an option on the runner to provide `build_args`, so we can pass in the `fondant` version to build into the image. But I'll open a separate PR for that.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Summary of discussion in #224
Currently the operation for a generic component (a reusable component that dynamically takes a component specification), needs to be created as follows:
But the name isn't actually used. Instead the component specification contains a reference to the generic image that it should use. So ideally, we could create it as follows instead:
One issue here is that we currently use the way a
ComponentOp
is initialized to decide if the local runner should add a build step for it. Since this looks like a custom component, it will try to build it, while there is no Dockerfile or code available at the specified location.We should therefore find a different mechanism to decide if a component needs to be built. If we agree that a component should have a fixed directory structure, we can instead detect if a Dockerfile is present. We currently already use the same directory structure for each component, which is:
If we fix this structure, we can also deduct the location of the component specification and further simplify the definition of a component to:
Such a fixed directory structure also provides opportunities to implement a component generator / cookie-cutter in the future.
Tasks
The text was updated successfully, but these errors were encountered: