-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[sdk] Containerized Python Component module not found error #8385
Comments
Hi @connor-mccarthy I am trying to follow the docs and I ended up with the same error as #8353 how can I help to close this issue? |
I can reproduce it with even one just module: working directory: ├── main.py
├── poetry.lock
├── poetry.toml
├── pyproject.toml
└── utils.py Where: main.py
utils.py
Results:╰─>$ kfp component build .
Building component using KFP package path: kfp==2.0.0-beta.11
attempted relative import with no known parent package Alternatively: ─>$ kfp component build bug/
Building component using KFP package path: kfp==2.0.0-beta.11
attempted relative import with no known parent package Adding a have the same results Moving the files to a Update:Even with changing the
still get the same error. My poetry just have the package installation |
@Davidnet, thank you for the detailed reproduction notes. This is on our backlog.
If you're interested, you're welcome to address this bug and submit a PR! |
@connor-mccarthy Awesome, thanks for the response any ideas on where could I start looking? |
Thanks, @Davidnet. Some links: |
Hey guys,
I patched the file by adding Now comes the funny part, bare with me here :) In other words, python would never need to check However, if the returned order from glob would be for example As a quickfix, I added following to the kfp file (
This should work when all the component python files are in the current directory where you execute It also works when using slightly more of a structure, eg:
Then from
|
Thanks for this thorough investigation, @b4sus. This makes sense based on the errors we've observed.
Is this quickfix a final fix? Or is there a more robust fix you have in mind? If final, are you interested in submitting a PR? I would be happy to review promptly. |
Hey @connor-mccarthy, Let's assume standard (I think) python project (let's call it
Now (considering my fix) you have to run
This will work, but there are caveats:
All this is fine for me, but needs to be considered. |
I agree with this. I'm working on a v2 docs refresh currently and will keep this in mind.
Thank you for laying out these considerations. In general, as long as there are no regressions (all existing user code that uses Containerized Python Components will still work), I'm happy to eagerly merge a better but not perfect fix. It sounds like some of these constraints may be been introduced by the proposed fix, however (such as the requirement to use the cwd If we do wait, I'll take a stab at this soon and certainly leverage your investigation. Thank you again for the thorough writeup -- this helps a lot. |
Thanks for the issue breakdown @b4sus and @connor-mccarthy. I have created a PR #9157 to essentially add the module directory to the sys.path within the |
Just tested the fix with b15, nice to see progress in the topic 👍
and using the absolute imports. From component module I import some util function from other module (via absolute import, eg in file |
Thanks for testing and updating this, @b4sus. Reopening. |
…nents. Fixes kubeflow#8385 (kubeflow#9157) * Add module directory to sys.path * Add nested module imports unit test * Add release note to release.md
This bug seems to have been introduced in |
@b4sus, I've been looking into a fix for this and my sense is that we may not want to support absolute imports at this time, since it requires that the KFP component executor In a bit more detail, the This is of course something that we could change by adding additional parameters to the component build process or doing something "smart" under the hood in the component build logic, but I'm not sure the cost of either (a) exposing those parameters to users or (b) maintaining that smart logic is worth the benefit. It seems reasonable that the module that contains the component definition(s) should be runnable as a script (with relative imports), rather than something that requires pip installation (with absolute imports). Let me know if you have other thoughts or if my understanding needs refinement. Thank you, by the way, for the very helpful minimal reproducible example. |
Hey @connor-mccarthy , I might have another idea for a fix though :). What about adding def build(components_directory: str, component_filepattern: str, engine: str,
kfp_package_path: Optional[str], overwrite_dockerfile: bool,
build_image: bool, platform: str, push_image: bool):
"""Builds containers for KFP v2 Python-based components."""
sys.path.append(components_directory)
if build_image and engine != 'docker': Then I can run |
Hey @b4sus, I have a similar project structure as you mentioned above. I wonder how you handle the component within the pipelines and how you separate the components. If the pipeline consist of more than one component, lets say
For now I use a CI/CI pipeline to build and push the components and finally compile and submit the pipeline job (using Google Vertex AI). The components build command looks like the following. This might be unorthodox, but this was the only way I could find, which works with absolute imports. kfp component build . --component-filepattern ${{ matrix.component }}/component.py --push-image
# The matrix.component variable looks for example like that "myproject/components/preprocessing" Now the problem is, that the generate Dockerfile always has a Bevor the containerized python components I used the docker components and only copied the parts of the project relevant to the component. But that required a lot of manual work and lead to errors because of files missing. |
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. |
This issue has been automatically closed because it has not had recent activity. Please comment "/reopen" to reopen it. |
There is a bug when building a containerized Python component that happens (at least) in the case when the longest path of the import graph ending at the component involves >2 modules.
Environment
KFP SDK 2.0.0-beta.6
Steps to reproduce
For example:
Then:
kfp component build .
You get a
No module named
error.Expected result
Should build without an error.
Materials and Reference
Related: #8353
The text was updated successfully, but these errors were encountered: