-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(sdk): using component's pip_index_urls for Dockerfile generation. Fixes #8816 #8871
feat(sdk): using component's pip_index_urls for Dockerfile generation. Fixes #8816 #8871
Conversation
Hi @b4sus. Thanks for your PR. I'm waiting for a kubeflow member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@b4sus, this looks great! Thank you for the contribution.
Just a few small nitpicks. Could you also add a release note to ./sdk/RELEASE.md
that describes this as a feature to support pip index_urls
in Containerized Python Components?
if comp.pip_index_urls is not None: | ||
pip_index_urls.extend(comp.pip_index_urls) | ||
if len(pip_index_urls) > 0: | ||
self._pip_index_urls = list(dict.fromkeys(pip_index_urls)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you explain what's happening here? It seems like this converts a list to dict then back to list, but it's possible I'm missing something.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is a little trick to get rid of possible duplicates preserving order (dict guarantees insertion order since 3.7, see here.
I will remove this as it is a bit too defensive and it is not done for any other list arguments I've seen - to be consistent.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Neat! Thank you for explaining and good thinking.
I actually think it makes sense to keep this deduping. I can imagine containerizing many components that each use the same single URL. We don't need to list that package registry multiple times. This is probably the most realistic use case, since combining the pip_index_urls
of components that each use a different registry wouldn't really make sense.
And the order preserving is good, too. There's no logical total ordering across multiple components, but it's easy enough to preserve a partial ordering, so let's do it.
The entries in the requirements file (runtime-requirements.txt
) are deduplicated (via set()
) and have some of the same concerns/considerations that arise from combining the component parameters for multiple components, so there's one precedent for deduping.
WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're right, the deduplication is actually necessary when more components in a file share the same pypi url, dockerfile needs it only once - what is I guess expected use-case - company having own pypi index.
I've put it back in :)
Preserving order is nice to have for unit tests, otherwise set would be fine (like in runtime-requirements case).
sdk/python/kfp/cli/component_test.py
Outdated
|
||
WORKDIR /usr/local/src/kfp/components | ||
COPY runtime-requirements.txt runtime-requirements.txt | ||
RUN pip install --index-url https://pypi.org/simple --trusted-host https://pypi.org/simple --extra-index-url https://example.com/pypi/simple --trusted-host https://example.com/pypi/simple --no-cache-dir -r runtime-requirements.txt |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(not your code; feel free to ignore) Looks like string joiner implementation results in two spaces between the first entry and the second (before the first --extra-index-url
).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
Co-authored-by: Connor McCarthy <[email protected]>
…s string; duplicate removal not necessary
…url - we need it only once for docker image
/ok-to-test |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
/approve
Thank you, @b4sus! This is a valuable contribution.
/remove-lgtm Just caught this: looks like the release notes entry is under |
@b4sus: The following test failed, say
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
/lgtm Thanks again, @b4sus! |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: connor-mccarthy The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
As described in issue #8816, pip_index_urls can be used for
pip install
command inside of generatedDockerfile
- handy for containerized python components.The PR is reusing same code for generating the index url options string, which is used for lightweight python component.