Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Allow using repository mirrors #3624

Closed
wants to merge 6 commits into from
Closed

feat: Allow using repository mirrors #3624

wants to merge 6 commits into from

Conversation

ZeroAurora
Copy link

@ZeroAurora ZeroAurora commented Jan 31, 2021

Pull Request Check List

Current status: ready for review and merge

Resolves: #1632, Resolves: #1841, Resolves: #2151, Resolves: #3355, Resolves: #3474, Resolves: #3479, Resolves: #3553

  • Added basic functionality.
  • Resolved the blocking problems below.
  • Added tests for changed code.
    (btw I don't know how to write good tests so I need some help...)
  • Updated documentation for changed code.

Details

This PR introduces a config repositories.<repo>.mirror.url, which is used to specify a mirror url of a existing repository.
This can be extremely helpful in regions where it's hard to connect to PyPI and other sources. (And also makes poetry my first choice for my python projects.)

Blocking Problems

  • Avoid publishing to mirror repositories.
    Currently I achieved the goals by modifing factory.py, which is used to initialize the Poetry object used by others. I'm not sure if it will break uploading.
    I checked poetry/publisher.py and ensured this won't happen.

Non-blocking Problems

  • Fallback to upstream sources. Related: Poetry source fallback fails when a source is unreachable #3565
    Currently (in the PR), poetry won't fallback to original source if a source fails. I couldn't think of a way to resolve it. But both pip and pipenv seem not to fallback, so we can be a bit evil and not fallback too lolol
  • Exported requirements.txt should not use mirrors. Related: #2617
    I've tested it and PyPI mirror won't show up in requirements.txt.

This is my first PR to such a large project 🎉 so I would be ❤️ if you help with this PR!

poetry/factory.py Outdated Show resolved Hide resolved
@sinoroc
Copy link

sinoroc commented Jan 31, 2021

I'm glad to see someone attacking this topic. @ZeroAurora I hope you manage to get it done.

@ZeroAurora
Copy link
Author

ZeroAurora commented Jan 31, 2021

OK most things of this PR are done. Now I need some help with the test...

@ZeroAurora ZeroAurora marked this pull request as ready for review January 31, 2021 12:07
@ZeroAurora
Copy link
Author

I'm trying to learn testing little by little. Maybe I will push some tests later.

poetry/factory.py Outdated Show resolved Hide resolved
@ZeroAurora
Copy link
Author

I'm making huge changes to the code I've written and will rebase and force-push again later, since I just want to keep the log clean.

@ZeroAurora
Copy link
Author

ZeroAurora commented Jan 31, 2021

Now everything is ready. This PR is ready for merging.
I'm surprised at the large number of mistakes I'm made and the large number of rebases I've used. And a lot of force-push is also used. Maybe I'm too eager to see if I have done. If you're annoyed at it, I feel really, really sorry.
Thank you git rebase, you've saved my life.

@ZeroAurora
Copy link
Author

ZeroAurora commented Jan 31, 2021

🎉 Completed! This time is true. Apologize again for so many force pushes.

@ZeroAurora

This comment has been minimized.

@finswimmer finswimmer requested a review from a team January 31, 2021 19:15
@finswimmer
Copy link
Member

Thanks a lot for your contribution @ZeroAurora. 👍

Because I'm not familiar with this part of poetry I've pinged the other maintainer to have a look at your PR.

@ZeroAurora
Copy link
Author

ZeroAurora commented Feb 1, 2021

Thanks! At the same time I'm going to rebase and modify the commits again to keep the log clean.

@ZeroAurora
Copy link
Author

Hey I've rebased this to the latest master branch. How's the reviewing process going?
(idk why flake is blocking linting. Seems not my fault.)

@ZeroAurora
Copy link
Author

Fixed something. Seems the linting problem is still not my fault.

@sinoroc
Copy link

sinoroc commented Feb 7, 2021

Fixed something. Seems the linting problem is still not my fault.

Yes, looks like the linter runner itself is failing somehow.

Hey I've rebased this to the latest master branch. How's the reviewing process going?

Patience. Maintainers will get to this eventually, but it might take some time, months maybe, because they do not have much time/resources to invest on unpaid volunteer work.

My advice to improve your chances of getting this PR reviewed earlier, is to make the maintainers' work easier. For example you could get as many eyes as possible on this. In your case there are many people affected by the underlying issue, as shown by the many linked issues. So if you feel confident enough with your PR, then go post on those issues and ask others to test your PR and see if it does indeed fix their use cases.

Good luck!

@ZeroAurora
Copy link
Author

Thanks @sinoroc ! I'm going to follow your advice.

@ZeroAurora
Copy link
Author

I know where's the problem! fixing.

... to enable further improvements on mirrored repository.

One example is that the cache dir of the mirrored PyPI repository should
also be named "pypi". However, the repository name "pypi" is reserved
for the original PyPI. Doing this change to LegacyRepository will not
only allow this, but also enable other developers to make changes if
mirrored LegacyRepository is broken.

The corresponding tests are also added. Note that there are some changes
on the paths to some files.
... and reduced some debug messages to ensure good reading experience.
Comment on lines -23 to -24
if TYPE_CHECKING:
from .repositories.legacy_repository import LegacyRepository
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've deleted these two lines since LegacyRepository is already imported above.

@amotl
Copy link

amotl commented Feb 9, 2021

Dear @ZeroAurora,

thanks a stack for your attempt to tackle this improvement to Poetry. We have been talking about topics related to this at earthobservations/wetterdienst#332 just recently. At earthobservations/wetterdienst#332 (comment), I just summarized different discussions which already took place within the Poetry issue tracker.

Specifically, it would be really nice if your patch could also address the use case not only to use a repository mirror, but also to turn off reaching out to the internet at all by going to a repository on the local filesystem (or the ~/.cache/wheelhouse) directory. This option would preferably be aligned with pips interactive commandline flag --no-index as outlined by @webknjaz, so that it can be specifically used on CI environments when installing dependencies from a local folder being already populated (and cached) from a previous build.

Good luck and keep up the spirit!

With kind regards,
Andreas.

@ZeroAurora
Copy link
Author

ZeroAurora commented Feb 10, 2021

Hey @amotl !
Thank you for your encouragement! But as I look into the issue you mentioned, I find that using a local flat-file repository may be out of the scope of this PR.
A solution might be adding a new repository type (maybe we should call it FlatRepository or whatever), but as I've mentioned, this won't be in the scope of the PR.
I've looked into your issue and I think the best solution can be caching poetry's cache directory ($XDG_CACHE_DIR/pypoetry). Here poetry caches wheels downloaded from each repository. This is similar to what you want to do, and you can achieve the same goal of speeding up CI by doing this.
I've reviewed your issue again and found that you've already tried that. I'm sorry I don't have any solutions, but I would love to hear someone working on this.
Good luck!

@amotl
Copy link

amotl commented Feb 10, 2021

Hey @ZeroAurora,

thanks for your kind response. I recognize that this request is out of scope of this patch, thanks to your elaboration. Eventually, I might come back to this in order to improve the situation for our use case, thus bringing similar features to Poetry like what pip has to offer in this regard.

Through your patch, it is already very valuable to us to know which places to look at.

With kind regards,
Andreas.

@sinoroc
Copy link

sinoroc commented Feb 10, 2021

@amotl
I have not exactly followed the discussion in details, but looks like you might be interested in: https://github.com/uranusjr/simpleindex
Basically it allows you to easily start a local index server based on simple rules, and the files can be served from the local filesystem. So I guess maybe if configured right, it could do what you are after.

@amotl
Copy link

amotl commented Feb 10, 2021

Dear @sinoroc,

simpleindex looks nice, indeed! But within a CI job like Travis CI oder GitHub Actions, even that would be additional hassle to bring up a sidecar service and wrap that around the wheelhouse directory just for that purpose. However, it might be doable.

Thank you very much for your suggestion. Let's continue our discussion at earthobservations/wetterdienst#332 when applicable? Then, we won't hijack the discussion around @ZeroAurora's contribution any longer. Sorry again for doing that on the first hand!

With kind regards,
Andreas.

@koterpillar
Copy link

I have just tried this branch (e0bc6d4) with CodeArtifact (essentially a private PyPI mirror that requires authentication). I have the following configuration:

[repositories.pypi.mirror]
url = "https://xxxxxxxx-NNNNNNNNNNN.d.codeartifact.xxxxxx-n.amazonaws.com/pypi/xxxxxxxx/"

and I have provided credentials for it using poetry config http-basic.pypi aws <AWS_TOKEN_HERE>.

However, loading any packages fails with "Unable to find installation candidates".

@koterpillar
Copy link

Some fixes you're welcome to merge:

8288f5b
997d6a1

After these, I can install from CodeArtifact:

ENDPOINT="$(aws codeartifact get-repository-endpoint --domain xxxxxxxx --domain-owner nnnnnnnnnnnn --repository xxxxxxxx --format pypi --query repositoryEndpoint --output text)simple/"
TOKEN="$(aws codeartifact get-authorization-token --domain xxxxxxxx --domain-owner nnnnnnnnnnnn --query authorizationToken --output text)"
poetry config repositories.pypi.mirror "$ENDPOINT"
poetry config http-basic.pypi aws "$TOKEN"

However, when updating or adding new packages, Poetry starts inserting these sections into the lockfile:

[package.source]
type = "legacy"
url = "https://xxxxxxxx-nnnnnnnnnnnn.d.codeartifact.rrrrrr-n.amazonaws.com/pypi/xxxxxxxx/simple"
reference = "PyPI"

@ZeroAurora
Copy link
Author

thx for the patches, I'm considering to merge them when I'm free (now very busy as a student)
As for the source lines in the lockfiles I don't have any ideas... now when I review this PR I made, I feel there are indeed much more things we need to do. Maybe adding some lines to stop lockfile generator from inserting mirrors into lockfile? I may investigate it soom(tm)

@ZeroAurora
Copy link
Author

I feel sorry, but as I couldn't handle this PR any longer, I have to close it. Debugging and keeping up-to-date with upstream are not easy for me (I'm a student, both in real world and in programming world, after all), and I don't have time to manage them either.
I do hope someone will take my commits and continue to investigate, or just rewrite from scratch. Sorry again for not solving this issue.

@ZeroAurora ZeroAurora closed this May 1, 2021
@ZeroAurora
Copy link
Author

I will not delete the branch, just in case that someone would like to base on it.

@ZeroAurora
Copy link
Author

I'm deleting the branch since it has already had many diffs with the upstream. Re-implementing is more realistic.
I hope someone would take this issue up again someday.

@ZeroAurora ZeroAurora deleted the feat/repository-mirror branch December 11, 2021 16:39
Copy link

This pull request has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Feb 29, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.