-
Notifications
You must be signed in to change notification settings - Fork 2.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Git dependencies' submodules with relative URLs handled incorrectly (regression from 1.1) #6499
Comments
Poetry has never recursively cloned submodules -- this isn't exactly a regression, but a new deficiency in the Dulwich Git implementation. I would encourage you to open a Dulwich issue for this as I do not believe there is anything to do from the Poetry end to make Dulwich understand these submodules as relative to the base remote. |
Thanks for looking into this! My understanding was that Poetry added code specifically to recursively clone submodules when using Dulwich, as it's not something Dulwich itself does: poetry/src/poetry/vcs/git/backend.py Lines 425 to 428 in 890c6a3
Poetry then calls Dulwich's Edit: I guess jelmer/dulwich#506 tracks providing the interface Poetry would want to use to recursively clone a repository and its submodules, rather than Poetry having to figure this out by itself. |
Ah, I see. We have a PR open to make the system Git client clone recursively, and I guess I got turned around on what the Dulwich implementation is doing. It looks like we probably need to munge the submodule URLs ourselves then (and add recursive cloning to the system Git client), or remove recursive cloning, to match the behavior of the system Git client. |
@neersighted I can probably do this, though am new to Poetry so would need some guidance. What approach is preferable, munging the submodule URLs and adding recursive cloning, or simply removing recursive cloning? |
Munging would be preferable in my opinion -- we don't want to regress new functionality unless there's no choice, and there's already a PR making the system git client execute a recursive clone merged. |
@neersighted Do we still want to implement this, or would we rather just track the Dulwich PR mentioned above? |
The linked Dulwich issue is an issue and not a PR, and barring someone from Dulwich commenting that they're interested in implementing it soon, I think we should handle this on the Poetry side. |
The issue on the Dulwich side is for porcelain; poetry uses the plumbing from Dulwich directly. There's already a function in the plumbing for getting o list of submodules. |
@neersighted can you please give some guidance on how to test this? The other submodule test simply checks to see if the submodule directory exists, which is simple enough. Given that the submodule specified in python-poetry/test-fixture-vcs-repository#5 is itself a branch of the main The only difference between the submodule ( Appreciate it in advance. |
Hi @evanrittenhouse -- I'm not sure what you mean. Poetry should choke and die on the bad submodule (indeed, I forgot we couldn't merge this until we had the fix in, and broke CI for a bit there) -- you can just make your PR test against https://github.com/python-poetry/test-fixture-vcs-repository/tree/relative_submodule (I moved the commit there) instead of main, and we'll swap main back/fix up your PR right before merge. Essentially, the existing tests capture this failure state -- you just need to make sure they pass/we don't choke on the URL. |
Hey @neersighted, thanks for doing that - I see the submodule folder on the Sorry for all the questions + appreciate the patience. I haven't worked much with Git submodules, but if I understand them correctly we'd need to update |
Unless @jelmer has a better idea, what I'd expect is this: When cloning submodules, determine if any is a relative path instead of a URL. If so, resolve the relative path relative to the URL of the 'main' repository. This would look like:
Those URLs would need to be mutated after we use the Dulwich plumbing to list them, and before we pass the source URLs and destination paths back to Dulwich to clone. |
Yep, exactly what @neersighted mentioned. I don't think you necessarily want the (current) porcelain, you probably want to open .submodules and call |
Apologies for the delay, work has been crazy. Commenting here so the issue doesn't get picked up by someone else as I'm basically code complete - will hopefully be able to submit a PR next weekend. |
are there any updates on this? |
This has not been properly fixed by #7017. Unfortunately, >>> from urllib.parse import urljoin
>>> urljoin("ssh://[email protected]/org/repo", "../other-repo")
'../other-repo' Rather than the expected The way to work around this would be to join the paths from a URL, and then recombine with the protocol: >>> from urllib.parse import urlparse, urlunparse, urljoin
>>> repo_url = urlparse("ssh://[email protected]/org/repo")
>>> other_repo_url = repo_url._replace(path=urljoin(repo_url.path, "../other-repo"))
>>> urlunparse(other_repo_url)
'ssh://[email protected]/org/other-repo' |
This issue should be re-opened. |
@neersighted, as per the comment above, this should be re-opened. I had a fix ready some months back but it hasn't gotten any attention. |
This issue has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs. |
I am on the latest Poetry version.
I have searched the issues of this repo and believe that this is not a duplicate.
If an exception occurs when executing a command, I executed it again in debug mode (
-vvv
option).OS version and name: Ubuntu 20.04
Poetry version: 1.2.0
Issue
I have a project with a git dependency. The dependency uses git submodules, and the dependency's
.gitmodules
file contains a submodule using a relative URL, e.g.:When Poetry attempts to install the dependency, it clones the repository and then attempts to clone each submodule, but it passes the raw relative URL from the configuration to
_clone()
and thus_fetch_remote_refs()
, which calls Dulwich'sget_transport_and_path()
, which decides the URL is for a local repository so it attempts to find../bar.git
on the filesystem, which fails. I think it should instead detect relative URLs and append them to the root repository's URL so that Dulwich will realise it's a remote repo and return the appropriate client.If I clone the dependency using git on the command line, with
--recurse-submodules
, git correctly uses a relative URL to fetch the submodule. If I force Poetry to use the legacy system git instead of Dulwich, it doesn't seem to clone the submodules at all, so also doesn't run into trouble (in this case the dependency's submodule isn't needed to install the dependency, just for testing). Similarly this also worked OK on poetry 1.1 which I think used the system git to recursively clone submodules.It looks like this is from #5428.
Here's a representative traceback:
Traceback
The text was updated successfully, but these errors were encountered: