Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support Git submodules in source-package-repository #7625

Merged
merged 3 commits into from
Oct 22, 2021
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
47 changes: 30 additions & 17 deletions cabal-install/src/Distribution/Client/VCS.hs
Original file line number Diff line number Diff line change
Expand Up @@ -63,7 +63,7 @@ import qualified Data.Char as Char
import qualified Data.List as List
import qualified Data.Map as Map
import System.FilePath
( takeDirectory )
( takeDirectory, (</>) )
import System.Directory
( doesDirectoryExist
, removeDirectoryRecursive
Expand Down Expand Up @@ -398,17 +398,18 @@ vcsGit =
vcsCloneRepo verbosity prog repo srcuri destdir =
[ programInvocation prog cloneArgs ]
-- And if there's a tag, we have to do that in a second step:
++ [ (programInvocation prog (checkoutArgs tag)) {
progInvokeCwd = Just destdir
}
| tag <- maybeToList (srpTag repo) ]
++ [ git (resetArgs tag) | tag <- maybeToList (srpTag repo) ]
++ [ git (["submodule", "sync", "--recursive"] ++ verboseArg)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How much does this, and other more complex git commands you introduced, slow down the cases where there are no submodules or where submodules are handled fine already (are there such cases?). What other risks there are from these more complex commands?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How much does this, and other more complex git commands you introduced, slow down the cases where there are no submodules?

I just tried timing the new commands on the cabal repo and got:

$ time git submodule deinit --force --all
git submodule deinit --force --all  0.03s user 0.04s system 113% cpu 0.056 total

$ time git submodule sync
git submodule sync  0.04s user 0.03s system 109% cpu 0.062 total

$ time git submodule update --init --recursive --force
git submodule update --init --recursive --force  0.03s user 0.04s system 109% cpu 0.066 total

$ time git submodule foreach --recursive "git clean -ffdxq"
git submodule foreach --recursive "git clean -ffdxq"  0.03s user 0.04s system 110% cpu 0.061 total

$ time git clean -ffdxq
git clean -ffdxq  0.00s user 0.01s system 96% cpu 0.011 total

So, with this imprecise test, it seems like this PR may have added extra latency of ~0.15 seconds per repository (which doesn't have submodules).

or where submodules are handled fine already (are there such cases?)

I don't think there were any cases where submodules worked.

What other risks there are from these more complex commands?

I can see this becoming a breaking change for some setup where a package repository submodules a private repository which is not needed for compiling haskell. To support this, a flag might be required to disable submodule fetching.

I can't think of any other risks.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can see this becoming a breaking change for some setup where a package repository submodules a private repository which is not needed for compiling haskell. To support this, a flag might be required to disable submodule fetching.

How do you feel about waiting with that flag until we get actual reports that people need it? Is there a workaround for such people in the absence of the flag?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am OK with waiting for the flag.
As a work around maybe people could maintain forks of of depedencies which remove such submodules, but this would be very inconvenient for repositories which need to be updated a lot.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So finally we will not have the flag to hide the new feature?

Out of curiosity, did you test it (performance oriented) with large git repos with many submodules?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tested it with a large git repo, there wasn't much difference, most time was just spent first time cloning the base repo. The repo had 3 submodules, which were all small, so fetching them was comparatively quick.

, git (["submodule", "update", "--init", "--force", "--recursive"] ++ verboseArg)
]
where
git args = (programInvocation prog args) {progInvokeCwd = Just destdir}
cloneArgs = ["clone", srcuri, destdir]
++ branchArgs ++ verboseArg
branchArgs = case srpBranch repo of
Just b -> ["--branch", b]
Nothing -> []
checkoutArgs tag = "checkout" : verboseArg ++ [tag, "--"]
resetArgs tag = "reset" : verboseArg ++ ["--hard", tag, "--"]
verboseArg = [ "--quiet" | verbosity < Verbosity.normal ]

vcsSyncRepos :: Verbosity
Expand All @@ -431,24 +432,36 @@ vcsGit =
if exists
then git localDir ["fetch"]
else git (takeDirectory localDir) cloneArgs
git localDir checkoutArgs
-- Before trying to checkout other commits, all submodules must be
-- de-initialised and the .git/modules directory must be deleted. This
-- is needed because sometimes `git submodule sync` does not actually
-- update the submodule source URL. Detailed description here:
-- https://git.coop/-/snippets/85
git localDir ["submodule", "deinit", "--force", "--all"]
let gitModulesDir = localDir </> ".git" </> "modules"
gitModulesExists <- doesDirectoryExist gitModulesDir
when gitModulesExists $ removeDirectoryRecursive gitModulesDir
git localDir resetArgs
git localDir $ ["submodule", "sync", "--recursive"] ++ verboseArg
git localDir $ ["submodule", "update", "--force", "--init", "--recursive"] ++ verboseArg
git localDir $ ["submodule", "foreach", "--recursive"] ++ verboseArg ++ ["git clean -ffxdq"]
git localDir $ ["clean", "-ffxdq"]
where
git :: FilePath -> [String] -> IO ()
git cwd args = runProgramInvocation verbosity $
(programInvocation gitProg args) {
progInvokeCwd = Just cwd
}

cloneArgs = ["clone", "--no-checkout", loc, localDir]
++ case peer of
Nothing -> []
Just peerLocalDir -> ["--reference", peerLocalDir]
++ verboseArg
where loc = srpLocation
checkoutArgs = "checkout" : verboseArg ++ ["--detach", "--force"
, checkoutTarget, "--" ]
checkoutTarget = fromMaybe "HEAD" (srpBranch `mplus` srpTag)
verboseArg = [ "--quiet" | verbosity < Verbosity.normal ]
cloneArgs = ["clone", "--no-checkout", loc, localDir]
++ case peer of
Nothing -> []
Just peerLocalDir -> ["--reference", peerLocalDir]
++ verboseArg
where loc = srpLocation
resetArgs = "reset" : verboseArg ++ ["--hard", resetTarget, "--" ]
resetTarget = fromMaybe "HEAD" (srpBranch `mplus` srpTag)
verboseArg = [ "--quiet" | verbosity < Verbosity.normal ]

gitProgram :: Program
gitProgram = (simpleProgram "git") {
Expand Down
Loading