Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize submodules fetching #5399

Closed
wants to merge 2 commits into from
Closed

Optimize submodules fetching #5399

wants to merge 2 commits into from

Conversation

Kha
Copy link
Contributor

@Kha Kha commented Oct 17, 2021

Replace the current naive submodule fetching code with one that mostly does the minimal amount of necessary work/fetching

  • For remote repos, we clone into .cache/nix as usual, then checkout & submodule update into the temp directory using --work-tree. It doesn't get much more direct than that.
  • For local repos, we shouldn't do the same because we don't want to change the HEAD of that repo (and there is no direct way to check out submodules for a specific commit that is not HEAD). Instead, we do a cheap clone into the temp dir that shares all objects, including those of submodules, with the original repo. Note that if the submodule commit has not yet been fetched into the original repo, submodule update will then fetch it into the temp dir, so checking out another commit afterwards will fetch the upstream again. This could potentially be improved by fetching into .cache/nix even for local repos.

Some measurements:

Store-copying (prefetched) Nixpkgs without submodules/with submodules (old)/with submodules (new)

$ nix store delete /nix/store/jq0sw73dh69bgm588277jvxzg881vjmq-source && sync && time nix eval --expr 'builtins.fetchGit { url = "https://github.com/NixOS/nixpkgs"; rev = "de084e5de9a4279320f902c3b92df34c47798fdd"; }'
1 store paths deleted, 91.36 MiB freed
{ lastModified = 1633782067; lastModifiedDate = "20211009122107"; narHash = "sha256-GGWh0sqDoll1JSqRri7CAm5dWfhtTMt3OG0arGF+Lxw="; outPath = "/nix/store/jq0sw73dh69bgm588277jvxzg881vjmq-source"; rev = "de084e5de9a4279320f902c3b92df34c47798fdd"; revCount = 322195; shortRev = "de084e5"; submodules = false; }
nix eval --expr   3.35s user 1.08s system 89% cpu 4.954 total
$ /run/current-system/sw/bin/nix store delete /nix/store/jq0sw73dh69bgm588277jvxzg881vjmq-source && sync && time nix eval --expr 'builtins.fetchGit { url = "https://github.com/NixOS/nixpkgs"; rev = "de084e5de9a4279320f902c3b92df34c47798fdd"; submodules = true; }'
1 store paths deleted, 91.36 MiB freed
{ lastModified = 1633782067; lastModifiedDate = "20211009122107"; narHash = "sha256-GGWh0sqDoll1JSqRri7CAm5dWfhtTMt3OG0arGF+Lxw="; outPath = "/nix/store/jq0sw73dh69bgm588277jvxzg881vjmq-source"; rev = "de084e5de9a4279320f902c3b92df34c47798fdd"; revCount = 322195; shortRev = "de084e5"; submodules = true; }
nix eval --expr   369.88s user 17.36s system 321% cpu 2:00.27 total
$ nix store delete /nix/store/jq0sw73dh69bgm588277jvxzg881vjmq-source && sync && time nix eval --expr 'builtins.fetchGit { url = "https://github.com/NixOS/nixpkgs"; rev = "de084e5de9a4279320f902c3b92df34c47798fdd"; }'
1 store paths deleted, 91.36 MiB freed
{ lastModified = 1633782067; lastModifiedDate = "20211009122107"; narHash = "sha256-GGWh0sqDoll1JSqRri7CAm5dWfhtTMt3OG0arGF+Lxw="; outPath = "/nix/store/jq0sw73dh69bgm588277jvxzg881vjmq-source"; rev = "de084e5de9a4279320f902c3b92df34c47798fdd"; revCount = 322195; shortRev = "de084e5"; submodules = false; }
nix eval --expr   3.37s user 1.12s system 90% cpu 4.983 total

Store-copying a repositories with actual submodules:

$ nix store delete /nix/store/2m5fx9xvm0pmzxkcrv1y37bab43jhkq3-source && sync && time /run/current-system/sw/bin/nix eval --expr 'builtins.fetchGit { url = "https://github.com/rust-lang/rust"; rev = "bb918d0a5bf22211df0423f7474e4e4056978007"; }'
don't know how to build these paths:
  /nix/store/2m5fx9xvm0pmzxkcrv1y37bab43jhkq3-source
0 store paths deleted, 0.00 MiB freed
{ lastModified = 1633773654; lastModifiedDate = "20211009100054"; narHash = "sha256-4gxA+B1VjwQuZp3YBCMD5u/nHqtSmgJxBixdAitGwQQ="; outPath = "/nix/store/2m5fx9xvm0pmzxkcrv1y37bab43jhkq3-source"; rev = "bb918d0a5bf22211df0423f7474e4e4056978007"; revCount = 155932; shortRev = "bb918d0"; submodules = false; }
/run/current-system/sw/bin/nix eval --expr   1.92s user 0.79s system 91% cpu 2.947 total
$ nix store delete /nix/store/4cx1ma5z5nj6g1gzmhgklqcpyhi28gi7-source && sync && time /run/current-system/sw/bin/nix eval --expr 'builtins.fetchGit { url = "https://github.com/rust-lang/rust"; rev = "bb918d0a5bf22211df0423f7474e4e4056978007"; submodules = true; }'
1 store paths deleted, 66.49 MiB freed
{ lastModified = 1633773654; lastModifiedDate = "20211009100054"; narHash = "sha256-hfbswcOMUjFMW9duxtnYfg1670QKfbwJ2+LMmu9hGvo="; outPath = "/nix/store/4cx1ma5z5nj6g1gzmhgklqcpyhi28gi7-source"; rev = "bb918d0a5bf22211df0423f7474e4e4056978007"; revCount = 155932; shortRev = "bb918d0"; submodules = true; }
/run/current-system/sw/bin/nix eval --expr   544.72s user 42.96s system 263% cpu 3:43.00 total
$ nix store delete /nix/store/smls8ip6zb6lijp0m8yj7yfdigxcjc58-source && sync && time nix eval --expr 'builtins.fetchGit { url = "https://github.com/rust-lang/rust"; rev = "bb918d0a5bf22211df0423f7474e4e4056978007"; submodules = true; }'
1 store paths deleted, 1183.98 MiB freed
{ lastModified = 1633773654; lastModifiedDate = "20211009100054"; narHash = "sha256-hfbswcOMUjFMW9duxtnYfg1670QKfbwJ2+LMmu9hGvo="; outPath = "/nix/store/4cx1ma5z5nj6g1gzmhgklqcpyhi28gi7-source"; rev = "bb918d0a5bf22211df0423f7474e4e4056978007"; revCount = 155932; shortRev = "bb918d0"; submodules = true; }
nix eval --expr   7.07s user 3.46s system 73% cpu 14.291 total

And the local repo that triggered #5280 in the first place:

$ nix store delete /nix/store/kabkv0czslhw3q35i0z0f2pi9brvprbc-source && sync && time nix eval --expr 'builtins.fetchGit { url = /home/sebastian/lean/lean; rev = "e843fb7ca5b596945820b76eaaa51754e8acfe47"; }'
1 store paths deleted, 101.73 MiB freed
{ lastModified = 1634457683; lastModifiedDate = "20211017080123"; narHash = "sha256-tG3ynyLzBXfE2lOWFPmH7Gpqme3tRmWcSTQlO4uY3hg="; outPath = "/nix/store/kabkv0czslhw3q35i0z0f2pi9brvprbc-source"; rev = "e843fb7ca5b596945820b76eaaa51754e8acfe47"; revCount = 26227; shortRev = "e843fb7"; submodules = false; }
nix eval --expr   0.34s user 0.29s system 88% cpu 0.711 total
$ nix store delete /nix/store/6vrqagyxshba0ym66147y2ha21ar6j51-source && sync && time /run/current-system/sw/bin/nix eval --expr 'builtins.fetchGit { url = /home/sebastian/lean/lean; rev = "e843fb7ca5b596945820b76eaaa51754e8acfe47"; submodules = true; }'
1 store paths deleted, 101.87 MiB freed
{ lastModified = 1634457683; lastModifiedDate = "20211017080123"; narHash = "sha256-PzoqcAswT89mBObk0q36eiHNFCtaHRIf3Eps1oKbons="; outPath = "/nix/store/6vrqagyxshba0ym66147y2ha21ar6j51-source"; rev = "e843fb7ca5b596945820b76eaaa51754e8acfe47"; revCount = 26227; shortRev = "e843fb7"; submodules = true; }
/run/current-system/sw/bin/nix eval --expr   105.60s user 3.50s system 428% cpu 25.441 total
$ nix store delete /nix/store/6vrqagyxshba0ym66147y2ha21ar6j51-source && sync && time nix eval --expr 'builtins.fetchGit { url = /home/sebastian/lean/lean; rev = "e843fb7ca5b596945820b76eaaa51754e8acfe47"; submodules = true; }'
1 store paths deleted, 101.87 MiB freed
{ lastModified = 1634457683; lastModifiedDate = "20211017080123"; narHash = "sha256-PzoqcAswT89mBObk0q36eiHNFCtaHRIf3Eps1oKbons="; outPath = "/nix/store/6vrqagyxshba0ym66147y2ha21ar6j51-source"; rev = "e843fb7ca5b596945820b76eaaa51754e8acfe47"; revCount = 26227; shortRev = "e843fb7"; submodules = true; }
nix eval --expr   0.36s user 0.27s system 45% cpu 1.375 total

TL;DR: no overhead from submodules = true; without submodules and reasonable one with submodules

I am only 90% sure this all makes sense and covers at least as many cases as the current code, so please do give it a try & come up with weird edge cases.

Comment on lines +459 to +461
runProgram("git", true, { "--git-dir", repoDir, "--work-tree", tmpDir, "checkout", "--quiet", input.getRev()->gitRev(), "." });
// `-C` should be redundant, probably a bug in `git submodule`
runProgram("git", true, { "--git-dir", repoDir, "--work-tree", tmpDir, "-C", tmpDir, "submodule", "--quiet", "update", "--init", "--recursive" });
Copy link
Contributor Author

@Kha Kha Oct 17, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This sequence (which existed before) should probably be covered by a repo lock to prevent races

@kaii-zen
Copy link
Contributor

kaii-zen commented Nov 4, 2021

... siphoning submodules stakeholders to #5497

@stale
Copy link

stale bot commented Jun 12, 2022

I marked this as stale due to inactivity. → More info

@stale stale bot added stale and removed stale labels Jun 12, 2022
@stale stale bot added the stale label Jan 7, 2023
@Kha
Copy link
Contributor Author

Kha commented Mar 18, 2023

Closing in favor of #6530, see also #7862 (comment)

@Kha Kha closed this Mar 18, 2023
@stale stale bot removed the stale label Mar 18, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants