-
Notifications
You must be signed in to change notification settings - Fork 5.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
repo-server duplicates git packfiles, filling up disk #8845
Comments
The code itself is a little difficult to follow. Here's the important part. err = gitClient.Init()
// the old way
//err = gitClient.Fetch("")
//err = gitClient.Checkout(revision, false)
// push a commit here
//err = gitClient.Fetch("")
//err = gitClient.Checkout(revision, false)
// observe that the pack file has not been duplicated
// the new way
err = gitClient.Fetch("some-revision")
err = gitClient.Checkout("FETCH_HEAD", false)
// push a commit here
err = gitClient.Fetch("some-revision")
err = gitClient.Checkout("FETCH_HEAD", false)
// observe that the pack file HAS been duplicated. Basically if you comment out the new way and uncomment the old way, you won't observe the duplicated pack files. I've created a way to reproduce the bug with just git:
I'm not sure what this teaches us. |
|
Okay. So when you run git does however support fetching a specific SHA. That will resolve even refs which are not in the default refspec. So when a user specifies Unfortunately, it seems like git isn't very tidy when you fetch specific SHAs like that. So here is my proposal. Instead of defaulting to fetching specific commits, first just do a standard For users of non-standard refs, this causes a performance hit, because you run For users of standard refs, this improves disk usage, because
Will put up a PR tomorrow. |
Another demonstration: mkdir argo-cd
cd argo-cd/
git init
git remote add origin https://github.com/argoproj/argo-cd.git
git fetch origin
git checkout 497e53b0203638409e3083fa2ffac7d8fb3cce14
git fetch origin
git checkout 32be020af0f8bf6438201ee79b4d2b8037c57154
git fetch origin
git checkout 32d33dedcc70d94177384b235891b99d89497273
git fetch origin
git checkout 2e65b42f05bcc1401d1489e751993ec197f6942c
git fetch origin
git checkout b1ff9dbe1e3e3b2520e94eefc77d0322c765cd75
ls .git/objects/pack # shows two files
du -h . # current directory is 96M cd ..
mkdir argo-cd-fetch
cd argo-cd-fetch/
git init
git remote add origin https://github.com/argoproj/argo-cd.git
git checkout FETCH_HEAD
git fetch origin 497e53b0203638409e3083fa2ffac7d8fb3cce14
git checkout FETCH_HEAD
git fetch origin 32be020af0f8bf6438201ee79b4d2b8037c57154
git checkout FETCH_HEAD
git fetch origin 32d33dedcc70d94177384b235891b99d89497273
git checkout FETCH_HEAD
git fetch origin 2e65b42f05bcc1401d1489e751993ec197f6942c
git checkout FETCH_HEAD
git fetch origin b1ff9dbe1e3e3b2520e94eefc77d0322c765cd75
git checkout FETCH_HEAD
ls .git/objects/pack. # shows ten files
du -sh . # current directory is 244M |
I asked on StackOverflow why the packfile behavior is so different and got a really interesting answer: https://stackoverflow.com/questions/71618307/why-would-fetching-specific-git-commits-use-more-disk-space-than-fetching-all |
Looks like this is a pretty bad regression. We should cherry-pick fix into v2.3 |
…#8897) fix: prevent excessive repo-server disk usage for large repos (#8845) (#8897) Signed-off-by: Michael Crenshaw <[email protected]>
…#8897) fix: prevent excessive repo-server disk usage for large repos (#8845) (#8897) Signed-off-by: Michael Crenshaw <[email protected]>
…oj#8845) (argoproj#8897) fix: prevent excessive repo-server disk usage for large repos (argoproj#8845) (argoproj#8897) Signed-off-by: Michael Crenshaw <[email protected]> Signed-off-by: wojtekidd <[email protected]>
* fix(ui): Applications page incorrectly resets to tiles view. Fixes argoproj#8702 (argoproj#8718) Signed-off-by: Yuan Tang <[email protected]> * fix: correct jsonnet paths resolution (argoproj#8721) Signed-off-by: Alexander Matyushentsev <[email protected]> * chore: Bump stable version of application set addon (argoproj#8744) Signed-off-by: Alexander Matyushentsev <[email protected]> * fix: Retry checkbox unchecked unexpectedly; Sync up with YAML (argoproj#8682) (argoproj#8720) Signed-off-by: Keith Chong <[email protected]> * Bump version to 2.3.1 * Bump version to 2.3.1 * Merge pull request from GHSA-2f5v-8r3f-8pww * fix: application resource APIs must enforce project restrictions Signed-off-by: Alexander Matyushentsev <[email protected]> * Fix unit tests Signed-off-by: jannfis <[email protected]> Co-authored-by: jannfis <[email protected]> * chore: remove lint-docs CI task (argoproj#8722) (argoproj#8858) * chore: remove lint-docs CI task Signed-off-by: Alexander Matyushentsev <[email protected]> * chore: remove not longer necessary url-allow-list Signed-off-by: Alexander Matyushentsev <[email protected]> Co-authored-by: Alexander Matyushentsev <[email protected]> * chore: fix imports (argoproj#8859) Signed-off-by: Michael Crenshaw <[email protected]> * Bump version to 2.3.2 * Bump version to 2.3.2 * fix: Set QPS and burst rate for resource ops client (argoproj#8915) * fix: Set QPS and burst rate for resource ops client Signed-off-by: jannfis <[email protected]> * fix: prevent excessive repo-server disk usage for large repos (argoproj#8845) (argoproj#8897) fix: prevent excessive repo-server disk usage for large repos (argoproj#8845) (argoproj#8897) Signed-off-by: Michael Crenshaw <[email protected]> * fix: bump gitops engine version to v0.6.2 Signed-off-by: Alexander Matyushentsev <[email protected]> * docs: update v2.4+ roadmap items (argoproj#8593) Signed-off-by: ishitasequeira <[email protected]> * docs: reflect v2.3 release changes in roadmap.md (argoproj#8747) docs: reflect v2.3 release changes in roadmap.md (argoproj#8747) Signed-off-by: Alexander Matyushentsev <[email protected]> * Bump version to 2.3.3 * Bump version to 2.3.3 * Add manifest for OnePipeline (cherry picked from commit 15aa211080ef020e6a2ceaee9b845eb3259db237) * Load additional resource overrides from dedicated ConfigMap * Run unit tests (cherry picked from commit 7605d5b0e2e816bb1cf9a29c5910c0fd511900c2) * Install and config Git for unit tests (cherry picked from commit 05dda11f6adf3191712b4598c8d55fe8ca1647a6) * Add doc for changes * feat: Argo CD CI pipeline changes (argoproj#4) * updated cicd image * upadted registry region * updated one.pipeline.yaml to use the latest scripts * updated makefile to add required targets * feat: Argo CD v2.3.2 (argoproj#5) * fix(ui): Applications page incorrectly resets to tiles view. Fixes argoproj#8702 (argoproj#8718) Signed-off-by: Yuan Tang <[email protected]> * fix: correct jsonnet paths resolution (argoproj#8721) Signed-off-by: Alexander Matyushentsev <[email protected]> * fix: Retry checkbox unchecked unexpectedly; Sync up with YAML (argoproj#8682) (argoproj#8720) Signed-off-by: Keith Chong <[email protected]> * chore: Bump stable version of application set addon (argoproj#8744) Signed-off-by: Alexander Matyushentsev <[email protected]> * Bump version to 2.3.1 * Bump version to 2.3.1 * Merge pull request from GHSA-2f5v-8r3f-8pww * fix: application resource APIs must enforce project restrictions Signed-off-by: Alexander Matyushentsev <[email protected]> * Fix unit tests Signed-off-by: jannfis <[email protected]> Co-authored-by: jannfis <[email protected]> * chore: remove lint-docs CI task (argoproj#8722) (argoproj#8858) * chore: remove lint-docs CI task Signed-off-by: Alexander Matyushentsev <[email protected]> * chore: remove not longer necessary url-allow-list Signed-off-by: Alexander Matyushentsev <[email protected]> Co-authored-by: Alexander Matyushentsev <[email protected]> * chore: fix imports (argoproj#8859) Signed-off-by: Michael Crenshaw <[email protected]> * Bump version to 2.3.2 * Bump version to 2.3.2 * feat: Updated CHANGES.md Co-authored-by: Yuan Tang <[email protected]> Co-authored-by: Alexander Matyushentsev <[email protected]> Co-authored-by: Keith Chong <[email protected]> Co-authored-by: argo-bot <[email protected]> Co-authored-by: jannfis <[email protected]> Co-authored-by: Michael Crenshaw <[email protected]> * feat: Add .whitesource configuration file (argoproj#6) Co-authored-by: whitesource-ets[bot] <328400+whitesource-ets[bot]@users.noreply.github.ibm.com> * docs: CHANGES.md Updated the CHANGES.md file to include updated information about changes made. Contributes to: automation-saas/native-AWS#1413 Signed-off-by: Sujeily Fonseca <[email protected]> Co-authored-by: Yuan Tang <[email protected]> Co-authored-by: Alexander Matyushentsev <[email protected]> Co-authored-by: Keith Chong <[email protected]> Co-authored-by: argo-bot <[email protected]> Co-authored-by: jannfis <[email protected]> Co-authored-by: Michael Crenshaw <[email protected]> Co-authored-by: Ishita Sequeira <[email protected]> Co-authored-by: Nikolas McGovern <[email protected]> Co-authored-by: Rahul Mourya <[email protected]> Co-authored-by: whitesource-ets[bot] <328400+whitesource-ets[bot]@users.noreply.github.ibm.com>
* fix(ui): Applications page incorrectly resets to tiles view. Fixes argoproj#8702 (argoproj#8718) Signed-off-by: Yuan Tang <[email protected]> * fix: correct jsonnet paths resolution (argoproj#8721) Signed-off-by: Alexander Matyushentsev <[email protected]> * chore: Bump stable version of application set addon (argoproj#8744) Signed-off-by: Alexander Matyushentsev <[email protected]> * fix: Retry checkbox unchecked unexpectedly; Sync up with YAML (argoproj#8682) (argoproj#8720) Signed-off-by: Keith Chong <[email protected]> * Bump version to 2.3.1 * Bump version to 2.3.1 * Merge pull request from GHSA-2f5v-8r3f-8pww * fix: application resource APIs must enforce project restrictions Signed-off-by: Alexander Matyushentsev <[email protected]> * Fix unit tests Signed-off-by: jannfis <[email protected]> Co-authored-by: jannfis <[email protected]> * chore: remove lint-docs CI task (argoproj#8722) (argoproj#8858) * chore: remove lint-docs CI task Signed-off-by: Alexander Matyushentsev <[email protected]> * chore: remove not longer necessary url-allow-list Signed-off-by: Alexander Matyushentsev <[email protected]> Co-authored-by: Alexander Matyushentsev <[email protected]> * chore: fix imports (argoproj#8859) Signed-off-by: Michael Crenshaw <[email protected]> * Bump version to 2.3.2 * Bump version to 2.3.2 * fix: Set QPS and burst rate for resource ops client (argoproj#8915) * fix: Set QPS and burst rate for resource ops client Signed-off-by: jannfis <[email protected]> * fix: prevent excessive repo-server disk usage for large repos (argoproj#8845) (argoproj#8897) fix: prevent excessive repo-server disk usage for large repos (argoproj#8845) (argoproj#8897) Signed-off-by: Michael Crenshaw <[email protected]> * fix: bump gitops engine version to v0.6.2 Signed-off-by: Alexander Matyushentsev <[email protected]> * docs: update v2.4+ roadmap items (argoproj#8593) Signed-off-by: ishitasequeira <[email protected]> * docs: reflect v2.3 release changes in roadmap.md (argoproj#8747) docs: reflect v2.3 release changes in roadmap.md (argoproj#8747) Signed-off-by: Alexander Matyushentsev <[email protected]> * Bump version to 2.3.3 * Bump version to 2.3.3 * fix: Fix docs build error (argoproj#8895) * work with specific jinja version Signed-off-by: pashavictorovich <[email protected]> * fix: fix broken monaco editor collapse icons (argoproj#8709) Signed-off-by: Alexander Matyushentsev <[email protected]> * chore: upgrade to go 1.17.8 (argoproj#8866) (argoproj#9004) * chore: upgrade to go 1.17.8 Signed-off-by: Michael Crenshaw <[email protected]> * chore: use 1.17 so it's always latest in the series Signed-off-by: Michael Crenshaw <[email protected]> * fix: allow cli/ui to follow logs (argoproj#8987) (argoproj#9065) Signed-off-by: Daniel Helfand <[email protected]> * Merge pull request from GHSA-xmg8-99r8-jc2j Signed-off-by: Michael Crenshaw <[email protected]> Co-authored-by: Michael Crenshaw <[email protected]> * Merge pull request from GHSA-6gcg-hp2x-q54h * fix: do not allow symlinks from directory-type applications Signed-off-by: Michael Crenshaw <[email protected]> * chore: add new util file Signed-off-by: Michael Crenshaw <[email protected]> * chore: lint Signed-off-by: Michael Crenshaw <[email protected]> * chore: use t.TempDir for simpler tests Signed-off-by: Michael Crenshaw <[email protected]> * address comments Signed-off-by: Michael Crenshaw <[email protected]> * Merge pull request from GHSA-r642-gv9p-2wjj Signed-off-by: jannfis <[email protected]> Co-authored-by: Michael Crenshaw <[email protected]> Co-authored-by: Michael Crenshaw <[email protected]> * Bump version to 2.3.4 * Bump version to 2.3.4 * test: fix ErrorContains (argoproj#9445) Signed-off-by: Michael Crenshaw <[email protected]> * fix: missing Helm params (argoproj#9565) (argoproj#9566) * fix: missing Helm params Signed-off-by: Michael Crenshaw <[email protected]> * use absolute paths, fix tests Signed-off-by: Michael Crenshaw <[email protected]> * fix race in test Signed-off-by: Michael Crenshaw <[email protected]> * chore: upgrade golangci-lint to v1.46.2 (argoproj#9448) * chore: upgrade golangci-lint to v1.46.2 Because: * Installation of golangci-lint v1.45.2 is currently broken and fails silently due to a redacted dependency (blizzy78/varnamelen#13) This commit: * Upgrades golangci-lint to v1.46.2 Signed-off-by: Tommaso Sardelli <[email protected]> * fix: lint Signed-off-by: Michael Crenshaw <[email protected]> * fix: lint Signed-off-by: Tommaso Sardelli <[email protected]> Co-authored-by: Michael Crenshaw <[email protected]> Signed-off-by: Michael Crenshaw <[email protected]> * fix: test race (argoproj#9469) Signed-off-by: Michael Crenshaw <[email protected]> * chore: lint issues Signed-off-by: Michael Crenshaw <[email protected]> * chore: update golangci-lint (argoproj#8988) * chore: update golangci-lint Signed-off-by: Michael Crenshaw <[email protected]> * chore: remove obsolete repo-server unit test (argoproj#9559) Signed-off-by: Alexander Matyushentsev <[email protected]> * chore: Make unit tests run on platforms other than amd64 (argoproj#8995) Signed-off-by: jannfis <[email protected]> Co-authored-by: Michael Crenshaw <[email protected]> Signed-off-by: Michael Crenshaw <[email protected]> * chore: eliminate go-mpatch dependency (argoproj#9045) * chore: eliminate go-mpatch dependency Signed-off-by: Michael Crenshaw <[email protected]> * chore: abstract out resource list function Signed-off-by: Michael Crenshaw <[email protected]> * chore: don't exit the program in anything but the main function Signed-off-by: Michael Crenshaw <[email protected]> * chore: better error messages Signed-off-by: Michael Crenshaw <[email protected]> * chore: better error messages Signed-off-by: Michael Crenshaw <[email protected]> * test: directory app manifest generation (argoproj#9503) * test: directory app manifest generation Signed-off-by: Michael Crenshaw <[email protected]> * git doesn't support empty dirs Signed-off-by: Michael Crenshaw <[email protected]> * Merge pull request from GHSA-h4w9-6x78-8vrj Signed-off-by: Michael Crenshaw <[email protected]> * Merge pull request from GHSA-2m7h-86qq-fp4v Signed-off-by: Michael Crenshaw <[email protected]> fix references Signed-off-by: Michael Crenshaw <[email protected]> use long enough state param for oauth2 Signed-off-by: Michael Crenshaw <[email protected]> typo Signed-off-by: Michael Crenshaw <[email protected]> more entropy Signed-off-by: Michael Crenshaw <[email protected]> fix test Signed-off-by: Michael Crenshaw <[email protected]> * Merge pull request from GHSA-q4w5-4gq2-98vm Signed-off-by: Michael Crenshaw <[email protected]> * Merge pull request from GHSA-jhqp-vf4w-rpwq Signed-off-by: Michael Crenshaw <[email protected]> defer instead of multiple close calls Signed-off-by: Michael Crenshaw <[email protected]> oops Signed-off-by: Michael Crenshaw <[email protected]> don't count jsonnet against max Signed-off-by: Michael Crenshaw <[email protected]> fix codegen Signed-off-by: Michael Crenshaw <[email protected]> add caveat about 300x ratio Signed-off-by: Michael Crenshaw <[email protected]> fix versions Signed-off-by: Michael Crenshaw <[email protected]> fix tests/lint Signed-off-by: Michael Crenshaw <[email protected]> * chore: fix docs gen Signed-off-by: Michael Crenshaw <[email protected]> * Bump version to 2.3.5 * Bump version to 2.3.5 * docs: Changes for v2.3.5 Documented key decision factors to use Argo CD v2.3.5. Contributes to: automation-saas/automation-saas/native-AWS#1972 Signed-off-by: Sujeily Fonseca <[email protected]> Co-authored-by: Yuan Tang <[email protected]> Co-authored-by: Alexander Matyushentsev <[email protected]> Co-authored-by: Keith Chong <[email protected]> Co-authored-by: argo-bot <[email protected]> Co-authored-by: jannfis <[email protected]> Co-authored-by: Michael Crenshaw <[email protected]> Co-authored-by: Ishita Sequeira <[email protected]> Co-authored-by: pasha-codefresh <[email protected]> Co-authored-by: Daniel Helfand <[email protected]> Co-authored-by: Tommaso Sardelli <[email protected]>
Describe the bug
In version 2.3.1, each time repo-server fetches a new commit from a repo which has a packfile, the packfile is duplicated. So N commits means N packfiles, when git should only have one packfile. If the file is big, this can fill up the repo-server disk.
This doesn't happen in 2.2.7.
To Reproduce
Create an app using a repo which has a packfile. I've been using a repo which has ~70k commits. When I clone that locally, I can see that there's a packfile in .git/objects/packs.
Remote into the repo-server pod and list the files in /tmp/_argocd-repo//.git/objects/head. You'll have to
chmod +rx
some directories to get access. There should be one .idx and one .pack file.Push a new commit to the repo and do a hard refresh on the app. List pack files again, and you'll see an additional .idx and an additional .pack file.
Expected behavior
I expected git to maintain one pack file.
Version
v2.3.1
Logs
I've manually added
--verbose
andGIT_TRACE=1
to the git calls. There's nothing interesting in the logs as far as I can tell.I've also commented out the initializer and closer logic that sets repo directory permissions as well as manually setting the _argocd-repo permissions to rwx. No effect.
Finally I've tried downgrading git to 2.30.2 by building an image based on Ubuntu 21.04. Same bug.
I'm out of hunches.
The text was updated successfully, but these errors were encountered: