-
Notifications
You must be signed in to change notification settings - Fork 248
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
pkg/cache: optimize peak memory usage during cache build #1281
pkg/cache: optimize peak memory usage during cache build #1281
Conversation
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: joelanford The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #1281 +/- ##
=======================================
Coverage 54.02% 54.03%
=======================================
Files 108 108
Lines 11266 11314 +48
=======================================
+ Hits 6087 6113 +26
- Misses 4190 4207 +17
- Partials 989 994 +5 ☔ View full report in Codecov by Sentry. |
/hold |
77c0299
to
092a36d
Compare
/hold cancel |
}) | ||
|
||
var mapMu sync.Mutex | ||
for i := 0; i < runtime.NumCPU(); i++ { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This means that data for up to runtime.NumCPU
packages will be in memory at once. I noticed the following when building a cache for the operatorhub catalog on my 10-core Mac M1 Pro:
On master
:
- With unmigrated catalog
- With
GOMAXPROCS=1
: Peak memory=778Mb, Duration=15s - With unset GOMAXPROCS, on master branch: Peak memory=731Mb, Duration=10s
- With
- With migrated catalog
- With
GOMAXPROCS=1
on master branch: Peak memory=166Mb, Duration=2.7s - With unset GOMAXPROCS, on master branch: Peak memory=141Mb, Duration=2.0s
- With
On PR branch:
- With unmigrated catalog
- With
GOMAXPROCS=1
on PR branch: Peak memory=234Mb, Duration=16s - With unset GOMAXPROCS, on PR branch: Peak memory=290Mb, Duration=6.6s
- With
- With migrated catalog
- With
GOMAXPROCS=1
on PR branch: Peak memory=115Mb, Duration=3s - With unset GOMAXPROCS, on PR branch: Peak memory=117Mb, Duration=1.39s
- With
I'm going to close this one. I think we should focus on #1278. |
Description of the change:
I extracted a commit from #1278 , which can be implemented on its own with our existing caching algorithm. As I mentioned there:
In order to maintain the performance of cache building, we need to ensure that WalkMetasFS can make use of concurrency in the same way that LoadFS can (which is what the cache builder currently uses). Therefore, the first commit in this PR includes those changes.
Motivation for the change:
There have been numerous issues reported about how finicky pre-built caches are. There are cases where a catalog image with a pre-built cache works correctly on one node, but not another. There are other cases where caches built outside the image and then injected in are mangled enough to throw off the digest calculation. While these cases are likely problems with the specific digest algorithm we use, this could all be avoided if we were able to build the cache on-the-fly.
Reviewer Checklist
/docs