-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add doc for node-agent memory preserve #8167
Add doc for node-agent memory preserve #8167
Conversation
7be784d
to
c540104
Compare
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #8167 +/- ##
==========================================
+ Coverage 59.05% 59.10% +0.04%
==========================================
Files 364 365 +1
Lines 30324 30336 +12
==========================================
+ Hits 17909 17931 +22
+ Misses 10972 10962 -10
Partials 1443 1443 ☔ View full report in Codecov by Sentry. |
@@ -641,6 +641,16 @@ Both the uploader and repository consume remarkable CPU/memory during the backup | |||
Velero node-agent uses [BestEffort as the QoS][14] for node-agent pods (so no CPU/memory request/limit is set), so that backups/restores wouldn't fail due to resource throttling in any cases. | |||
If you want to constraint the CPU/memory usage, you need to [customize the resource limits][15]. The CPU/memory consumption is always related to the scale of data to be backed up/restored, refer to [Performance Guidance][16] for more details, so it is highly recommended that you perform your own testing to find the best resource limits for your data. | |||
|
|||
For Kopia path, some memory is preserved by the node-agent to avoid frequent memory allocations, therefore, after you run a file-system backup/restore, you won't see node-agent releases all the memory. There is a limit for the memory preservation, so the memory won't increase all the time. The limit varies from the number of CPU cores in the cluster nodes, as calculated below: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we clarify how much if at all is released? Should there be timeout for preserved memory? If you only backup once every 6 months, you may rather spend time to reallocate memory next backup.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added the clarification that there is no timeout for the preserved memory, so you won't see node-agent releases all the memory until it restarts.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we clarify how much if at all is released?
The released memory is unknown actually, because released memory = total allocated memory - preserved memory
. While for total allocated memory
, we've already clarified as below:
The CPU/memory consumption is always related to the scale of data to be backed up/restored, refer to [Performance Guidance][16] for more details, so it is highly recommended that you perform your own testing to find the best resource limits for your data.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should there be timeout for preserved memory?
Yes, it is more rational to have a smarter mechanism instead of preserve the memory forever. But whether a timeout is ideal enough, we need to further consider.
At present, we just document it and leave it as is. We think it is not a high priority task, reasons:
- It happens to fs-backup only from 1.15 on, because data movers will not run in the long-running node-agent pods.
- The preserved memory won't reach to the limit very easily, normally it is less than the limit
- The backup is usually a scheduled task, e.g., one/several per day, so the preserved memory is normally effective
138c4d9
to
6a9a827
Compare
Signed-off-by: Lyndon-Li <[email protected]>
6a9a827
to
43de32a
Compare
Partially fix issue #8138, add doc for node-agent memory preserve