-
-
Notifications
You must be signed in to change notification settings - Fork 48
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Available Xen free memory not used #4891
Comments
One issue I've found, is that qmemman fails to see that dom0 is limited to 1GB (or even 4GB in default setup). It loads dom0 max mem from
You may need restart qmemman to reload the value. |
I did not have to restart qmemman. Not so fast, tried my main 16GB -current system that was at limit with 1367MB free, it has dom0 mem max set to 1.5G so I wrote that to static-max, (checked with xenstore-read), qmemman sprang into life, but only got another 30MB for VMs, no further change after restarting qmemman. |
Try collecting logs mentioned in #4890 (comment) |
System as above, with static-max still at 1.5G, 3 VM start attemps: fixed 650MB - failed, fixed 600MB - failed, and finally fixed 550MB succceded. Runs of |
The 30MB I thought I got above was just memory being shuffled around. As you will note most VMs are fixed memory as having too many enabled for balancing sends qmemman bananas - writing 16+lines/sec to the dom0 journal and getting nowhere. |
Something about dom0 is still odd. Multiple qmemman lines says it's at ~2.8G, while in fact it is below 1.5G. Are you sure static-max entry is set correctly? |
That looks suspiciously similar to
Generally, OOM in dom0 is quite unpleasant, see #3079. 4GB default is definitely on the safe side, but lower default value could be risky depending on the use case (desktop environment, screen(s) size etc). Wrong |
I was wondering what that parameter was* - that and the min qube size were being wound down slowly by Qubes a while back and it has ended up at 185MB (same in qmemman.conf). So it is not that. Any other ideas? Shouldn't it be XEN_FREE_MEM_LEFT?
I did not read all of that issue but it sounded at the start like it might have been triggered by downward ballooning in dom0 - my reading on best practice with Xen says not to allow that and always run a fixed size dom0. I have configured my next boot for 1800M fixed. Where is the best place to script up the patch to This does bring up another point that I have been pondering while the qmemman issue unfolded: Is it possible that qmemman opens up a side-channel attack vector? Given ITL's awareness of these issues, I have to assume that it must be OK. I may be paranoid, however I feel more comfortable with dom0 and any sensitive VMs being excluded from memory balancing. Perhaps a note for this memory guide. Fixed dom0 should avoid any fast memory surprises so that the very well over provisioned swap will cope. On the 4GB default I did notice that it did not change on a 4GB machine. [ed] Deleted my previous post: too much information. Summary Marek was right and adjusting |
The bug is in
with
picks up the size of dom0 running during startup, which in my testing on fixed sized dom0s picks up the same number that dom0_mem was set to at startup for larger sizes and a few MB under for smaller settings (under ~1GB) that seems to have the effect of having dom0 further trimmed by that amount over the next 10+ minutes. I have never looked at Xen before and could not find where the allocated size at startup was expressed internally... over to Marek. Both my main and test systems are running SO much better for the last couple of days now than with the default configuration. Sold on fixed size dom0. :-) |
Wait what so the fact that I only have like 7-8 GB available to use out of my pool of 12 GB of memory after I start qubes was a bug afterall? I thought that was normal after all this time |
@Eric678 Thanks! I've had dom0 limited to 1500MB for many months and might never have noticed there was something wrong with qmemman dom0 values. Look forward to trying this. |
@tasket thank you for your posts in the topic in qubes-devel linked to above that led me down this path. * f*** me it worked |
I had a few spare hours so decided to pull down the sources and have a look at qmemman - now I have never looked at a python program before so bear that in mind. It is relatively small (good, I hate complexity) however I have to say it looks like it was written by trial and error. I was trying to figure out how to stop dom0 from being added to Since no one has reassured me on the side-channel issue I have to assume that it is real and that is why I have heard nothing, damage control, and my email address seems to have been taken off the qubes-users whitelist. 😒 I am not a security professional either, bear that in mind. Only picked up Qubes from 4.01 and still finding my feet. I am getting a bit of a queasy feeling, all of the above and then thinking that it was OK to stop and start qmemman, tried leaving it stopped, thinking I needed a big switch to get it out of the picture, shut down a qube and it disappeared from Xen and stayed running in Qubes. OK fine, so started qmemman and Qubes crapped on itself, the domain widget went into an inf startup loop - throwing an exception for the first domain in its list - Seems like little or no stress testing is done on Qubes - if there is, it is not working. On the size of dom0, after working out how to easily track memory and swap real instantaneous demands, the peak usage requirements in dom0 are, unsurprisingly, during updates. A 4GB system seems good at 900M dom0 and it looks like 1300M is enough for 16GB with up to 20 VMs, which still has ~360MB unusable Xen free memory, not too bad. BTW I found the Need to get my dom0 X killed cheat sheet ready. |
@marmarek After such a brief time with a nicely working Qubes system, the awful memory allocation is back with the update from qubes-core-dom0 4.0.41 to 4.0.42. Max allocation (from xentop sums) is back to 6750MB, about 1GB lower than I was getting before the last update. Can anyone explain how my startup-misc.sh reverted to using |
The fix hasn't been cherry-picked into the release4.0 branch yet. |
OK, so this issue was flagged as major and the fix hasn't made it to testing in nearly a month........ |
This value needs to be set to actual static max for qmemman to work properly. If it's set higher than real static-max, qmemman will try to assign more memory to dom0, which dom0 could not use - will be wasted. Since this script is executed before any VM is started, simply take the current dom0 memory usage, instead of parsing dom0_mem Xen argument. There doesn't seem to be nice API to get this value from Xen directly. Fixes QubesOS/qubes-issues#4891 (cherry picked from commit 56ec271)
Automated announcement from builder-github The package
|
Automated announcement from builder-github The package
Or update dom0 via Qubes Manager. |
Automated announcement from builder-github The package
|
Qubes OS version:
4.0
Affected component(s) or functionality:
core: qmemman? Whoever decides to issue "Not enough memory to start ..."
Steps to reproduce the behavior:
on my 4GB current-testing system (4.8.5 Xen & 4.19 kernel): clamp dom0 memory (dom0_mem=1024M,max=1024M), disable memory balancing on all qubes, start qubes until "Not enough memory to start..." and chop qube size to find where the threshold is.
Expected or desired behavior:
All Xen free memory will be allocated to qubes except XEN_FREE_MEM_LEFT (50MB).
Actual behavior:
The minimum Xen free memory (in xl info) was 655MB (21% of domU space).
General notes:
I was trying to track down why more than 1.3G of free memory always remained in my work 16GB system where memory is always at a premium.
In the 4G example qmemman has no one to talk to except dom0 and its memory is fixed so it should not be doing anything and on the 4G system at min free it is not writing anything to the journal.
I have consulted the following relevant documentation:
https://www.qubes-os.org/doc/qmemman/
This topic in qubes-devel started me off: https://groups.google.com/forum/#!topic/qubes-devel/o3ZoOsGPR7o and its subject expresses many new user's frustration - certainly mine!
I am aware of the following related, non-duplicate issues:
The text was updated successfully, but these errors were encountered: