-
Notifications
You must be signed in to change notification settings - Fork 138
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optionally reallocate large+small buffer after each module #537
Conversation
Thanks a lot, @jodavies . This looks indeed promising. I will look into it when I find a bit of time. |
In case it is useful to see how your own scripts behave, I logged the RSS during the run with something like
|
I added an On/Off switch for this, I think it is better than just a preprocessor instruction since the option can easily be enabled for multiple modules (nice when calling procedures etc) and can still be used for a "single module" (by turning it off again in the next). I think the "free and reallocate" version is simpler in that we don't need to worry about aligned memory allocations and whether or not this works in Windows. Edit: I don't know what is going on with valgrind on the runners these days, I see a lot of:
|
@jodavies Maybe it would be great to also have a preprocessor command #sortreallocate to be active only for a single module. Could you possibly add this? |
Do you strongly prefer that to doing
? Of course it is straightforward to have both options, so I can add a |
I think it would be a nice feature when you're dynamically generating code with the preprocessor and happen to not have access to the previous .sort statement (but don't want to sort right now). What do you think? |
d3eaca4
to
5302a8d
Compare
5302a8d
to
933dd7e
Compare
In trying to test this further, it seems that actually it does not work in @msloechner if you happen to have your test code still around, are you by chance able to check if you used the reallocation mode or the (now deleted) madvise version? |
I just tried to run with the tform binaries I compiled in June for buffer reallocation. The tform buffer reallocation immediately aborts with a segfault. |
18057f2
to
2bfbafd
Compare
Rebased on master (which now has the separated sort allocation). Merged the tform fix into the first commit, to avoid a broken commit in the master history. |
This branch just force-enables the reallocation for every module, just to see how it goes on the CI: https://github.com/jodavies/form/tree/sort-realloc-ci-test The test for Issue508 fails, where we have an 8G ulimit, and a reallocation fails. Outside of valgrind it runs OK with the ulimit, so I think we can ignore this. |
Maybe I'll add a minor review comment to this PR today, but I'll probably review the core part tomorrow. (Right now, I'm hitting some minor VSCode C++ extension issues...) |
The code looks fine. But why did you choose |
Not really any reason I suppose, it would also fit nicely as a moduleoption alongside "inparallel" and "noparallel", i.e. options which don't really have anything to do with the processing of the actual terms (like "local" etc) but more to do with the high-level operation of the module. I am not opposed to adding a moduleoption (and removing the pre-proc command?) if you prefer. |
Actually, I don't have a strong opinion either :-)
I'm not entirely sure about the use case described in
So, @msloechner, what do you think? Which syntax option would be easy to use? The current implementation supports both
and
Instead of the latter, we could implement
and/or
|
2bfbafd
to
f1809b5
Compare
RSS decreases. If the OS is under memory pressure, it will not be swapping out useless pages. The SortBlocks contain pointers into the master thread's lBuffer, which need updating after reallocation. Use an "UpdateSortBlocks" function to do this, which is a trimmed-down version of IniSortBlocks which only sets the pointers.
Enable the reallocation for a single module. If specified in the same module as "Off sortreallocate;", the reallocation will still happen in that module.
f1809b5
to
12fa261
Compare
Rebased and review addressed. |
@tueda: Very sorry for answering only now. I would prefer the first option, because the sortreallocation may only take place at a .sort or .store, and in Form statements like ModuleOption become effective at the end of the module. |
@msloechner Thanks for your input! So, by "the first option", you mean the current implementation ( If the current one is the best, then I think this PR can be merged. |
By "first option" I meant |
This is a test of a request from @msloechner , which makes FORM's resident set size better track its actual memory usage. This comes on top of the split allocations branch, since it only deals with the largest buffer. Since FORM allocates its buffers and keeps them for the whole run, the apparent memory usage remains constant after a "peak". If the OS is under memory pressure it will swap pages, but they might be pages that FORM actually isn't using any more.
It seems there are two ways to do it: free and re-allocate the buffer, or use
madvise
to specifyMADV_DONTNEED
. Both have the same effect and have about the same performance, but there is an impact of more than 10% compared to doing nothing. This shouldn't be done by default, and maybe never "every module". (However, the longer each module takes, with lots of disk access during sorting, etc, the less the performance impact will be).The options could be, a preprocessor statement like
#reallocatesort
which must be specified when the user really wants this (say, after a heavy module) or maybe justOn reallocatesort;
to enable it for every module which follows. Any thoughts?@msloechner , can you test this and see if it helps your multiple concurrent jobs scenario? I could also see it being useful when running FORM on cluster machines which kill jobs based on RSS: both of these commits have a lower peak RSS in my tests compared to the original behaviour.
Here is a small MINCER performance test for
orig
(unmodified form),split
(checking the split allocations have no impact)realloc
(the first commit) andmadv
(the second):Here is the RSS profile of this test: