-
Notifications
You must be signed in to change notification settings - Fork 138
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Split the large sort allocation into separate allocations #529
Conversation
The second point is a bit tricky to resolve, since there is some circular interaction between I am not sure of the current code: Lines 884 to 903 in 83e3d41
IObuffersize is in units of WORD (L884). Then comes some code which checks (LargeSize+SmallEsize) against MaxFpatches*IObuffersize (with a constant offset). If it is too small, the large (or small) buffer are increased in size accordingly. This code runs, basically, if the user has set sortiosize significantly larger.
But then we make the same check in reverse, and adjust |
I have cleaned this up a bit, but now I have a few questions: In RecalcSetups, the buffer size constraints are not consistent with those in Line 357 in 83e3d41
In Lines 567 to 592 in 83e3d41
|
Another thing I noticed: |
Buffer overruns in these allocations will be visible to valgrind. Fix units mismatch re: IObuffersize, IOtry. Print warnings in debug mode, if any buffer sizes are altered by AllocSort. Update "default" buffer sizes, so that we have no warnings.
Here the "changes" to the default buffer sizes imply a change to the manual also. |
Do you see any performance regression because of the memory nonlocality? If this is unclear, maybe we could turn on |
I've not measured any performance difference, no. This is required in non-DEBUGGING mode by #537 . |
For reference, what happens with SmallEsize = (SmallEsize+15) & (-16L); is not guaranteed prior to C23, which standardizes signed integers are two's complement. That said, we can keep this code, as it's now 2024. Anyway, I don't think anyone is interested in running FORM on exotic systems like Unisys ClearPath Dorado Servers. |
Many lines are duplicated for both the #ifndef SPLITALLOC
LONG allocation = ...
char *allocated = Malloc1(allocation,"sort buffers");
#define Malloc1(size, msg) (allocated += (size), (void *)(allocated - (size)))
#endif
sort = Malloc1(sortsize, "sorting struct");
...
#ifndef SPLITALLOC
#undef Malloc1
#endif |
I left the old code there for future debugging purposes, but it could also just be removed entirely if it is neater. One can always use git to bisect a potential bug to this change (and assuming I did not make a mistake in the separated allocations, crashes due to this in principle imply bugs elsewhere in the code). |
Yes, you are right. We can simply remove the old code, which lowers the code maintenance overhead. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now it looks good to me.
Here I have split the large single allocation in
AllocSort
into individual allocations. OnlyLarge+Small(+extension)
must be contiguous. The aim is that overrunning any of these buffers will cause a crash and not silent corruption of the program state.In principle this has a performance penalty, though whether it is measurable or not has to be determined by benchmarking.
The second commit adds warning messages, if any of the buffers are adjusted from their requested values. I have changed some of the default (64bit) allocations, such that they are consistent with FORM's own requirements and no warnings are printed. This is not completely resolved yet: the
sublargesize
is still shifted, and more things are shifted in 32bit mode.