-
Notifications
You must be signed in to change notification settings - Fork 559
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
New PERL_STRLEN_NEW_MIN & PERL_ARRAY_NEW_MIN_KEY definitions #20529
Conversation
In a nutshell, for a long time the minimum PV length (hardcoded in Perl_sv_grow) has been 10 bytes and the minimum AV array size (hardcoded in av_extend_guts) has been 4 elements. These numbers have been used elsewhere for consistency (e.g. Perl_sv_grow_fresh) in the past couple of development cycles. Having a standard definition, rather than hardcoding in multiple places, is more maintainable. This commit therefore introduces into perl.h: PERL_ARRAY_NEW_MIN_KEY PERL_STRLEN_NEW_MIN (Note: Subsequent commit(s) will actually change the values.)
Major malloc implementations, including the popular dlmalloc derivatives all return chunks of memory that are a multiple of the platform's pointer size. Perl's traditional default string allocation of 10 bytes will almost certainly result in a larger allocation than requested. Consequently, the interpreter may try to Renew() an allocation to increase the PV buffer size when it does not actually need to do so. This commit increases the default string size to the nearest pointer multiple. (12 bytes for 32-bit pointers, 16 bytes for 64-bit pointers). This is almost certainly unnecessarily small for 64-bit platforms, since most common malloc implementations seem to return 3*pointer size (i.e. 24 bytes) as the smallest allocation. However, 16 bytes was chosen to prevent an increase in memory usage in memory-constrained platforms which might have a smaller minimum memory allocation.
comment replaced. see below.
|
I vaguely remember that. I believe it is related to sv_gets(). IIRC we end If you look for a thread titled "sv_grow() and malloc" you will find a Sayeth @iabyn (Dave Mitchel): in https://marc.info/?l=perl5-porters&m=141147301828156&w=3 I've spent the last week or two (on and off) looking into how all the TL;DR: COW caused problems with memory wastage when using readline, so Background: There are two core functions I'm interested in here: sv_setsv_flags(), sv_grow() applies various fiddle factors to the requested length; these As regards chumminess, some platforms have a malloc_size() function which
The other function of interest, sv_setsv_flags() (which is the thing The issues: A while back we had an issue with COW and readline() interacting:
was causing each pushed SV to be large, even if the lines were short. This This subsequently caused an issue on Darwin (and any platform that has
In that ticket, something like:
was slow (took minutes) up to and including 5.18.0; it then became fast
What I propose: I think that the criteria for skipping COW on divergent SvCUR/SvLEN Further, based on the outputs of that little program I asked people
which will allow us to determine two constants A and B such that the size Finally, there are issues with sv_grow()'s "I know better than the caller" What now: So the main thing I want to do right now is tweak the SvCUR verses SvLEN The main action shown in each table is that which prevailed before the The main pre-patch criterion was to copy rather than COW if the string was First, test whether the buffer is swipeable:
If not swipeable, then test for COWability. There are two sets of If the src is already COW:
If the src is not already COW:
As can be seen, for large strings the I think this is basically the wrong way round; I'm not sure if this was
With Linux/glibc/64-bit where A=24 and B=16, this would yield thresholds So in conclusion, I am proposing for discussion right now, -- |
Thanks, @demerphq. I'd remembered Dave talking about runtime probing but couldn't unearth the correct discussion. That thread seemed to end on (FC's email?):
Thoughts:
I also saw this, which seemed like an interesting approach: https://doc.qt.io/qt-5/containers.html#growth-strategies Independent of all this, do you think this PR could be applied as-is, with us revisiting the rest as a follow up? |
@richardleach Yes, sorry, my mail was just explaining why this stuff is sensitive. It doesn't change the picture much except for replacing the hard coded constants with defines, with is an obvious improvement. I don't think the change in min string size makes much of a difference. So I approved and merged the patch. I think further discussion on this should go into an issue where it can be tracked independently. FWIW, as an issue for another ticket: hashes are presized to 8 buckets, why should arrays be any different? Should hashes be reduced or arrays expanded? Food for though for a future ticket. |
On Sun, Nov 20, 2022 at 02:21:04PM -0800, Richard Leach wrote:
* I'm unclear what the `A` and `B` constants would be. `A` is the
smallest possible allocation? If `B` is some kind of scaling constant,
it could break down if, as someone suggested in the thread, the`
malloc()` has multiple different sizing strategies.
Well that was 8 years ago and I've I forgotten much; but I think the
idea was based on the observation that malloc() will almost certainly
round up small allocations to some sort of block size, e.g. 8 bytes.
So malloc(2) followed by realloc(8) will likely re-use the same address.
But further, the first block it uses may contain some malloc housekeeping
overhead, meaning there is less space to use for the first block. So for
example if two bytes are used for overhead, then the first block can only
contain up to 6 bytes. So in this context, A=6, B=8 and blocks are filled
on a A+Bn basis for n=1,2,....
Or to put it another way, whenever perl's string allocating code wants to
do a malloc for a block sized anywhere from 1..6, it should do a malloc(6)
and set SvLEN to 6. Similarly, for lengths of 7..14 it should malloc(14),
15..22, malloc(22) and so on.
Of course this won't hold for large block sizes. But the idea being that
for systems which don't have a malloc_size()-type introspection facility,
perl could, by doing some basic probing at startup, do a better job of
guessing what sorts of rounding-up of malloc request sizes than it does at
the moment.
I don't think I ever acted upon any of the suggestions in that thread. It
got put on my back-burner and then forgotten about.
The whole COW things needs some careful love and tweaking. In particular,
IIRC, COW strings being freed or normalized (e.g.having an integer
assigned to them) currently take a very slow code path. COW being set
triggers SV_CHECK_THINKFIRST(), which tends to then take a slow code path
checking for lots of things like SvREADONLY, SvOOK() etc. See the 'case
SAVEt_CLEARSV' branch in Perl_leave_scope() for example.
Finally, here's a trivial example of a malloc probe. A real one would be
less crude, and would increment i by greater than 1 in later stages.
#include <stdio.h>
#include <malloc.h>
typedef long unsigned int UV;
typedef unsigned char U8;
int main(int argc, char**argv)
{
void *m, *r, *throwaway;
int i, prev_i;
/* the throwaway mallocs are to encourage the allocated block to be
* hemmed in, necessitating realloc() using new memory */
throwaway = malloc(1);
m = malloc(1);
throwaway = malloc(1);
prev_i = 1;
for (i = 2; i <= 2048; i++) {
r = realloc(m, i);
//printf("i=%3d; r=%p\n", i, r);
if (r != m) {
printf("at size %3d realloc used new buffer, diff=%3d\n",
i, i-prev_i);
m = r;
prev_i = i;
throwaway = malloc(1);
}
}
}
On my Linux system this output the following. It shows that for up to about
1K, A=24 bytes, B=16 bytes.
$ ./c
at size 25 realloc used new buffer, diff= 24
at size 41 realloc used new buffer, diff= 16
at size 57 realloc used new buffer, diff= 16
at size 73 realloc used new buffer, diff= 16
at size 89 realloc used new buffer, diff= 16
at size 105 realloc used new buffer, diff= 16
at size 121 realloc used new buffer, diff= 16
at size 137 realloc used new buffer, diff= 16
at size 153 realloc used new buffer, diff= 16
at size 169 realloc used new buffer, diff= 16
at size 185 realloc used new buffer, diff= 16
at size 201 realloc used new buffer, diff= 16
at size 217 realloc used new buffer, diff= 16
at size 233 realloc used new buffer, diff= 16
at size 249 realloc used new buffer, diff= 16
at size 265 realloc used new buffer, diff= 16
at size 281 realloc used new buffer, diff= 16
at size 297 realloc used new buffer, diff= 16
at size 313 realloc used new buffer, diff= 16
at size 329 realloc used new buffer, diff= 16
at size 345 realloc used new buffer, diff= 16
at size 361 realloc used new buffer, diff= 16
at size 377 realloc used new buffer, diff= 16
at size 393 realloc used new buffer, diff= 16
at size 409 realloc used new buffer, diff= 16
at size 425 realloc used new buffer, diff= 16
at size 441 realloc used new buffer, diff= 16
at size 457 realloc used new buffer, diff= 16
at size 473 realloc used new buffer, diff= 16
at size 489 realloc used new buffer, diff= 16
at size 505 realloc used new buffer, diff= 16
at size 521 realloc used new buffer, diff= 16
at size 537 realloc used new buffer, diff= 16
at size 553 realloc used new buffer, diff= 16
at size 569 realloc used new buffer, diff= 16
at size 585 realloc used new buffer, diff= 16
at size 601 realloc used new buffer, diff= 16
at size 617 realloc used new buffer, diff= 16
at size 633 realloc used new buffer, diff= 16
at size 649 realloc used new buffer, diff= 16
at size 665 realloc used new buffer, diff= 16
at size 681 realloc used new buffer, diff= 16
at size 697 realloc used new buffer, diff= 16
at size 713 realloc used new buffer, diff= 16
at size 729 realloc used new buffer, diff= 16
at size 745 realloc used new buffer, diff= 16
at size 761 realloc used new buffer, diff= 16
at size 777 realloc used new buffer, diff= 16
at size 793 realloc used new buffer, diff= 16
at size 809 realloc used new buffer, diff= 16
at size 825 realloc used new buffer, diff= 16
at size 841 realloc used new buffer, diff= 16
at size 857 realloc used new buffer, diff= 16
at size 873 realloc used new buffer, diff= 16
at size 889 realloc used new buffer, diff= 16
at size 905 realloc used new buffer, diff= 16
at size 921 realloc used new buffer, diff= 16
at size 937 realloc used new buffer, diff= 16
at size 953 realloc used new buffer, diff= 16
at size 969 realloc used new buffer, diff= 16
at size 985 realloc used new buffer, diff= 16
at size 1001 realloc used new buffer, diff= 16
at size 1017 realloc used new buffer, diff= 16
at size 1033 realloc used new buffer, diff= 16
at size 1049 realloc used new buffer, diff= 16
…--
A power surge on the Bridge is rapidly and correctly diagnosed as a faulty
capacitor by the highly-trained and competent engineering staff.
-- Things That Never Happen in "Star Trek" #9
|
This PR consists of two commits, both concerned with the
hardcoded default minimum length for PV buffers and AV arrays.
PERL_STRLEN_NEW_MIN
to becloser to the likely actual allocation returned by
malloc()
.The intention is to reduce the potential for unnecessary
calls to
sv_grow
, such as when building a string throughconcatenation, when the PV buffer is actually big enough
for the operation and perl just does not know it.
The minimum value chosen for 64-bit is very likely too
small, but I am unfamiliar with memory-constrained
systems and wanted to play it safe. (Happy to raise it to
the very common 3*8 bytes if people are comfortable
with that.)
Actually measuring the minimum allocation size at
Configure time, for
malloc()
implementations that supportit, would be more accurate and I might try that in the next
development cycle.
The PERL_UNWARANTED_CHUMMINESS_WITH_MALLOC
stuff would also help with this problem, but I've previously been
told that it was disabled due to problems with COW that have
not been ironed out yet.