-
Notifications
You must be signed in to change notification settings - Fork 180
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
WIP: Guess pgresult size allocated by libpq #23
Conversation
Nice, Would you know what the impact here is performance wise on the pathological case which is an ultra wide table (say 40 columns) single row, for the added accounting? Also perhaps let's open a discussion on the pg mailing list about adding something to libpq here for pulling out the number cause it seems it would be as simple as number of buffers * buffer size? |
Absolutely - we should open a discussion about a libpq memsize function. Possibly there's also a smarter way for guessing, than I used. This patch is an interim workaround only.
A quick test showed around 10% performance decrease. I used: https://gist.github.com/SamSaffron/409805f6c8447d344e04ad68505ec43f
Interestingly roughly the same for single rows as for larger result sets with 1000 rows. I guess although only 20 rows are scanned, it causes a lot of memory cache misses.
|
I wonder, can we make this opt-out then, I think trying for safety out of the box is fine, but AR and places that are careful to .clear should not be penalized here. |
Sure, there should be a switch to enable/disable such a performance regression. Since it worked all the years now, I think maybe opt-in. |
Agree maybe opt in till we get a patch in libpq to get the size cheaply
…On Thu, 14 Jun 2018 at 8:40 pm, Lars Kanis ***@***.***> wrote:
Sure, there should be a switch to enable/disable such a performance
regression. Since it worked all the years now, I think maybe opt-in.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#23 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AAAUXTmNa394dTc8Xm-9kFBE-BYP6Uxoks5t8j2ugaJpZM4Um38T>
.
|
35b64d9
to
7138926
Compare
As published here https://discuss.samsaffron.com/t/rubys-external-malloc-problem/431 memory allocated by libpq is not considered by Ruby's GC. This patch tries to guess the memory size based on the field sizes of a small sample set of fields. The sample fields are chose by a simple heuristic. The size calculation mimics parts of libpq's allocation mechanism. It makes use of rb_gc_adjust_memory_usage() introduced in Ruby-2.4: https://bugs.ruby-lang.org/issues/12690 This ensures, that Ruby's GC is triggered early enough to free PG::Result objects. Also use TypedData to allow retrieval of PG::Result sizes by ObjectSpace.memsize_of(result)
7138926
to
c0cad36
Compare
I did some more fine tuning and benchmarking. The result memsize calculation is now opt-out, because the performance impact is below precision of measurement.
I would merge this as is now, unless there are any concerns. |
very interesting you can not measure this anymore. curious can you confirm it is working as expected and perhaps just measure by hand with Not against merging this in, but I am curious on the actual cost here. |
@SamSaffron Yes it's working as expected. It can be seen by the MemoryProfiler or by your exploit. It doesn't bloat memory any longer. I measured the time for calculating the memory size per
The measurement itself takes around 70 nsecs, so this should be substracted. |
Nice! I think at this cost it is cheap enough to have default on. I do still feel like we should work with libpq here so I will post something to pghackers and see what they say. Perhaps we can merge this (and maybe Jeremy's patches) and cut a release? There is so much goodness that was added to the gem in the last 5-6 months. |
As published here
https://discuss.samsaffron.com/t/rubys-external-malloc-problem/431
memory allocated by libpq is not considered by Ruby's GC.
This patch tries to guess the memory size based on the field sized of a sample set of rows. It also mimics parts of libpq's allocation mechanism.
It makes use of rb_gc_adjust_memory_usage() introduced in Ruby-2.4:
https://bugs.ruby-lang.org/issues/12690
This ensures, that Ruby's GC is triggered early enough to free PG::Result objects.
Also use TypedData to allow retrieval of PG::Result sizes by
It's WIP because of missing tests and missing fine tuning of size approximation.
cc @ged , @SamSaffron