Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: Guess pgresult size allocated by libpq #23

Merged
merged 1 commit into from
Jun 23, 2018

Conversation

larskanis
Copy link
Collaborator

As published here
https://discuss.samsaffron.com/t/rubys-external-malloc-problem/431
memory allocated by libpq is not considered by Ruby's GC.

This patch tries to guess the memory size based on the field sized of a sample set of rows. It also mimics parts of libpq's allocation mechanism.

It makes use of rb_gc_adjust_memory_usage() introduced in Ruby-2.4:
https://bugs.ruby-lang.org/issues/12690
This ensures, that Ruby's GC is triggered early enough to free PG::Result objects.

Also use TypedData to allow retrieval of PG::Result sizes by

  ObjectSpace.memsize_of(result)

It's WIP because of missing tests and missing fine tuning of size approximation.

cc @ged , @SamSaffron

@SamSaffron
Copy link

Nice,

Would you know what the impact here is performance wise on the pathological case which is an ultra wide table (say 40 columns) single row, for the added accounting?

Also perhaps let's open a discussion on the pg mailing list about adding something to libpq here for pulling out the number cause it seems it would be as simple as number of buffers * buffer size?

@larskanis
Copy link
Collaborator Author

larskanis commented Jun 14, 2018 via email

@SamSaffron
Copy link

I wonder, can we make this opt-out then, I think trying for safety out of the box is fine, but AR and places that are careful to .clear should not be penalized here.

@larskanis
Copy link
Collaborator Author

Sure, there should be a switch to enable/disable such a performance regression. Since it worked all the years now, I think maybe opt-in.

@SamSaffron
Copy link

SamSaffron commented Jun 14, 2018 via email

@larskanis larskanis force-pushed the account-result-size branch 3 times, most recently from 35b64d9 to 7138926 Compare June 18, 2018 20:02
As published here
  https://discuss.samsaffron.com/t/rubys-external-malloc-problem/431
memory allocated by libpq is not considered by Ruby's GC.

This patch tries to guess the memory size based on the field sizes of
a small sample set of fields. The sample fields are chose by a simple
heuristic. The size calculation mimics parts of libpq's allocation
mechanism.

It makes use of rb_gc_adjust_memory_usage() introduced in Ruby-2.4:
  https://bugs.ruby-lang.org/issues/12690
This ensures, that Ruby's GC is triggered early enough to free PG::Result
objects.

Also use TypedData to allow retrieval of PG::Result sizes by
  ObjectSpace.memsize_of(result)
@larskanis
Copy link
Collaborator Author

I did some more fine tuning and benchmarking. The result memsize calculation is now opt-out, because the performance impact is below precision of measurement. true means memsize enabled, false memsize disabled.

Calculating -------------------------------------
         raw_id true    291.497  (± 2.1%) i/s -      1.484k in   5.093319s
        raw_all true     26.881  (± 3.7%) i/s -    136.000  in   5.062055s
      raw_all_1 true    580.691  (± 4.3%) i/s -      2.950k in   5.088647s
   raw_id_clear true    293.475  (± 1.4%) i/s -      1.484k in   5.057401s
  raw_all_clear true     26.934  (± 0.0%) i/s -    136.000  in   5.050548s
raw_all_1_clear true    554.946  (±13.0%) i/s -      2.688k in   5.034526s
        raw_id false    292.696  (± 1.4%) i/s -      1.479k in   5.054145s
       raw_all false     27.222  (± 0.0%) i/s -    138.000  in   5.070949s
     raw_all_1 false    577.804  (± 2.1%) i/s -      2.928k in   5.069737s
  raw_id_clear false    297.485  (± 1.7%) i/s -      1.512k in   5.084062s
 raw_all_clear false     26.607  (± 3.8%) i/s -    134.000  in   5.053072s
raw_all_1_clear false   563.031  (±12.1%) i/s -      2.750k in   5.069865s

I would merge this as is now, unless there are any concerns.

@SamSaffron
Copy link

very interesting you can not measure this anymore. curious can you confirm it is working as expected and perhaps just measure by hand with clock_gettime(CLOCK_MONOTONIC, &ts); before and after method call to see what the actual nonsec cost is?

Not against merging this in, but I am curious on the actual cost here.

@larskanis
Copy link
Collaborator Author

@SamSaffron Yes it's working as expected. It can be seen by the MemoryProfiler or by your exploit. It doesn't bloat memory any longer.

I measured the time for calculating the memory size per clock_gettime(CLOCK_MONOTONIC) and got the following results on my Ryzen system:

rows x cols memsize query time
1x45 551 nsec 125 usec
1000x45 1132 nsec 4200 usec
1000x1 561 nsec 416 usec

The measurement itself takes around 70 nsecs, so this should be substracted.

@SamSaffron
Copy link

Nice! I think at this cost it is cheap enough to have default on. I do still feel like we should work with libpq here so I will post something to pghackers and see what they say.

Perhaps we can merge this (and maybe Jeremy's patches) and cut a release? There is so much goodness that was added to the gem in the last 5-6 months.

@larskanis larskanis merged commit c0cad36 into ged:master Jun 23, 2018
@larskanis larskanis deleted the account-result-size branch August 17, 2021 09:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants