-
Notifications
You must be signed in to change notification settings - Fork 52
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Split cache
keyword from method replacement
#43
Comments
I took a look at a synthetic benchmark. All the numbers here are arbitrary but feel reasonable! I define two classes: one with 1,000 new methods (beyond those from Then, I do a very basic load: just replacing a value with itself. I do this 1,000 times for each case, and in each case, then call a method 1,000 times on the loaded object. I tested this without caching on both objects, and with. I also added my proposed require 'benchmark'
require './lib/batch-loader'
class ManyMethods
1.upto(1000) do |i|
define_method("method_#{i}") { i }
end
end
class FewMethods
def method_1
1
end
end
def load_value(x, **opts)
BatchLoader.for(x).batch(opts) do |xs, loader|
xs.each { |x| loader.call(x, x) }
end
end
def benchmark(klass:, **opts)
1000.times do
value = load_value(klass.new, opts)
1000.times { value.method_1 }
end
end
Benchmark.bmbm do |x|
x.report('replacement + many methods') { benchmark(klass: ManyMethods) }
x.report('replacement + few methods') { benchmark(klass: FewMethods) }
x.report('no replacement + many methods') { benchmark(klass: ManyMethods, method_replacement: false) }
x.report('no replacement + few methods') { benchmark(klass: FewMethods, method_replacement: false) }
x.report('no cache + many methods') { benchmark(klass: ManyMethods, cache: false, method_replacement: false) }
x.report('no cache + few methods') { benchmark(klass: FewMethods, cache: false, method_replacement: false) }
end And here are the results (for me):
I found it a bit suspicious that no replacement was always faster than caching, so I bumped it up to 10,000 calls on each loaded object, and I got (skipping the cache-less case as it's too slow):
Which shows that with an object with a lot of methods, the technique of replacing the methods can be slower than @exAspArk what do you think? The actual patch so far is just: diff --git a/lib/batch_loader.rb b/lib/batch_loader.rb
index ac1638c..f2e2e29 100644
--- a/lib/batch_loader.rb
+++ b/lib/batch_loader.rb
@@ -23,10 +23,11 @@ class BatchLoader
@__executor_proxy = executor_proxy
end
- def batch(default_value: nil, cache: true, key: nil, &batch_block)
+ def batch(default_value: nil, cache: true, key: nil, method_replacement: true, &batch_block)
@default_value = default_value
@cache = cache
@key = key
+ @method_replacement = method_replacement
@batch_block = batch_block
__executor_proxy.add(item: @item)
@@ -74,7 +75,7 @@ class BatchLoader
def __sync!
loaded_value = __sync
- if @cache
+ if @method_replacement
__replace_with!(loaded_value)
else
loaded_value |
http://franck.verrot.fr/blog/2015/07/12/benchmarking-ruby-method-missing-and-define-method (although written in 2015) seems to suggest that batch-loader/lib/batch_loader.rb Line 122 in 43e4f62
|
Yes, I think that implementation will be faster over many calls, but maybe not if there are not so many. Having this configurable by the users of the library would be nice so that we can experiment, anyway 🙂 |
Hey @smcgivern and @stanhu, Thanks a lot for opening the issue and diving into the performance optimizations! Does this approach help to solve the mentioned GitLab issues? Will be amazing if you could open a PR. Maybe we could rename the flag to foo = BatchLoader.for(bar).batch(replace_methods: true) { ... } From your benchmarks, using the flag doesn't always improve the performance. I guess it primarily depends on 2 factors:
We could also update the existing synthetic benchmarks https://github.com/exAspArk/batch-loader/tree/master/spec/benchmarks so other devs could check before and after their changes locally, for example :) |
Exactly that! I don't want to make a decision for people here; I'm sure that within the GitLab application, there are cases where I've created #45 now, thanks for the pointers! |
Currently, the
cache
keyword argument is used for two things:#__sync
uses it to set a@synced
instance variable, which tells later calls to#__sync
not to reload the value when another method is called on it (https://github.com/exAspArk/batch-loader/blob/v1.3.0/lib/batch_loader.rb#L49-L55).#__sync!
uses it to replace methods on the proxied object with their 'real' equivalents (https://github.com/exAspArk/batch-loader/blob/v1.3.0/lib/batch_loader.rb#L77-L78).In our use of this gem at GitLab, we've noticed that it's very easy for objects with a large interface to spend a lot of time in
#__replace_with!
. See https://gitlab.com/gitlab-org/gitlab-ce/issues/60373#note_159582633 and https://gitlab.com/gitlab-org/gitlab-ce/issues/43065#note_160469960 for some examples.In some exploratory testing, we found that retaining only item 1 from my list above gave us better performance than either disabling the cache entirely, or having both 1 and 2 coupled together.
Could we consider a
replace_methods
keyword argument? If unset, it would default to the same value ascache
, but if set tofalse
, it would skip item 2 above. I am happy to create a PR if that sounds reasonable.The text was updated successfully, but these errors were encountered: