-
Notifications
You must be signed in to change notification settings - Fork 15.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Ruby: Call "Class#new" over rb_class_new_instance in decoding #7352
Ruby: Call "Class#new" over rb_class_new_instance in decoding #7352
Conversation
This patch has almost no change in behaviour where users have not patched the implementation of new on either a specific proto object class, or `Google::Protobuf::MessageExts::ClassMethods`. The default implementation of `new`, and `rb_class_new_instance` have the same behaviour. By default when we call `new` on a class in Ruby, it goes to the `Class` class's implementation: ```ruby class Foo end >> Foo.method(:new).owner => Class ``` the `Class` implementation of `new` is (pseudocode, it's actually in c): ```ruby class Class def new(*args, &blk) instance = alloc instance.initialize(*args, &blk) instance end end ``` `rb_class_new_instance` does the same thing, it calls down to [`rb_class_s_new`](https://github.com/ruby/ruby/blob/v2_5_5/object.c#L2147), which calls `rb_class_alloc`, then `rb_obj_call_init`. `rb_funcall` is a variadic c function for calling a ruby method on an object, it takes: * A `VALUE` on to which the method should be called * An `ID` which should be an ID of a method, usually created with `rb_intern`, to get an ID from a string * An integer, the number of arguments calling the method with, * A number of `VALUE`s, to send to the method call. `rb_funcall` is the same as calling a method directly in Ruby, and will perform ancestor chain respecting method lookup. This means that in C extensions, if nobody has defined the `new` method on any classes or modules in a class's inheritance chain calling `rb_class_new_instance` is the same as calling `rb_funcall(klass, rb_intern("new"))`, *however* this removes the ability for users to define or monkey patch their own constructors in to the objects created by the C extension. In Ads, we define [`new`](https://git.io/JvFC9) on `Google::Protobuf::MessageExts::ClassMethods` to allow us to insert a monkeypatch which makes it possible to assign primitive values to wrapper type fields (e.g. Google::Protobuf::StringValue). The monkeypatch we apply works for objects that we create for the user via the `new` method. Before this commit, however, the patch does not work for the `decode` method, for the reasons outlined above. Before this commit, protobuf uses `rb_class_new_instance`. After this commit, we use `rb_funcall(klass, rb_intern("new"), 0);` to construct protobuf objects during decoding. While I haven't measured it this will have a very minor performance impact for decoding (`rb_funcall` will have to go to the method cache, which `rb_class_new_instance` will not). This does however do the "more rubyish" thing of respecting the protobuf object's inheritance chain to construct them during decoding. I have run both Ads and Cloud's test suites for Ruby libraries against this patch, as well as the protobuf Ruby gem's test suite locally.
LGTM, thanks Penelope! |
Thanks for writing this up Penelope! I'm glad to know that this can't crash our C extension. I do have a few concerns about this change:
That said, I understand that this fixes a bug in Ads right now though, and we already do something like this in other places: If you can verify that the performance impacts of this won't adversely affect the Ads client, I'll merge it. |
Hi Josh, I benchmarked this vs current master, and decoding was the same speed (within the margin of error)
|
This patch has almost no change in behaviour where users have not
patched the implementation of new on either a specific proto object
class, or
Google::Protobuf::MessageExts::ClassMethods
. The defaultimplementation of
new
, andrb_class_new_instance
have the samebehaviour.
By default when we call
new
on a class in Ruby, it goes to theClass
class's implementation:
the
Class
implementation ofnew
is (pseudocode, it's actually in c):rb_class_new_instance
does the same thing, it calls down torb_class_s_new
,which calls
rb_class_alloc
, thenrb_obj_call_init
.rb_funcall
is a variadic c function for calling a ruby method on an object,it takes:
VALUE
on to which the method should be calledID
which should be an ID of a method, usually created withrb_intern
,to get an ID from a string
VALUE
s, to send to the method call.rb_funcall
is the same as calling a method directly in Ruby, and will performancestor chain respecting method lookup.
This means that in C extensions, if nobody has defined the
new
method on anyclasses or modules in a class's inheritance chain calling
rb_class_new_instance
is the same as callingrb_funcall(klass, rb_intern("new"))
, however this removes the ability for users to define ormonkey patch their own constructors in to the objects created by the C
extension.
In Ads, we define
new
onGoogle::Protobuf::MessageExts::ClassMethods
to allow us to insert amonkeypatch which makes it possible to assign primitive values to wrapper type
fields (e.g. Google::Protobuf::StringValue). The monkeypatch we apply works for
objects that we create for the user via the
new
method. Before this commit,however, the patch does not work for the
decode
method, for the reasonsoutlined above. Before this commit, protobuf uses
rb_class_new_instance
.After this commit, we use
rb_funcall(klass, rb_intern("new"), 0);
to constructprotobuf objects during decoding. While I haven't measured it this will have
a very minor performance impact for decoding (
rb_funcall
will have to go to themethod cache, which
rb_class_new_instance
will not). This does however dothe "more rubyish" thing of respecting the protobuf object's inheritance chain
to construct them during decoding. What this means practically is that the monkeypatch where we override new works for the ads library in decoding. Other users could override new on
Google::Protobuf::MessageExts::ClassMethods
to introduce their own behavioursI have run both Ads and Cloud's test suites for Ruby libraries against this
patch, as well as the protobuf Ruby gem's test suite locally. I also tested what happens if a user does not return the right object type from their overriden new method. A reasonable exception is produced:
TypeError: wrong argument type Object (expected Message)
, and no crash occurs in the C code.