Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reduce binary size overhead of new-style constructors #1014

Merged
merged 2 commits into from
Aug 28, 2017

Conversation

dean0x7d
Copy link
Member

This is a follow-up to #805. It removes the binary size increase of the new-style constructors (about ~1% binary size reduction).

The lookup of the self type and value pointer are moved out of template code and into dispatcher. This brings down the binary size of constructors back to the level of the old placement-new approach. (It also avoids a second lookup for init_instance.)

With this implementation, mixing old- and new-style constructors in the same overload set may result in some runtime overhead for temporary allocations/deallocations, but this should be fine as
old style constructors are phased out.

@@ -472,6 +490,18 @@ class cpp_function : public function {
size_t args_to_copy = std::min(pos_args, n_args_in);
size_t args_copied = 0;

// 0. Inject new-style `self` argument
if (it->is_new_style_constructor) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Really minor: it-> -> func., for consistency.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, I missed that. Fixed now.

@@ -352,9 +351,9 @@ def test_reallocations(capture, msg):
create_and_destroy(1.5)
assert msg(capture) == strip_comments("""
noisy new # allocation required to attempt first overload
noisy delete # have to dealloc before stashing factory-generated pointer
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

before stashing factory-generted pointer -> before considering factory init overload

@jagerman
Copy link
Member

Looks good to me (I left a couple of very minor cosmetic changes).

With this implementation, mixing old- and new-style constructors in the same overload set may result in some runtime overhead for temporary allocations/deallocations, but this should be fine as
old style constructors are phased out.

That was basically true for the previous implementation, too. The only extra overhead I see is that the deallocation gets called even if the new-style constructor isn't invokable with the given constructor arguments. But for that to matter, I think you'd have to do something like interspersing old and new (e.g. .def(...old...).def(...new...).def(...old...)); otherwise I don't see any more overhead here. That likely unusual pattern, combined with the fact that we're really only talking about an extra malloc/free of memory big enough to hold an instance, seems like a perfectly reasonable inefficiency to me.

@dean0x7d
Copy link
Member Author

But for that to matter, I think you'd have to do something like interspersing old and new (e.g. .def(...old...).def(...new...).def(...old...)); otherwise I don't see any more overhead here.

Yeah, seems unlikely.

This reminds me: What do you think about adding a deprecation warning for the old-style constructors? (And eventually removing them in v3.0?). It should be possible to detect at module initialization time as name == "__init__" && !rec.is_new_style_constructor. I'm just a bit worried that a runtime deprecation warning might be more annoying than usual.

@jagerman
Copy link
Member

I don't like the idea of a runtime deprecation warning; it has the nasty feature of annoying the end-user rather than just the module developer (unless it's something pretty serious like "this is definitely going to break in the next release"). Deprecating it in the changelog and removing it from the documentation is fine with me, though. The placement new stuff is currently still there, but I can't see anything that placement new would let you do anything you couldn't do with a (simpler) factory function constructor.

@jagerman
Copy link
Member

On the other hand, a runtime warning when not under NDEBUG might annoy the right people without being annoying more generally.

@dean0x7d dean0x7d modified the milestone: v2.2 Aug 23, 2017
@dean0x7d
Copy link
Member Author

On the other hand, a runtime warning when not under NDEBUG might annoy the right people without being annoying more generally.

That sounds pretty good to me. Python's builtin warning handlers print messages only for the first unique appearance, so there would be a single warning per class instead of per __init__. I'll put something together.

@dean0x7d
Copy link
Member Author

Before deprecating placement-new, there was still one more place where it was being used that didn't have an alternative: __setstate__. py::pickle() aims to replace it here:

    // old
    py::class<Foo>(m, "Foo")
        ...
        .def("__getstate__", [](const Foo &self) {
            return py::make_tuple(self.value1(), self.value2(), ...);
        })
        .def("__setstate__", [](Foo &self, py::tuple t) {
            new (&self) Foo(t[0].cast<std::string>(), ...);
        });

    // new
    py::class<Foo>(m, "Foo")
        ...
        .def(py::pickle(
            [](const Foo &self) { // __getstate__
                return py::make_tuple(f.value1(), f.value2(), ...); // unchanged
            },
            [](py::tuple t) { // __setstate__, note: no `self` argument
                return new Foo(t[0].cast<std::string>(), ...);
                // or: return std::make_unique<Foo>(...); // return by holder
                // or: return Foo(...); // return by value (move constructor)
            }
        ));

py::pickle makes sure that the return type of __getstate__ matches the argument type of __setstate__. The new-style __setstate__ matches the new __init__ and reuses the same implementation.

@jagerman
Copy link
Member

This looks good to me.

The lookup of the `self` type and value pointer are moved out of
template code and into `dispatcher`. This brings down the binary
size of constructors back to the level of the old placement-new
approach. (It also avoids a second lookup for `init_instance`.)

With this implementation, mixing old- and new-style constructors
in the same overload set may result in some runtime overhead for
temporary allocations/deallocations, but this should be fine as
old style constructors are phased out.
@dean0x7d
Copy link
Member Author

I left out the 2 commits with pickling changes and I'll put them in a separate PR since they are not related to this and they introduce a significant change (sorry, should have done this initially). That also keeps the changelog entries clear.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants