-
Notifications
You must be signed in to change notification settings - Fork 902
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remove abc inheritance from Serializable #8254
Remove abc inheritance from Serializable #8254
Conversation
Here are some back of the envelope benchmarks run using IPython's
|
Codecov Report
@@ Coverage Diff @@
## branch-21.06 #8254 +/- ##
===============================================
Coverage ? 82.89%
===============================================
Files ? 105
Lines ? 17934
Branches ? 0
===============================================
Hits ? 14866
Misses ? 3068
Partials ? 0 Continue to review full report at Codecov.
|
LGTM. @jakirkham could you please take a quick look too when you have a moment? :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Are there use cases where this |
@jakirkham Yes, unfortunately. In an ideal world we would rely on ducktyping everywhere, but that's definitely not the case internally for various reasons. As a result, lots of different code paths are impacted here. I included a benchmark above of the difference in constructing an Index from different types, here's the relevant data again:
That was just intended as a representative example. The same type of difference will be observed in a large number of different operations. Take this simple binop example (run via IPython):
The before/after numbers there are In addition, removing |
@gpucibot merge |
Currently the Serializable class provides
serialize
anddeserialize
asabstractmethod
s via the mechanisms afforded by inheritance fromabc.ABC
. Since this class is purely internal tocudf
and is not describing an abstract interface in a manner useful to consumers of our code, the benefits of the abstract base class concept are outweighed by the performance and maintenance costs. In particular,isinstance
checks on subclasses ofabc.ABC
are much more expensive than for normal classes (due to an expensive implementation of__instancecheck__
), and (for better or worse) our code base currently makes use of these checks extensively. In addition, in certain places we can benefit from the use of custom metaclasses incudf
, but their usage becomes more cumbersome withABC
because metaclasses then also have to inherit fromABCMeta
(which brings along any associated complexities). This PR removes that inheritance, replacing it with a much simpler approach that simply implementsserialize
anddeserialize
as raisingNotImplementedError
.