-
Notifications
You must be signed in to change notification settings - Fork 167
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Pickling of generic annotations/types in 3.5+ #318
Conversation
Codecov Report
@@ Coverage Diff @@
## master #318 +/- ##
==========================================
- Coverage 93.33% 93.29% -0.05%
==========================================
Files 2 2
Lines 765 790 +25
Branches 156 162 +6
==========================================
+ Hits 714 737 +23
Misses 26 26
- Partials 25 27 +2
Continue to review full report at Codecov.
|
Don't worry about the failing CI. It should be resolved as soon as we merge: #342. |
Maybe the lint failures should be fixed though. |
You may want to rebase, since we recently dropped Python 2 support. |
@pierreglaser Done. I had to struggle a bit to get the coverage test passing, but it's good now :p |
Using a potentially pathological example inspired from python/typing#511, I get: In [7]: from cloudpickle import loads, dumps
...: from typing import Generator
...: g = Generator[int, None, None]
...: loads(dumps(g))
Out[7]: typing.Generator[int, NoneType, NoneType] Could you look into that @valtron ? |
Could you elaborate? Do you mean I should add a test for |
My bad, I did not check the |
Out of curiosity, do we need My concern comes from this comment (from python/typing#511), suggesting that pickling typing constructs is impossible by design.
(From a quick skim at the diff between |
Thank you for this remark, we (at least I) was not aware of that. Indeed, on In [14]: from cloudpickle import loads, dumps
...: import typing
...: T = typing.TypeVar('T', int, float, complex)
...: s = dumps(T)
...: del T
...: depickled_T = loads(s) # raises AttributeError Which is, to me, a bug (as we did not stated explicitly that we do not support it) |
Just for the
Yeah, those two methods do a lot of stuff I don't understand, but I don't think it loses any information, so in principle it can be done. I just wrote the most straightforward code that worked, and only a small fixup was needed to account for the "conversion" that it does. I know it's wonky to try to write this without having a full understanding of what
Glad you found that bug! I should add that as a test. |
I'd have to double check, but I'm in favour of switching to |
Let's not introduce |
One other question: why is it that this feature is not backported to |
I left it for later, so I'll be doing it soon.
Will do. |
Seems to already work for 3.5. |
Added test for I couldn't get a test that failed without the PR's changes, because of this: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also +1 for not adding a hard dependency on typing_extensions. We need to install typing_extensions on some CI workers though to make sure the integration works as expected.
Here are some comments about testing:
tests/cloudpickle_test.py
Outdated
] | ||
|
||
for obj in objs: | ||
_ = pickle_depickle(obj, protocol=self.protocol) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We also need a new test (or extend this one to check that classes and functions defined using such type annotations are pickled as expected. However because of the class tracking feature of cloudpickle, using pickle_depickle
+ checks might not be enough to properly test this (#313). We probably need to use a subprocess in a similar way to the way we do insubprocess_pickle_echo
.
For instance we could have a subprocess_pickle_and_run_checks
helper and use it as follows:
for type_hint in type_hints:
class MyClass:
attribute: type_hint
type_hint_repr: str
def __init__(self):
# executed in the parent process
self.type_hint_repr = repr(self.__annotations__["attribute"])
def run_checks(self):
# executed in the child process
assert self.type_hint_repr == repr(self.__annotations__["attribute"])
instance = MyClass()
subprocess_pickle_run_checks(instance)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, I don't know how I missed that. I'll add those tests.
About class tracking: it sounds like all tests should be checking both ways. Maybe pickle_depickle
can be a test parameter (like self.protocol
)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here's a POC of that idea: 873e556
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not opposed to use functools.partial
on pickle_echo
to remove the redundant protocol argument but I am not sure how it relates to my original concern.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I should have done a better job linking to the relevant part of the PR :p Here it is: 873e556#diff-ad0a048ef2f4c0dc3eaf2baa553b134eR2097.
By parametrizing pickle_depickle
, it becomes possible to add a subclass that runs all the tests using subprocess_pickle_echo
.
About this - the contract of from typing import Generic, TypeVar
from cloudpickle import loads, dumps
T = TypeVar('T')
G = Generic[T]
class MyClass(G): # this semantic allows people to use MyClass[int] as a type hint
pass
loads(dumps(MyClass)) errors out on PS: I needed some time to get around the meaning of this semantic ( |
As far as I can tell, that snippet fails because of the global lookup for typevar Here's the errors I got: Python < 3.7:
Python >= 3.7:
|
My bad, I was using It does not change my statement though: this error is a breach of In addition, this error on 3.6 is a little bit worrying. It comes from an issue in this part of the code that is revealed by a combination of |
Ah, thanks for the explanation. So there are two small changes that can be factored out of here:
Should I make a PR for each? |
Sure :) |
I tried making a PR just for this test, but
... which is because it's calling I think maybe this PR should be split up into two parts:
What do you think? |
A few updates:
In [28]: from typing import TypeVar, Generic
...: tvs = []
...: gvs = []
...: for i in range(200):
...: t = TypeVar('t'+str(i))
...: g = Generic[t]
...: tvs.append(t)
...: gvs.append(g)
...:
...: print(gvs[0] is Generic[tvs[0]]) prints For these two reasons I wonder if The last commit disabled Also, I need to investigate why I'm not merging this yet -- I'd like @ogrisel to take a look at it before that, but it might take a couple extra weeks given the current circumstances. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM but it would be great to understand what's going on with code coverage.
@@ -941,6 +955,31 @@ def inject_addons(self): | |||
"""Plug in system. Register additional pickling functions if modules already loaded""" | |||
pass | |||
|
|||
if sys.version_info < (3, 7): # pragma: no branch |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should not be enabled if typing_extensions
is not available, right?
if sys.version_info < (3, 7): # pragma: no branch | |
if sys.version_info < (3, 7) and _typing_extensions is not None: # pragma: no branch |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
save_parametrized_type_hints
is useful to pickle:
typing_extensions
construct such asLiteral
andFinal
- some stdlib
typing
construct, such asClassVar
,Union
,Tuple
,Generic
...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But do want to enable a half broken feature in cloudpickle when typing_extensions
is not installed?
I would rather have the type annotations to be dropped or raise an explicit error message that states that installing typing_extensions
on Python 3.6 an earlier.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this code breaks if typing_extensions
is not installed: for instance, the CI tests the pickling of typing
constructs in all entries of the CI whether typing_extensions
is installed or not (typing_extensions
is installed in only one entry in the CI).
All typing_extensions
objects in this branch are created through this process, which include import guards.
Can you precise what the "half-broken feature" is?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Typing extensions is only an extension of the typing
module in Python 3.5-3.6
. It only exposes new type annotations and does not impact the behavior of cloudpickle
when pickling type annotations from the typing
module.
I guess that the fact that we deleted more lines than we added in this PR can potentially make the overall coverage percentage decrease even though the coverage diff is 100% (as we can potentially increase the proportion of uncovered lines by deleting only covered lines). We can try to have smarter (or at least a little bit looser) |
No worries. The diff coverage looks good. |
Maybe launch CI downstream before merging this PR? |
Yes. I'll also do a last pass on the tests before merging. |
Co-Authored-By: Olivier Grisel <[email protected]>
Merged! Thank you very much @valtron and @pierreglaser! |
Thanks a lot @valtron for this sequence of contributions! |
Thanks from me too @valtron et al, it solves a production issue with dask for us! |
2.0.0 ===== - Python 3.5 is no longer supported. - Support for registering modules to be serialised by value. This allows code defined in local modules to be serialised and executed remotely without those local modules installed on the remote machine. ([PR #417](cloudpipe/cloudpickle#417)) - Fix a side effect altering dynamic modules at pickling time. ([PR #426](cloudpipe/cloudpickle#426)) - Support for pickling type annotations on Python 3.10 as per [PEP 563]( https://www.python.org/dev/peps/pep-0563/) ([PR #400](cloudpipe/cloudpickle#400)) - Stricter parametrized type detection heuristics in _is_parametrized_type_hint to limit false positives. ([PR #409](cloudpipe/cloudpickle#409)) - Support pickling / depickling of OrderedDict KeysView, ValuesView, and ItemsView, following similar strategy for vanilla Python dictionaries. ([PR #423](cloudpipe/cloudpickle#423)) - Suppressed a source of non-determinism when pickling dynamically defined functions and handles the deprecation of co_lnotab in Python 3.10+. ([PR #428](cloudpipe/cloudpickle#428)) 1.6.0 ===== - `cloudpickle`'s pickle.Pickler subclass (currently defined as `cloudpickle.cloudpickle_fast.CloudPickler`) can and should now be accessed as `cloudpickle.Pickler`. This is the only officially supported way of accessing it. ([issue #366](cloudpipe/cloudpickle#366)) - `cloudpickle` now supports pickling `dict_keys`, `dict_items` and `dict_values`. ([PR #384](cloudpipe/cloudpickle#384)) 1.5.0 ===== - Fix a bug causing cloudpickle to crash when pickling dynamically created, importable modules. ([issue #360](cloudpipe/cloudpickle#354)) - Add optional dependency on `pickle5` to get improved performance on Python 3.6 and 3.7. ([PR #370](cloudpipe/cloudpickle#370)) - Internal refactoring to ease the use of `pickle5` in cloudpickle for Python 3.6 and 3.7. ([PR #368](cloudpipe/cloudpickle#368)) 1.4.1 ===== - Fix incompatibilities between cloudpickle 1.4.0 and Python 3.5.0/1/2 introduced by the new support of cloudpickle for pickling typing constructs. ([issue #360](cloudpipe/cloudpickle#360)) - Restore compat with loading dynamic classes pickled with cloudpickle version 1.2.1 that would reference the `types.ClassType` attribute. ([PR #359](cloudpipe/cloudpickle#359)) 1.4.0 ===== **This version requires Python 3.5 or later** - cloudpickle can now all pickle all constructs from the ``typing`` module and the ``typing_extensions`` library in Python 3.5+ ([PR #318](cloudpipe/cloudpickle#318)) - Stop pickling the annotations of a dynamic class for Python < 3.6 (follow up on #276) ([issue #347](cloudpipe/cloudpickle#347)) - Fix a bug affecting the pickling of dynamic `TypeVar` instances on Python 3.7+, and expand the support for pickling `TypeVar` instances (dynamic or non-dynamic) to Python 3.5-3.6 ([PR #350](cloudpipe/cloudpickle#350)) - Add support for pickling dynamic classes subclassing `typing.Generic` instances on Python 3.7+ ([PR #351](cloudpipe/cloudpickle#351)) 1.3.0 ===== - Fix a bug affecting dynamic modules occuring with modified builtins ([issue #316](cloudpipe/cloudpickle#316)) - Fix a bug affecting cloudpickle when non-modules objects are added into sys.modules ([PR #326](cloudpipe/cloudpickle#326)). - Fix a regression in cloudpickle and python3.8 causing an error when trying to pickle property objects. ([PR #329](cloudpipe/cloudpickle#329)). - Fix a bug when a thread imports a module while cloudpickle iterates over the module list ([PR #322](cloudpipe/cloudpickle#322)). - Add support for out-of-band pickling (Python 3.8 and later). https://docs.python.org/3/library/pickle.html#example ([issue #308](cloudpipe/cloudpickle#308)) - Fix a side effect that would redefine `types.ClassTypes` as `type` when importing cloudpickle. ([issue #337](cloudpipe/cloudpickle#337)) - Fix a bug affecting subclasses of slotted classes. ([issue #311](cloudpipe/cloudpickle#311)) - Dont pickle the abc cache of dynamically defined classes for Python 3.6- (This was already the case for python3.7+) ([issue #302](cloudpipe/cloudpickle#302)) 1.2.2 ===== - Revert the change introduced in ([issue #276](cloudpipe/cloudpickle#276)) attempting to pickle functions annotations for Python 3.4 to 3.6. It is not possible to pickle complex typing constructs for those versions (see [issue #193]( cloudpipe/cloudpickle#193)) - Fix a bug affecting bound classmethod saving on Python 2. ([issue #288](cloudpipe/cloudpickle#288)) - Add support for pickling "getset" descriptors ([issue #290](cloudpipe/cloudpickle#290)) 1.2.1 ===== - Restore (partial) support for Python 3.4 for downstream projects that have LTS versions that would benefit from cloudpickle bug fixes. 1.2.0 ===== - Leverage the C-accelerated Pickler new subclassing API (available in Python 3.8) in cloudpickle. This allows cloudpickle to pickle Python objects up to 30 times faster. ([issue #253](cloudpipe/cloudpickle#253)) - Support pickling of classmethod and staticmethod objects in python2. arguments. ([issue #262](cloudpipe/cloudpickle#262)) - Add support to pickle type annotations for Python 3.5 and 3.6 (pickling type annotations was already supported for Python 3.7, Python 3.4 might also work but is no longer officially supported by cloudpickle) ([issue #276](cloudpipe/cloudpickle#276)) - Internal refactoring to proactively detect dynamic functions and classes when pickling them. This refactoring also yields small performance improvements when pickling dynamic classes (~10%) ([issue #273](cloudpipe/cloudpickle#273)) 1.1.1 ===== - Minor release to fix a packaging issue (Markdown formatting of the long description rendered on pypi.org). The code itself is the same as 1.1.0. 1.1.0 ===== - Support the pickling of interactively-defined functions with positional-only arguments. ([issue #266](cloudpipe/cloudpickle#266)) - Track the provenance of dynamic classes and enums so as to preseve the usual `isinstance` relationship between pickled objects and their original class defintions. ([issue #246](cloudpipe/cloudpickle#246)) 1.0.0 ===== - Fix a bug making functions with keyword-only arguments forget the default values of these arguments after being pickled. ([issue #264](cloudpipe/cloudpickle#264)) 0.8.1 ===== - Fix a bug (already present before 0.5.3 and re-introduced in 0.8.0) affecting relative import instructions inside depickled functions ([issue #254](cloudpipe/cloudpickle#254)) 0.8.0 ===== - Add support for pickling interactively defined dataclasses. ([issue #245](cloudpipe/cloudpickle#245)) - Global variables referenced by functions pickled by cloudpickle are now unpickled in a new and isolated namespace scoped by the CloudPickler instance. This restores the (previously untested) behavior of cloudpickle prior to changes done in 0.5.4 for functions defined in the `__main__` module, and 0.6.0/1 for other dynamic functions. 0.7.0 ===== - Correctly serialize dynamically defined classes that have a `__slots__` attribute. ([issue #225](cloudpipe/cloudpickle#225)) 0.6.1 ===== - Fix regression in 0.6.0 which breaks the pickling of local function defined in a module, making it impossible to access builtins. ([issue #211](cloudpipe/cloudpickle#211)) 0.6.0 ===== - Ensure that unpickling a function defined in a dynamic module several times sequentially does not reset the values of global variables. ([issue #187](cloudpipe/cloudpickle#205)) - Restrict the ability to pickle annotations to python3.7+ ([issue #193]( cloudpipe/cloudpickle#193) and [issue #196]( cloudpipe/cloudpickle#196)) - Stop using the deprecated `imp` module under Python 3. ([issue #207](cloudpipe/cloudpickle#207)) - Fixed pickling issue with singleton types `NoneType`, `type(...)` and `type(NotImplemented)` ([issue #209](cloudpipe/cloudpickle#209)) 0.5.6 ===== - Ensure that unpickling a locally defined function that accesses the global variables of a module does not reset the values of the global variables if they are already initialized. ([issue #187](cloudpipe/cloudpickle#187)) 0.5.5 ===== - Fixed inconsistent version in `cloudpickle.__version__`. 0.5.4 ===== - Fixed a pickling issue for ABC in python3.7+ ([issue #180]( cloudpipe/cloudpickle#180)). - Fixed a bug when pickling functions in `__main__` that access global variables ([issue #187]( cloudpipe/cloudpickle#187)). 0.5.3 ===== - Fixed a crash in Python 2 when serializing non-hashable instancemethods of built-in types ([issue #144](cloudpipe/cloudpickle#144)). - itertools objects can also pickled ([PR #156](cloudpipe/cloudpickle#156)). - `logging.RootLogger` can be also pickled ([PR #160](cloudpipe/cloudpickle#160)). 0.5.2 ===== - Fixed a regression: `AttributeError` when loading pickles that hold a reference to a dynamically defined class from the `__main__` module. ([issue #131]( cloudpipe/cloudpickle#131)). - Make it possible to pickle classes and functions defined in faulty modules that raise an exception when trying to look-up their attributes by name. 0.5.1 ===== - Fixed `cloudpickle.__version__`. 0.5.0 ===== - Use `pickle.HIGHEST_PROTOCOL` by default. 0.4.4 ===== - `logging.RootLogger` can be also pickled ([PR #160](cloudpipe/cloudpickle#160)). 0.4.3 ===== - Fixed a regression: `AttributeError` when loading pickles that hold a reference to a dynamically defined class from the `__main__` module. ([issue #131]( cloudpipe/cloudpickle#131)). - Fixed a crash in Python 2 when serializing non-hashable instancemethods of built-in types. ([issue #144](cloudpipe/cloudpickle#144)) 0.4.2 ===== - Restored compatibility with pickles from 0.4.0. - Handle the `func.__qualname__` attribute. 0.4.1 ===== - Fixed a crash when pickling dynamic classes whose `__dict__` attribute was defined as a [`property`](https://docs.python.org/3/library/functions.html#property). Most notably, this affected dynamic [namedtuples](https://docs.python.org/2/library/collections.html#namedtuple-factory-function-for-tuples-with-named-fields) in Python 2. (cloudpipe/cloudpickle#113) - Cloudpickle now preserves the `__module__` attribute of functions (cloudpipe/cloudpickle#118). - Fixed a crash when pickling modules that don't have a `__package__` attribute (cloudpipe/cloudpickle#116). 0.4.0 ===== * Fix functions with empty cells * Allow pickling Logger objects * Fix crash when pickling dynamic class cycles * Ignore "None" mdoules added to sys.modules * Support WeakSets and ABCMeta instances * Remove non-standard `__transient__` support * Catch exception from `pickle.whichmodule()` 0.3.1 ===== * Fix version information and ship a changelog 0.3.0 ===== * Import submodules accessed by pickled functions * Support recursive functions inside closures * Fix `ResourceWarnings` and `DeprecationWarnings` * Assume modules with `__file__` attribute are not dynamic 0.2.2 ===== * Support Python 3.6 * Support Tornado Coroutines * Support builtin methods
This PR adds support for pickling annotations on 3.5+, and fixes some problems with generic annotations on 3.7+.
TODO
TypeError: type() doesn't support MRO entry resolution; use types.new_class()
if not usingtypes.new_class
for reconstructing classestyping_extensions
dependency_
pickle_depickle
'ing annotated functions/classesDetails
The
types.new_class
change (in_make_skeleton_class
) is because of aTypeError: type() doesn't support MRO entry resolution; use types.new_class()
error on 3.7+, similar to this issue. Also see python/cpython#6319.I'm not sure if there are any downsides to
TypeVar
s being__reduce__
'd now. Previously, they were only supported as globals (so always imported, I think).The functions
try_decompose_generic
andget_bases
are brittle the way they are written, because they check for attributes. There might be a better way.Tests
Passing, and added some new ones.