Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cannot pickle attrs classes #320

Open
Lingepumpe opened this issue Jan 1, 2020 · 7 comments
Open

Cannot pickle attrs classes #320

Lingepumpe opened this issue Jan 1, 2020 · 7 comments
Labels

Comments

@Lingepumpe
Copy link

from attr import attrs, attrib
import cloudpickle

@attrs
class Foo:
    x = attrib()

cloudpickle.dumps(Foo)

Fails with TypeError: cannot pickle 'property' object for me.
python 3.8.1, attrs = "19.3.0", cloudpickle = "1.2.2"

As mentioned here: python-attrs/attrs#458 there is also some issue with
TypeError: cannot pickle '_thread._local' object that can be worked around by using @attrs(repr=False). The repr fix does not help with the above.

See below for a snipet to reproduce the TypeError: cannot pickle '_thread._local' object:

from attr import attrs
import cloudpickle

@attrs
class Foo:
    pass

cloudpickle.dumps(Foo)
@ogrisel ogrisel added the bug label Jan 20, 2020
@ogrisel
Copy link
Contributor

ogrisel commented Jan 20, 2020

I am not sure if it's a bug, a regression or a missing feature but someone should definitely investigate :) Thanks for the report.

@Lingepumpe
Copy link
Author

At least the first part with TypeError: cannot pickle 'property' object might be a duplicate of issue #321 [or vice versa] - the python version and error message are the same.

The second part of the ticket (with TypeError: cannot pickle '_thread._local' object) is attrs specific and less critical due to a work-around via repr=False being known.

@ogrisel
Copy link
Contributor

ogrisel commented Jan 20, 2020

I thought #321 was only about Python 3.6 while this issue is reported for 3.8.1. Will need to investigate but cannot do it in the short term.

@Lingepumpe
Copy link
Author

With the merge of @pierreglaser MR ( 8f851f7 ) now both of my above test-cases produce the same output:
TypeError: cannot pickle '_thread._local' object. It seems the python-3.8 specific bug has been fixed and the attr specific bug is the only thing remaining. As mentioned above requesting no repr function to be generated by attr is a workaround, when supplying @attrs(repr=False) as the class decorator cloudpickle manages to dump the class.

benbovy added a commit to xarray-contrib/xarray-simlab that referenced this issue Apr 2, 2020
benbovy added a commit to xarray-contrib/xarray-simlab that referenced this issue Apr 6, 2020
* add and document API

* implement run batch in parallel (+ tests)

* simplify model context manager

* fix pickle process classes

See
cloudpipe/cloudpickle#320
python-attrs/attrs#458

* fix zarr in-memory store and dask processes

* disable dask parallel schedulers on CI

Is it really supported?

* get a dask lock for create / resize zarr datasets

* clean-up

* add test for DummyLock

* run model processes in parallel + more docstrings

* update release notes

* docstrings tweaks

* doc: add run parallel section
@danpf
Copy link

danpf commented Jun 12, 2020

Hello,

I was looking at this and found that for some reason I cannot recreate the thread.local error in pytest. I was wondering if anyone had any thoughts on why that is? or possibly the solution!

for reference the stdout/err is:

# pytest tests/cloudpickle_test.py -k test_attr
tests/cloudpickle_test.py b'\x80\x05\x95"\x00\x00\x00\x00\x00\x00\x00\x8c\x16tests.cloudpickle_test\x94\x8c\x03Foo\x94\x93\x94.'
Traceback (most recent call last):
  File "hello.py", line 6, in <module>
    print(cloudpickle.dumps(Foo))
  File "/home/danpf/git/cloudpickle/cloudpickle/cloudpickle_fast.py", line 64, in dumps
    cp.dump(obj)
  File "/home/danpf/git/cloudpickle/cloudpickle/cloudpickle_fast.py", line 549, in dump
    return Pickler.dump(self, obj)
TypeError: cannot pickle '_thread._local' object

======================================================================================== FAILURES ========================================================================================
_______________________________________________________________________________ CloudPickleTest.test_attr ________________________________________________________________________________

self = <tests.cloudpickle_test.CloudPickleTest testMethod=test_attr>

    def test_attr(self):
        print(cloudpickle.dumps(Foo))
        test_str = (
                "import cloudpickle\n"
                "import attr\n"
                "@attr.s\n"
                "class Foo:\n"
                "    x: int = attr.ib()\n"
                "print(cloudpickle.dumps(Foo))\n")
        with open("hello.py", "w") as fh:
            fh.write(test_str)
        ret = subprocess.call(["python", "hello.py"])
>       assert ret == 0
E       AssertionError: assert 1 == 0

patch below

diff --git a/dev-requirements.txt b/dev-requirements.txt
index 4e26b21..7dcd389 100644
--- a/dev-requirements.txt
+++ b/dev-requirements.txt
@@ -13,3 +13,4 @@ codecov
 coverage
 # Utility package used when running the cloudpickle test suite
 ./tests/cloudpickle_testpkg
+attrs
diff --git a/tests/cloudpickle_test.py b/tests/cloudpickle_test.py
index b1821db..4e9afc3 100644
--- a/tests/cloudpickle_test.py
+++ b/tests/cloudpickle_test.py
@@ -26,6 +26,7 @@ import typing
 from functools import wraps
 
 import pytest
+import attr
 
 try:
     # try importing numpy and scipy. These are not hard dependencies and
@@ -58,6 +59,11 @@ from _cloudpickle_testpkg import relative_imports_factory
 _TEST_GLOBAL_VARIABLE = "default_value"
 
 
[email protected]
+class Foo:
+    x: int = attr.ib()
+
+
 class RaiserOnPickle(object):
 
     def __init__(self, exc):
@@ -191,6 +197,20 @@ class CloudPickleTest(unittest.TestCase):
         self.assertTrue("exit" in foo.__code__.co_names)
         cloudpickle.dumps(foo)
 
+    def test_attr(self):
+        print(cloudpickle.dumps(Foo))
+        test_str = (
+                "import cloudpickle\n"
+                "import attr\n"
+                "@attr.s\n"
+                "class Foo:\n"
+                "    x: int = attr.ib()\n"
+                "print(cloudpickle.dumps(Foo))\n")
+        with open("hello.py", "w") as fh:
+            fh.write(test_str)
+        ret = subprocess.call(["python", "hello.py"])
+        assert ret == 0
+
     def test_buffer(self):
         try:
             buffer_obj = buffer("Hello")

thetorpedodog added a commit to thetorpedodog/attrs that referenced this issue Nov 1, 2021
Generated `__repr__` methods for Pythons with f-strings included
a `threading.local` object in that method's `globals()` dictionary.
Because `cloudpickle` attempts to serialize all globals of this method,
it ends up trying to pickle the `threading.local`, which cannot be
pickled.

Instead, we now pull use of the thread-local out into its own function,
which `cloudpickle` will happily serialize. As an added benefit, this
eliminates some duplicated code between f-string and non–f-string
`__repr__`s.

Should fix:

- python-attrs#458
- cloudpipe/cloudpickle#320
thetorpedodog added a commit to thetorpedodog/attrs that referenced this issue Nov 1, 2021
Generated `__repr__` methods for Pythons with f-strings included
a `threading.local` object in that method's `globals()` dictionary.
Because `cloudpickle` attempts to serialize all globals of this method,
it ends up trying to pickle the `threading.local`, which cannot be
pickled.

Instead, we now pull use of the thread-local out into its own function,
which `cloudpickle` will happily serialize. As an added benefit, this
eliminates some duplicated code between f-string and non–f-string
`__repr__`s.

Should fix:

- python-attrs#458
- cloudpipe/cloudpickle#320
thetorpedodog added a commit to thetorpedodog/attrs that referenced this issue Nov 2, 2021
Generated `__repr__` methods for Pythons with f-strings included
a `threading.local` object in that method's `globals()` dictionary.
Because `cloudpickle` attempts to serialize all globals of this method,
it ends up trying to pickle the `threading.local`, which cannot be
pickled.

Instead, we now pull use of the thread-local out into its own function,
which `cloudpickle` will happily serialize. As an added benefit, this
eliminates some duplicated code between f-string and non–f-string
`__repr__`s.

Should fix:

- python-attrs#458
- cloudpipe/cloudpickle#320
thetorpedodog added a commit to thetorpedodog/attrs that referenced this issue Nov 3, 2021
Because cloudpickle tries to pickle a function's globals, when it
pickled an attrs instance, it would try to pickle the `__repr__` method
and its globals, which included a `threading.local`. This broke
cloudpickle for all attrs classes unless they explicitly specified
`repr=False`. Modules, however, are pickled by reference, not by value,
so moving the repr into a different module means we can put `_compat`
into the function's globals and not worry about direct references.
Includes a test to ensure that attrs and cloudpickle remain compatible.

Also adds an explanation of the reason we even *have* that global
thread-local variable.  It wasn't completely obvious to a reader why
the thread-local was needed to track reference cycles in `__repr__`
calls, and the test did not previously contain a cycle that touched
a non-attrs value. This change adds a comment explaining the need
and also adds a non-attrs value in the reference cycle of
`test_infinite_recursion`.

Fixes:
- python-attrs#458
- cloudpipe/cloudpickle#320
thetorpedodog added a commit to thetorpedodog/attrs that referenced this issue Nov 3, 2021
Because cloudpickle tries to pickle a function's globals, when it
pickled an attrs instance, it would try to pickle the `__repr__` method
and its globals, which included a `threading.local`. This broke
cloudpickle for all attrs classes unless they explicitly specified
`repr=False`. Modules, however, are pickled by reference, not by value,
so moving the repr into a different module means we can put `_compat`
into the function's globals and not worry about direct references.
Includes a test to ensure that attrs and cloudpickle remain compatible.

Also adds an explanation of the reason we even *have* that global
thread-local variable.  It wasn't completely obvious to a reader why
the thread-local was needed to track reference cycles in `__repr__`
calls, and the test did not previously contain a cycle that touched
a non-attrs value. This change adds a comment explaining the need
and tests a cycle that contains non-attrs values.

Fixes:
- python-attrs#458
- cloudpipe/cloudpickle#320
hynek pushed a commit to python-attrs/attrs that referenced this issue Nov 4, 2021
Because cloudpickle tries to pickle a function's globals, when it
pickled an attrs instance, it would try to pickle the `__repr__` method
and its globals, which included a `threading.local`. This broke
cloudpickle for all attrs classes unless they explicitly specified
`repr=False`. Modules, however, are pickled by reference, not by value,
so moving the repr into a different module means we can put `_compat`
into the function's globals and not worry about direct references.
Includes a test to ensure that attrs and cloudpickle remain compatible.

Also adds an explanation of the reason we even *have* that global
thread-local variable.  It wasn't completely obvious to a reader why
the thread-local was needed to track reference cycles in `__repr__`
calls, and the test did not previously contain a cycle that touched
a non-attrs value. This change adds a comment explaining the need
and tests a cycle that contains non-attrs values.

Fixes:
- #458
- cloudpipe/cloudpickle#320
@ritsuki1227
Copy link

Hi, any update on the issue?
I have a similar problem on the code bellow.

from tenacity import retry
import cloudpickle

@retry
def foo(bar):
    print(bar)
    
cloudpickle.dumps(foo)

@thetorpedodog
Copy link

With attrs release 21.4.0, this incompatibility has been fixed and attrs instances can be cloudpickled.

manu-18l added a commit to manu-18l/xarray-contrib that referenced this issue Jan 7, 2022
* add and document API

* implement run batch in parallel (+ tests)

* simplify model context manager

* fix pickle process classes

See
cloudpipe/cloudpickle#320
python-attrs/attrs#458

* fix zarr in-memory store and dask processes

* disable dask parallel schedulers on CI

Is it really supported?

* get a dask lock for create / resize zarr datasets

* clean-up

* add test for DummyLock

* run model processes in parallel + more docstrings

* update release notes

* docstrings tweaks

* doc: add run parallel section
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

5 participants