Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unsupported pickle protocol: 5 #27

Closed
stsievert opened this issue Jul 19, 2020 · 6 comments
Closed

Unsupported pickle protocol: 5 #27

stsievert opened this issue Jul 19, 2020 · 6 comments
Assignees
Labels
bug Something isn't working documentation Improvements or additions to documentation

Comments

@stsievert
Copy link
Contributor

stsievert commented Jul 19, 2020

I am following Dask's HyperbandSearchCV example verbatim (each cell is copy/pasted into my notebook).

When I get to the line that does the computation, search.fit(X, y, classes=[0, 1, 2, 3], I get this error:

ValueError: unsupported pickle protocol: 5

Full traceback, printed in notebook
distributed.protocol.core - CRITICAL - Failed to deserialize
Traceback (most recent call last):
  File "/home/stsievert/miniconda3/lib/python3.7/site-packages/distributed/protocol/core.py", line 130, in loads
    value = _deserialize(head, fs, deserializers=deserializers)
  File "/home/stsievert/miniconda3/lib/python3.7/site-packages/distributed/protocol/serialize.py", line 302, in deserialize
    return loads(header, frames)
  File "/home/stsievert/miniconda3/lib/python3.7/site-packages/distributed/protocol/serialize.py", line 64, in pickle_loads
    return pickle.loads(x, buffers=buffers)
  File "/home/stsievert/miniconda3/lib/python3.7/site-packages/distributed/protocol/pickle.py", line 75, in loads
    return pickle.loads(x)
ValueError: unsupported pickle protocol: 5
distributed.utils - ERROR - unsupported pickle protocol: 5
Traceback (most recent call last):
  File "/home/stsievert/miniconda3/lib/python3.7/site-packages/distributed/utils.py", line 656, in log_errors
    yield
  File "/home/stsievert/miniconda3/lib/python3.7/site-packages/distributed/client.py", line 1221, in _handle_report
    msgs = await self.scheduler_comm.comm.read()
  File "/home/stsievert/miniconda3/lib/python3.7/site-packages/distributed/comm/tcp.py", line 206, in read
    allow_offload=self.allow_offload,
  File "/home/stsievert/miniconda3/lib/python3.7/site-packages/distributed/comm/utils.py", line 87, in from_frames
    res = _from_frames()
  File "/home/stsievert/miniconda3/lib/python3.7/site-packages/distributed/comm/utils.py", line 66, in _from_frames
    frames, deserialize=deserialize, deserializers=deserializers
  File "/home/stsievert/miniconda3/lib/python3.7/site-packages/distributed/protocol/core.py", line 130, in loads
    value = _deserialize(head, fs, deserializers=deserializers)
  File "/home/stsievert/miniconda3/lib/python3.7/site-packages/distributed/protocol/serialize.py", line 302, in deserialize
    return loads(header, frames)
  File "/home/stsievert/miniconda3/lib/python3.7/site-packages/distributed/protocol/serialize.py", line 64, in pickle_loads
    return pickle.loads(x, buffers=buffers)
  File "/home/stsievert/miniconda3/lib/python3.7/site-packages/distributed/protocol/pickle.py", line 75, in loads
    return pickle.loads(x)
ValueError: unsupported pickle protocol: 5

I don't see any error messages on the Jupyter output stream in the terminal.

This is on Python 3.7.7:

>>> import sys
>>> sys.version_info
sys.version_info(major=3, minor=7, micro=7, releaselevel='final', serial=0)

The relevant portion of history:

(base) [stsievert@submit3 ~]$ history
    2  wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh
    3  bash Miniconda3-latest-Linux-x86_64.sh
    5  source ~/.bashrc
    7  conda install numpy scipy pandas scikit-learn
   12  dask-chtc jupyter run lab
   13  which conda
   14  conda install numpy scipy pandas scikit-learn
   15  conda install dask dask dask-ml distributed -c conda-forge
   17  conda install pytorch torchvision cpuonly -c pytorch
   18  pip install --upgrade git+https://github.com/JoshKarpel/dask-chtc.git
@JoshKarpel
Copy link
Contributor

I've been hitting this myself while testing. It's a Python 3.7 vs. 3.8 compatibility problem in pickle. 3.8 adds protocol version 5, and which 3.7 does not support. If I remember correctly, daskdev/dask is on 3.8 now, so you'll need to upgrade your client-side Python to 3.8 as well.

Related Dask issue, where it looks like they concluded that there wasn't an easy fix on their side and you just need to make sure not to cross the protocol boundary: dask/dask#6007

On our end, since we're sitting right at this boundary, this is probably worth adding a note in the docs. Maybe a good candidate for the first entry in a "troubleshooting" page?

@JoshKarpel JoshKarpel added bug Something isn't working documentation Improvements or additions to documentation labels Jul 19, 2020
@JoshKarpel JoshKarpel self-assigned this Jul 19, 2020
@stsievert
Copy link
Contributor Author

Maybe a good candidate for the first entry in a "troubleshooting" page?

I'll make Python 3.8 a requirement in #28. I've found it easiest to install a new version of Python in a conda env (like in #25).

@stsievert
Copy link
Contributor Author

Nevermind, I'll leave the Python 3.8 requirement out; it would require documenting the creation of a virtual env; however, #25 already does that.

JoshKarpel added a commit that referenced this issue Jul 23, 2020
@JoshKarpel
Copy link
Contributor

"Resolved" by b01d7fe, but I want to keep this in mind for #25 . We should make it easy to keep your Python version in sync.

@stsievert
Copy link
Contributor Author

Why not raise an error if Python <3.8 is installed? Or use Dask's ability to version check?

@JoshKarpel
Copy link
Contributor

We can't guarantee that users can use Python >=3.8 for the rest of their code (unlikely, but possible). Dask itself seems unwilling to add a hard incompatibility check for this, so I'd prefer to follow their lead and let the error happen.

One thing I have been a little confused about is that sometimes I get the version mismatch text from Dask when starting workers, but sometimes I don't. Haven't looked into that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working documentation Improvements or additions to documentation
Projects
None yet
Development

No branches or pull requests

2 participants