Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pin Matplotlib < 3.1.1 to avoid mutiprocessing MacOS error #3912

Closed
wants to merge 11 commits into from

Conversation

AlexAndorra
Copy link
Contributor

This PR adresses #3849 by pinning Matplotlib to versions < 3.1.1.
This avoids an issue that appeared in MPL 3.1.1, where multiprocessing crashes on MacOS when plotting code is executed before code using multiprocessing -- see this issue on MPL repo.

The pin should be removed only when the issue is resolved on MPL side.
Should we keep #3849 open though, as a reminder?

In passing, I reordered requirements.txt alphabetically, to mirror requirements-dev.txt

  • No breaking changes
  • Added to RELEASE-NOTES.md

Copy link
Member

@michaelosthege michaelosthege left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These changes look good.
You could add a manual downgrade of the installed matplotlib in create_testenv.sh line 45 ?

@michaelosthege
Copy link
Member

Should we keep #3849 open though, as a reminder?

I'd rather and create a new issue for taking it back out when matplotlib/matplotlib#15410 is fixed.
(Plus a reminder about changes to the test environment..)

@AlexAndorra
Copy link
Contributor Author

@michaelosthege I could use your help here: do you see what I'm doing wrong? 🤔
I specify pip install --no-cache-dir --force-reinstall matplotlib==3.1.0 in create_testenv.sh, and yet Travis installs... matplotlib-3.2.0 😅 (see lines 515 and 564 respectively: https://travis-ci.org/github/pymc-devs/pymc3/jobs/683468312#L564)

@michaelosthege
Copy link
Member

@michaelosthege I could use your help here: do you see what I'm doing wrong? 🤔
I specify pip install --no-cache-dir --force-reinstall matplotlib==3.1.0 in create_testenv.sh, and yet Travis installs... matplotlib-3.2.0 😅 (see lines 515 and 564 respectively: https://travis-ci.org/github/pymc-devs/pymc3/jobs/683468312#L564)

Wow, that looks super weird!
Other than putting quotes "matplotlib==3.1.0" I have no idea.
it does uninstall, but then installs a different version than it downloaded previously?!

@AlexAndorra
Copy link
Contributor Author

Exactly 😅 I'll try with the quotes but I'm skeptical, as I already had quotes on earlier tries with relative installs ("matplotlib<=3.1.0"). I'll also uninstall MPL before installing. We'll see how it goes 🤷‍♂️

@AlexAndorra
Copy link
Contributor Author

AlexAndorra commented May 5, 2020

This is definitely a mystery: now it uninstalls MPL 3.2.1, downloads 3.1.0... but installs 3.1.3 😅 (see lines 522 and 535 in Travis).

@codecov
Copy link

codecov bot commented May 6, 2020

Codecov Report

Merging #3912 into master will not change coverage.
The diff coverage is n/a.

Impacted file tree graph

@@           Coverage Diff           @@
##           master    #3912   +/-   ##
=======================================
  Coverage   83.45%   83.45%           
=======================================
  Files         103      103           
  Lines       14178    14178           
=======================================
  Hits        11832    11832           
  Misses       2346     2346           

@AlexAndorra
Copy link
Contributor Author

@michaelosthege, after multiple attempts, tests finally passed 🍾
However, I had to move Matplotlib from requirements.txt to requirements-dev.txt. Is that a problem in your opinion?
I'm also thinking that @twiecki @ColCarroll and @canyon289 could add valuable input here 😉

requirements-dev.txt Outdated Show resolved Hide resolved
@AlexAndorra
Copy link
Contributor Author

It worked 🎉 And I think I understand why: it probably comes from the --ignore-installed flag, used lines 46 and 47 of create_testenv.sh. As explained in this SO thread, this flag doesn't uninstall previous versions of packages -- it just ignores them. As a result, you can end up with multiple versions of MPL in the venv, which is why I had to uninstall MPL thre times -- until the most recent version could be 3.1.0

In a nutshell, if --ignore-installed is here to ensure packages are reinstalled each time, I think we should use --force-reinstall instead -- much safer, as it will uninstall previous versions before installing new ones. If you agree, I'll make the change (maybe in another PR?).

@twiecki
Copy link
Member

twiecki commented May 6, 2020 via email

@tacaswell
Copy link

tacaswell commented May 7, 2020

Can you please check https://bugs.python.org/issue33725 In particular https://bugs.python.org/issue33725#msg329923 which suggests that adding

mp.set_start_method('forkserver')

will fix this problem on all versions of python3 (the reason upgrading to 3.8.2 is that this is now the default). This seems like a better fix than forcing an old Matplotlib version.

The pin should be removed only when the issue is resolved on MPL side.

This is not a MPL bug, this is a python + multi-process + OSX bug.

Copy link
Member

@twiecki twiecki left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's see if the forkserver thing fixes the problem instead.

@AlexAndorra
Copy link
Contributor Author

Thanks a lot for the proposed solution @tacaswell !
I still have what might seem like a trivial question: what is the mp referring to in mp.set_start_method('forkserver')?

Depending on the answer, it seems to me that the fix could be incorporated into PyMC source code (where is to be determined), or added by each user individually if they encounter the bug. Am I right?

@lucianopaz
Copy link
Contributor

lucianopaz commented May 7, 2020

We have to be aware that fork server isn't available on windows, so mp.set_start_method('forkserver') should be used depending on the os.

what is the mp referring to in...?

mp stands for multiprocessing. The different process start methods can be found here

@AlexAndorra
Copy link
Contributor Author

Thanks for the link @lucianopaz, very useful!

  • IIUC, we should use multiprocessing.get_context("forkserver") instead of multiprocessing.set_start_method('forkserver'), because it looks safer:

A library which wants to use a particular start method should probably use get_context() to avoid interfering with the choice of the library user.

  • Woud __init__.py at the root be a good place to do that? There already is a __set_compiler_flags function there, that changes the theano config depending on the platform (Darwin or Windows), so this looked liked a good place to me.

  • Since this is fixed in python >= 3.8.2, should we also check the user's python version, and modify the start method only for python < 3.8.2? Note that I don't know if that's possible and how to do that.

Disclaimer: I'm punching well above my league here, so please tell me if I talk nonsense!

@tacaswell
Copy link

@AlexAndorra Sorry for not being clearer.

@lucianopaz
Copy link
Contributor

IIUC, we should use multiprocessing.get_context("forkserver") instead of multiprocessing.set_start_method('forkserver'), because it looks safer

That's great! I didn't know about the get_context

Woud init.py at the root be a good place to do that? There already is a __set_compiler_flags function there, that changes the theano config depending on the platform (Darwin or Windows), so this looked liked a good place to me.

That sounds good to me.

Since this is fixed in python >= 3.8.2, should we also check the user's python version, and modify the start method only for python < 3.8.2? Note that I don't know if that's possible and how to do that.

I don't think it's necessary to also test the python version here.

@tacaswell
Copy link

The default going forward is 'spawn', however I am not sure on the trade offs between 'forksever' and 'spawn'.

@AlexAndorra
Copy link
Contributor Author

Ha ha, me neither @tacaswell 😅 What I think I understood is that this problem doesn't appear on python >= 3.8.2 anymore because the default start method was switched to spawn.
Manually setting it to forksever fixes it on version < 3.8.2 (correct me if I'm wrong).

@lucianopaz, thanks for the feedback 👌 I'll start marking the changes ASAP, probably on another PR as this is very different than this branch now.

@twiecki
Copy link
Member

twiecki commented May 11, 2020

Closing this then to wait for other PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants