Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

We shouldn't require a recent version of setuptools to install xarray #4295

Closed
shoyer opened this issue Aug 1, 2020 · 33 comments · Fixed by #4296
Closed

We shouldn't require a recent version of setuptools to install xarray #4295

shoyer opened this issue Aug 1, 2020 · 33 comments · Fixed by #4296

Comments

@shoyer
Copy link
Member

shoyer commented Aug 1, 2020

@canol reports on our mailing that our setuptools 41.2 (released 21 August 2019) install requirement is making it hard to install recent versions of xarray at his company:
https://groups.google.com/g/xarray/c/HS_xcZDEEtA/m/GGmW-3eMCAAJ

Hello, this is just a feedback about an issue we experienced which caused our internal tools stack to stay with xarray 0.15 version instead of a newer versions.

We are a company using xarray in our internal frameworks and at the beginning we didn't have any restrictions on xarray version in our requirements file, so that new installations of our framework were using the latest version of xarray. But a few months ago we started to hear complaints from users who were having problems with installing our framework and the installation was failing because of xarray's requirement to use at least setuptools 41.2 which is released on 21th of August last year. So it hasn't been a year since it got released which might be considered relatively new.

During the installation of our framework, pip was failing to update setuptools by saying that some other process is already using setuptools files so it cannot update setuptools. The people who are using our framework are not software developers so they didn't know how to solve this problem and it became so overwhelming for us maintainers that we set the xarray requirement to version >=0.15 <0.16. We also share our internal framework with customers of our company so we didn't want to bother the customers with any potential problems.

You can see some other people having having similar problem when trying to update setuptools here (although not related to xarray): https://stackoverflow.com/questions/49338652/pip-install-u-setuptools-fail-windows-10

It is not a big deal but I just wanted to give this as a feedback. I don't know how much xarray depends on setuptools' 41.2 version.

I was surprised to see this in our setup.cfg file, added by @crusaderky in #3628. The version requirement is not documented in our docs.

Given that setuptools may be challenging to upgrade, would it be possible to relax this version requirement?

@shoyer
Copy link
Member Author

shoyer commented Aug 1, 2020

It looks like the actual hard requirement for setup.cfg may be setuptools 30.3.0 from 8 December 2016:
https://setuptools.readthedocs.io/en/latest/setuptools.html#configuring-setup-using-setup-cfg-files

This is shortly before the release date of Python 3.6.0, so I suspect this would be a fine requirement to impose for our users.

@crusaderky
Copy link
Contributor

I was surprised to see this in our setup.cfg file, added by @crusaderky in #3628. The version requirement is not documented in our docs.

It is documented:
https://xarray.pydata.org/en/stable/installing.html#minimum-dependency-versions

xarray adopts a rolling policy regarding the minimum supported version of its dependencies:
[...]
all other libraries: 6 months

The requirement is explicitly set in setup.cfg because don't ship what you don't test.

I see no problem in explicitly adding a special case to the policy for setuptools - I guess 24 months should be fine for all? I do not recommend just going back to "whatever the very first version that works" as we were doing before the introduction of the rolling policy.

I'm preparing a PR...

@shoyer
Copy link
Member Author

shoyer commented Aug 1, 2020

I think setuptools should be treated more like Python/NumPy because it's hard installation requirement (and can be challenging to install).

The requirement is explicitly set in setup.cfg because don't ship what you don't test.

My sense is that setuptools is somewhat unique as a dependency because it's only used as part of installation

I am supportive of bumping minimum version requirements according to our policy when it serves a purpose, but I don't think we should do it "just because we can".

I see no problem in explicitly adding a special case to the policy for setuptools - I guess 24 months should be fine for all? I do not recommend just going back to "whatever the very first version that works" as we were doing before the introduction of the rolling policy.

24 months sounds about right to me. Or given that setuptools is typically bundled with Python, maybe "Whatever version of setuptools corresponds to our oldest supported Python release"?

(This is assuming that it's still possible to get that version of setuptools in CI environments. If not, we may need to reconsider...)

@shoyer
Copy link
Member Author

shoyer commented Aug 1, 2020

I'm preparing a PR...

Thanks! This is greatly appreciated :)

@ChrisBarker-NOAA
Copy link

As I said in the mailing list thread, if setuptools is only required for installation, it's not really a requirement at all. In fact, I'm pretty sure pip requires it, so it will always be there if pip is there.

But I see:

install_requires =
    numpy >= 1.15
    pandas >= 0.25
    setuptools >= 41.2  # For pkg_resources

in setup.cfg -- so if you are using pkg_resources at run time, then you do need it here, in some version :-(

(the :-( is for pkg_resource being built in to setuptools -- it REALLY should be a separate package~!)

In regards to "don't ship what you don't test" -- if you take that philosophy, which is a good one, then you should be testing with-as-old-as-they-can-be versions of the dependencies anyway. which it looks like #4296 is doing :-)

@crusaderky
Copy link
Contributor

then you should be testing with-as-old-as-they-can-be versions

We used to do that and we abandoned that policy in favour of the current rolling window, because it made developers (particularly the less experienced ones) waste a considerable amount of effort retaining backwards compatibility with obsolete versions of the dependencies that nobody cared about.

@crusaderky
Copy link
Contributor

setuptools-scm doesn't work with setuptools < 36.7 (Nov 2017).
The conda metadata is malformed for setuptools < 38.4 (Jan 2018) - it's missing a timestamp which prevents the minimum versions tool from working.

Is everybody happy with >= 38.4?

@ChrisBarker-NOAA
Copy link

it made developers (particularly the less experienced ones) waste a considerable amount of effort retaining backwards compatibility with obsolete versions of the dependencies that nobody cared about.

that all depends on what you mean by "as-old-as-they-can-be" means -- to me, it means as old as they can be that have the features you need. IN this case, setuptools was being upgraded, even though xarray didn't need any of the new functionality.

What are you actually using for development? One option is to keep versions pinned to old versions until a developer wants a feature that requires a newer version -- then you update that one, if it's within your rolling window

But I can see that keeping track of what you need is tricky -- so the rolling window is a lot easier to manage

Honestly, the real problem here is that setuptools is both a build, install, and run-time dependency -- which is going to require special treatment.

@crusaderky
Copy link
Contributor

The key problem in "as-old-as-they-can-be" is that you end up with dependencies that depend on each other and are 1 year apart in release date. Since very frequently other projects are a lot less rigorous with testing vs old dependencies (if they test at all!) that has caused an endless amount of breakages in the past. Testing with all packages as of 1 year ago is a lot less bug-prone and time-wasting.

@dopplershift
Copy link
Contributor

Rolling window seems fine to me. I will say that I don't generally bother bumping that on other projects until we run into an issue/new feature that necessitates it, though.

@crusaderky
Copy link
Contributor

PR ready for review

@keewis
Copy link
Collaborator

keewis commented Aug 2, 2020

the reason we install-depend on setuptools is that we use pkg_resources for constructing the version with setuptools_scm and for getting the paths to images and css used for the HTML repr. Both of these can be replaced with modules in the standard library: importlib.resources (available since 3.7) and importlib.metadata (available since 3.8). Both also have backports (importlib-resources and importlib-metadata), so we should be able to get rid of the install-dependency on setuptools.

setup_requires has been deprecated in favor of specifying the build dependency in pyproject.toml. Maybe we should use that instead? That way we don't have to care about users failing to bootstrap setuptools because pip will create a isolated environment with just the build dependencies, build the source into a wheel and then install that without using setuptools. So I think that means you can have a old version of setuptools installed in your environment and still pip-install a package that requires a newer version.

An additional advantage is that our setup.py can become

from setuptools import setup

if __name__ == "__main__":
    setup()

(we can't remove setup.py entirely because it's required for editable installs)

@crusaderky
Copy link
Contributor

importlib.resources (available since 3.7) and importlib.metadata (available since 3.8). Both also have backports (importlib-resources and importlib-metadata), so we should be able to get rid of the install-dependency on setuptools.

-1 from me, because dependencies that are only required on a specific Python version are incompatible with noarch conda recipes. This would force us to change conda to build one package for each OS x python version.

@keewis
Copy link
Collaborator

keewis commented Aug 2, 2020

I was going to suggest using preprocessing selectors, but as you say these are incompatible with noarch because they're used at package build time, not when installing.

The pint recipe worked around that by unconditionally installing importlib_metadata, even on python 3.8. Not sure if that's the best option, though. Other than that, NEP29 states that we can drop python 3.6 since Jun 23 2020, so if we bump python we could use stdlib's importlib.resources.

@shoyer
Copy link
Member Author

shoyer commented Aug 4, 2020

My preference would be to say that we support setuptools 30.3 and newer, even if we can't test it:

  1. setuptools is extremely stable, compared to any of our other dependencies. I have a very hard time imagining any of the limited functionality we use breaking.
  2. It is apparently tricky to upgrade, at least it can't be done automatically with pip install on some platforms.

I don't think it's worth the hassle of switching to importlib backports, at least for now. Likewise, I would lean against switching to pyproject.toml until it is well established. There's just not much to be gained by switching to novel packaging technology...

@dopplershift
Copy link
Contributor

I'm not here to argue, but pyproject.toml was introduced in PEP-518, which was accepted over 4 years ago. I know packaging moves slowly but I'm curious how long something has to be around before becoming "established" and ceases to be "novel". 😉

@shoyer
Copy link
Member Author

shoyer commented Aug 4, 2020

I'm not here to argue, but pyproject.toml was introduced in PEP-518, which was accepted over 4 years ago. I know packaging moves slowly but I'm curious how long something has to be around before becoming "established" and ceases to be "novel". 😉

It looks like pip has supported pyproject.toml since version 10.0.0, on 2018-04-14. That's more recent than Python 3.6 (which, to be fair, we are about to drop support for).

Consistent with my earlier suggestion about setuptools, I think we should support the oldest packaging tools that were released at the time of our earliest supported Python release. So if we switch to requiring Python 3.7 in our next major release, we could switch to using pyproject.toml, too.

@dopplershift
Copy link
Contributor

Wow...two years for pip to gain support...I would not have expected that.

@ChrisBarker-NOAA
Copy link

and when pyproject.toml was supported isn't even the point -- for whatever reason, it's still not very comon practice.

But when it comes to dependencies, setuptool is an odd one -- as it is used bioth to build teh pacakge and potentially at run time. I can rant about what a bad design that is, but it's a fact.

However, in the case of xarray, it seems to be used at runtime very sparingly:

(and then, only pkg_resources -- which really should be distributed by itself)

in init.py to get teh version number:
__version__ = pkg_resources.get_distribution("xarray").version

I would argue that that is NOT a good way to manage your versioning, but certainly not the only way.

and in formatting_html.py

pkg_resources.resource_string("xarray", fname).decode("utf8")

I'm not sure I know what that actually does.

If it were me, I'd replace those and not have a setuptols run time dependency at all.

But in any case, unless pkg_resources has changed in recent years, you could probably have th run-time setuptools dependency be really old without problem.

That is separate from the install time dependency, which is really outside xarray anyway.

To summarize:

setuptools is used:

at built time: for that, xarray can use as recetn a version as you want.

at install time: this does overlap if folks are doing a source install -- but xarray has wheels up on pypi (and conda is pre-built) IN that case, it's up to teh user / pip to have setuptools if needed. When installing a wheel, I"m not sure setuptools is used at all, but in any case, pip does require it.

At run time: here is where the dependency matters: if I were you I"d get rid of the run time dependency, but if not, an old version should be fine here.

Looking at the setup,cfg file, I see:

install_requires =
    numpy >= 1.15
    pandas >= 0.25
    setuptools >= 41.2  # For pkg_resources
setup_requires =
    setuptools >= 41.2
    setuptools_scm

note that you don't have to have the same version in both of these stanzas -- I think if you push the install_requires version back, you'll solve this problem without having to worry about building with old versions.

@shoyer
Copy link
Member Author

shoyer commented Aug 4, 2020

and in formatting_html.py

pkg_resources.resource_string("xarray", fname).decode("utf8")

This is used for pulling out static files (CSS/HTML) for xarray's HTML repr.

We could inline these resources as Python strings, but I think using separate files is cleaner and to my knowledge there is no better alternative than pkg_resources prior to Python 3.7.

On Python 3.7 we could use importlib.resources: https://docs.python.org/3/library/importlib.html#module-importlib.resources

@ChrisBarker-NOAA
Copy link

well, I've had fine luck with simply using file and a relative path to get files. Though I suppose that would fail with a zipped package -- is xarray zipsafe otherwise?

Also, I appreciate the goal here, but in this case, it's only two files, though the CSS is pretty big.

but in any case, you have something that works, and you'd be hard pressed to find a python install that doesn't have setuptools -- so I'd say bump the Install-requires version back to an old one, and you're done.

NOTE: I'm still feeling the scars from setuptools nightmares bundling stand-alone applications from 15 years ago -- so I still don't like to use it at run-time -- but yeah, probably not relevant anymore :-)

and maybe revisit when you're no longer supporting 3.6

@max-sixty
Copy link
Collaborator

FWIW according to https://numpy.org/neps/nep-0029-deprecation_policy.html, we can drop 3.6 support as-of June 2020.

@crusaderky
Copy link
Contributor

My preference would be to say that we support setuptools 30.3 and newer, even if we can't test it

I have tested that setuptools < 36.7 breaks setuptools-scm; the installed version becomes 0.0.0 which in turns breaks any other package that contains a minimum version check (namely, pandas).

Also, I think we agreed when we implemented NEP29 that we should not support Python 3.6.0, but only the latest patch version for any given minor version of a package. Python 3.6.11 (released 1 month ago) is shipped with setuptools 40.6.

Any pip or conda-based environment can trivially upgrade from Python 3.6.0 to 3.6.11.
The only users that have problems with getting setuptools >=38.4 (2.5 years old!!!) are those that use /usr/bin/python3 from a very old Linux distribution, which for some reason never got the patch updates of Python, AND expect everything to be compatible with the very latest python packages freshly downloaded from the internet. I mean, seriously?

@crusaderky
Copy link
Contributor

Ubuntu 18.04 ships Python 3.6.5 and setuptools 39.0.
Ubuntu 16.04 ships Python 3.5 so it's not to be taken into consideration anyway.

@ChrisBarker-NOAA
Copy link

I have tested that setuptools < 36.7 breaks setuptools-scm; the installed version becomes 0.0.0 which in turns breaks any other package that contains a minimum version check (namely, pandas).

Does it break anything if you use an older version only at run-time? I wouldn't think so.

It's not clear from the OP how they were installing -- i.e. from wheels or source, but if wheels, then pushing teh run time dependency back would fix it.

And setuptools 37 is from 20 Nov 2017, so probably safe :-)

@crusaderky
Copy link
Contributor

It's not clear from the OP how they were installing -- i.e. from wheels or source, but if wheels, then pushing teh run time dependency back would fix it.

I don't think we should be discussing a solution that works on wheels but breaks on sources...

@ChrisBarker-NOAA
Copy link

I'm making the distinction between the setuptools version used for building, and at run time.

I think you CAN expect people to have newish setuptools if they are building from source.

But that doesn't mean you have to require anyone running xarray to have a newish setuptools.

This is why there really should be separate packages for run time vs build time functionality, but that's out of our control.

@ChrisBarker-NOAA
Copy link

This is kinda out of date:

https://stackoverflow.com/questions/58521386/should-setuptools-be-included-in-setup-requires-in-python

so maybe not true anymore, but it makes the point that setup_requires should not include setuptools itself -- as it won't be read until after setuptools has been imported anyway.

This may be a bit different with everything in setup.cfg, but I'm pretty sure that's still how it's usually run by pip -- that is setup.py is run, which imports setuptools, and then when setup() from setuptools is run, it reads the setup.cfg. So it's too late to check the setuptools version.

@dopplershift
Copy link
Contributor

cough Solving the "setuptools won't work in setup_requires because it's too late" was basically the entire driving force of pyproject.toml.

@ChrisBarker-NOAA
Copy link

@dopplershift: indeed -- it's really the right way to go, but the community has been very, very slow to get there :(

BTW, I'm testing now, as as far as I can tell, pip has no problem up/downgrading setuptools. Even when installing xarray -- so the OP may really be facing a pip / setuptools / distro bug, that xarray should not try to accomodate :-)

However, some more testing shows that pip doesn't appear to try to upgrade setuptools to the version in setup_requires anyway. For example, I created a clean environment, installed setuptools version: 47.3.2, then edited the setup.cfg to:

install_requires =
numpy >= 1.15
pandas >= 0.25
setuptools >= 41.2 # For pkg_resources
setup_requires =
setuptools >= 49
setuptools_scm

note that the installed setuptools meets the spec for install_requires, but not for setup_requires.

When I run the install, it works fine, and does not upgrade setuptools, or complain about it.

If I update the install_requies setuptools version, then it does upgrade it (successfully) on install.

So I suggest that we remove the setuptools requirement from setup_requires (or keep it where it is), and bump down the install_requires version to 30.3, or 37, if you really want.

@crusaderky
Copy link
Contributor

Discussion seems to have died down here. Can we get to a consensus and wrap this up?
My vote is to simply require setuptools >= 38.4 at runtime (for which PR #4296 is ready to go).

@canol
Copy link

canol commented Aug 11, 2020

Thanks for all the work and discussion. I think requiring >= 38.4 would certainly improve our situation of adopting new versions of xarray.

Related to general dependency decisions, it would be nice to keep supporting old versions unless there is a new feature the xarray can take advantage of and the version that includes that new feature is within the policy window.

I see that there are some library dependencies with a window policy of 12 months like pandas and scipy. 1 year old software in enterprise is really young, so I might prefer a window of 2 years maybe which might help adoption by companies, but I am not very familiar with the pace of new features that get implemented in scipy scene, so maybe there are really nice features that xarray is taking advantage of, in that case just ignore my comment, I am writing it from a narrow perspective of an enterprise company.

@crusaderky
Copy link
Contributor

pandas is really unstable and its API breaks every other version. Extending its support window from 1 to 2 years would be extremely expensive and frustrating to maintain.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants