Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Specify build requirements in pyproject.toml (PEP 517) #25227

Closed
wants to merge 12 commits into from

Conversation

xhochy
Copy link
Contributor

@xhochy xhochy commented Feb 8, 2019

Verified this using the Dockefile:

FROM alpine:3.8
RUN apk update
RUN apk add g++ libstdc++ python3-dev bash git
RUN pip3 install --upgrade pip

COPY dist/pandas-0.25.0.dev0+86.gba77fb900.tar.gz /
RUN ls -l /
RUN pip3 install -vvv /pandas-0.25.0.dev0+86.gba77fb900.tar.gz

@pep8speaks
Copy link

pep8speaks commented Feb 8, 2019

Hello @xhochy! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found:

There are currently no PEP 8 issues detected in this Pull Request. Cheers! 🍻

Comment last updated at 2019-05-12 21:02:14 UTC

@xhochy
Copy link
Contributor Author

xhochy commented Feb 8, 2019

@jorisvandenbossche I hope this works around the issues that were reported before. If you know of others that came with pyproject.toml please link me to them. I'll have a look then.

Copy link
Member

@jorisvandenbossche jorisvandenbossche left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Added some comments.

pyproject.toml Outdated
@@ -0,0 +1,3 @@
# When changing the version numbers here, also adjust them in `setup.py`
[build-system]
requires = ["setuptools", "wheel", "cython >= 0.28.2", "numpy >= 1.12.0"]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need a fixed old numpy version, depending on the python version.
At least it's what we did last time (77bfe21, and it is also what scipy did before they removed it again).
This is because the build happens in an isolated build environment (and with above syntax will always use the latest version), so if your actually installed numpy is then older, it gives problems.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I'll add that to the toml.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually this fails in my tests as numpy==1.12.1 doesn't compile on Alpine.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually this fails in my tests as numpy==1.12.1 doesn't compile on Alpine.

Is that a known issue with numpy?
But not really sure how to deal with that. I think we need the pins here for the general case. It might be, that if our setup does not work for a specific use case / platform, you will need to use a --no-build-isolation

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this is a known issue with NumPy and was fixed in the 1.13 series. I have now pinned the Python 3.6 build also to 1.13.1 and added an explanatory comment. As Alpine+Python3.6 was the initial origin for tackling the pyproject.toml again, I made this change.

setup.py Outdated
# workaround bug in pip 19.0
here = os.path.dirname(__file__)
if here not in sys.path:
sys.path.insert(0, here)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this needed?
For versioneer?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, added a comment

Copy link
Member

@jorisvandenbossche jorisvandenbossche Feb 8, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTW, this workaround might not be needed any more, as it should be fixed in setuptools/pip (pypa/pip#6212), which will be released in the coming hours it seems

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would keep this in here for a bit as the pip release for that was not done. (From another bug report it seems like it will be in the release coming out this weekend but I won't count on that).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

New Pip/Setuptools to fix this have been released in the meantime.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed this again

@codecov
Copy link

codecov bot commented Feb 8, 2019

Codecov Report

Merging #25227 into master will not change coverage.
The diff coverage is n/a.

Impacted file tree graph

@@           Coverage Diff           @@
##           master   #25227   +/-   ##
=======================================
  Coverage   92.15%   92.15%           
=======================================
  Files         166      166           
  Lines       52294    52294           
=======================================
  Hits        48194    48194           
  Misses       4100     4100
Flag Coverage Δ
#multiple 90.62% <ø> (ø) ⬆️
#single 42.29% <ø> (+0.02%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 1d1b14c...6bc1d5e. Read the comment docs.

@codecov
Copy link

codecov bot commented Feb 8, 2019

Codecov Report

❗ No coverage uploaded for pull request base (master@b48d1ff). Click here to learn what that means.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff            @@
##             master   #25227   +/-   ##
=========================================
  Coverage          ?   91.73%           
=========================================
  Files             ?      173           
  Lines             ?    52839           
  Branches          ?        0           
=========================================
  Hits              ?    48472           
  Misses            ?     4367           
  Partials          ?        0
Flag Coverage Δ
#multiple 90.29% <100%> (?)
#single 41.72% <100%> (?)
Impacted Files Coverage Δ
pandas/compat/numpy/__init__.py 93.1% <100%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update b48d1ff...cec7053. Read the comment docs.

@gfyoung gfyoung added Build Library building on various platforms Regression Functionality that used to work in a prior pandas version Dependencies Required and optional dependencies labels Feb 9, 2019
@jorisvandenbossche
Copy link
Member

Yes, this is a known issue with NumPy and was fixed in the 1.13 series. I have now pinned the Python 3.6 build also to 1.13.1 and added an explanatory comment. As Alpine+Python3.6 was the initial origin for tackling the pyproject.toml again, I made this change.

This might ensure it works on alpine+python3.6, but then can give errors on other platforms in case you would eg do pip install pandas numpy=1.12 --no-binary :all: (or without the no-binary on platforms that don't have wheels, but where numpy 1.12 works).

I am not sure what the best way to go is here (although I am inclined to keep it at the lowest we support (1.12), so it works when numpy 1.12 works on the specific platform).
But, it is a problem that many other packages also will encounter when adding (back) pyproject.toml file, like scipy, scikit-learn, ... So it might be interesting to know what they would do / to be consistent (as often those packages will be installed together)

cc @rgommers (since you were involved in the pyproject.toml file in scipy)

@rgommers
Copy link
Contributor

Pip support should finally be good enough now with pip 19, we're adding it back in SciPy soon: scipy/scipy#8734

(although I am inclined to keep it at the lowest we support (1.12), so it works when numpy 1.12 works on the specific platform).

Yes, this is the only correct way to do it, you have to build against the lowest supported numpy version for each Python version. If that lowest version varies (e.g. Python 3.7 + numpy 1.12.x won't work), then specify multiple versions like so: PyWavelets/pywt@5e6d53a#diff-522adf759addbd3b193c74ca85243f7d

@jorisvandenbossche
Copy link
Member

Yes, this is the only correct way to do it, you have to build against the lowest supported numpy version for each Python version.

Yes, that's what we are doing (and did before as well before we removed the pyproject.toml again .. just like scipy did as well :)).
But the question was more specifically about the situation for alpine linux, where apparently, numpy 1.12 doesn't work for python 3.6. So pinning to numpy 1.12, will mean that people on alpine (where typically no wheels are available so this pyproject.toml is of relevance) will still need to do a --no-build-isolation or --no-use-pep517 to be able to install pandas.
But I suppose there is not really a way around that?

@jreback
Copy link
Contributor

jreback commented Feb 15, 2019

maybe soln is to bump numpy requirement to 1.13? or higher

@xhochy
Copy link
Contributor Author

xhochy commented Feb 15, 2019

Is is really that common that people are using Pandas with a numpy version older than a year? Updating numpy has been quite smooth in the last years, so I would have expected that this is one of the simpler migrations.

@jreback
Copy link
Contributor

jreback commented Feb 15, 2019

agree if bumping helps here let’s do it
only downside is that then this patch must be for 0.25 but that should be soonish anyhow

@jorisvandenbossche
Copy link
Member

only downside is that then this patch must be for 0.25 but that should be soonish anyhow

I would keep this patch for 0.25 anyway. We don't want to experiment with pyproject.toml in a bug fix release (it should not give that much troubles as last time we added it, but still)

@jorisvandenbossche jorisvandenbossche added this to the 0.25.0 milestone Feb 15, 2019
@rgommers
Copy link
Contributor

But I suppose there is not really a way around that?

Indeed there isn't. There are no features of pyproject.toml that allow you to make the numpy version that you build against depend on Alpine/MUSL vs other linux distros.

Is is really that common that people are using Pandas with a numpy version older than a year?

Yes it is. I'm quite sure that that's more common than people using non-manylinux-compatible distros.

maybe soln is to bump numpy requirement to 1.13? or higher

In this case, SciPy is anyway bumping to 1.13.3 as minimum numpy version for the next release. So bumping won't impact many users. So +1 for that solution.

@xhochy
Copy link
Contributor Author

xhochy commented Feb 17, 2019

The pip version on Azure is not yet 19.0.2. I wonder whether it would be better to have the workaround in place to support the missing current directory in the sys.path rather than updating pip in the CI jobs.

@TomAugspurger
Copy link
Contributor

Does this fully close
#25193, or is there an additional issue there with cythonize being called unconditionally?

@jorisvandenbossche
Copy link
Member

If we are fine with cython being a build dependency (also for sdists, as pyproject.toml cannot really distinguish from source vs from sdist builds), then the unconditional call to cythonize should not really be a problem.

But as I said already before, I don't think this PR closes #25193 for 0.24.2 (we don't want to start using pyproject.toml in a bug fix release, and also the numpy version increase should go in 0.25).

@jorisvandenbossche
Copy link
Member

I wonder whether it would be better to have the workaround in place to support the missing current directory in the sys.path rather than updating pip in the CI jobs.

It's not that urgent to merge this I think, so we might still be able to remove it before that.
But, that said, we should maybe also think about adding versioneer to the build-system requires, instead of relying on the one included in the source, if we want to prevent that we need to have the current directory on the python path.

@h-vetinari
Copy link
Contributor

Bumping numpy comes with a bit more complexity (esp. re: compat code) - I've started a PR dedicated to just that in #25554.

@xhochy
Copy link
Contributor Author

xhochy commented Mar 5, 2019

@h-vetinari Thanks for taking care of the NumPy bump. I hope I can spend some time on this again once the bump is merged.

@jorisvandenbossche
Copy link
Member

Yes, it is a good idea to do the numpy version bump in a separate PR.

@h-vetinari
Copy link
Contributor

@xhochy: @h-vetinari Thanks for taking care of the NumPy bump. I hope I can spend some time on this again once the bump is merged.

The numpy bump has been merged a few days ago. :)

@jreback
Copy link
Contributor

jreback commented Apr 5, 2019

@xhochy can you merge master. we are now in 1.13 min, so you can remove a lot of these changes.

@jreback
Copy link
Contributor

jreback commented Apr 20, 2019

can you merge master

@jreback jreback modified the milestone: 0.25.0 Apr 20, 2019
@jreback
Copy link
Contributor

jreback commented May 7, 2019

https://travis-ci.org/pandas-dev/pandas/jobs/529095120

ERROR: Error installing 'file:///home/travis/build/pandas-dev/pandas': editable mode is not supported for pyproject.toml-style projects. pip is processing this project as pyproject.toml-style because it has a pyproject.toml file. Since the project has a setup.py and the pyproject.toml has no "build-backend" key for the "build_system" value, you may pass --no-use-pep517 to opt out of pyproject.toml-style processing. See PEP 517 for details on pyproject.toml-style projects.
The command "ci/setup_env.sh" failed and exited with 1 during .

seems like pypa/pip#6434

@jorisvandenbossche
Copy link
Member

Yes, there was indeed a regression with the latest pip 19.1.0 for projects using editable installs and already having pyproject.toml file (although I think pip 19.1.1 which was just released should mostly fix that)

We could choose to postpone adopting a pyproject.toml somewhat more until the dust settles around it (I think people are working on a PEP update to include editable installs in PEP517/518, but that will take some more time).

@@ -415,6 +415,7 @@ Reshaping
- Bug in :func:`pandas.cut` where large bins could incorrectly raise an error due to an integer overflow (:issue:`26045`)
- Bug in :func:`DataFrame.sort_index` where an error is thrown when a multi-indexed DataFrame is sorted on all levels with the initial level sorted last (:issue:`26053`)
- Bug in :meth:`Series.nlargest` treats ``True`` as smaller than ``False`` (:issue:`26154`)
- Bug in :func:`factorize` when passing an ``ExtensionArray`` with a custom ``na_sentinel`` (:issue:`25696`).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

some merge left-overs?

setup.py Outdated
# commit: 0f16dc307b72e613e71067b6498f82728461434a
#
# ensure cwd is on sys.path
# workaround bug in pip 19.0
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think all of this can be removed now, see also discussion below

@jreback jreback removed this from the 0.25.0 milestone May 12, 2019
@TomAugspurger
Copy link
Contributor

Is this a blocker for 0.25.0, or were all the blocking issue resolved elsewhere?

@jorisvandenbossche
Copy link
Member

I don't think this is urgent (eg #25193 which triggered, you already solved that on master)

@jreback
Copy link
Contributor

jreback commented Jun 27, 2019

closing; we need to see if we actually need this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Build Library building on various platforms Dependencies Required and optional dependencies Regression Functionality that used to work in a prior pandas version
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Pip install of version 0.24 is broken for platforms without wheels
8 participants