Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Increasing the ABI level to follow Nep29 #324

Closed
hmaarrfk opened this issue Aug 16, 2024 · 18 comments
Closed

Increasing the ABI level to follow Nep29 #324

hmaarrfk opened this issue Aug 16, 2024 · 18 comments
Labels

Comments

@hmaarrfk
Copy link
Contributor

Comment:

I noticed that the ABI level is set to 1.19 which is quite aggressively backward.

This is somewhat troublesome since we have encouraged the following practices:

  1. Asked users to remove numpy constraints from the run section.
  2. Asked users to rely on run-exports from numpy.

While the ABI is compatible with 1.19, many packages are likely not supporting numpy that old.

NEP29 and SPEC suggestion numpy version 1.24 today.
For example, scikit-image now has a constraint of numpy >=1.19,<3, but the runtime upstream is not tested with 1.19 and

https://conda-metadata-app.streamlit.app/?q=conda-forge%2Flinux-64%2Fscikit-image-0.24.0-py39hfc16268_1.conda

@rgommers
Copy link
Contributor

While the ABI is compatible with 1.19, many packages are likely not supporting numpy that old.

That doesn't matter at all, it's a lower bound only. If >=1.24,<3 would work, then >=1.19,<3 works too since that's a superset of versions.

Is this a hypothetical worry, or do you have a real-world problem?

@rgommers
Copy link
Contributor

For example, scikit-image now has a constraint of numpy >=1.19,<3,

Ah wait, missed this bit. You should also be setting a numpy runtime dependency constraints in your scikit-image meta.yaml, with the same bounds as you have in your pyproject.toml.

@rgommers
Copy link
Contributor

You have a direct runtime dependency on numpy, but it's missing from:
https://github.com/conda-forge/scikit-image-feedstock/blob/03b94eb66111686becbf7a6e60a62bc5191dc46f/recipe/meta.yaml#L44-L53

That's a bug (and technically always already was a bug).

@hmaarrfk
Copy link
Contributor Author

If >=1.24,<3 would work, then >=1.19,<3 works too since that's a superset of versions.

Do you want all downstream packages to have the upper bound hardcoded? The whole point of the run_exports was that downstream packages didn't have t worry about the upper bound of compatibility.

It would have to be a

- run_constraint:
    - numpy >1.23

That's a bug (and technically always already was a bug).

I'm happy to make the PR to scikit-image, but the point is to make is easy to not make these mistakes for other maintainers.

I'm with you that it is a bug, but I think that having the default point to 1.19 is really counter to the whole spirit of SPEC0 and NEP29.

In the spirit of SPEC0/NEP29, support for 1.19 should be opt-out instead of opt in.

In my mind, we should help downstream packages shed the burden of infinitely long support windows. If we can let the the defaults follow SPEC0 and/or NEP29, that would shed alot of the maintenance from ourselves in general.

@hmaarrfk
Copy link
Contributor Author

NEP29 or not, the issue is that by starting to the numpy 2.0 migration, the default lower bound unexpectedly (to me) changed from 1.22 to 1.19.

@hmaarrfk
Copy link
Contributor Author

Ultimately, @rgommers this isn't just a scikit-image problem. Even the scipy feedstock (which I would argue has the most knowledgeable eyes on it) suffers from the same problem

https://github.com/scipy/scipy/blob/v1.14.0/pyproject.toml#L55

$ mamba create --name sp scipy=1.14.0 numpy=1.21 python=3.10 --channel conda-forge --override-channels

Looking for: ['scipy=1.14.0', 'numpy=1.21', 'python=3.10']

conda-forge/linux-64                                        Using cache
conda-forge/noarch                                          Using cache
Transaction

  Prefix: /home/mark/miniforge3/envs/sp

  Updating specs:

   - scipy=1.14.0
   - numpy=1.21
   - python=3.10


  Package              Version  Build                Channel           Size
─────────────────────────────────────────────────────────────────────────────
  Install:
─────────────────────────────────────────────────────────────────────────────

  + _libgcc_mutex          0.1  conda_forge          conda-forge        3kB
  + python_abi            3.10  4_cp310              conda-forge        6kB
  + ld_impl_linux-64      2.40  hf3520f5_7           conda-forge     Cached
  + ca-certificates   2024.7.4  hbcca054_0           conda-forge     Cached
  + libgomp             14.1.0  h77fa898_0           conda-forge     Cached
  + _openmp_mutex          4.5  2_gnu                conda-forge       24kB
  + libgcc-ng           14.1.0  h77fa898_0           conda-forge     Cached
  + openssl              3.3.1  h4bc722e_2           conda-forge     Cached
  + libxcrypt           4.4.36  hd590300_1           conda-forge     Cached
  + libzlib              1.3.1  h4ab18f5_1           conda-forge     Cached
  + libffi               3.4.2  h7f98852_5           conda-forge     Cached
  + bzip2                1.0.8  h4bc722e_7           conda-forge     Cached
  + ncurses                6.5  h59595ed_0           conda-forge     Cached
  + libstdcxx-ng        14.1.0  hc0a3c3a_0           conda-forge     Cached
  + libgfortran5        14.1.0  hc5f4f2c_0           conda-forge     Cached
  + libuuid             2.38.1  h0b41bf4_0           conda-forge     Cached
  + libnsl               2.0.1  hd590300_0           conda-forge     Cached
  + xz                   5.2.6  h166bdaf_0           conda-forge     Cached
  + tk                  8.6.13  noxft_h4845f30_101   conda-forge     Cached
  + libsqlite           3.46.0  hde9e2c9_0           conda-forge     Cached
  + readline               8.2  h8228510_1           conda-forge     Cached
  + libgfortran-ng      14.1.0  h69a702a_0           conda-forge       50kB
  + libopenblas         0.3.27  pthreads_hac2b453_1  conda-forge     Cached
  + libblas              3.9.0  23_linux64_openblas  conda-forge       15kB
  + libcblas             3.9.0  23_linux64_openblas  conda-forge       15kB
  + liblapack            3.9.0  23_linux64_openblas  conda-forge       15kB
  + tzdata               2024a  h0c530f3_0           conda-forge     Cached
  + python             3.10.14  hd12c33a_0_cpython   conda-forge       26MB
  + wheel               0.44.0  pyhd8ed1ab_0         conda-forge     Cached
  + setuptools          72.1.0  pyhd8ed1ab_0         conda-forge     Cached
  + pip                   24.2  pyhd8ed1ab_0         conda-forge     Cached
  + numpy               1.21.6  py310h45f3432_0      conda-forge        6MB
  + scipy               1.14.0  py310h93e2701_1      conda-forge     Cached

  Summary:

  Install: 33 packages

  Total download: 32MB

─────────────────────────────────────────────────────────────────────────────


Confirm changes: [Y/n]

@jakirkham
Copy link
Member

jakirkham commented Aug 16, 2024

Missed this discussion when looking at the scikit-image PR

It should be possible to just add the additional numpy constraints in requirements/run

If that doesn't work, would think that is a bug in build tools (like conda-build). Though I'm pretty sure I've seen this work elsewhere

@h-vetinari
Copy link
Member

I understand the sentiment, I also didn't want to go "backward". We discussed this a lot in conda-forge/conda-forge.github.io#1997, see comments starting here.

There are two separate concerns here:

  1. what numpy ABI do we compile for
  2. what numpy API is required

The first part is where we get the 1.19 from. We could override this with NPY_FEATURE_VERSION in some activation script, but that's somewhat brittle, and Ralf made the case that we really shouldn't touch this, and keep what numpy does by default (out of a variety of considerations). This is also where the run-export comes from, i.e. it describes the correct information in terms of compile-time-to-runtime relationship.

For the second point, packages can of course require newer numpy API. That constraint should still be specified in the run requirements, as the 2.0 migrator ToDos said:

If upstream requires a minimum numpy version newer than 1.19, you can add numpy >=x.y under run:.

In retrospect that should really have been formulated as "you need to add" rather than "you can" - sorry about that. I'll open a PR to improve that.

For SciPy, we do catch the upper bound, but indeed we're missing the lower bound (and that it changed for 1.14). Thanks for pointing that out, I'll go fix that.

I'm sure we can improve the messaging on this (i.e. distinguish that removing pin_compatible doesn't mean that there can be no run-constraint on numpy), but I'm not convinced yet we need to touch the run-exports. I agree that it's not great that the numpy-portion of NEP29 gets slowed down in some vague sense, but projects are still free to bump their run-time requirements as described by NEP29, and our metadata should match that of course (but not necessarily the run-exports).

@jakirkham
Copy link
Member

For SciPy, we do catch the upper bound, but indeed we're missing the lower bound (and that it changed for 1.14). Thanks for pointing that out, I'll go fix that.

This is being done in these PRs:

I'm sure we can improve the messaging on this (i.e. distinguish that removing pin_compatible doesn't mean that there can be no run-constraint on numpy)...

Also this is being done in this PR:

It should be possible to just add the additional numpy constraints in requirements/run

Just to follow up on this point, this in fact is what is now happening the scipy case.

If we look at a recent scipy package build with this change (screenshot below), we can see it adds 3 numpy constraints to depends. They are the numpy upper bound (<2.3), the run_exports from numpy (>=1.19,<3), and the numpy lower bound now added (>=1.23.5)

Screenshot: Screenshot 2024-08-16 at 6 03 45 PM

IOW adding multiple numpy constraints to requirements/run (in addition to run_exports) works as expected

@hmaarrfk
Copy link
Contributor Author

Maybe my knowledge is outdated, but previous some solver, one of

  • conda
  • conda + libmamba
  • mamba

had real trouble with double run constraints like the one you added for scipy

        - numpy <2.{{ numpy_minor + 3 }}
        - numpy >=1.23.5

its somewhat hard to test this today in 2024 since numpy 2.4 doesn't exist.

@hmaarrfk
Copy link
Contributor Author

hmaarrfk commented Aug 17, 2024

Honestly, I probably need to get over myself and take the recommendations of NEP29/SPEC0 are so loose and non-binding that it seems that the burden continues to get push onto package maintainers to learn new "best practices" on a daily basis.

I'll just close this issue on the note that neither

had a lower bound set on numpy at the time I opened this issue. So instead of helping people make better packages, we are actively encouraging maintainers to continue to spend their resources (their own human time) to make PRs and write repo-data patches (that only core can merge).

While it is true that numpy 2.0 is ABI compatible with numpy 1.19, that doesn't mean that we have to allow this at conda-forge. I don't see anybody in the near future making a PR to the 1.19 branch in case anything accidentally breaks
https://github.com/conda-forge/numpy-feedstock/commits/numpy119/

Setting the lower bound to 1.22 would just help create a better safety net when mistakes in packages happen (and they did in the 4 popular packages I listed above)
https://github.com/conda-forge/conda-forge-pinning-feedstock/blob/main/recipe/conda_build_config.yaml#L689

edit: added 1.19 in the sentence:

While it is true that numpy 2.0 is ABI compatible with numpy 1.19

@jakirkham
Copy link
Member

Think some misunderstanding may be happening here

No one is saying packages need to support numpy's full range of versions with ABI compatibility. Packages should feel free to add tighter constraints

If anything think the improvements on the NumPy ABI side with NumPy 2 and the use of run_exports instead of pin_compatible in conda-forge, have made this easier for package maintainers to do (not harder). Today if a package needs NumPy as part of the build, they...

  • Add numpy to requirements/host
  • Copy the numpy constraint from the source's pyproject.toml's project.dependencies
  • Re-render

There is no pin_compatible (or the earlier x.x). No need for maintainers to think about what numpy versions are being built against. Support across architectures is also better aligned. We collectively have tried very hard to make building with numpy as easy as possible


If you are still feeling that there are pain points here, would encourage to reopen and discuss further. Would like us to sort out any remaining issues so that the experience of using numpy in package builds is as smooth as possible

As a particular example, if there are issues with repeated constraints, we should identify what they are and track them down and make sure they are fixed. It should also be possible to create example test cases with any package (not just numpy)

IOW this...

    - numpy >=1.23.5
    - numpy <2.0

...should be identical to one constraint using ,s for and...

    - numpy >=1.23.5,<2.0

...and solvers should be able to handle both easily (as they can map to one another). If that is not the case, we can raise issues and work on resolution

We could also look at having build tooling do this transformation for us when generating the package metadata

@h-vetinari
Copy link
Member

FWIW, with the upcoming numpy 2.1.0, we'll bump this to 1.21: numpy/numpy#27187

@rgommers
Copy link
Contributor

The bugs in recipes are unfortunate indeed. On the upside, the only reason we didn't catch them yet is that no one opened a bug report (until this one yesterday) - so in practice it is very rare to get latest version of scipy/pandas/etc. with a 3 year old numpy version.

+1 to @jakirkham's explanation. ABI is managed by run_export, and is now cleanly separated from runtime (API) dependency, so "meta.yaml should match pyproject.toml" is all one has to remember now.

@hmaarrfk
Copy link
Contributor Author

I closed this issue because I agree on the technical merits of your decisions. Numpy's ABI is compatible with down to 1.19, strictly technically we should expose that. Conda-forge shouldn't be picking and choosing features it exposes

I do think we don't do enough to set lower bounds more aggressively to help ourselves minimize our bug exposure surface.

Few people are asking for a 3 year old numpy in new packages in 2024, so why make that the default?

@h-vetinari
Copy link
Member

Few people are asking for a 3 year old numpy in new packages in 2024, so why make that the default?

Because everything is a trade-off... taking numpy's default ABI saves us some bug surface (in packaging) too, at the risk that someone installing a very old numpy might get something broken. But that's not a very common case, so IMO the current trade-off is still favourable. 🤷‍♂️

@hmaarrfk
Copy link
Contributor Author

@jakirkham where should I post this kind of feature suggestion. I was going to post on conda/grayskull but i'm not sure if that is the approproate spot.

Feature request to discuss

Is your feature request related to a problem? Please describe.
As a maintainer, I want to provide flexible packages that follow the spirit of what the upstream developers want, however, often upstream tested is limited and doesn't always cover the very lowest bounds of the allowable versions.

Often these issues arise for hardware configurations not available upstream to test on, beit osx-arm, aarch64, or just package combinations that affect how each other work (like numpy / cupy / dask / xarray dispatching)

As a maintainer, I would like the choice to restrict the lower bound of the packages to those suggested by NEP29, and not really think about it.

Describe the solution you'd like

build:
    skip: true  # [win]
    skip: true  # [py==38]

requirements:
    host:
        - numpy
    run:
        - numpy >=1.21,<3.0a0
        - scipy
        - scikit-image !=0.17.2,>0.16.0,!=0.21.0<1.0.0
        - dask >=2022.9.1

to be filtered with the command

grayskull --nep29-lower-bounds

would update on August 17 to:

  • Dropping Python 3.9 and lower
  • numpy lower bound of 1.24.0
  • scipy lower bound of 1.10.0
  • scikit-image lower bound of 0.20.0
  • dask lower bound of 2022.9.0, but the package asks for 2022.9.1 so that won't change
build:
    skip: true  # [win]
    skip: true  # [py<=39]

requirements:
   host:
      - numpy
   run:
      - numpy >=1.24.0,<3.0a0
      - scipy >=1.10.0
      - scikit-image >=0.20.0,!=0.21.0,<1.0.0
      - dask >=2022.9.1

I have a package called https://github.com/hmaarrfk/nep29 (on pypi and conda-forge) that can generate these lower bounds for you in according to my understanding of the heuristics spelled out in https://numpy.org/neps/nep-0029-deprecation_policy.html

Describe alternatives you've considered

Ultimately the answer seems to be:

The tooling should be flexibile, maintainers should opt in.

This is a solution that I think would provide an option for maintainers to opt in.

Additional context
Add any other context or screenshots about the feature request here.

@h-vetinari
Copy link
Member

where should I post this kind of feature suggestion.

I think the best would be an issue in https://github.com/conda-forge/conda-forge.github.io

rgommers added a commit to rgommers/pandas-feedstock that referenced this issue Aug 26, 2024
Follows up on conda-forge/numpy-feedstock#324 (comment), which pointed out that the dependency is currently missing and that it isn't correct to depend only on the `run_exports` of `numpy` itself (that's ABI-only and allows >=1.19).

The upstream constraints for v2.2.2 are at https://github.com/pandas-dev/pandas/blob/v2.2.2/pyproject.toml#L30-L32. Note that (unlike in `pyproject.toml`) it is not necessary to have separate lower bounds for different Python versions that correspond to the first `numpy` version with support for that Python version. A single lower bound is enough; there will not be (for example) `numpy` conda-forge packages for 1.23.0 and python 3.11. In `pyproject.toml` the constraints have to be more complex only to avoid `pip` & co trying to build a too-old version from source.
rgommers added a commit to rgommers/pandas-feedstock that referenced this issue Aug 26, 2024
Follows up on conda-forge/numpy-feedstock#324 (comment),
which pointed out that the dependency is currently missing and that it isn't
correct to depend only on the `run_exports` of `numpy` itself (that's ABI-only
and allows >=1.19).

The upstream constraints for v2.2.2 are at
https://github.com/pandas-dev/pandas/blob/v2.2.2/pyproject.toml#L30-L32. Note
that (unlike in `pyproject.toml`) it is not necessary to have separate lower
bounds for different Python versions that correspond to the first `numpy`
version with support for that Python version. A single lower bound is enough;
there will not be (for example) `numpy` conda-forge packages for 1.23.0 and
python 3.11. In `pyproject.toml` the constraints have to be more complex only
to avoid `pip` & co trying to build a too-old version from source.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants