Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

awkward._v2 imports broken with subpackages #1771

Closed
douglasdavis opened this issue Oct 5, 2022 · 8 comments · Fixed by #1857
Closed

awkward._v2 imports broken with subpackages #1771

douglasdavis opened this issue Oct 5, 2022 · 8 comments · Fixed by #1857
Labels
bug (unverified) The problem described would be a bug, but needs to be triaged

Comments

@douglasdavis
Copy link
Contributor

douglasdavis commented Oct 5, 2022

Version of Awkward Array

2.0.0rc1

Description and code to reproduce

In dask-awkward we have some imports that break after installing from the main branch (but not removing the _v2 imports). For example: https://github.com/ContinuumIO/dask-awkward/blob/7a0cceeade6aa8f64f5f3c29fb59c75a807230f7/src/dask_awkward/lib/io/io.py#L9
This is because awkward._v2.types used to be package and now it's just a module with the updated repo structure.

Note that dask-awkward is fine if we transition to not importing _v2 things (https://github.com/ContinuumIO/dask-awkward/pull/86)

@douglasdavis douglasdavis added the bug (unverified) The problem described would be a bug, but needs to be triaged label Oct 5, 2022
@agoose77
Copy link
Collaborator

agoose77 commented Oct 5, 2022

This is a known "bug", and it seems like it is the least-worst option of the solutions to implementing a v2 shim. It doesn't seem like there is a way to import the same package under different names in sys.modules, without Python creating a new module object. We could pre-populate sys.modules with pre-generated aliases, but then if we miss something, an erroneous import would give the user new state and overwrite the name of the submodule in the parent module namespace.

Given that v2 has had so many bug fixes over the last couple of versions, and we made no guarantees initially that we would provide a shim, my feeling is that we just have to put up with a hard-breakage if users aren't already treating ak._v2 as a non-package. @jpivarski thoughts?

@jpivarski
Copy link
Member

The code change of

import awkward._v2 as ak

to

import awkward as ak

is an easy one. What's hard is coordinating versions of packages, particularly in a CI matrix, in which each one might get a different version from PyPI.

But if the shim isn't working or only partially works, then that's even worse, since downstream users have to adapt anyway, and it's not the simple adaptation described above.

Okay, so I'm thinking that you're right and we should give up on the idea of a shim, just doing a hard break. That's also different from what I told @alexander-held, so I'm including him in the loop here, too. Since we haven't released even an RC yet, going back and forth like this won't be too disruptive.

Suppose instead of a shim, we ask all downstream users to replace

import awkward._v2 as ak

with

try:
    import awkward._v2 as ak
except ImportError:
    import awkward as ak

As of 2.0.0rc1 (not released yet), we would have no _v2 submodule. A user who hasn't adapted yet would get an error, but the above fix is easy. The try-except block would only do the wrong thing (pass through v1) if the version of Awkward predates the first release of a _v2 submodule, which is so long ago, I can't even remember. (A year, say least.) Only early adopters of v2 would be affected, so I think no one would fall between the cracks.

Maybe it needs to be ModuleNotFoundError, but that's a subclass of ImportError, so it still works.

Anyone who has a testing matrix that's not complicated by Python versions that Awkward v2 doesn't support (3.6, specifically) can just

import awkward as ak

after 2.0.0rc1, and that includes Uproot.

@douglasdavis
Copy link
Contributor Author

@agoose77 thanks for the explanation of the imports issue. And wearing my early adopter hat (and having easily transitioned to removing ._v2 usage) I don't feel strongly either way about the eventual conclusion on having or not having the v2 shim.

@alexander-held
Copy link
Member

I've been using

try:
    import awkward._v2 as ak
except ModuleNotFoundError:
    import awkward as ak

so far, with the intention of monitoring the developments to see whether I would need to update this. I also do not feel strongly either way, this was easy to adopt.

I imagine the most challenging setup for projects depending on awkward is a potential conflict due to other dependencies that may put an upper bound on awkward due to a delayed adoption. Such a project would in any case be using awkward<2 in the meantime and _v2 would be there regardless.

@martindurant
Copy link
Contributor

There is a simple fix: make "_v2" a symbolic file link back to the root directory. The only downside is that you would have an akwards._v2._v2._v2..., but that's fine.

@agoose77
Copy link
Collaborator

agoose77 commented Oct 6, 2022

I'm not sure that will work 🤔; wheels don't support symlinks. We could create a v2 symlink at runtime (only on unix), but that's fairly unappealing. Furthermore, if we could establish the link by other means, we would then end up with two separate instances of Awkward in the same Python interpreter which I'm not sure about. On the one hand, that's what we had with v1/v2, but on the other, awkward.v2 is the same as awkward and it would feel unnatural to have them behave distinctly.

@jpivarski
Copy link
Member

A decision needs to be made on this.

We've tried two ways of implementing an awkward._v2 that points back to awkward, and they both seemed fine in our first tests and then failed in subtle ways (in someone else's workflow!) later. I suspect that there isn't a foolproof method, and these subtle failures are worse than an outright ModuleNotFoundError if there's no awkward._v2 at all.

So I think we should have no awkward._v2 at all. Here's what the consequences of that would be.

  • Early adopters of v2 have import awkward._v2 as ak in their scripts or libraries. When they install 2.0.0 or one of its RC releases, it will fail with ModuleNotFoundError.
    • If their code is always running under 2.0.0(rcX), they can replace the import awkward._v2 as ak with import awkward as ak.
    • If their code is sometimes running under 2.0.0(rcX) and sometimes v2 within 1.8.X, 1.9.X, or 1.10.X, perhaps because they have a test matrix with one of the tests under Python 3.6 and 2.0.0(rcX) doesn't support Python 3.6, then this try-except would give them v2 in all circumstances:
try:
    import awkward._v2 as ak
except ModuleNotFoundError:
    import awkward as ak
  • Mainstream users who encounter v2 for the first time when pip install awkward gives them 2.0.0 (not RC) will just see v1 behavior switch over to v2 behavior. Depending on what they use, they might not notice any difference at all: there's a large subset of the API that is backward-compatible. If they do see changes, it will be like any other version upgrade: either they'll have to adapt or they'll have to cap their awkward version until they're up to date. A submodule named v2 would never factor into that at all.
  • The only way to use v1 and v2 in the same process is with one of the releases that includes an incomplete v2 implementation, from 1.8.0rc1/1.8.0 (January 7, 2022/March 1, 2022) onward (introduced in C++ refactoring: ak._v2 namespace is now filled with the right symbols. #1208). We will not recommend this because the most recent v2 someone can get this way is in 1.10.1 (September 22, 2022).

This set of consequences sound pretty good to me. It would be strictly nicer to provide a _v2 stub, so that early adopters can adjust their import statements after the 2.0.0 release, but the consequences of a botched _v2 stub are not nice, and I'm not confident that there's a clean way to do it.

Actually, a little better than having no _v2 at all: we introduce a _v2 submodule that immediately raises ModuleNotFoundError with a descriptive error message saying that import awkward as ak or the try-except technique should be used instead. I'll start a PR and we can continue the discussion there.

@agoose77
Copy link
Collaborator

agoose77 commented Nov 1, 2022

Personally, I'm in favour of this. We don't want people to be running on _v2 forever, and so at some point we need to push them over. I don't see any huge issue with doing that now rather than in N months. At least now it is less likely to be in released packages that are part of some dependency chain.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug (unverified) The problem described would be a bug, but needs to be triaged
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants