-
Notifications
You must be signed in to change notification settings - Fork 76
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Awkward v2 update #620
Awkward v2 update #620
Conversation
The above commit has all instances of @jpivarski There are 6 places in Aryan's work had a few instances of Lastly as discussed, the typeparser was added to v2. The release hasn't made it to pypi at the time of writing, so I just did Essentially this had the same effect as directly importing ak_v1 inline and using that v1 parser, as the same tests fail, and the same tests are resolved (the 5 which were failing earlier). Please have a look at the last few tests in the CI for the error. |
I found 7 major ways in which the tests are failing:
I believe this is because somewhere in the tests we might be using ak_v1 arrays and that is conflicting.
I'm not sure what this is, but you provided a few notes on the highlevel submodule, I'll go through that, some code and get back to this test.
But somewhere along the stacktrace the typeparser code in awkward is trying to use |
src/uproot/interpretation/library.py
Outdated
cls = getattr(awkward.layout, form["class"]) | ||
cls = getattr( | ||
awkward.contents, form["class"] | ||
) # PLEASE SEE: I'M NOT SURE WHAT TO DO HERE! PLS CONFIRM |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This form
is apparently a dict that came from JSON-parsing the JSON serialization of a Form. Every JSON object representing a node of a Form tree has a "class"
attribute, which names the associated Content class.
So, for instance, you know that a particular JSON node represents a ListOffsetForm if the "class"
is "ListOffsetArray32"
, "ListOffsetArray64"
, or an unqualified name like "ListOffsetArray"
. (This JSON has to be backward compatible, so it really can have the "64"
on it or not, regardless of whether it's v1 or v2.)
This code would only have worked for the "ListOffsetArray32"
, "ListOffsetArray64"
kinds of strings in v1, so it's not completely valid anyway. What it did was getattr the Content subclass by name, such as awkward.layout.ListOffsetArray64
.
Maybe the right thing to do is to have a helper function:
def _content_cls_from_name(name):
if name.endswith("32") or name.endswith("64"):
name = name[-2:]
elif name.endswith("U32"):
name = name[-3:]
elif name.endswith("8_32") or name.endswith("8_64"):
name = name[-4:]
elif name.endswith("8_U32"):
name = name[-5:]
return getattr(awkward.contents, name)
These are all of the possible suffixes that have to be ignored in v2. Since it's a fixed set of classes, it could have been a dict mapping names to class objects, but then that dict would have to be updated if a new Content subclass is ever added to Awkward Array (unlikely). This, at least, is open-minded about new Content subclasses, as long as they stay with a limited set of suffixes.
src/uproot/interpretation/library.py
Outdated
cls = getattr(awkward.layout, form["class"]) | ||
cls = getattr( | ||
awkward.contents, form["class"] | ||
) # PLEASE SEE: AND HERE, PLS CONFIRM |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same.
src/uproot/interpretation/library.py
Outdated
return awkward.Array( | ||
awkward.contents.RecordArray([], keys=[]) | ||
) # PLEASE SEE: RECORD ARRAY API seems to have changed |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes: "keys
" is now named "fields
" and it is no longer an optional argument.
return awkward.Array( | |
awkward.contents.RecordArray([], keys=[]) | |
) # PLEASE SEE: RECORD ARRAY API seems to have changed | |
return awkward.Array(awkward.contents.RecordArray([], [])) |
src/uproot/interpretation/library.py
Outdated
out = awkward.Array( | ||
awkward.contents.RecordArray([], keys=[]) | ||
) # PLEASE SEE: RECORD ARRAY API seems to have changed |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same.
out = awkward.Array( | |
awkward.contents.RecordArray([], keys=[]) | |
) # PLEASE SEE: RECORD ARRAY API seems to have changed | |
out = awkward.Array(awkward.contents.RecordArray([], [])) |
src/uproot/interpretation/library.py
Outdated
awkward.contents.RecordArray( | ||
[], keys=[] | ||
) # PLEASE SEE: RECORD ARRAY API seems to have changed |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same.
awkward.contents.RecordArray( | |
[], keys=[] | |
) # PLEASE SEE: RECORD ARRAY API seems to have changed | |
awkward.contents.RecordArray([], []) |
src/uproot/interpretation/library.py
Outdated
awkward.contents.RecordArray( | ||
[], keys=[] | ||
) # PLEASE SEE: RECORD ARRAY API seems to have changed |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same.
awkward.contents.RecordArray( | |
[], keys=[] | |
) # PLEASE SEE: RECORD ARRAY API seems to have changed | |
awkward.contents.RecordArray([], []) |
Done.
I'll be fixing (mostly replacing) the new This error doesn't sound like it's related, since If you need to make pytest's output less noisy, you can put a @pytest.mark.skip(reason="FIXME: remove this mark before end of PR") decorator on every test that fails except the one you're working on. The pytest output will list a lot of skipped tests at the end. Then, you can leave the ones related to
Better yet: replace the v1
This On Zoom, I said that the submodules within awkward.operations.describe.type becomes awkward._v2.operations.type But for
If the new JSON string doesn't have the "
awkward._v2.forms.from_json
Is this (It's also possible to drop the word "
The argument named "
I don't see a way for v1 types to get into the v2 |
c22ada9
to
5957e0e
Compare
@jpivarski Is it possible, we might not have pushed a change? I remember you correcting the test that is currently failing. |
The failure is in the thing I thought I'd have to change in Awkward (and is therefore temporary):
The question is whether a record name like |
@jpivarski In order to solve them, I had to change the arguments of from_datashape. As already discussed, in ak v1 the default for high-level was False, and in v2 the default is True. Due to this change, the instances of from_datashape in uproot now have to explicitly mention Now, based on our meet, we did talk about how maybe it should have been |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approved!
But it will get merged after #644 because we want to do a performance test with that one (today), and not updating to v2 yet will make it easier to do that right away.
After the #644 merger into main
, I'll help merge that into this branch, and then this can be merged. That should happen later today (I can do that alone) or tomorrow.
As I said in Slack, we're having trouble with #644: it doesn't pass all tests, and so we won't merge it yet. The trouble has been traced back to Awkward Array, and it will take a long time to get an Awkward Array fix into PyPI so that Uproot's CI can see it. Therefore, we'll merge this one first, and do #644 later. |
@all-contributors please add @kkothari2001 for code |
I've put up a pull request to add @kkothari2001! 🎉 |
@all-contributors please add @kkothari2001 for code |
@kkothari2001 already contributed before to code |
* Changed everything we _think_ we need to change. * All tests pass. * Change test 0034 * Alter instances of from_datashape * Removed 'v1/v2 insensitivity' checks. We're all-in for v2 now. Co-authored-by: Jim Pivarski <[email protected]>
No description provided.