Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: complain about CannotBeAwkward earlier, before reading data. #838

Merged
merged 2 commits into from
Feb 21, 2023

Conversation

jpivarski
Copy link
Member

This came up in Discussion #826 (Issue #833): the AwkwardForth errors occur on an Interpretation that could never have been Awkward anyway.

An early hint was that it was going through read_object_any, which means arbitrary-typed pointers, circular references, and a whole host of things that are beyond the scope of Awkward Array, but the clearest indication was that we got the "Cannot be Awkward" error in TTrees with zero entries—if it could have gotten past the AwkwardForth read, it would have encountered this error, anyway. (It didn't get past the AwkwardForth read because we don't generate AwkwardForth inside read_object_any.)

So this PR

  • takes the check out of Library.finalize
  • puts it into _ranges_or_baskets_to_arrays, on the main thread, before any reading starts1
  • fixes an unrelated error in _util, but it was uncovered by this PR.

Footnotes

  1. Well, actually, Source.chunks has already been called, so the bytes have been read from local files or a remote server has been queried, but we haven't waited for any data to come back yet and won't be decompressing it or interpreting it if it does.

@jpivarski jpivarski linked an issue Feb 20, 2023 that may be closed by this pull request
Comment on lines +573 to +580
elif model.fields is not None:
fields = []
contents = []
for field, (dtype, _) in model.fields.items():
fields.append(field)
contents.append(awkward_form(dtype, file, context))
# directly return; don't cache RecordForms in _primitive_awkward_form
return awkward.forms.RecordForm(contents, fields)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is unrelated to the main purpose of the PR, but calling the awkward_forth method at this point in the code revealed that the leaf-list case isn't handled. Somehow, test_0033-more-interpretations-2.py::test_leaflist_awkward was passing before because something must have figured out that it needs a RecordArray without calling awkward_forth. (I'm not sure which code path that is, but now, at least, this is consistent with that.)

Copy link
Collaborator

@ioanaif ioanaif left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Should tests be added for this issue?

@jpivarski
Copy link
Member Author

The file that revealed the issue is private—otherwise, that would have been used as a test.

However, every existing test with library="ak" goes through this new code path, so it is being tested in the sense of having full coverage. In fact, the unrelated bug (highlighted above) was discovered because it went through this code path, and in the old system, it somehow managed to avoid the old (late) code path. I'd take that as an indication that coverage has increased.

@jpivarski jpivarski enabled auto-merge (squash) February 21, 2023 15:50
@jpivarski jpivarski merged commit f7dee75 into main Feb 21, 2023
@jpivarski jpivarski deleted the jpivarski/complain-about-CannotBeAwkward-earlier branch February 21, 2023 15:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Error in untested AwkwardForth case
2 participants