-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use custom resolver for query and eval with nested frames. #150
Conversation
Check out this pull request on See visual diffs & provide feedback on Jupyter Notebooks. Powered by ReviewNB |
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #150 +/- ##
==========================================
+ Coverage 99.47% 99.49% +0.01%
==========================================
Files 14 14
Lines 948 981 +33
==========================================
+ Hits 943 976 +33
Misses 5 5 ☔ View full report in Codecov by Sentry. |
Verify preflighting of nested expressions using AST visitation. Remove logic for splitting queries by string. Now the evaluation is handled by a nested column resolver, and the mixed-mode expressions are preflighted by examining the parsed abstract syntax tree for the query expression.
Also remove `_ensure_spacing`, no longer used.
29335eb
to
ee57673
Compare
Click here to view all benchmarks. |
This logic was copied from pd.DataFrame.query, and the accompanying comment said that it was to handle an occasional case where `self.loc[b]` would raise an error on a multi-dimensional `b`, but `self[b]` would succeed. I can't cause this error with `.loc` anymore, so the code coverage complains about the unexercised exception clause. Removing it since the limitation that inspired it seems to be gone now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you!
I'd ask for more testing and not-implemented error-handling. For example, I'm not sure that assignment with eval
is supported: .eval("nested.c = nested.a + nested.b")
or .eval("nested.b = x + nested.a").
It looks like it may also close #144 and #146? If yes, could you please add tests for covering those issues?
Yes, this PR does resolve #146, thanks for the tip! #144 is not resolved by this PR, but perhaps this same approach could be extended to cover it in a separate PR. Good catch on the assignment problem. The existing ability for |
Prevent users from depending on the public interface, and also from assuming that a `NestedSeries` has a nest within it, since that is the existing meaning of that prefix.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for this @gitosaurus , looks really good! I added a few comments, in addition to the feedback Kostya has provided.
Permits the creation of new nests and new nested columns within the `.eval()` method.
Include more unit tests of `NestedFrame.eval`.
The changes I pushed yesterday do permit assignment of nested columns, and even the creation of new nests, from within |
Some of the dict-like overloads were an overachievement.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great work, thank you so much!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is great! I think the eval interface here is a huge addition to nested-pandas, to the level that I would love to see it in our documentation. I wouldn't ask it for this PR, but perhaps in a future PR we could add a section to the data manipulation tutorial showcasing some of the uses of eval?
Great idea! I'll add a task for that. |
Verify preflighting of nested expressions using AST visitation.
Remove logic for splitting queries by string. Now the evaluation is handled by a nested column resolver, and the mixed-mode expressions are preflighted by examining the parsed abstract syntax tree for the query expression.
Resolves #59.
Change Description
Solution Description
Code Quality
Project-Specific Pull Request Checklists
Bug Fix Checklist