Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid the overhead of creating a PyErr for downcasting. #326

Merged
merged 3 commits into from
Apr 25, 2022

Conversation

adamreichold
Copy link
Member

This does give us the best of both worlds, i.e. avoid having to create a PyErr for downcast but still keeping a single implementation for the extract logic. I am just not sure whether this is wort the additional effort, i.e. the extra ExtractionError enum.

Base automatically changed from repr-transparent to main April 21, 2022 19:25
@kngwyu
Copy link
Member

kngwyu commented Apr 22, 2022

I'm not sure how it matters in user experience. Anyway users can't get the enum and just get PyErr, right?

@adamreichold
Copy link
Member Author

I'm not sure how it matters in user experience. Anyway users can't get the enum and just get PyErr, right?

The difference is purely internal w.r.t. performance, but it actually turns out to be a mixed bag after adding the benchmarks (I guess the enum is larger than PyErr hence slowing down everything but downcast_failure):

 name              main ns/iter  pr ns/iter  diff ns/iter   diff %  speedup 
 downcast_failure  84            52                   -32  -38.10%   x 1.62 
 downcast_success  18            21                     3   16.67%   x 0.86 
 extract_failure   84            111                   27   32.14%   x 0.76 
 extract_success   18            21                     3   16.67%   x 0.86 

I therefore redid the whole thing using a more tricky IgnoreError type to just drop the error on the floor for downcast and still directly convert it into PyErr for extract.

This, together with checking pointer equality before calling into PyArray_EquivTypes, then yields an improvement across the board which I feel makes this change more reasonable:

 name              main ns/iter  pr ns/iter  diff ns/iter   diff %  speedup 
 downcast_failure  84            48                   -36  -42.86%   x 1.75 
 downcast_success  18            15                    -3  -16.67%   x 1.20 
 extract_failure   84            84                     0    0.00%   x 1.00 
 extract_success   18            15                    -3  -16.67%   x 1.20 

@adamreichold adamreichold changed the title RFC: Use a bespoke enum error type to avoid the overhead of creating a PyErr for downcasting. Avoid the overhead of creating a PyErr for downcasting. Apr 22, 2022
@kngwyu
Copy link
Member

kngwyu commented Apr 25, 2022

The hack in is_equiv_to looks reasonable, but I'm a bit struggling in understanding what IgnoreError speeds up. It currently only works for is_type_of?

@adamreichold
Copy link
Member Author

The hack in is_equiv_to looks reasonable, but I'm a bit struggling in understanding what IgnoreError speeds up. It currently only works for is_type_of?

Yes, it is used only for is_type_of which however backs downcast. The point of IgnoreError is that downcast does not return a PyErr, but a PyDowncastError iff is_type_of returns false.

So as indicated by the downcast_failure benchmark, calling downcast on something that does not match gets significantly faster, because no PyErr is allocated on the Python heap by is_type_of just to be dropped immediately afterwards. IgnoreError exploits this by just ignoring the error information entirely (since it will never reach a public API anyway).

Finally, failed calls to extract and downcast are important because Python methods which support multiple element data types safely will need to repeatedly try to extract/downcast a given PyAny value into any of the supported PyArray instantiations.

Copy link
Member

@kngwyu kngwyu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your explanation. Let's land this PR after documenting the internal usage a bit.

src/error.rs Show resolved Hide resolved
@kngwyu
Copy link
Member

kngwyu commented Apr 25, 2022

Hmm, the pypy failure looks like a bug of maturin 😓

@adamreichold
Copy link
Member Author

Hmm, the pypy failure looks like a bug of maturin sweat

Reported at PyO3/maturin#882

@adamreichold
Copy link
Member Author

Will hold back on merging until Maturin 0.12.14 is released so that I can drop the last commit.

@adamreichold adamreichold merged commit c1cc96f into main Apr 25, 2022
@adamreichold adamreichold deleted the extraction-error branch April 25, 2022 22:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants