Avoid the overhead of creating a PyErr for downcasting. #326

adamreichold · 2022-04-21T09:36:18Z

This does give us the best of both worlds, i.e. avoid having to create a PyErr for downcast but still keeping a single implementation for the extract logic. I am just not sure whether this is wort the additional effort, i.e. the extra ExtractionError enum.

kngwyu · 2022-04-22T08:02:50Z

I'm not sure how it matters in user experience. Anyway users can't get the enum and just get PyErr, right?

adamreichold · 2022-04-22T08:32:44Z

I'm not sure how it matters in user experience. Anyway users can't get the enum and just get PyErr, right?

The difference is purely internal w.r.t. performance, but it actually turns out to be a mixed bag after adding the benchmarks (I guess the enum is larger than PyErr hence slowing down everything but downcast_failure):

 name              main ns/iter  pr ns/iter  diff ns/iter   diff %  speedup 
 downcast_failure  84            52                   -32  -38.10%   x 1.62 
 downcast_success  18            21                     3   16.67%   x 0.86 
 extract_failure   84            111                   27   32.14%   x 0.76 
 extract_success   18            21                     3   16.67%   x 0.86

I therefore redid the whole thing using a more tricky IgnoreError type to just drop the error on the floor for downcast and still directly convert it into PyErr for extract.

This, together with checking pointer equality before calling into PyArray_EquivTypes, then yields an improvement across the board which I feel makes this change more reasonable:

 name              main ns/iter  pr ns/iter  diff ns/iter   diff %  speedup 
 downcast_failure  84            48                   -36  -42.86%   x 1.75 
 downcast_success  18            15                    -3  -16.67%   x 1.20 
 extract_failure   84            84                     0    0.00%   x 1.00 
 extract_success   18            15                    -3  -16.67%   x 1.20

kngwyu · 2022-04-25T07:35:28Z

The hack in is_equiv_to looks reasonable, but I'm a bit struggling in understanding what IgnoreError speeds up. It currently only works for is_type_of?

adamreichold · 2022-04-25T08:12:33Z

The hack in is_equiv_to looks reasonable, but I'm a bit struggling in understanding what IgnoreError speeds up. It currently only works for is_type_of?

Yes, it is used only for is_type_of which however backs downcast. The point of IgnoreError is that downcast does not return a PyErr, but a PyDowncastError iff is_type_of returns false.

So as indicated by the downcast_failure benchmark, calling downcast on something that does not match gets significantly faster, because no PyErr is allocated on the Python heap by is_type_of just to be dropped immediately afterwards. IgnoreError exploits this by just ignoring the error information entirely (since it will never reach a public API anyway).

Finally, failed calls to extract and downcast are important because Python methods which support multiple element data types safely will need to repeatedly try to extract/downcast a given PyAny value into any of the supported PyArray instantiations.

kngwyu

Thank you for your explanation. Let's land this PR after documenting the internal usage a bit.

src/error.rs

kngwyu · 2022-04-25T10:28:52Z

Hmm, the pypy failure looks like a bug of maturin 😓

adamreichold · 2022-04-25T10:38:40Z

Hmm, the pypy failure looks like a bug of maturin sweat

Reported at PyO3/maturin#882

adamreichold · 2022-04-25T17:59:43Z

Will hold back on merging until Maturin 0.12.14 is released so that I can drop the last commit.

Base automatically changed from repr-transparent to main April 21, 2022 19:25

adamreichold force-pushed the extraction-error branch from 40b727c to dee5c3a Compare April 21, 2022 19:28

adamreichold added 2 commits April 22, 2022 10:06

Add downcast to benchmarks, in addition to extract.

7f5e79f

Do not call into PyArray_EquivTypes if the descriptors are identical.

fb8d706

adamreichold force-pushed the extraction-error branch from dee5c3a to cbb3c1c Compare April 22, 2022 08:30

adamreichold changed the title ~~RFC: Use a bespoke enum error type to avoid the overhead of creating a PyErr for downcasting.~~ Avoid the overhead of creating a PyErr for downcasting. Apr 22, 2022

kngwyu requested changes Apr 25, 2022

View reviewed changes

src/error.rs Show resolved Hide resolved

Avoid overhead of creating a PyErr during downcasting.

3795010

adamreichold force-pushed the extraction-error branch from cbb3c1c to 3795010 Compare April 25, 2022 10:20

kngwyu approved these changes Apr 25, 2022

View reviewed changes

adamreichold force-pushed the extraction-error branch from af79124 to 3795010 Compare April 25, 2022 22:08

adamreichold merged commit c1cc96f into main Apr 25, 2022

adamreichold deleted the extraction-error branch April 25, 2022 22:20

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Avoid the overhead of creating a PyErr for downcasting. #326

Avoid the overhead of creating a PyErr for downcasting. #326

adamreichold commented Apr 21, 2022

kngwyu commented Apr 22, 2022

adamreichold commented Apr 22, 2022

kngwyu commented Apr 25, 2022

adamreichold commented Apr 25, 2022

kngwyu left a comment

kngwyu commented Apr 25, 2022

adamreichold commented Apr 25, 2022

adamreichold commented Apr 25, 2022

Avoid the overhead of creating a PyErr for downcasting. #326

Avoid the overhead of creating a PyErr for downcasting. #326

Conversation

adamreichold commented Apr 21, 2022

kngwyu commented Apr 22, 2022

adamreichold commented Apr 22, 2022

kngwyu commented Apr 25, 2022

adamreichold commented Apr 25, 2022

kngwyu left a comment

Choose a reason for hiding this comment

kngwyu commented Apr 25, 2022

adamreichold commented Apr 25, 2022

adamreichold commented Apr 25, 2022