Simplify code generated for for_each_method_def and for_each_proto_slot #1641

1tgr · 2021-05-28T16:37:05Z

This change replaces slice1.iter().chain(slice2)....chain(sliceN).for_each(f) with f(slice1); f(slice2); ... f(sliceN);.

On a real-world project it significantly reduces number of lines of LLVM code generated by the compiler, with a similar improvement in compilation time.

I adapted the word_count example to add a #[pyclass]:

#[pyclass]
struct WordCount {
    contents: String,
}

#[pymethods]
impl WordCount {
    #[new]
    fn new(contents: String) -> Self {
        Self { contents }
    }

    fn search(&self, needle: &str) -> usize {
        search(&self.contents, needle)
    }

    fn search_sequential(&self, needle: &str) -> usize {
        search_sequential(&self.contents, needle)
    }

    fn search_sequential_allow_threads(&self, py: Python, needle: &str) -> usize {
        search_sequential_allow_threads(py, &self.contents, needle)
    }
}
...
m.add_class::<WordCount>()?;

According to cargo llvm-lines --release, before this change the top 10 functions in the word_count example, by number of LLVM lines, are:

  Lines         Copies       Function name
  -----         ------       -------------
  50562 (100%)  2257 (100%)  (TOTAL)
   2323 (4.6%)    16 (0.7%)  <core::iter::adapters::chain::Chain<A,B> as core::iter::traits::iterator::Iterator>::fold
   2125 (4.2%)    24 (1.1%)  core::iter::traits::iterator::Iterator::fold
   1255 (2.5%)    21 (0.9%)  alloc::alloc::box_free
   1026 (2.0%)    16 (0.7%)  core::result::Result<T,E>::map_err
    844 (1.7%)    12 (0.5%)  std::panicking::try
    789 (1.6%)     8 (0.4%)  pyo3::callback::handle_panic
    753 (1.5%)     3 (0.1%)  word_count::<impl pyo3::class::impl_::PyMethods<word_count::WordCount> for pyo3::class::impl_::PyClassImplCollector<word_count::WordCount>>::py_methods::METHODS::__wrap::{{closure}}
    665 (1.3%)    16 (0.7%)  core::iter::adapters::chain::Chain<A,B>::new
    648 (1.3%)    16 (0.7%)  core::iter::traits::iterator::Iterator::chain
    598 (1.2%)    30 (1.3%)  core::ptr::read

And after:

  Lines         Copies       Function name
  -----         ------       -------------
  45552 (100%)  2168 (100%)  (TOTAL)
   1255 (2.8%)    21 (1.0%)  alloc::alloc::box_free
   1026 (2.3%)    16 (0.7%)  core::result::Result<T,E>::map_err
    844 (1.9%)    12 (0.6%)  std::panicking::try
    789 (1.7%)     8 (0.4%)  pyo3::callback::handle_panic
    753 (1.7%)     3 (0.1%)  word_count::<impl pyo3::class::impl_::PyMethods<word_count::WordCount> for pyo3::class::impl_::PyClassImplCollector<word_count::WordCount>>::py_methods::METHODS::__wrap::{{closure}}
    657 (1.4%)     7 (0.3%)  core::iter::traits::iterator::Iterator::fold
    598 (1.3%)    30 (1.4%)  core::ptr::read
    570 (1.3%)    10 (0.5%)  alloc::raw_vec::RawVec<T,A>::current_memory
    558 (1.2%)     8 (0.4%)  alloc::boxed::Box<T,A>::into_unique
    546 (1.2%)     7 (0.3%)  std::thread::local::LocalKey<T>::try_with

1tgr · 2021-05-28T16:49:38Z

I expect the actual machine code to be the same before and after, since slice's iterator doesn't do anything special in its for_each, meaning it's equivalent to this for loop.

pyo3-macros-backend/src/pyclass.rs

davidhewitt

Thanks, the slice approach looks good to me!

mejrs requested changes May 28, 2021

View reviewed changes

pyo3-macros-backend/src/pyclass.rs Outdated Show resolved Hide resolved

Simplify code generated for for_each_method_def and for_each_proto_slot

030aa43

1tgr force-pushed the for-each branch from 770cdfc to 030aa43 Compare May 29, 2021 08:10

Method and slot visitors accept slices instead of individual items

1ba3217

1tgr force-pushed the for-each branch from 96153c2 to 1ba3217 Compare May 29, 2021 08:38

davidhewitt approved these changes May 29, 2021

View reviewed changes

davidhewitt merged commit 8bf3ade into PyO3:main May 29, 2021

1tgr deleted the for-each branch May 29, 2021 15:12

alonblade mentioned this pull request May 31, 2021

using subclass, args on new, and overriding __init__ fails #1644

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Simplify code generated for for_each_method_def and for_each_proto_slot #1641

Simplify code generated for for_each_method_def and for_each_proto_slot #1641

1tgr commented May 28, 2021 •

edited

Loading

1tgr commented May 28, 2021

davidhewitt left a comment

Simplify code generated for for_each_method_def and for_each_proto_slot #1641

Simplify code generated for for_each_method_def and for_each_proto_slot #1641

Conversation

1tgr commented May 28, 2021 • edited Loading

1tgr commented May 28, 2021

davidhewitt left a comment

Choose a reason for hiding this comment

1tgr commented May 28, 2021 •

edited

Loading