Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for Rayon integration #33

Open
rohitjoshi opened this issue Apr 19, 2018 · 6 comments
Open

Support for Rayon integration #33

rohitjoshi opened this issue Apr 19, 2018 · 6 comments

Comments

@rohitjoshi
Copy link

Rayon supports parallel iterators/mapv function to process using multiple threads. How can we integrate with rayon so we can leverage both simd and thread parallel processing?

@AdamNiederer
Copy link
Owner

You can do something like arr.par_iter(|chunk| chunk.simd_iter(|vec| ...)) to use both multithreading and SIMD

@rohitjoshi
Copy link
Author

rohitjoshi commented May 7, 2018

I tried your suggestion and getting an error.
e.g.

pub fn sqrt_par_simd(a: &[f64]) -> Vec<f64> {
   a.par_iter(|chunk| {
        chunk
            .simd_iter()
            .simd_map(f64s(0.0), |index| index.sqrt())
            .scalar_collect()
    }).collect()
}

Error:

error[E0061]: this function takes 0 parameters but 1 parameter was supplied
  --> src/prior.rs:34:7
   |
34 |     a.par_iter(|chunk| {
   |       ^^^^^^^^ expected 0 parameters

error[E0277]: the trait bound `std::vec::Vec<f64>: rayon::iter::FromParallelIterator<&f64>` is not satisfied
  --> src/prior.rs:39:8
   |
39 |     }).collect()
   |        ^^^^^^^ the trait `rayon::iter::FromParallelIterator<&f64>` is not implemented for `std::vec::Vec<f64>`
   |
   = help: the following implementations were found:
             <std::vec::Vec<T> as rayon::iter::FromParallelIterator<T>>

@andersk
Copy link
Contributor

andersk commented Dec 31, 2018

The Rayon syntax you’re looking for is a.par_chunks(128).flat_map(|chunk| …).collect(). (Pick your favorite chunk size.)

@Titaniumtown
Copy link
Contributor

The Rayon syntax you’re looking for is a.par_chunks(128).flat_map(|chunk| …).collect(). (Pick your favorite chunk size.)

How would that translate if I wanted to filter elements instead of map?

@andersk
Copy link
Contributor

andersk commented Mar 24, 2021

AFAIK Faster doesn’t currently provide a way to accelerate filter, with or without Rayon—so you’d just use the normal Rayon filter.

(If in the future some kind of filter is added to Faster, the same construction would work: you’d use Rayon’s .par_chunks().flat_map() around the hypothetical filter.)

@AdamNiederer
Copy link
Owner

AFAIK Faster doesn’t currently provide a way to accelerate filter, with or without Rayon—so you’d just use the normal Rayon filter.

That's correct, and it's unlikely that it will without AVX-512; SSE and AVX don't really have the underlying instructions required to yield an appreciable performance improvement, save for specific cases which wouldn't work well in a general library.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants