Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add AVG Pooling cpu implementation #2296

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

WenheLI
Copy link

@WenheLI WenheLI commented Jun 28, 2024

Try to resolve #2294

@EricLBuehler
Copy link
Member

@WenheLI I think this is correct.

@WenheLI
Copy link
Author

WenheLI commented Jun 29, 2024

@EricLBuehler - Thanks! I guess another question is that in the CPU backend implementation, we should be able to speed this up by using vectorization. Not sure in candle's codebase, do we already have some infrastructure that can help us?

@EricLBuehler
Copy link
Member

Hi @WenheLI, I think you could use something like Rayon, just replace the for loops (probably just choose one to replace as rayon uses the number of CPU cores as the number of threads by default) and replace .iter() with .par_iter().

@WenheLI
Copy link
Author

WenheLI commented Jun 29, 2024

Thanks! Added vectorization. Wondering if someone can I take a look and review this?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

How to get raw tensor data?
2 participants