Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use specilized kernel for f-arrays and sum by axis=1. Add keepdims support #1489

Conversation

AlexanderKalistratov
Copy link
Collaborator

@AlexanderKalistratov AlexanderKalistratov commented Jul 19, 2023

  • Have you provided a meaningful PR description?
  • Have you added a test, reproducer or referred to issue with a reproducer?
  • Have you tested your changes locally for CPU and GPU devices?
  • Have you made sure that new changes do not introduce compiler warnings?
  • Have you checked performance impact of proposed changes?
  • If this PR is a work in progress, are you filing the PR as a draft?

Tests in #1488 covers this functionality. Except keepdims due to #1487

Speeds up cov with rowvar=False parameter. As it is used in pca workload.

import dpnp
import numpy
import time

a = dpnp.ones((1048576, 128))

dpnp.cov(a, rowvar=False, dtype=a.dtype)
start = time.time()
dpnp.cov(a, rowvar=False, dtype=a.dtype)
print(time.time() - start)

Before:

0.6248629093170166

After:

0.229935884475708

@antonwolfy
Copy link
Contributor

I've checked the PCA workload on the laptop:

numpy dpnp CPU dpnp GPU OCL dpnp GPU L0 size
baseline 0.61 s 0.75 s 0.71 s 0.68 s 1048576
PR #1489 0.6 s 0.57 s 0.3 s 0.29 s 1048576

Both CPU and GPU are now in targets for the workload, awesome.
I believe it closes #1398.

@AlexanderKalistratov AlexanderKalistratov merged commit 70e5aa8 into IntelPython:master Jul 21, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants