-
Notifications
You must be signed in to change notification settings - Fork 902
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG] cuDF covariance does not agree with pandas when ddof=N #10303
Comments
This was accidentally closed because my comment here: #9889 (review) |
This issue has been labeled |
This issue has been labeled |
In PR #9889, @isVoid observed that cudf and pandas handle covariance calculations differently when
N == ddof
. Here is a minimal example showing a few differences in behavior withddof
. This is a bug because cudf should give the same results as pandas. The bug affects bothdf.cov()
anddf.groupby(...).cov()
. Note that this is not related to the pandas bug 45814 which was also found during #9889 (not directly linked here because it is not related) regardingddof
with missing data, because no data is missing here.Results (warnings not shown):
Notice that cudf isn't self-consistent between
df.cov()
anddf.groupby(...).cov()
in its results fora
.Originally posted by @bdice in #9889 (comment)
The text was updated successfully, but these errors were encountered: