-
-
Notifications
You must be signed in to change notification settings - Fork 140
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
t-test and p-value for partial and semi-partial correlation #171
Comments
Thanks @anthonybritto for spotting this critical issue. I need to fix this ASAP. Please see below for more details. The problem pingouin.partial_corr does not adjust the degrees of freedom by the number of covariates, which means that the p-value, confidence interval, power and Bayes factor are wrong. The correlation coefficient is unaffected by this issue. The plot below shows how the p-value is affected as a function of the correlation coefficient, sample size and number of covariates. In short, the p-values currently returned by Pingouin are smaller compared to the exact p-values, and this gets worse with a low sample size and/or high number of covariates and/or weak correlation coefficient. The smaller the sample size, the worse the problem is. If the sample size is large enough, and/or the correlation is high, then the p-value is largely unaffected. How to fix / testing The calculation of the p-value, confidence interval, power must be done within the pingouin.partial_corr, and not by calling pingouin.corr as is currently the case.
|
Your fixes look good, far as I can tell! I unfortunately don't know anything about the Bayes Factor. I will look at the sources you reference at some point this week to see if I can help. |
@anthonybritto I just released a new stable version of Pingouin (v0.3.12) which fixes the p-value and confidence intervals of the partial correlation. Please see the full list of modifications here: https://pingouin-stats.org/changelog.html
|
FYI I have just pushed a commit on the develop branch (81d1aaf) to re-implement the partial correlation using the same method as ppcor: pingouin/pingouin/correlation.py Lines 781 to 822 in 81d1aaf
We're now getting exactly the same results as ppcor for partial/semi-partial correlations (either Pearson or Spearman). I think this will be included in the next stable release of Pingouin. |
@raphaelvallat Thanks for keeping me updated. I see that you have updated the p-value calculation to use the beta-distribution: I am not familiar with this approach, so I'm afraid I can't comment. If you are however reproducing the |
Hi @anthonybritto,
Thanks, |
Closing this issue but feel free to reopen. |
As per the documentation of the R package
ppcor
: https://dx.doi.org/10.5351%2FCSAM.2015.22.6.665, the correct degrees of freedom for the calculation of the t-statistic and p-values is n - g- 2 where g is the number of covariates.I believe that the subtraction of g from df has not been implemented in
pingouin
, as the three screenshots below show.Here, I simply pull the p-value from
partial_corr
to show that I haven’t altered anything.I now compute what I believe to be the correct p-value, with df=n-(k-1)-2 since I control for all predictors but one.
Finally, I show how by using the incorrect df=n-2, I am able to reproduce pingouin’s results.
The text was updated successfully, but these errors were encountered: