Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Internal qsturng Implementation Substitution #156

Merged
merged 8 commits into from
Feb 13, 2021

Conversation

DominicChm
Copy link
Contributor

This PR resolves #155 by removing the internal qsturng and psturng function implementations (used for tukey's pairwise comparison) and replacing them with statsmodels' version.

The below is a worst-case example of the behavior seen with the current version. Pingouin's error compared to Statsmodels' is much greater than it should be, considering that the implementation of their psturng and qsturng functions is almost identical. With this change, that implementation is replaced with Statmodels' more accurate one.

cdf-0 01-8 225917112372644-20-10_DATA
Note: This graph is of the error of various different functions at varying degrees of input precision. In this case, statmodels' and pingouin's lines are generated by calling their psturng functions with the labeled input parameters, then subtracted from the high-precision numerical result of the integral the psturng function emulates to get error.

@codecov
Copy link

codecov bot commented Feb 12, 2021

Codecov Report

Merging #156 (e8eb5b9) into master (58df891) will not change coverage.
The diff coverage is 100.00%.

Impacted file tree graph

@@           Coverage Diff           @@
##           master     #156   +/-   ##
=======================================
  Coverage   99.38%   99.38%           
=======================================
  Files          18       18           
  Lines        3239     3239           
  Branches      531      531           
=======================================
  Hits         3219     3219           
  Misses          8        8           
  Partials       12       12           
Impacted Files Coverage Δ
pingouin/pairwise.py 100.00% <100.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 58df891...e8eb5b9. Read the comment docs.

@raphaelvallat raphaelvallat self-assigned this Feb 12, 2021
@raphaelvallat raphaelvallat added feature request 🚧 New feature or request invalid 🚩 This doesn't seem right labels Feb 12, 2021
@raphaelvallat
Copy link
Owner

Thanks @DominicChm, that's really great! A few minor comments before merging:

  1. Can you add the changes to the docs/changelog.rst?
  2. Can you also update the Examples section in the documentation of the pairwise_tukey and pairwise_gameshowell function?
  3. Finally, make sure that all the flake8 and doctest run smoothly by following the instructions here: https://pingouin-stats.org/contributing.html

@DominicChm
Copy link
Contributor Author

Ok, @raphaelvallat, I've updated examples and changelog in a way that I hope is consistent with previous ones. I added the change to the 0.3.9 (current?) version because I'm not sure about when this change should ship, but could start another section instead if it makes more sense.

@raphaelvallat
Copy link
Owner

Amazing @DominicChm ! So, we actually need to start a new section, I think let's go for 0.3.10 as I'm hoping to release this bugfix soon. Thanks again!

@DominicChm
Copy link
Contributor Author

@raphaelvallat No problem, happy to help! I've added the new changelog section. I included the same disclaimer from the previous bugfix because it seems like this problem could significantly affect p-values.

@@ -15,6 +15,8 @@ We therefore strongly recommend that all users UPDATE Pingouin (:code:`pip insta

Furthermore, and to prevent a similar issue, we have now disabled ``marginal=False`` in two-way repeated measure design. As of this release, ``marginal=False`` will therefore only have an impact on the between-factor T-test(s) of a mixed design.

In addition, the internal function responsible for generating a significance Q value for :py:func:`pingouin.pairwise_ttests` has been replaced with an equivalent from statsmodels, as it was found to be substantially inaccurate for certain combinations of treatments, degrees of freedom, and significances.
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@DominicChm

  1. The changes proposed in this PR only affects the pairwise_tukey and pairwise_gameshowell functions, not the pairwise_ttests function

  2. Could you add a hyperlink to the PR, as well as your GitHub user name (if you wish)?

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can also be more exhaustive and say that there was an error in the algorithm for the studentized range approximation that lead to minor errors in the p-values of the pairwise_tukey / pairwise_gamesshowell function. Most likely however, this did not affect the significance levels, with the exception of the p-values that may have been very close to significance (around the 0.05 threshold) -- or am I wrong?

@raphaelvallat raphaelvallat merged commit 6307ba8 into raphaelvallat:master Feb 13, 2021
@raphaelvallat
Copy link
Owner

Merging now, thanks again!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature request 🚧 New feature or request invalid 🚩 This doesn't seem right
Projects
None yet
Development

Successfully merging this pull request may close these issues.

qsturng.py Inaccuracy
2 participants